This is a preliminary patch to add hw csum hint support to
mvneta/mvpp2 xdp implementation
Tested-by: Matteo Croce <mcroce@linux.microsoft.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'has_phy' field from struct mvpp2_port is no longer used.
Remove it.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the MDIO and phylink are supported in the ACPI
world, enable to use them in the mvpp2 driver. Ensure a backward
compatibility with the firmware whose ACPI description does
not contain the necessary elements for the proper phy handling
and fall back to relying on the link interrupts instead.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most of the time during the RX is caused by the compound_head() call
done at the end of the RX loop:
│ build_skb():
[...]
│ static inline struct page *compound_head(struct page *page)
│ {
│ unsigned long head = READ_ONCE(page->compound_head);
65.23 │ ldr x2, [x1, #8]
Prefetch the page struct as soon as possible, to speedup the RX path
noticeabily by a ~3-4% packet rate in a drop test.
│ build_skb():
[...]
│ static inline struct page *compound_head(struct page *page)
│ {
│ unsigned long head = READ_ONCE(page->compound_head);
17.92 │ ldr x2, [x1, #8]
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the RX buffer, the received data starts after a headroom used to
align the IP header and to allow prepending headers efficiently.
The prefetch() should take this into account, and prefetch from
the very start of the received data.
We can see that ether_addr_equal_64bits(), which is the first function
to access the data, drops from the top of the perf top output.
prefetch(data):
Overhead Shared Object Symbol
11.64% [kernel] [k] eth_type_trans
prefetch(data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM):
Overhead Shared Object Symbol
13.42% [kernel] [k] build_skb
10.35% [mvpp2] [k] mvpp2_rx
9.35% [kernel] [k] __netif_receive_skb_core
8.24% [kernel] [k] kmem_cache_free
7.97% [kernel] [k] dev_gro_receive
7.68% [kernel] [k] page_pool_put_page
7.32% [kernel] [k] kmem_cache_alloc
7.09% [mvpp2] [k] mvpp2_bm_pool_put
3.36% [kernel] [k] eth_type_trans
Also, move the eth_type_trans() call a bit down, to give the RAM more
time to prefetch the data.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If Link Partner sends frames larger than RX buffer size, MAC mark it
as oversize but still would pass it to the Packet Processor.
In this scenario, Packet Processor scatter frame between multiple buffers,
but only a single buffer would be returned to the Buffer Manager pool and
it would not refill the poll.
Patch add handling of oversize error with buffer header handling, so all
buffers would be returned to the Buffer Manager pool.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove some unreachable code, so to suppress this warning:
drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c: In function ‘mvpp2_prs_tcam_first_free’:
drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c:397:10: warning: comparison is always false due to limited range of data type [-Wtype-limits]
397 | if (end >= MVPP2_PRS_TCAM_SRAM_SIZE)
| ^~
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mvpp2_run_xdp() has an unused parameter rxq, remove it.
Fixes: 07dd0a7aae ("mvpp2: add basic XDP support")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
device_property_read_u32() may return different error codes.
Propagate it to the caller.
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace open coded variants of devm_clk_get_optional().
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the device_get_match_data() helper instead of open coding.
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In each iteration fwnode_for_each_available_child_node() bumps a reference
counting of a loop variable followed by dropping in on a next iteration,
Since in error case the loop is broken, we have to drop a reference count
by ourselves. Do it for port_fwnode in error case during ->probe().
Fixes: 248122212f ("net: mvpp2: use device_*/fwnode_* APIs instead of of_*")
Cc: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add parser entries for different IPv4 IHL values.
Each entry will set the L4 header offset according to the IPv4 IHL field.
L3 header offset will set during the parsing of the IPv4 protocol.
Because of missed parser support for IP header length > 20, RX IPv4 checksum HW offload fails
and skb->ip_summed set to CHECKSUM_NONE(checksum done by Network stack).
This patch adds RX IPv4 checksum HW offload capability for frames with IP header length > 20.
v1 --> v2
- Improve commit message.
Suggested-by: Dana Vardi <danat@marvell.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to change the current ndo_xdp_xmit drop semantics because it will
allow us to implement better queue overflow handling. This is working
towards the larger goal of a XDP TX queue-hook. Move XDP_REDIRECT error
path handling from each XDP ethernet driver to devmap code. According to
the new APIs, the driver running the ndo_xdp_xmit pointer, will break tx
loop whenever the hw reports a tx error and it will just return to devmap
caller the number of successfully transmitted frames. It will be devmap
responsibility to free dropped frames.
Move each XDP ndo_xdp_xmit capable driver to the new APIs:
- veth
- virtio-net
- mvneta
- mvpp2
- socionext
- amazon ena
- bnxt
- freescale (dpaa2, dpaa)
- xen-frontend
- qede
- ice
- igb
- ixgbe
- i40e
- mlx5
- ti (cpsw, cpsw-new)
- tun
- sfc
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Link: https://lore.kernel.org/bpf/ed670de24f951cfd77590decf0229a0ad7fd12f6.1615201152.git.lorenzo@kernel.org
PPv2 loopback port doesn't support RSS, so we should
skip RSS configurations for this port.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Reviewed-by: Marcin Wojtas <mw@semihalf.com>
Link: https://lore.kernel.org/r/1613652123-19021-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With MTU less than 1500B on all ports, the driver uses per CPU pool mode.
If one of the ports set to jumbo frame MTU size, all ports move
to shared pools mode.
Here, buffer manager TX Flow Control reconfigured on all ports.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1KB is enough for loopback port, so 2KB can be distributed
between other ports.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GENCONF_CTRL0_PORTX naming improved.
Non functional change.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PTR_ERR_OR_ZERO instead of IS_ERR and PTR_ERR.
Non functional change.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use >= MVPP22 instead of != MVPP21.
Non functional change.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PPv2.1 contain 0 in Version ID register, priv->hw_version check
can be removed.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The condition should be skipped if CPU ID equal to nthreads.
The patch doesn't fix any actual issue since
nthreads = min_t(unsigned int, num_present_cpus(), MVPP2_MAX_THREADS).
On all current Armada platforms, the number of CPU's is
less than MVPP2_MAX_THREADS.
Fixes: e531f76757 ("net: mvpp2: handle cases where more CPUs are available than s/w threads")
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch check that TX FC firmware is running in CM3.
If not, global TX FC would be disabled.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fix GMAC TX flow control autoneg.
Flow control autoneg wrongly were disabled with enabled TX
flow control.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New FIFO flow control feature was added in PPv23.
PPv2 FIFO polled by HW and trigger pause frame if FIFO
fill level is below threshold.
FIFO HW flow control enabled with CM3 RXQ&BM flow
control with ethtool.
Current FIFO thresholds is:
9KB for port with maximum speed 10Gb/s port
4KB for port with maximum speed 5Gb/s port
2KB for port with maximum speed 1Gb/s port
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PP2v23 hardware supports a feature allowing to double the
size of BPPI by decreasing number of pools from 16 to 8.
Increasing of BPPI size protect BM drop from BPPI underrun.
Underrun could occurred due to stress on DDR and as result slow buffer
transition from BPPE to BPPI.
New BPPI threshold recommended by spec is:
BPPI low threshold - 640 buffers
BPPI high threshold - 832 buffers
Supported only in PPv23.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add ethtool flow control configuration support.
Tx flow control retrieved correctly by ethtool get function.
FW per port ethtool configuration capability added.
Patch also takes care about mtu change procedure, if PPv2 switch
BM pools during mtu change.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds RXQ flow control configurations.
Flow control disabled by default.
Minimum ring size limited to 1024 descriptors.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables global flow control in FW and in the phylink validate mask.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The firmware needs to monitor the RX Non-occupied descriptor
bits for flow control to move to XOFF mode.
These bits need to be unmasked to be functional, but they will
not raise interrupts as we leave the RX exception summary
bit in MVPP2_ISR_RX_TX_MASK_REG clear.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flow Control periodic timer would be used if port in
XOFF to transmit periodic XOFF frames.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BM pool and RXQ size increased to support Firmware Flow Control.
Minimum depletion thresholds to support FC are 1024 buffers.
BM pool size increased to 2048 to have some 1024 buffers
space between depletion thresholds and BM pool size.
Jumbo frames require a 9888B buffer, so memory requirements
for data buffers increased from 7MB to 24MB.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add PPv23 version definition.
PPv23 is new packet processor in CP115.
Everything that supported by PPv22, also supported by PPv23.
No functional changes in this stage.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we have PP2v1 and PP2v2 hw-versions, with some different
handlers depending upon condition hw_version = MVPP21/MVPP22.
In a future there will be also PP2v3. Let's use now the generic
"if equal/notEqual MVPP21" for all cases instead of "if MVPP22".
This patch does not change any functionality.
It is not intended to introduce PP2v3.
It just modifies MVPP21/MVPP22 check-condition
bringing it to generic and unified form correct for new-code
introducing and PP2v3 net-next generation.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds CM3 memory map.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This entry used when skipping the parser needed,
for example, the custom header pretended to ethernet header.
Suggested-by: Liron Himi <liron@marvell.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Last TCAM data contains TCAM enable bit.
It should be written after SRAM data before entry enabled.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1612172139-28343-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-01-16
1) Extend atomic operations to the BPF instruction set along with x86-64 JIT support,
that is, atomic{,64}_{xchg,cmpxchg,fetch_{add,and,or,xor}}, from Brendan Jackman.
2) Add support for using kernel module global variables (__ksym externs in BPF
programs) retrieved via module's BTF, from Andrii Nakryiko.
3) Generalize BPF stackmap's buildid retrieval and add support to have buildid
stored in mmap2 event for perf, from Jiri Olsa.
4) Various fixes for cross-building BPF sefltests out-of-tree which then will
unblock wider automated testing on ARM hardware, from Jean-Philippe Brucker.
5) Allow to retrieve SOL_SOCKET opts from sock_addr progs, from Daniel Borkmann.
6) Clean up driver's XDP buffer init and split into two helpers to init per-
descriptor and non-changing fields during processing, from Lorenzo Bianconi.
7) Minor misc improvements to libbpf & bpftool, from Ian Rogers.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits)
perf: Add build id data in mmap2 event
bpf: Add size arg to build_id_parse function
bpf: Move stack_map_get_build_id into lib
bpf: Document new atomic instructions
bpf: Add tests for new BPF atomic operations
bpf: Add bitwise atomic instructions
bpf: Pull out a macro for interpreting atomic ALU operations
bpf: Add instructions for atomic_[cmp]xchg
bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
bpf: Move BPF_STX reserved field check into BPF_STX verifier code
bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
bpf: x86: Factor out a lookup table for some ALU opcodes
bpf: x86: Factor out emission of REX byte
bpf: x86: Factor out emission of ModR/M for *(reg + off)
tools/bpftool: Add -Wall when building BPF programs
bpf, libbpf: Avoid unused function warning on bpf_tail_call_static
selftests/bpf: Install btf_dump test cases
selftests/bpf: Fix installation of urandom_read
selftests/bpf: Move generated test files to $(TEST_GEN_FILES)
selftests/bpf: Fix out-of-tree build
...
====================
Link: https://lore.kernel.org/r/20210116012922.17823-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Patch didn't fix any issue, just improve parse flow
and align ipv4 parse flow with ipv6 parse flow.
Currently ipv4 kenguru parser first check IP protocol(TCP/UDP)
and then destination IP address.
Patch introduce reverse ipv4 parse, first destination IP address parsed
and only then IP protocol.
This would allow extend capability for packet L4 parsing and align ipv4
parsing flow with ipv6.
Suggested-by: Liron Himi <liron@marvell.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1610289059-14962-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Packet Processor hardware not connected to MAC flow control unit and
cannot support TX flow control.
This patch disable flow control support.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Link: https://lore.kernel.org/r/1610306582-16641-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The packet coalescing interrupt threshold has separated registers
for different aggregated/cpu (sw-thread). The required value should
be loaded for every thread but not only for 1 current cpu.
Fixes: 213f428f50 ("net: mvpp2: add support for TX interrupts and RX queue distribution modes")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608748521-11033-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
During GoP port 2 Networking Complex Control mode of operation configurations,
also GoP port 3 mode of operation was wrongly set.
Patch removes these configurations.
Fixes: f84bf386f3 ("net: mvpp2: initialize the GoP")
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608462149-1702-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current PPPoE+IPv6 entry is jumping to 'next-hdr'
field and not to 'DIP' field as done for IPv4.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Reported-by: Liron Himi <lironh@marvell.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608230266-22111-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Issue:
Flow control frame used to pause GoP(MAC) was delivered to the CPU
and created a load on the CPU. Since XOFF/XON frames are used only
by MAC, these frames should be dropped inside MAC.
Fix:
According to 802.3-2012 - IEEE Standard for Ethernet pause frame
has unique destination MAC address 01-80-C2-00-00-01.
Add TCAM parser entry to track and drop pause frames by destination MAC.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608229817-21951-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Force link UP can be enabled by bootloader during tftpboot
and breaks NFS support.
Force link UP disabled during port init procedure.
Fixes: f84bf386f3 ("net: mvpp2: initialize the GoP")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Link: https://lore.kernel.org/r/1608216735-14501-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2020-12-03
The main changes are:
1) Support BTF in kernel modules, from Andrii.
2) Introduce preferred busy-polling, from Björn.
3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh.
4) Memcg-based memory accounting for bpf objects, from Roman.
5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits)
selftests/bpf: Fix invalid use of strncat in test_sockmap
libbpf: Use memcpy instead of strncpy to please GCC
selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module
selftests/bpf: Add tp_btf CO-RE reloc test for modules
libbpf: Support attachment of BPF tracing programs to kernel modules
libbpf: Factor out low-level BPF program loading helper
bpf: Allow to specify kernel module BTFs when attaching BPF programs
bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF
selftests/bpf: Add support for marking sub-tests as skipped
selftests/bpf: Add bpf_testmod kernel module for testing
libbpf: Add kernel module BTF support for CO-RE relocations
libbpf: Refactor CO-RE relocs to not assume a single BTF object
libbpf: Add internal helper to load BTF data by FD
bpf: Keep module's btf_data_size intact after load
bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address()
selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP
bpf: Adds support for setting window clamp
samples/bpf: Fix spelling mistake "recieving" -> "receiving"
bpf: Fix cold build of test_progs-no_alu32
...
====================
Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix to return negative error code -ENOENT from invalid configuration
error handling case instead of 0, as done elsewhere in this function.
Fixes: 4bb0432628 ("net: mvpp2: phylink support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201203141806.37966-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add napi_id to the xdp_rxq_info structure, and make sure the XDP
socket pick up the napi_id in the Rx path. The napi_id is used to find
the corresponding NAPI structure for socket busy polling.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
Tx/Rx FIFO is a HW resource limited by total size, but shared
by all ports of same CP110 and impacting port-performance.
Do not divide the FIFO for ports which are not enabled in DTS,
so active ports could have more FIFO.
No change in FIFO allocation if all 3 ports on the communication
processor enabled in DTS.
The active port mapping should be done in probe before FIFO-init.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/1606154073-28267-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rivers/net/ethernet/marvell/mvpp2/mvpp2_main.c:7084:36: warning: ‘mvpp2_acpi_match’ defined but not used [-Wunused-const-variable=]
7084 | static const struct acpi_device_id mvpp2_acpi_match[] = {
| ^~~~~~~~~~~~~~~~
Wrap the definition inside #ifdef/#endif.
Compile tested only.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Cochran points out that SKBTX_IN_PROGRESS should be set when
the skbuff is queued for timestamping. Add this.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the functions mvpp2_isr_handle_xlg() and
mvpp2_isr_handle_gmac_internal(), the bool variable link is assigned a
true value in the case that a given bit of val is set. However, if the
bit is unset, no value is assigned to link and it is then passed to
mvpp2_isr_handle_link() without being initialised. Fix by assigning to
link the value of the bit test.
Build-tested on x86.
Fixes: 36cfd3a6e5 ("net: mvpp2: restructure "link status" interrupt handling")
Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for timestamping transmit packets. We allocate SYNC
messages to queue 1, every other message to queue 0.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for receive timestamping. When enabled, the hardware adds
a timestamp into the receive queue descriptor for all received packets
with no filtering. Hence, we can only support NONE or ALL receive
filter modes.
The timestamp in the receive queue contains two bit sof seconds and
the full nanosecond timestamp. This has to be merged with the remainder
of the seconds from the TAI clock to arrive at a full timestamp before
we can convert it to a ktime for the skb hardware timestamp field.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the TAI block in the mvpp2.2 hardware.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check the first level interrupt status registers to determine how to
further process the port interrupt. We will need this to know whether
to invoke the link status processing and/or the PTP processing for
both XLG and GMAC.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The link interrupt is used for way more than just the link status; it
comes from a collection of units to do with the port. The Marvell
documentation describes the interrupt as "GOP port X interrupt".
Since we are adding PTP support, and the PTP interrupt uses this,
rename it to be more inline with the documentation.
This interrupt is also mis-named in the DT binding, but we leave that
alone.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "link status" interrupt is used for more than just link status.
Restructure mvpp2_link_status_isr() so we can add additional handling.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.
Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit 507ebe6444 ("ibmvnic: Fix use-after-free of VNIC login
response buffer") did).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Split the XLG and GMAC PCS implementations and switch between them
during the mac_prepare() method.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert mvpp2 to phylink's new pcs support.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the GMAC reset handling into mac_prepare() / mac_finish()
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ensure that the port is forced down while reconfiguring, controlling
this via mac_prepare() and mac_finish() so that it is down while we
are configuring our (future) PCS.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert mvpp2 to use the mac_prepare() and mac_finish() methods in
preparation to converting mvpp2 to split-PCS support.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tidy up the ACPI hack so that we can minimise the function prototypes
for this. This avoids adding further prototypes unnecessarily.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2020-08-04
The following pull-request contains BPF updates for your *net-next* tree.
We've added 73 non-merge commits during the last 9 day(s) which contain
a total of 135 files changed, 4603 insertions(+), 1013 deletions(-).
The main changes are:
1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF
syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko.
2) Add BPF iterator for map elements and to iterate all BPF programs for efficient
in-kernel inspection, from Yonghong Song and Alexei Starovoitov.
3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid
unwinder errors, from Song Liu.
4) Allow cgroup local storage map to be shared between programs on the same
cgroup. Also extend BPF selftests with coverage, from YiFei Zhu.
5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM
load instructions, from Jean-Philippe Brucker.
6) Follow-up fixes on BPF socket lookup in combination with reuseport group
handling. Also add related BPF selftests, from Jakub Sitnicki.
7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for
socket create/release as well as bind functions, from Stanislav Fomichev.
8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct
xdp_statistics, from Peilin Ye.
9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime.
10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6}
fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin.
11) Fix a bpftool segfault due to missing program type name and make it more robust
to prevent them in future gaps, from Quentin Monnet.
12) Consolidate cgroup helper functions across selftests and fix a v6 localhost
resolver issue, from John Fastabend.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Release skb memory in mvpp2_rx() if mvpp2_rx_refill routine fails
Fixes: b501585467 ("net: mvpp2: fix refilling BM pools in RX path")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that BPF program/link management is centralized in generic net_device
code, kernel code never queries program id from drivers, so
XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary.
This patch removes all the implementations of those commands in kernel, along
the xdp_attachment_query().
This patch was compile-tested on allyesconfig.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200722064603.3350758-10-andriin@fb.com
priv->page_pool is an array, so comparing against it will always return true.
Do a meaningful check by checking priv->page_pool[0] instead.
While at it, clear the page_pool pointers on deallocation, or when an
allocation error happens during init.
Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: c2d6fe6163 ("mvpp2: XDP TX support")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the transmit part of XDP support, which includes:
- support for XDP_TX in mvpp2_xdp()
- .ndo_xdp_xmit hook for AF_XDP and XDP_REDIRECT with mvpp2 as destination
mvpp2_xdp_submit_frame() is a generic function which is called by
mvpp2_xdp_xmit_back() when doing XDP_TX, and by mvpp2_xdp_xmit when
doing AF_XDP or XDP_REDIRECT target.
The buffer allocation has been reworked to be able to map the buffers
as DMA_FROM_DEVICE or DMA_BIDIRECTIONAL depending if native XDP is
in use or not.
Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add XDP native support.
By now only XDP_DROP, XDP_PASS and XDP_REDIRECT
verdicts are supported.
Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the page_pool API for memory management.
This is a prerequisite for native XDP support.
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In mvpp2_swf_bm_pool_init_percpu(), a reference to a struct
mvpp2_bm_pool is obtained traversing multiple structs, when a
local variable already points to the same object.
Fix it and, while at it, give the variable a meaningful name.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the flow control settings in mvpp2_mac_link_up() for 10G links
just as we do for 1G and slower links. This is now the preferred
location.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to read-modify-write a register, and use it in the phylink
helpers.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to convert the struct phylink_config pointer passed in
from phylink to the drivers internal struct mvpp2_port.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvpp2 code has tests scattered amongst the code to determine
whether the port supports the XLG, and whether the port supports
RGMII mode.
Rather than having these tests scattered, provide a couple of helper
functions, so that future additions can ensure that they get these
tests correct.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tlen field into struct tso_t, and change tso_start()
to return skb_transport_offset(skb) + tso->tlen
This removes from callers the need to use tcp_hdrlen(skb) and
will ease UDP segmentation offload addition.
v2: calls tso_start() earlier in otx2_sq_append_tso() [Jakub]
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The remove function does not destroy all
BM Pools when per cpu pool is active.
When reloading the mvpp2 as a module the BM Pools
are still active in hardware and due to the bug
have twice the size now old + new.
This eventually leads to a kernel crash.
v2:
* add Fixes tag
Fixes: 7d04b0b13b ("mvpp2: percpu buffers")
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool rx and tx queue statistics are reporting wrong values.
Fix reading out the correct ones.
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When rxhash is enabled on any ethernet port except the first in each CP
block, traffic flow is prevented. The analysis is below:
I've been investigating this afternoon, and what I've found, comparing
a kernel without 895586d5dc and with 895586d5dc applied is:
- The table programmed into the hardware via mvpp22_rss_fill_table()
appears to be identical with or without the commit.
- When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports
that c2.attr[0] and c2.attr[2] are written back containing:
- with 895586d5dc, failing: 00200000 40000000
- without 895586d5dc, working: 04000000 40000000
- When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as:
04000000 00000000
The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the
first value is the queue number, which comprises two fields. The high
5 bits are 24:29 and the low three are 21:23 inclusive. This comes
from:
c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) |
MVPP22_CLS_C2_ATTR0_QLOW(ql);
So, the working case gives eth2 a queue id of 4.0, or 32 as per
port->first_rxq, and the non-working case a queue id of 0.1, or 1.
The allocation of queue IDs seems to be in mvpp2_port_probe():
if (priv->hw_version == MVPP21)
port->first_rxq = port->id * port->nrxqs;
else
port->first_rxq = port->id * priv->max_port_rxqs;
Where:
if (priv->hw_version == MVPP21)
priv->max_port_rxqs = 8;
else
priv->max_port_rxqs = 32;
Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1
(eth2) be 32. It seems the idea is that the first 32 queues belong to
port 0, the second 32 queues belong to port 1, etc.
mvpp2_rss_port_c2_enable() gets the queue number from it's parameter,
'ctx', which comes from mvpp22_rss_ctx(port, 0). This returns
port->rss_ctx[0].
mvpp22_rss_context_create() is responsible for allocating that, which
it does by looking for an unallocated priv->rss_tables[] pointer. This
table is shared amongst all ports on the CP silicon.
When we write the tables in mvpp22_rss_fill_table(), the RSS table
entry is defined by:
u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) |
MVPP22_RSS_INDEX_TABLE_ENTRY(i);
where rss_ctx is the context ID (queue number) and i is the index in
the table.
If we look at what is written:
- The first table to be written has "sel" values of 00000000..0000001f,
containing values 0..3. This appears to be for eth1. This is table 0,
RX queue number 0.
- The second table has "sel" values of 00000100..0000011f, and appears
to be for eth2. These contain values 0x20..0x23. This is table 1,
RX queue number 0.
- The third table has "sel" values of 00000200..0000021f, and appears
to be for eth3. These contain values 0x40..0x43. This is table 2,
RX queue number 0.
How do queue numbers translate to the RSS table? There is another
table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE field
of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE
register. Before 895586d5dc, it was:
mvpp2_write(priv, MVPP22_RSS_INDEX,
MVPP22_RSS_INDEX_QUEUE(port->first_rxq));
mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE,
MVPP22_RSS_TABLE_POINTER(port->id));
and after:
mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx));
mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(ctx));
Before the commit, for eth2, that would've contained '32' for the
index and '1' for the table pointer - mapping queue 32 to table 1.
Remember that this is queue-high.queue-low of 4.0.
After the commit, we appear to map queue 1 to table 1. That again
looks fine on the face of it.
Section 9.3.1 of the A8040 manual seems indicate the reason that the
queue number is separated. queue-low seems to always come from the
classifier, whereas queue-high can be from the ingress physical port
number or the classifier depending on the MVPP2_CLS_SWFWD_PCTRL_REG.
We set the port bit in MVPP2_CLS_SWFWD_PCTRL_REG, meaning that queue-high
comes from the MVPP2_CLS_SWFWD_P2HQ_REG() register... and this seems to
be where our bug comes from.
mvpp2_cls_oversize_rxq_set() sets this up as:
mvpp2_write(port->priv, MVPP2_CLS_SWFWD_P2HQ_REG(port->id),
(port->first_rxq >> MVPP2_CLS_OVERSIZE_RXQ_LOW_BITS));
val = mvpp2_read(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG);
val |= MVPP2_CLS_SWFWD_PCTRL_MASK(port->id);
mvpp2_write(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG, val);
Setting the MVPP2_CLS_SWFWD_PCTRL_MASK bit means that the queue-high
for eth2 is _always_ 4, so only queues 32 through 39 inclusive are
available to eth2. Yet, we're trying to tell the classifier to set
queue-high, which will be ignored, to zero. Hence, the queue-high
field (MVPP22_CLS_C2_ATTR0_QHIGH()) from the classifier will be
ignored.
This means we end up directing traffic from eth2 not to queue 1, but
to queue 33, and then we tell it to look up queue 33 in the RSS table.
However, RSS table has not been programmed for queue 33, and so it ends
up (presumably) dropping the packets.
It seems that mvpp22_rss_context_create() doesn't take account of the
fact that the upper 5 bits of the queue ID can't actually be changed
due to the settings in mvpp2_cls_oversize_rxq_set(), _or_ it seems that
mvpp2_cls_oversize_rxq_set() has been missed in this commit. Either
way, these two functions mutually disagree with what queue number
should be used.
Looking deeper into what mvpp2_cls_oversize_rxq_set() and the MTU
validation is doing, it seems that MVPP2_CLS_SWFWD_P2HQ_REG() is used
for over-sized packets attempting to egress through this port. With
the classifier having had RSS enabled and directing eth2 traffic to
queue 1, we may still have packets appearing on queue 32 for this port.
However, the only way we may end up with over-sized packets attempting
to egress through eth2 - is if the A8040 forwards frames between its
ports. From what I can see, we don't support that feature, and the
kernel restricts the egress packet size to the MTU. In any case, if we
were to attempt to transmit an oversized packet, we have no support in
the kernel to deal with that appearing in the port's receive queue.
So, this patch attempts to solve the issue by clearing the
MVPP2_CLS_SWFWD_PCTRL_MASK() bit, allowing MVPP22_CLS_C2_ATTR0_QHIGH()
from the classifier to define the queue-high field of the queue number.
My testing seems to confirm my findings above - clearing this bit
means that if I enable rxhash on eth2, the interface can then pass
traffic, as we are now directing traffic to RX queue 1 rather than
queue 33. Traffic still seems to work with rxhash off as well.
Reported-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Fixes: 895586d5dc ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "info->fs.location" is a u32 that comes from the user via the
ethtool_set_rxnfc() function. We need to check for invalid values to
prevent a buffer overflow.
I copy and pasted this check from the mvpp2_ethtool_cls_rule_ins()
function.
Fixes: 90b509b39a ("net: mvpp2: cls: Add Classification offload support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "rss_context" variable comes from the user via ethtool_get_rxfh().
It can be any u32 value except zero. Eventually it gets passed to
mvpp22_rss_ctx() and if it is over MVPP22_N_RSS_TABLES (8) then it
results in an array overflow.
Fixes: 895586d5dc ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
flow_action_hw_stats_types_check() helper takes one of the
FLOW_ACTION_HW_STATS_*_BIT values as input. If we align
the arguments to the opening bracket of the helper there
is no way to call this helper and stay under 80 characters.
Remove the "types" part from the new flow_action helpers
and enum values.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver did not previously reject unsupported parameters.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce flow_action_basic_hw_stats_types_check() helper and use it
in drivers. That sanitizes the drivers which do not have support
for action HW stats types.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the Marvell mvpp2 ethernet driver to use the finalised link
parameters in mac_link_up() rather than the parameters in mac_config().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Propagate the resolved link parameters via the mac_link_up() call for
MACs that do not automatically track their PCS state. We propagate the
link parameters via function arguments so that inappropriate members
of struct phylink_link_state can't be accessed, and creating a new
structure just for this adds needless complexity to the API.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch network drivers, phy drivers, and SFP/phylink over to use the
more correct 10GBASE-R, rather than 10GBASE-KR. 10GBASE-KR is backplane
ethernet, which is 10GBASE-R with autonegotiation on top, which our
current usage on the affected platforms does not have.
The only remaining user of PHY_INTERFACE_MODE_10GKR is the Aquantia
PHY, which has a separate mode for 10GBASE-KR.
For Marvell mvpp2, we detect 10GBASE-KR, and rewrite it to 10GBASE-R
for compatibility with existing DT - this is the only network driver
at present that makes use of PHY_INTERFACE_MODE_10GKR.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Presently, at boot time, the comphys are enabled. For firmware
compatibility reasons, the comphy driver does not power down the
comphys at boot. Consequently, the ethernet comphys are left active
until the network interfaces are brought through an up/down cycle.
If the port is never used, the port wastes power needlessly. Arrange
for the ethernet comphys to be cycled by the mvpp2 driver as if the
interface went through an up/down cycle during driver probe, thereby
powering them down.
This saves:
270mW per 10G SFP+ port on the Macchiatobin Single Shot (eth0/eth1)
370mW per 10G PHY port on the Macchiatobin Double Shot (eth0/eth1)
160mW on the SFP port on either Macchiatobin flavour (eth3)
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up the mvpp2 validate implementation to adopt the same behaviour as
mvneta:
- only allow the link modes that the specified PHY interface mode
supports with the exception of 1000base-X and 2500base-X.
- use the basex helper to deal with SFP modules that can be switched
between 1000base-X vs 2500base-X.
This gives consistent behaviour between mvneta and mvpp2.
This commit depends on "net: phylink: extend clause 45 PHY validation
workaround" so is not marked for backporting to stable kernels.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phylink requires the MAC to report when its link status changes when
operating in inband modes. Failure to report link status changes
means that phylink has no idea when the link events happen, which
results in either the network interface's carrier remaining up or
remaining permanently down.
For example, with a fiber module, if the interface is brought up and
link is initially established, taking the link down at the far end
will cut the optical power. The SFP module's LOS asserts, we
deactivate the link, and the network interface reports no carrier.
When the far end is brought back up, the SFP module's LOS deasserts,
but the MAC may be slower to establish link. If this happens (which
in my tests is a certainty) then phylink never hears that the MAC
has established link with the far end, and the network interface is
stuck reporting no carrier. This means the interface is
non-functional.
Avoiding the link interrupt when we have phylink is basically not
an option, so remove the !port->phylink from the test.
Fixes: 4bb0432628 ("net: mvpp2: phylink support")
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Rename the mac_link_state() method to mac_pcs_get_state() to make it
clear that it should be returning the MACs PCS current state, which
is used for inband negotiation rather than just reading back what the
MAC has been configured for. Update the documentation to explicitly
mention that this is for inband.
We drop the return value as well; most of phylink doesn't check the
return value and it is not clear what it should do on error - instead
arrange for state->link to be false.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
When receiving traffic, eth_type_trans() is high up on the perf top list,
because it's the first function which access the packet data.
Move the DMA unmap a bit higher, and put a prefetch just after it, so we
have more time to load the data into the cache.
The packet rate increase is about 14% with a tc drop test: 1620 => 1853 kpps
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the RX path we always sync against the maximum frame size for that pool.
Do the DMA sync and the unmap separately, so we can only sync by the
size of the received frame.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recycling in mvpp2 has gone long time ago, but two comment still refers
to it. Remove those two misleading comments as they generate confusion.
Fixes: 7ef7e1d949 ("net: mvpp2: drop useless fields in mvpp2_bm_pool and related code")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Every mvpp2 unit can use up to 8 buffers mapped by the BM (the HW buffer
manager). The HW will place the frames in the buffer pool depending on the
frame size: short (< 128 bytes), long (< 1664) or jumbo (up to 9856).
As any unit can have up to 4 ports, the driver allocates only 2 pools,
one for small and one long frames, and share them between ports.
When the first port MTU is set higher than 1664 bytes, a third pool is
allocated for jumbo frames.
This shared allocation makes impossible to use percpu allocators,
and creates contention between HW queues.
If possible, i.e. if the number of possible CPU are less than 8 and jumbo
frames are not used, switch to a new scheme: allocate 8 per-cpu pools for
short and long frames and bind every pool to an RXQ.
When the first port MTU is set higher than 1664 bytes, the allocation
scheme is reverted to the old behaviour (3 shared pools), and when all
ports MTU are lowered, the per-cpu buffers are allocated again.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor mvpp2_bm_pool_create(), mvpp2_bm_pool_destroy() and
mvpp2_bm_pools_init() so that they accept a struct device instead
of a struct platform_device, as they just need platform_device->dev.
Removing such dependency makes the BM code more reusable in context
where we don't have a pointer to the platform_device.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tx_done_tasklet tasklet is used in invoke the hrtimer
(mvpp2_hr_timer_cb) in softirq context. This can be also achieved without
the tasklet but with HRTIMER_MODE_SOFT as hrtimer mode.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Chevallier <maxime.chevallier@bootlin.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nathan Huckleberry <nhuck@google.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware can only offload checksum calculation on first port due to
the Tx FIFO size limitation, and has a maximum L3 offset of 128 bytes.
Document this in a comment and move duplicated code in a function.
Fixes: 576193f2d5 ("net: mvpp2: jumbo frames support")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MTU change code can call napi_disable() with the device already down,
leading to a deadlock. Also, lot of code is duplicated unnecessarily.
Rework mvpp2_change_mtu() to avoid the deadlock and remove duplicated code.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for unifying the skb_frag and bio_vec, use the fine
accessors which already exist and use skb_frag_t instead of
struct skb_frag_struct.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
PPv2's XLGMAC can wait for 3 idle frames before triggering a link up
event. This can cause the link to be stuck low when there's traffic on
the interface, so disable this feature.
Fixes: 4bb0432628 ("net: mvpp2: phylink support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Users can specify classification actions based on the 'ether' flow type.
In that case, this will apply to all ethernet traffic, superseeding
flows such as 'udp4' or 'tcp6'.
Add support for this flow type in the PPv2 classifier, by mapping the
ETHER_FLOW value to the corresponding entries in the classifier.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a missing check to detect flow types that we don't support, so that
user can be informed of this.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Header Parser allows identifying various fields in the packet
headers, used for various kind of filtering and classification
steps.
This is a re-entrant process, where the offset in the packet header
depends on the previous lookup results. This offset is represented in
the SRAM results of the TCAM, as a shift to be operated.
This shift can be negative in some cases, such as in IPv6 parsing.
This commit prevents overriding the sign bit when setting the shift
value, which could cause instabilities when parsing IPv6 flows.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Suggested-by: Alan Winkowski <walan@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was an unused variable 'mvpp2_dbgfs_prs_pmap_fops'
Added a usage consistent with other fops to dump pmap
to userspace.
Cc: clang-built-linux@googlegroups.com
Link: https://github.com/ClangBuiltLinux/linux/issues/529
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit allows using the vlan Id and priority as parts of the key
for classification offload. These fields are extracted from the
outermost tag, if multiple tags are present.
Vlan Id and priority are considered as 2 different fields by the
classifier, however the fields are both appended in the Header Extracted
Key in the same layout as they are found in the tags. This means that
when steering only based on the prio, a 16-bit slot is still taken in
the HEK.
The classifier doesn't allow extracting the DEI bit from the tag, so we
explicitly prevent user from using this bit in the key.
This commit adds the vlan priotity as a compatible HEK field for
tagged traffic, meaning that we limit the possibility of extracting this
field only to the flows that contain tagged traffic.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The C2 TCAM used for classification uses a key (Header Extracted Key)
built by concatenating several fields extracted from the packet header.
After a lot of trial-and-error and some guess work, it seems the HEK is
right justified, with the first fields being stored in the MSB, then
concatenated up until the LSB.
Until now, this doesn't cause any issue since all HEK fields we use are
full bytes. However this is an issue for the upcoming VLAN id and pri
extraction, which aren't full bytes.
Rework the way we built that TCAM key, by changing the order in which we
append the fields.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The way we currently handle classification offload and RSS is by having
dedicated lookup sequences in the flow table, each being selected
depending on several fields being present in the packet header.
We need to make sure the classification operation we want to perform can
be done in each flow we want to insert it into. As an example,
classifying on VLAN tag can only be done on flows used for tagged
traffic.
This commit makes sure we don't insert rules in flows we aren't
compatible with.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When performing a TCAM lookup in the C2 engine, it's possible that
multiple entries match the packet. To make sure the correct entry match
when performing a lookup, the Flow Table can set a lookup type, which
will be used in the TCAM lookup, thus preventing such false-positives.
We need to make sure the RSS match doesn't interfere with other
classification lookups, hence we use a dedicated lookup_type for it.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be
called with the TCAM id incorrectly used as a VID, causing the wrong
TCAM entries to be invalidated.
Fix this by directly invalidating entries in the VID range.
Fixes: 56beda3db6 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <yuric@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VID filtering is implemented in the Header Parser, with one range of 11
vids being assigned for each no-loopback port.
Make sure we use the per-port range when looking for existing entries in
the Parser.
Since we used a global range instead of a per-port one, this causes VIDs
to be removed from the whitelist from all ports of the same PPv2
instance.
Fixes: 56beda3db6 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <yuric@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Besides the MIB counters, some other useful counters can be exposed to
the user. This commit adds support for :
- Per-port counters, that indicate FIFO drops and classifier drops,
- Per-rxq counters,
- Per-txq counters
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we'll be adding support for other kind of internal counters, make
clear that the currently supported counters are the MIB counters.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When first configuring a port on PPv2, we want to clear the internal
counters so that we don't get values from previous boot stages.
However, we can't really clear these counters when resetting the MAC,
since there are valid reasons to do so while the port is being used,
such as when reconfiguring the interface mode with the PHY.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use a safe strscpy call to copy the ethtool stat strings into the
relevant buffers, instead of a memcpy that will be accessing
out-of-bound data.
Fixes: 118d6298f6 ("net: mvpp2: add ethtool GOP statistics")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phylink conflict was between a bug fix by Russell King
to make sure we have a consistent PHY interface mode, and
a change in net-next to pull some code in phylink_resolve()
into the helper functions phylink_mac_link_{up,down}()
On the dp83867 side it's mostly overlapping changes, with
the 'net' side removing a condition that was supposed to
trigger for RGMII but because of how it was coded never
actually could trigger.
Signed-off-by: David S. Miller <davem@davemloft.net>
MVPP2_TXQ_SCHED_TOKEN_CNTR_REG() expects the logical queue id but
the current code is passing the global tx queue offset, so it ends
up writing to unknown registers (between 0x8280 and 0x82fc, which
seemed to be unused by the hardware). This fixes the issue by using
the logical queue id instead.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix smatch warning:
drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c:1236
mvpp2_ethtool_cls_rule_ins() warn: unsigned 'info->fs.location' is never less than zero.
'info->fs.location' is u32 type, never less than zero.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phylink_config structure will encapsulate a pointer to a struct
device and the operation type requested for this instance of PHYLINK.
This patch does not make any functional changes, it just transitions the
PHYLINK internals and all its users to the new API.
A pointer to a phylink_config structure will be passed to
phylink_create() instead of the net_device directly. Also, the same
phylink_config pointer will be passed back to all phylink_mac_ops
callbacks instead of the net_device. Using this mechanism, a PHYLINK
user can get the original net_device using a structure such as
'to_net_dev(config->dev)' or directly the structure containing the
phylink_config using a container_of call.
At the moment, only the PHYLINK_NETDEV is defined as a valid operation
type for PHYLINK. In this mode, a valid reference to a struct device
linked to the original net_device should be passed to PHYLINK through
the phylink_config structure.
This API changes is mainly driven by the necessity of adding a new
operation type in PHYLINK that disconnects the phy_device from the
net_device and also works when the net_device is lacking.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure we don't use an out-of-bound index for the per-port RSS
context array.
As of today, the global context creation in mvpp22_rss_context_create
will prevent us from reaching this case, but we should still make sure
we are using a sane value anyway.
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When steering to an RXQ, we can perform an extra RSS step to assign a
queue from an RSS table.
This is done by setting the RSS_EN attribute in the C2 engine. In that
case, the RXQ that is assigned is the global RSS context id, that is
then translated to an RSS table using the RXQ2RSS table.
An example using ethtool to steer to RXQ 2 and 3 would be :
ethtool -X eth0 weight 0 0 1 1 context new
(This would print the allocated context id, let's say it's 1)
ethtool -N eth0 flow-type udp4 dst-port 1234 context 1 loc 0
The hash parameters are the ones that are globally configured for RSS :
ethtool -N eth0 rx-flow-hash udp4 sdfn
When an RSS context is removed while there are active classification
rules using this context, these rules are removed.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ethtool_rx_flow_rule_create takes into parameter the ethtool flow spec,
which doesn't contain the rss context id. We therefore need to extract
it ourself before parsing the ethtool rule.
The FLOW_RSS flag is only set in info->fs.flow_type, and not
info->flow_type.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPv2 controller has 8 RSS tables that are shared across all ports on
a given PPv2 instance. The previous implementation allocated one table
per port, leaving others unused.
By using RSS contexts, we can make use of multiple RSS tables per
port, one being the default table (always id 0), the other ones being
used as destinations for flow steering, in the same way as rx rings.
This commit introduces RSS contexts management in the PPv2 driver. We
always reserve one table per port, allocated when the port is probed.
The global table list is stored in the struct mvpp2, as it's a global
resource. Each port then maintains a list of indices in that global
table, that way each port can have it's own numbering scheme starting
from 0.
One limitation that seems unavoidable is that the hashing parameters are
shared across all RSS contexts for a given port. Hashing parameters for
ctx 0 will be applied to all contexts.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The C2 TCAM has internal FIFOs that are only useful for the built-in
self-tests. Disable these FIFOS at init, as recommended in the
functionnal specs.
Suggested-by: Alan Winkowski <walan@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As of today, the classification offload implementation only supports 4
different rules to be offloaded. This number has been hardcoded in the
rule insertion function, and the wrong define is being used elsewhere.
Use the correct #define everywhere to make sure we always check for the
correct number of rules.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flow_rule is only used when configuring the classification tables,
and should be free'd once we're done using it. The current code only
frees it in the error path.
Fixes: 90b509b39a ("net: mvpp2: cls: Add Classification offload support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>