mirror of
https://git.proxmox.com/git/mirror_ubuntu-kernels.git
synced 2025-12-10 09:20:49 +00:00
Core
----
- Allow live renaming when an interface is up
- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations.
- Add inet drop monitor support.
- A few GRO performance improvements.
- Add infrastructure for atomic dev stats, addressing long standing
data races.
- De-duplicate common code between OVS and conntrack offloading
infrastructure.
- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements.
- Netfilter: introduce packet parser for tunneled packets
- Replace IPVS timer-based estimators with kthreads to scale up
the workload with the number of available CPUs.
- Add the helper support for connection-tracking OVS offload.
BPF
---
- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF.
- Make cgroup local storage available to non-cgroup attached BPF
programs.
- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers.
- A relevant bunch of BPF verifier fixes and improvements.
- Veristat tool improvements to support custom filtering, sorting,
and replay of results.
- Add LLVM disassembler as default library for dumping JITed code.
- Lots of new BPF documentation for various BPF maps.
- Add bpf_rcu_read_{,un}lock() support for sleepable programs.
- Add RCU grace period chaining to BPF to wait for the completion
of access from both sleepable and non-sleepable BPF programs.
- Add support storing struct task_struct objects as kptrs in maps.
- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values.
- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions.
Protocols
---------
- TCP: implement Protective Load Balancing across switch links.
- TCP: allow dynamically disabling TCP-MD5 static key, reverting
back to fast[er]-path.
- UDP: Introduce optional per-netns hash lookup table.
- IPv6: simplify and cleanup sockets disposal.
- Netlink: support different type policies for each generic
netlink operation.
- MPTCP: add MSG_FASTOPEN and FastOpen listener side support.
- MPTCP: add netlink notification support for listener sockets
events.
- SCTP: add VRF support, allowing sctp sockets binding to VRF
devices.
- Add bridging MAC Authentication Bypass (MAB) support.
- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios.
- More work for Wi-Fi 7 support, comprising conversion of all
the existing drivers to internal TX queue usage.
- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading.
- IPSec: extended ack support for more descriptive XFRM error
reporting.
- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking.
- IEEE 802154: synchronous send frame and extended filtering
support, initial support for scanning available 15.4 networks.
- Tun: bump the link speed from 10Mbps to 10Gbps.
- Tun/VirtioNet: implement UDP segmentation offload support.
Driver API
----------
- PHY/SFP: improve power level switching between standard
level 1 and the higher power levels.
- New API for netdev <-> devlink_port linkage.
- PTP: convert existing drivers to new frequency adjustment
implementation.
- DSA: add support for rx offloading.
- Autoload DSA tagging driver when dynamically changing protocol.
- Add new PCP and APPTRUST attributes to Data Center Bridging.
- Add configuration support for 800Gbps link speed.
- Add devlink port function attribute to enable/disable RoCE and
migratable.
- Extend devlink-rate to support strict prioriry and weighted fair
queuing.
- Add devlink support to directly reading from region memory.
- New device tree helper to fetch MAC address from nvmem.
- New big TCP helper to simplify temporary header stripping.
New hardware / drivers
----------------------
- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches.
- Marvel Prestera AC5X Ethernet Switch.
- WangXun 10 Gigabit NIC.
- Motorcomm yt8521 Gigabit Ethernet.
- Microchip ksz9563 Gigabit Ethernet Switch.
- Microsoft Azure Network Adapter.
- Linux Automation 10Base-T1L adapter.
- PHY:
- Aquantia AQR112 and AQR412.
- Motorcomm YT8531S.
- PTP:
- Orolia ART-CARD.
- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices.
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices.
- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets.
- Realtek RTL8852BE and RTL8723DS.
- Cypress.CYW4373A0 WiFi + Bluetooth combo device.
Drivers
-------
- CAN:
- gs_usb: bus error reporting support.
- kvaser_usb: listen only and bus error reporting support.
- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping.
- implement devlink-rate support.
- support direct read from memory.
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate.
- Support for enhanced events compression.
- extend H/W offload packet manipulation capabilities.
- implement IPSec packet offload mode.
- nVidia/Mellanox (mlx4):
- better big TCP support.
- Netronome Ethernet NICs (nfp):
- IPsec offload support.
- add support for multicast filter.
- Broadcom:
- RSS and PTP support improvements.
- AMD/SolarFlare:
- netlink extened ack improvements.
- add basic flower matches to offload, and related stats.
- Virtual NICs:
- ibmvnic: introduce affinity hint support.
- small / embedded:
- FreeScale fec: add initial XDP support.
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood.
- TI am65-cpsw: add suspend/resume support.
- Mediatek MT7986: add RX wireless wthernet dispatch support.
- Realtek 8169: enable GRO software interrupt coalescing per
default.
- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP.
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support.
- add ip6gre support.
- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support.
- enable flow offload support.
- Renesas:
- add rswitch R-Car Gen4 gPTP support.
- Microchip (lan966x):
- add full XDP support.
- add TC H/W offload via VCAP.
- enable PTP on bridge interfaces.
- Microchip (ksz8):
- add MTU support for KSZ8 series.
- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan.
- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support.
- add ack signal support.
- enable coredump support.
- remain_on_channel support.
- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities.
- 320 MHz channels support.
- RealTek WiFi (rtw89):
- new dynamic header firmware format support.
- wake-over-WLAN support.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmOYXUcSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOk8zQP/R7BZtbJMTPiWkRnSoKHnAyupDVwrz5U
ktukLkwPsCyJuEbAjgxrxf4EEEQ9uq2FFlxNSYuKiiQMqIpFxV6KED7LCUygn4Tc
kxtkp0Q+5XiqisWlQmtfExf2OjuuPqcjV9tWCDBI6GebKUbfNwY/eI44RcMu4BSv
DzIlW5GkX/kZAPqnnuqaLsN3FudDTJHGEAD7NbA++7wJ076RWYSLXlFv0Z+SCSPS
H8/PEG0/ZK/65rIWMAFRClJ9BNIDwGVgp0GrsIvs1gqbRUOlA1hl1rDM21TqtNFf
5QPQT7sIfTcCE/nerxKJD5JE3JyP+XRlRn96PaRw3rt4MgI6I/EOj/HOKQ5tMCNc
oPiqb7N70+hkLZyr42qX+vN9eDPjp2koEQm7EO2Zs+/534/zWDs24Zfk/Aa1ps0I
Fa82oGjAgkBhGe/FZ6i5cYoLcyxqRqZV1Ws9XQMl72qRC7/BwvNbIW6beLpCRyeM
yYIU+0e9dEm+wHQEdh2niJuVtR63hy8tvmPx56lyh+6u0+pondkwbfSiC5aD3kAC
ikKsN5DyEsdXyiBAlytCEBxnaOjQy4RAz+3YXSiS0eBNacXp03UUrNGx4Pzpu/D0
QLFJhBnMFFCgy5to8/DvKnrTPgZdSURwqbIUcZdvU21f1HLR8tUTpaQnYffc/Whm
V8gnt1EL+0cc
=CbJC
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Allow live renaming when an interface is up
- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations
- Add inet drop monitor support
- A few GRO performance improvements
- Add infrastructure for atomic dev stats, addressing long standing
data races
- De-duplicate common code between OVS and conntrack offloading
infrastructure
- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements
- Netfilter: introduce packet parser for tunneled packets
- Replace IPVS timer-based estimators with kthreads to scale up the
workload with the number of available CPUs
- Add the helper support for connection-tracking OVS offload
BPF:
- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF
- Make cgroup local storage available to non-cgroup attached BPF
programs
- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers
- A relevant bunch of BPF verifier fixes and improvements
- Veristat tool improvements to support custom filtering, sorting,
and replay of results
- Add LLVM disassembler as default library for dumping JITed code
- Lots of new BPF documentation for various BPF maps
- Add bpf_rcu_read_{,un}lock() support for sleepable programs
- Add RCU grace period chaining to BPF to wait for the completion of
access from both sleepable and non-sleepable BPF programs
- Add support storing struct task_struct objects as kptrs in maps
- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values
- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions
Protocols:
- TCP: implement Protective Load Balancing across switch links
- TCP: allow dynamically disabling TCP-MD5 static key, reverting back
to fast[er]-path
- UDP: Introduce optional per-netns hash lookup table
- IPv6: simplify and cleanup sockets disposal
- Netlink: support different type policies for each generic netlink
operation
- MPTCP: add MSG_FASTOPEN and FastOpen listener side support
- MPTCP: add netlink notification support for listener sockets events
- SCTP: add VRF support, allowing sctp sockets binding to VRF devices
- Add bridging MAC Authentication Bypass (MAB) support
- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios
- More work for Wi-Fi 7 support, comprising conversion of all the
existing drivers to internal TX queue usage
- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading
- IPSec: extended ack support for more descriptive XFRM error
reporting
- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking
- IEEE 802154: synchronous send frame and extended filtering support,
initial support for scanning available 15.4 networks
- Tun: bump the link speed from 10Mbps to 10Gbps
- Tun/VirtioNet: implement UDP segmentation offload support
Driver API:
- PHY/SFP: improve power level switching between standard level 1 and
the higher power levels
- New API for netdev <-> devlink_port linkage
- PTP: convert existing drivers to new frequency adjustment
implementation
- DSA: add support for rx offloading
- Autoload DSA tagging driver when dynamically changing protocol
- Add new PCP and APPTRUST attributes to Data Center Bridging
- Add configuration support for 800Gbps link speed
- Add devlink port function attribute to enable/disable RoCE and
migratable
- Extend devlink-rate to support strict prioriry and weighted fair
queuing
- Add devlink support to directly reading from region memory
- New device tree helper to fetch MAC address from nvmem
- New big TCP helper to simplify temporary header stripping
New hardware / drivers:
- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches
- Marvel Prestera AC5X Ethernet Switch
- WangXun 10 Gigabit NIC
- Motorcomm yt8521 Gigabit Ethernet
- Microchip ksz9563 Gigabit Ethernet Switch
- Microsoft Azure Network Adapter
- Linux Automation 10Base-T1L adapter
- PHY:
- Aquantia AQR112 and AQR412
- Motorcomm YT8531S
- PTP:
- Orolia ART-CARD
- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices
- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets
- Realtek RTL8852BE and RTL8723DS
- Cypress.CYW4373A0 WiFi + Bluetooth combo device
Drivers:
- CAN:
- gs_usb: bus error reporting support
- kvaser_usb: listen only and bus error reporting support
- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping
- implement devlink-rate support
- support direct read from memory
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate
- Support for enhanced events compression
- extend H/W offload packet manipulation capabilities
- implement IPSec packet offload mode
- nVidia/Mellanox (mlx4):
- better big TCP support
- Netronome Ethernet NICs (nfp):
- IPsec offload support
- add support for multicast filter
- Broadcom:
- RSS and PTP support improvements
- AMD/SolarFlare:
- netlink extened ack improvements
- add basic flower matches to offload, and related stats
- Virtual NICs:
- ibmvnic: introduce affinity hint support
- small / embedded:
- FreeScale fec: add initial XDP support
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
- TI am65-cpsw: add suspend/resume support
- Mediatek MT7986: add RX wireless wthernet dispatch support
- Realtek 8169: enable GRO software interrupt coalescing per
default
- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support
- add ip6gre support
- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support
- enable flow offload support
- Renesas:
- add rswitch R-Car Gen4 gPTP support
- Microchip (lan966x):
- add full XDP support
- add TC H/W offload via VCAP
- enable PTP on bridge interfaces
- Microchip (ksz8):
- add MTU support for KSZ8 series
- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan
- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support
- add ack signal support
- enable coredump support
- remain_on_channel support
- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
- 320 MHz channels support
- RealTek WiFi (rtw89):
- new dynamic header firmware format support
- wake-over-WLAN support"
* tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
ipvs: fix type warning in do_div() on 32 bit
net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
net: ipa: add IPA v4.7 support
dt-bindings: net: qcom,ipa: Add SM6350 compatible
bnxt: Use generic HBH removal helper in tx path
IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
selftests: forwarding: Add bridge MDB test
selftests: forwarding: Rename bridge_mdb test
bridge: mcast: Support replacement of MDB port group entries
bridge: mcast: Allow user space to specify MDB entry routing protocol
bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
bridge: mcast: Add support for (*, G) with a source list and filter mode
bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
bridge: mcast: Add a flag for user installed source entries
bridge: mcast: Expose __br_multicast_del_group_src()
bridge: mcast: Expose br_multicast_new_group_src()
bridge: mcast: Add a centralized error path
bridge: mcast: Place netlink policy before validation functions
bridge: mcast: Split (*, G) and (S, G) addition into different functions
bridge: mcast: Do not derive entry type from its filter mode
...
360 lines
8.9 KiB
C
360 lines
8.9 KiB
C
// SPDX-License-Identifier: GPL-2.0-only
|
|
/*
|
|
* net/core/dst.c Protocol independent destination cache.
|
|
*
|
|
* Authors: Alexey Kuznetsov, <kuznet@ms2.inr.ac.ru>
|
|
*
|
|
*/
|
|
|
|
#include <linux/bitops.h>
|
|
#include <linux/errno.h>
|
|
#include <linux/init.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/workqueue.h>
|
|
#include <linux/mm.h>
|
|
#include <linux/module.h>
|
|
#include <linux/slab.h>
|
|
#include <linux/netdevice.h>
|
|
#include <linux/skbuff.h>
|
|
#include <linux/string.h>
|
|
#include <linux/types.h>
|
|
#include <net/net_namespace.h>
|
|
#include <linux/sched.h>
|
|
#include <linux/prefetch.h>
|
|
#include <net/lwtunnel.h>
|
|
#include <net/xfrm.h>
|
|
|
|
#include <net/dst.h>
|
|
#include <net/dst_metadata.h>
|
|
|
|
int dst_discard_out(struct net *net, struct sock *sk, struct sk_buff *skb)
|
|
{
|
|
kfree_skb(skb);
|
|
return 0;
|
|
}
|
|
EXPORT_SYMBOL(dst_discard_out);
|
|
|
|
const struct dst_metrics dst_default_metrics = {
|
|
/* This initializer is needed to force linker to place this variable
|
|
* into const section. Otherwise it might end into bss section.
|
|
* We really want to avoid false sharing on this variable, and catch
|
|
* any writes on it.
|
|
*/
|
|
.refcnt = REFCOUNT_INIT(1),
|
|
};
|
|
EXPORT_SYMBOL(dst_default_metrics);
|
|
|
|
void dst_init(struct dst_entry *dst, struct dst_ops *ops,
|
|
struct net_device *dev, int initial_ref, int initial_obsolete,
|
|
unsigned short flags)
|
|
{
|
|
dst->dev = dev;
|
|
netdev_hold(dev, &dst->dev_tracker, GFP_ATOMIC);
|
|
dst->ops = ops;
|
|
dst_init_metrics(dst, dst_default_metrics.metrics, true);
|
|
dst->expires = 0UL;
|
|
#ifdef CONFIG_XFRM
|
|
dst->xfrm = NULL;
|
|
#endif
|
|
dst->input = dst_discard;
|
|
dst->output = dst_discard_out;
|
|
dst->error = 0;
|
|
dst->obsolete = initial_obsolete;
|
|
dst->header_len = 0;
|
|
dst->trailer_len = 0;
|
|
#ifdef CONFIG_IP_ROUTE_CLASSID
|
|
dst->tclassid = 0;
|
|
#endif
|
|
dst->lwtstate = NULL;
|
|
atomic_set(&dst->__refcnt, initial_ref);
|
|
dst->__use = 0;
|
|
dst->lastuse = jiffies;
|
|
dst->flags = flags;
|
|
if (!(flags & DST_NOCOUNT))
|
|
dst_entries_add(ops, 1);
|
|
}
|
|
EXPORT_SYMBOL(dst_init);
|
|
|
|
void *dst_alloc(struct dst_ops *ops, struct net_device *dev,
|
|
int initial_ref, int initial_obsolete, unsigned short flags)
|
|
{
|
|
struct dst_entry *dst;
|
|
|
|
if (ops->gc &&
|
|
!(flags & DST_NOCOUNT) &&
|
|
dst_entries_get_fast(ops) > ops->gc_thresh) {
|
|
if (ops->gc(ops)) {
|
|
pr_notice_ratelimited("Route cache is full: consider increasing sysctl net.ipv6.route.max_size.\n");
|
|
return NULL;
|
|
}
|
|
}
|
|
|
|
dst = kmem_cache_alloc(ops->kmem_cachep, GFP_ATOMIC);
|
|
if (!dst)
|
|
return NULL;
|
|
|
|
dst_init(dst, ops, dev, initial_ref, initial_obsolete, flags);
|
|
|
|
return dst;
|
|
}
|
|
EXPORT_SYMBOL(dst_alloc);
|
|
|
|
struct dst_entry *dst_destroy(struct dst_entry * dst)
|
|
{
|
|
struct dst_entry *child = NULL;
|
|
|
|
smp_rmb();
|
|
|
|
#ifdef CONFIG_XFRM
|
|
if (dst->xfrm) {
|
|
struct xfrm_dst *xdst = (struct xfrm_dst *) dst;
|
|
|
|
child = xdst->child;
|
|
}
|
|
#endif
|
|
if (!(dst->flags & DST_NOCOUNT))
|
|
dst_entries_add(dst->ops, -1);
|
|
|
|
if (dst->ops->destroy)
|
|
dst->ops->destroy(dst);
|
|
netdev_put(dst->dev, &dst->dev_tracker);
|
|
|
|
lwtstate_put(dst->lwtstate);
|
|
|
|
if (dst->flags & DST_METADATA)
|
|
metadata_dst_free((struct metadata_dst *)dst);
|
|
else
|
|
kmem_cache_free(dst->ops->kmem_cachep, dst);
|
|
|
|
dst = child;
|
|
if (dst)
|
|
dst_release_immediate(dst);
|
|
return NULL;
|
|
}
|
|
EXPORT_SYMBOL(dst_destroy);
|
|
|
|
static void dst_destroy_rcu(struct rcu_head *head)
|
|
{
|
|
struct dst_entry *dst = container_of(head, struct dst_entry, rcu_head);
|
|
|
|
dst = dst_destroy(dst);
|
|
}
|
|
|
|
/* Operations to mark dst as DEAD and clean up the net device referenced
|
|
* by dst:
|
|
* 1. put the dst under blackhole interface and discard all tx/rx packets
|
|
* on this route.
|
|
* 2. release the net_device
|
|
* This function should be called when removing routes from the fib tree
|
|
* in preparation for a NETDEV_DOWN/NETDEV_UNREGISTER event and also to
|
|
* make the next dst_ops->check() fail.
|
|
*/
|
|
void dst_dev_put(struct dst_entry *dst)
|
|
{
|
|
struct net_device *dev = dst->dev;
|
|
|
|
dst->obsolete = DST_OBSOLETE_DEAD;
|
|
if (dst->ops->ifdown)
|
|
dst->ops->ifdown(dst, dev, true);
|
|
dst->input = dst_discard;
|
|
dst->output = dst_discard_out;
|
|
dst->dev = blackhole_netdev;
|
|
netdev_ref_replace(dev, blackhole_netdev, &dst->dev_tracker,
|
|
GFP_ATOMIC);
|
|
}
|
|
EXPORT_SYMBOL(dst_dev_put);
|
|
|
|
void dst_release(struct dst_entry *dst)
|
|
{
|
|
if (dst) {
|
|
int newrefcnt;
|
|
|
|
newrefcnt = atomic_dec_return(&dst->__refcnt);
|
|
if (WARN_ONCE(newrefcnt < 0, "dst_release underflow"))
|
|
net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
|
|
__func__, dst, newrefcnt);
|
|
if (!newrefcnt)
|
|
call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu);
|
|
}
|
|
}
|
|
EXPORT_SYMBOL(dst_release);
|
|
|
|
void dst_release_immediate(struct dst_entry *dst)
|
|
{
|
|
if (dst) {
|
|
int newrefcnt;
|
|
|
|
newrefcnt = atomic_dec_return(&dst->__refcnt);
|
|
if (WARN_ONCE(newrefcnt < 0, "dst_release_immediate underflow"))
|
|
net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
|
|
__func__, dst, newrefcnt);
|
|
if (!newrefcnt)
|
|
dst_destroy(dst);
|
|
}
|
|
}
|
|
EXPORT_SYMBOL(dst_release_immediate);
|
|
|
|
u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old)
|
|
{
|
|
struct dst_metrics *p = kmalloc(sizeof(*p), GFP_ATOMIC);
|
|
|
|
if (p) {
|
|
struct dst_metrics *old_p = (struct dst_metrics *)__DST_METRICS_PTR(old);
|
|
unsigned long prev, new;
|
|
|
|
refcount_set(&p->refcnt, 1);
|
|
memcpy(p->metrics, old_p->metrics, sizeof(p->metrics));
|
|
|
|
new = (unsigned long) p;
|
|
prev = cmpxchg(&dst->_metrics, old, new);
|
|
|
|
if (prev != old) {
|
|
kfree(p);
|
|
p = (struct dst_metrics *)__DST_METRICS_PTR(prev);
|
|
if (prev & DST_METRICS_READ_ONLY)
|
|
p = NULL;
|
|
} else if (prev & DST_METRICS_REFCOUNTED) {
|
|
if (refcount_dec_and_test(&old_p->refcnt))
|
|
kfree(old_p);
|
|
}
|
|
}
|
|
BUILD_BUG_ON(offsetof(struct dst_metrics, metrics) != 0);
|
|
return (u32 *)p;
|
|
}
|
|
EXPORT_SYMBOL(dst_cow_metrics_generic);
|
|
|
|
/* Caller asserts that dst_metrics_read_only(dst) is false. */
|
|
void __dst_destroy_metrics_generic(struct dst_entry *dst, unsigned long old)
|
|
{
|
|
unsigned long prev, new;
|
|
|
|
new = ((unsigned long) &dst_default_metrics) | DST_METRICS_READ_ONLY;
|
|
prev = cmpxchg(&dst->_metrics, old, new);
|
|
if (prev == old)
|
|
kfree(__DST_METRICS_PTR(old));
|
|
}
|
|
EXPORT_SYMBOL(__dst_destroy_metrics_generic);
|
|
|
|
struct dst_entry *dst_blackhole_check(struct dst_entry *dst, u32 cookie)
|
|
{
|
|
return NULL;
|
|
}
|
|
|
|
u32 *dst_blackhole_cow_metrics(struct dst_entry *dst, unsigned long old)
|
|
{
|
|
return NULL;
|
|
}
|
|
|
|
struct neighbour *dst_blackhole_neigh_lookup(const struct dst_entry *dst,
|
|
struct sk_buff *skb,
|
|
const void *daddr)
|
|
{
|
|
return NULL;
|
|
}
|
|
|
|
void dst_blackhole_update_pmtu(struct dst_entry *dst, struct sock *sk,
|
|
struct sk_buff *skb, u32 mtu,
|
|
bool confirm_neigh)
|
|
{
|
|
}
|
|
EXPORT_SYMBOL_GPL(dst_blackhole_update_pmtu);
|
|
|
|
void dst_blackhole_redirect(struct dst_entry *dst, struct sock *sk,
|
|
struct sk_buff *skb)
|
|
{
|
|
}
|
|
EXPORT_SYMBOL_GPL(dst_blackhole_redirect);
|
|
|
|
unsigned int dst_blackhole_mtu(const struct dst_entry *dst)
|
|
{
|
|
unsigned int mtu = dst_metric_raw(dst, RTAX_MTU);
|
|
|
|
return mtu ? : dst->dev->mtu;
|
|
}
|
|
EXPORT_SYMBOL_GPL(dst_blackhole_mtu);
|
|
|
|
static struct dst_ops dst_blackhole_ops = {
|
|
.family = AF_UNSPEC,
|
|
.neigh_lookup = dst_blackhole_neigh_lookup,
|
|
.check = dst_blackhole_check,
|
|
.cow_metrics = dst_blackhole_cow_metrics,
|
|
.update_pmtu = dst_blackhole_update_pmtu,
|
|
.redirect = dst_blackhole_redirect,
|
|
.mtu = dst_blackhole_mtu,
|
|
};
|
|
|
|
static void __metadata_dst_init(struct metadata_dst *md_dst,
|
|
enum metadata_type type, u8 optslen)
|
|
{
|
|
struct dst_entry *dst;
|
|
|
|
dst = &md_dst->dst;
|
|
dst_init(dst, &dst_blackhole_ops, NULL, 1, DST_OBSOLETE_NONE,
|
|
DST_METADATA | DST_NOCOUNT);
|
|
memset(dst + 1, 0, sizeof(*md_dst) + optslen - sizeof(*dst));
|
|
md_dst->type = type;
|
|
}
|
|
|
|
struct metadata_dst *metadata_dst_alloc(u8 optslen, enum metadata_type type,
|
|
gfp_t flags)
|
|
{
|
|
struct metadata_dst *md_dst;
|
|
|
|
md_dst = kmalloc(sizeof(*md_dst) + optslen, flags);
|
|
if (!md_dst)
|
|
return NULL;
|
|
|
|
__metadata_dst_init(md_dst, type, optslen);
|
|
|
|
return md_dst;
|
|
}
|
|
EXPORT_SYMBOL_GPL(metadata_dst_alloc);
|
|
|
|
void metadata_dst_free(struct metadata_dst *md_dst)
|
|
{
|
|
#ifdef CONFIG_DST_CACHE
|
|
if (md_dst->type == METADATA_IP_TUNNEL)
|
|
dst_cache_destroy(&md_dst->u.tun_info.dst_cache);
|
|
#endif
|
|
if (md_dst->type == METADATA_XFRM)
|
|
dst_release(md_dst->u.xfrm_info.dst_orig);
|
|
kfree(md_dst);
|
|
}
|
|
EXPORT_SYMBOL_GPL(metadata_dst_free);
|
|
|
|
struct metadata_dst __percpu *
|
|
metadata_dst_alloc_percpu(u8 optslen, enum metadata_type type, gfp_t flags)
|
|
{
|
|
int cpu;
|
|
struct metadata_dst __percpu *md_dst;
|
|
|
|
md_dst = __alloc_percpu_gfp(sizeof(struct metadata_dst) + optslen,
|
|
__alignof__(struct metadata_dst), flags);
|
|
if (!md_dst)
|
|
return NULL;
|
|
|
|
for_each_possible_cpu(cpu)
|
|
__metadata_dst_init(per_cpu_ptr(md_dst, cpu), type, optslen);
|
|
|
|
return md_dst;
|
|
}
|
|
EXPORT_SYMBOL_GPL(metadata_dst_alloc_percpu);
|
|
|
|
void metadata_dst_free_percpu(struct metadata_dst __percpu *md_dst)
|
|
{
|
|
int cpu;
|
|
|
|
for_each_possible_cpu(cpu) {
|
|
struct metadata_dst *one_md_dst = per_cpu_ptr(md_dst, cpu);
|
|
|
|
#ifdef CONFIG_DST_CACHE
|
|
if (one_md_dst->type == METADATA_IP_TUNNEL)
|
|
dst_cache_destroy(&one_md_dst->u.tun_info.dst_cache);
|
|
#endif
|
|
if (one_md_dst->type == METADATA_XFRM)
|
|
dst_release(one_md_dst->u.xfrm_info.dst_orig);
|
|
}
|
|
free_percpu(md_dst);
|
|
}
|
|
EXPORT_SYMBOL_GPL(metadata_dst_free_percpu);
|