Customer has this valgrind trace:
Direct leak of 2829120 byte(s) in 70728 object(s) allocated from:
0 in community_new ../bgpd/bgp_community.c:39
1 in community_uniq_sort ../bgpd/bgp_community.c:170
2 in route_set_community ../bgpd/bgp_routemap.c:2342
3 in route_map_apply_ext ../lib/routemap.c:2673
4 in subgroup_announce_check ../bgpd/bgp_route.c:2367
5 in subgroup_process_announce_selected ../bgpd/bgp_route.c:2914
6 in group_announce_route_walkcb ../bgpd/bgp_updgrp_adv.c:199
7 in hash_walk ../lib/hash.c:285
8 in update_group_af_walk ../bgpd/bgp_updgrp.c:2061
9 in group_announce_route ../bgpd/bgp_updgrp_adv.c:1059
10 in bgp_process_main_one ../bgpd/bgp_route.c:3221
11 in bgp_process_wq ../bgpd/bgp_route.c:3221
12 in work_queue_run ../lib/workqueue.c:282
The above leak detected by valgrind was from a screenshot so I copied it
by hand. Any mistakes in line numbers are purely from my transcription.
Additionally this is against a slightly modified 8.5.1 version of FRR.
Code inspection of 8.5.1 -vs- latest master shows the same problem
exists. Code should be able to be followed from there to here.
What is happening:
There is a route-map being applied that modifes the outgoing community
to a peer. This is saved in the attr copy created in
subgroup_process_announce_selected. This community pointer is not
interned. So the community->refcount is still 0. Normally when
a prefix is announced, the attr and the prefix are placed on a
adjency out structure where the attribute is interned. This will
cause the community to be saved in the community hash list as well.
In a non-normal operation when the decision to send is aborted after
the route-map application, the attribute is just dropped and the
pointer to the community is just dropped too, leading to situations
where the memory is leaked. The usage of bgp suppress-fib would
would be a case where the community is caused to be leaked.
Additionally the previous commit where an unsuppress-map is used
to modify the outgoing attribute but since unsuppress-map was
not considered part of outgoing policy the attribute would be dropped as
well. This pointer drop also extends to any dynamically allocated
memory saved by the attribute pointer that was not interned yet as well.
So let's modify the return case where the decision is made to
not send the prefix to the peer to always just flush the attribute
to ensure memory is not leaked.
Fixes: #15459
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit addff17a55)
If unsuppress-map is setup for outgoing peers, consider that
policy is being applied as for RFC 8212.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 6814401c47)
Currently in subgroup_default_originate the attr.aspath
is set in bgp_attr_default_set, which hashs the aspath
and creates a refcount for it. If this is a withdraw
the subgroup_announce_check and bgp_adj_out_set_subgroup
is called which will intern the attribute. This will
cause the the attr.aspath to be set to a new value
finally at the bottom of the function it intentionally
uninterns the aspath which is not the one that was
created for this function. This reduces the other
aspath's refcount by 1 and if a clear bgp * is issued
fast enough the aspath for that will be removed
and the system will crash.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit e613e12f12)
A route and its nexthop might belong to different VRFs. Therefore, we need
both the bgp and bgp_nexthop pointers.
Fixes: 8d51fafdcb ("bgpd: Drop bgp_static_update_safi() function")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 778357e9ef)
Alpine Linux gets this with 3.19:
This is already installed with `pytest` via apk package manager.
```
15 78.20 error: externally-managed-environment
15 78.20
15 78.20 × This environment is externally managed
15 78.20 ╰─>
15 78.20 The system-wide python installation should be maintained using the system
15 78.20 package manager (apk) only.
15 78.20
15 78.20 If the package in question is not packaged already (and hence installable via
15 78.20 "apk add py3-somepackage"), please consider installing it inside a virtual
15 78.20 environment, e.g.:
15 78.20
15 78.20 python3 -m venv /path/to/venv
15 78.20 . /path/to/venv/bin/activate
15 78.20 pip install mypackage
15 78.20
15 78.20 To exit the virtual environment, run:
15 78.20
15 78.20 deactivate
15 78.20
15 78.20 The virtual environment is not deleted, and can be re-entered by re-sourcing
15 78.20 the activate file.
15 78.20
15 78.20 To automatically manage virtual environments, consider using pipx (from the
15 78.20 pipx package).
15 78.20
15 78.20 note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 3f7cc3b7f5)
This is happening for Alpine Linux.
```
26 64.59 ./lib/sigevent.h:23:18: error: unknown type name 'sig_atomic_t'
26 64.59 23 | volatile sig_atomic_t caught; /* private member */
26 64.59 | ^~~~~~~~~~~~
26 64.60 In file included from ./lib/libfrr.h:12,
26 64.60 from ./lib/vty.h:28,
26 64.60 from ./lib/command.h:11,
26 64.60 from ./lib/debug.h:11,
26 64.60 from ./mgmtd/mgmt.h:12,
26 64.60 from mgmtd/mgmt_history.c:14:
26 64.60 ./lib/sigevent.h:23:18: error: unknown type name 'sig_atomic_t'
26 64.60 23 | volatile sig_atomic_t caught; /* private member */
26 64.60 | ^~~~~~~~~~~~
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit f03b0bfaa4)
FRR needs to properly include the FreeBSD headers for
compilation on FreeBSD. I have setup v6 as well
but I have not even tested it. Since I know
that the form is the same I think this is ok
at the moment. This is a step forward.
Because of this change *clearly* no-one is even
using pim on FreeBSD. <look at the MRT_XXX values
to prove to yourself>. In any event this is a step
in the direction of getting that working again.
Signed-off-by: Donald Sharp <sharpd@freebsd.network>
(cherry picked from commit a5389154a1)
If we enter `bgp graceful-restart-disable`, make sure we disable the capabilities.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 78757362f2)
When using dynamic capabilities, do not forget to unset advertised capabilities.
Otherwise, it's kept as advertised.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 77102e853e)
If `bgp default software-version-capability` is enabled, allow unsetting this
for a single neighbor also.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 2038fad33e)
mgmtd doesn't support YANG RPCs yet, so this command must go directly to
ripngd.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(cherry picked from commit c544b9e8e7)
mgmtd doesn't support YANG RPCs yet, so this command must go directly to
ripd.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(cherry picked from commit 1ba97510e2)
Add missing cli_cmp callback. Without it, interfaces are not sorted and
printed in order they were created.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(cherry picked from commit 18da736949)
When a node is top-level, we shouldn't stop the whole processing, we
should just skip this single node.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(cherry picked from commit 8287fbe453)
`darr_avail` returns the available capacity excluding the already
existing terminating NULL byte. Take this into account when using
`darr_avail`. Otherwise, if the error length is a power of 2, the
capacity is never enough and the function stucks in an infinite loop.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(cherry picked from commit cb6032d6b3)
If the initial darr capacity is not enough for the output, the `ap` is
reused multiple times, which is wrong, because it may be altered by
`vsnprintf`. Make a copy of `ap` each time instead of reusing.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(cherry picked from commit ee0c1cc1e4)
./configure [...] --disable-ripngd
could lead to:
mgmtd/mgmt_vty.c:614:5: warning: "HAVE_RIPNGD" is not defined, evaluates to 0 [-Wundef]
614 | #if HAVE_RIPNGD
| ^~~~~~~~~~~
Signed-off-by: Vincent Jardin <vjardin@free.fr>
(cherry picked from commit 717b3350bb)
6vPE enables the announcement of IPv6 VPN prefixes through an IPv4 BGP
session. In this scenario, the next hop addresses for these prefixes are
represented in an IPv4-mapped IPv6 format, noted as ::ffff:[IPv4]. This
format indicates to the peer that it should route these IPv6 addresses
using information from the IPv4 nexthop. For example:
> Path Attribute - MP_REACH_NLRI
> [...]
> Address family identifier (AFI): IPv6 (2)
> Subsequent address family identifier (SAFI): Labeled VPN Unicast (128)
> Next hop: RD=0:0 IPv6=::ffff:192.0.2.5 RD=0:0 Link-local=fe80::501d:42ff:feef:b021
> Number of Subnetwork points of attachment (SNPA): 0
This rule is set out in RFC4798:
> The IPv4 address of the egress 6PE router MUST be encoded as an
> IPv4-mapped IPv6 address in the BGP Next Hop field.
However, in some situations, bgpd sends a standard nexthop IPv6 address
instead of an IPv4-mapped IPv6 address because the outgoing interface for
the BGP session has a valid IPv6 address. This is problematic because
the peer router may not be able to route the nexthop IPv6 address (ie.
if the outgoing interface has not IPv6).
Fix the issue by always sending a IPv4-mapped IPv6 address as nexthop
when the BGP session is on IPv4 and address family IPv6.
Link: https://datatracker.ietf.org/doc/html/rfc4798#section-2
Fixes: 92d6f76 ("lib,zebra,bgpd: Fix for nexthop as IPv4 mapped IPv6 address")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 0325116a27)
This test uses the connected ipv4 mapped ipv6 prefix
to resolve the received BGP routes.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: François Dumontet <francois.dumontet@6wind.com>
(cherry picked from commit 4d7df91752)
A macvlan interface can have its underlying link-interface in another
namespace (aka. netns). However, by default, zebra does not know the
interface from the other namespaces. It results in a crash the pointer
to the link interface is NULL.
> 6 0x0000559d77a329d3 in zebra_vxlan_macvlan_up (ifp=0x559d798b8e00) at /root/frr/zebra/zebra_vxlan.c:4676
> 4676 link_zif = link_ifp->info;
> (gdb) list
> 4671 struct interface *link_ifp, *link_if;
> 4672
> 4673 zif = ifp->info;
> 4674 assert(zif);
> 4675 link_ifp = zif->link;
> 4676 link_zif = link_ifp->info;
> 4677 assert(link_zif);
> 4678
> (gdb) p zif->link
> $2 = (struct interface *) 0x0
> (gdb) p zif->link_ifindex
> $3 = 15
Fix the crash by returning when the macvlan link-interface is in another
namespace. No need to go further because any vxlan under the macvlan
interface would not be accessible by zebra.
Link: https://github.com/FRRouting/frr/issues/15370
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 44e6e3868d)