98efa5bc6b ("bgpd: bgp_path_info_extra memory optimization") has removed
SID info from the extra structure.
Do not test for extra presence.
Fixes: 98efa5bc6b ("bgpd: bgp_path_info_extra memory optimization")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Not necessarily the correct place for this but there
is no other place and it needs to be called out and I
would rather have some documentation in place. Long
term I would like to add a bunch of frr_pthread documentation
but at this point in time it's not there. We can
re-arrange when that happens.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There was a question in regards to how the update-source
choose the ip address for the source when using the `update-source`
command in BGP. Upon looking at the code, I was a but surprised,
so I decided to document this behavior.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Since we have a knob to disable sending FQDN capability, it MUST be checked
before sending it using dynamic capabilities.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When a static route with a gateway nexthop is created, the nexthop is
sent to zebra for NHT, and added to a local hash. When the nexthop's VRF
is deleted from kernel, nexthop still stays in the hash. This is a
memory leak, because it is never deleted from there. Even if the VRF is
recreated in kernel, it is assigned with a new `vrf_id` so the old hash
entry is not reused, and a new one is created. To fix the issue, remove
all nexthops from the hash when the corresponding VRF is deleted, do not
add nexthops to the hash if their corresponding VRF doesn't exist in
kernel and don't add the same nexthop to the hash multiple times.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When processing an NHT update, we should go though all `static_vrf`
structures instead of regualar `vrf`, because some of `static_vrf` may
not have corresponding `vrf` and will miss the update.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We don't need to manually load built-in modules. This fixes the
following warning in mgmtd:
```
YANG model "ietf-yang-metadata@*" "*@*"not embedded, trying external file
```
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
mgmtd reads config files on its own, it doesn't need libfrr to do that.
The code is already skipped, because mgmtd uses `di->read_in` thread for
config reading and libfrr doesn't reschedule the thread, so this commit
just removes the dead code.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
mgmtd is supposed to only register CLI callbacks. If configuration
callbacks are registered, they are getting called on startup when mgmtd
reads config files, and they can use infrastructure that is not
initialized on mgmtd, or allocate some memory that is never freed.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
mgmtd is supposed to only register CLI callbacks. If configuration
callbacks are registered, they are getting called on startup when mgmtd
reads config files, and they can use infrastructure that is not
initialized on mgmtd, or allocate some memory that is never freed.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
As far as I can tell, the rpki code creates a pthread that
is used to handle the i/o associated with talking to the
remote rpki server. The problem that we are having is that
the rpki code in FRR wants to behave like FRR code and use
the zlog_XXX functions. These functions all depend on
the RCU code. Which is a bit picky( and rightly so!!! )
about being started up properly and shut down properly.
This commit is fixing the problem of shutdown. From
playing with the rpki code, I was able to experimentally
determine that the rpki_create_socket callback function
can be called multiple times per pthread. Additionally
I was able to clearly see multiple *different* pthreads
actually be created. This leaves the possiblity
that each time it is called it might be hooking into the
RCU code. Which makes the rcu code unhappy on shutdown.
Let's address the issue by checking to see if this pthread
has already hooked into the RCU code or not. If so
then don't do this again.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In a non-controlled startup, the rcu data structures were
not being created until after logging could happen. This
is bad. Move it so that the rcu data structures are
created first, before logging( HA! ) can happen.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
cisco routers are not dealing fairly whith unsupported capabilities.
When a cisco router receive an unsupported capabilities it reset the
negociation without notifying the unmatching capability as described in
RFC2842.
Cisco suggest the use of
neighbor x.x.x.x capability fqdn
to avoid the use of fqdn in open message.
this new command is to remove the use of fqdn capability in the
open message with the peer "x.x.x.x".
Link: https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/116189-problemsolution-technology-00.pdf
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
Fix the following crash when logging from rpki_create_socket():
> #0 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1 0x00007f6e21723798 in core_handler (signo=6, siginfo=0x7f6e1e502ef0, context=0x7f6e1e502dc0) at lib/sigevent.c:248
> #2 <signal handler called>
> #3 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #4 0x00007f6e2144e537 in __GI_abort () at abort.c:79
> #5 0x00007f6e2176348e in _zlog_assert_failed (xref=0x7f6e2180c920 <_xref.16>, extra=0x0) at lib/zlog.c:670
> #6 0x00007f6e216b1eda in rcu_read_lock () at lib/frrcu.c:294
> #7 0x00007f6e21762da8 in vzlog_notls (xref=0x0, prio=2, fmt=0x7f6e217afe50 "%s:%d: %s(): assertion (%s) failed", ap=0x7f6e1e504248) at lib/zlog.c:425
> #8 0x00007f6e217632fb in vzlogx (xref=0x0, prio=2, fmt=0x7f6e217afe50 "%s:%d: %s(): assertion (%s) failed", ap=0x7f6e1e504248) at lib/zlog.c:627
> #9 0x00007f6e217621f5 in zlog (prio=2, fmt=0x7f6e217afe50 "%s:%d: %s(): assertion (%s) failed") at lib/zlog.h:73
> #10 0x00007f6e21763596 in _zlog_assert_failed (xref=0x7f6e2180c920 <_xref.16>, extra=0x0) at lib/zlog.c:687
> #11 0x00007f6e216b1eda in rcu_read_lock () at lib/frrcu.c:294
> #12 0x00007f6e21762da8 in vzlog_notls (xref=0x7f6e21a50040 <_xref.68>, prio=4, fmt=0x7f6e21a4999f "getaddrinfo: debug", ap=0x7f6e1e504878) at lib/zlog.c:425
> #13 0x00007f6e217632fb in vzlogx (xref=0x7f6e21a50040 <_xref.68>, prio=4, fmt=0x7f6e21a4999f "getaddrinfo: debug", ap=0x7f6e1e504878) at lib/zlog.c:627
> #14 0x00007f6e21a3f774 in zlog_ref (xref=0x7f6e21a50040 <_xref.68>, fmt=0x7f6e21a4999f "getaddrinfo: debug") at ./lib/zlog.h:84
> #15 0x00007f6e21a451b2 in rpki_create_socket (_cache=0x55729149cc30) at bgpd/bgp_rpki.c:1337
> #16 0x00007f6e2120e7b7 in tr_tcp_open (tr_socket=0x5572914d1520) at rtrlib/rtrlib/transport/tcp/tcp_transport.c:111
> #17 0x00007f6e2120e212 in tr_open (socket=0x5572914b5e00) at rtrlib/rtrlib/transport/transport.c:16
> #18 0x00007f6e2120faa2 in rtr_fsm_start (rtr_socket=0x557290e17180) at rtrlib/rtrlib/rtr/rtr.c:130
> #19 0x00007f6e218b7ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
> #20 0x00007f6e21527a2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
rpki_create_socket() is a hook function called from the rtrlib library.
The issue arises because rtrlib initiates its own separate pthread in which
it runs the hook, which does not establish an FRR RCU context. Consequently,
this leads to failures in the logging mechanism that relies on RCU.
Initialize a new FRR pthread context from the rtrlib pthread with a
valid RCU context to allow logging from the rpki_create_socket() and
dependent functions.
Link: https://github.com/FRRouting/frr/issues/15260
Fixes: a951752d4a ("bgpd: create cache server socket in vrf")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>