When you have suppress-fib-pending turned on it is possible
to end up in a situation where the prefix is not withdrawn
from downstream peers.
Here is the timing that I believe is happening:
a) have 2 paths to a peer.
b) receive a withdrawal from 1 path, set BGP_NODE_FIB_INSTALL_PENDING
and send the route install to zebra.
c) receive a withdrawal from the other path.
d) At this point we have a dest->flags set BGP_NODE_FIB_INSTALL_PENDING
old_select the path_info going away, new_select is NULL
e) A bit further down we call group_announce_route() which calls
the code to see if we should advertise the path. It sees the
BGP_NODE_FIB_INSTALL_PENDING flag and says, nope.
f) the route is sent to zebra to withdraw, which unsets the
BGP_NODE_FIB_INSTALL_PENDING.
g) This function winds up and deletes the path_info. Dest now
has no path infos.
h) BGP receives the route install(from step b) and unsets the
BGP_NODE_FIB_INSTALL_PENDING flag
i) BGP receives the route removed from zebra (from step f) and
unsets the flag again.
We know if there is no new_select, let's go ahead and just
unset the PENDING flag to allow the withdrawal to go out
at the time when the second withdrawal is received.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When looping through the dplane providers, the worklist was
being populated with items from the last provider and then
the event system was checked to see if we should stop processing.
If the event system says `yes` then the dplane code would stop
and send the worklist to the master zebra pthread for collection.
This obviously skipped the next dplane provider on the list
which is double plus not good.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The mutex that wraps access to the output buffer
is being held for the entire time the data is
being generated to send down the pipe. Since
the generation has absolutely nothing to do
with the obuf, let's limit the mutex holding some.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The fpm_testing_topo1 didn't turn on the fpm_listener
sending the routes back to zebra to set the asic offload.
Modify the test to tell the fpm_listener to set the offloaded
flag and reflect the route back to the dplane_fpm_nl.c code.
Also modify zebra to expect a response to the underlying fpm listener.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In fpm_listener, when a error is detected it would
stop listening and not recover. Modify the code
to close the socket and allow the connection to
recover.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A recent code change 29122bc9b8
changed the passing of data up the fpm from passing the
tableid and vrf to the sonic expected tableid contains
the vrfid. This violates the assumptions in the code
that the netlink message passes up the tableid as the
tableid. Additionally this code change did not modify
the rib_find_rn_from_ctx to actually properly decode
what could be passed up. Let's just fix this and let
Sonic carry the patch as appropriate for themselves
since they are not the only users of dplane_fpm_nl.c
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When called without caps/privs, just return from "change_caps"
instead of exiting - it's possible that a process may not need
privs, but a lib (for example) may use the api.
Signed-off-by: Mark Stapp <mjs@cisco.com>
When looping through the dplane providers, the worklist was
being populated with items from the last provider and then
the event system was checked to see if we should stop processing.
If the event system says `yes` then the dplane code would stop
and send the worklist to the master zebra pthread for collection.
This obviously skipped the next dplane provider on the list
which is double plus not good.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The SRv6 SID Manager does not allow allocating an SRv6 End/uN function
even though it is already supported by staticd.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
If we have IPv6-only network and no IPv4 addresses at all, then by default
0.0.0.0 is created which is treated as malformed according to RFC 6286.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
We were hashing 4 bytes of the address. Even for IPv6 addresses.
Oops.
The reason this was done was to try to make it faster, but made a
complex maze out of everything. Time for a refactor.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When reading in a nexthop from ZAPI, only set the fields that actually
have meaning. While it shouldn't happen to begin with, we can otherwise
carry padding garbage into the unused leftover union bytes.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would
contain whatever junk is on the stack on function entry. Also move
the IPv4 parse before the IPv6 parse so if it's successful we can be
sure the other bytes haven't been touched.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Doesn't seem to break anything but really poor style to pass potentially
uninitialized data to hash_lookup.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
While the loop is currently exited in all cases after using nexthop, it
is a footgun to have "nh" around to be reused in another iteration of
the loop. This would leave nexthop with partial data from the previous
use. Make it local where needed instead.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In ebgp-multihop, there is a difference in reload behavior when TTL is
unspecified (meaning default 255) and when 255 is explicitly specified.
For example, when reloading with 'neighbor <neighbor> ebgp-multihop
255' in the config, the following difference is created. This commit
fixes that.
Lines To Delete
===============
router bgp 65001
no neighbor 10.0.0.4 ebgp-multihop
exit
Lines To Add
============
router bgp 65001
neighbor 10.0.0.4 ebgp-multihop 255
exit
The commit 767aaa3a80 is not sufficient and frr-reload needs to be
fixed to handle both unspecified and specified cases.
Signed-off-by: Nobuhiro MIKI <nob@bobuhiro11.net>
Currently FRR does not handle v6 recurisive resolution properly
when the route being recursed through changes and the most
significant bits of the route are not changed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Freeing any item here means freeing someone's `event->hist`, leaving a
dangling pointer there. Which will immediately be written to because
we're executing in a CLI function under the `vty_read` event, whose
`event->hist` is then updated.
Deallocating `event->hist` anywhere other than shutting down the whole
event loop is a bad idea to begin with, just zero out the stats instead.
Fixes: FRRouting/frr#16419
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When VRF instance is going to be deleted inside bgp_vrf_disable(), it uses
a helper method that skips auto created VRF instances and that leads to STALE
issue.
When creating a VNI for a particular VRF vrfX with e.g. `advertise-all-vni`,
auto VRF instance is created, and then we do `router bgp ASN vrf vrfX`.
But when we do a reload bgp_vrf_disable() is called, and we miss previously
created auto instance.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The "static_simple" test has code for testing IPv6 routes, but it wasn't
even being run (duh.) Enable it, and also test IPv6 dst-src routes.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When a daemon wants to know about its routes, make it possible to have
that work for dst-src routes.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The staticd YANG conversion completely f*cked up dst-src routes.
Stupidly enough, the correct thing is much simpler as seen by the amount
of deletes in this commit.
This does, unfortunately, involve a rather annoying YANG edge case with
what should reasonably be an optional leaf as part of a list key, which
is not possible. It uses `::/0` as unconditional filler instead, since
that is semantically correct.
The `test_yang_mgmt` topotest needed to be adjusted after this to add
`src-prefix='::/0'`.
Fixes: 88fa5104a0 ("staticd : Configuration northbound implementation")
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The Linux kernel doesn't support dst-src routes with NHGs as nexthop,
for some (rather dubious) caching reasons.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>