Commit Graph

30591 Commits

Author SHA1 Message Date
Donald Sharp
ab06d92642
Merge pull request #10748 from opensourcerouting/unused-20220307
zebra: remove unused variable
2022-03-07 19:33:26 -05:00
Jafar Al-Gharaibeh
7816130cca
Merge pull request #10747 from donaldsharp/reported_code_indentation
bgpd: Fix continue/break change from old commit
2022-03-07 13:21:33 -06:00
David Lamparter
a4af82ee2b lib, vtysh: report lost messages on live log
The vtysh live logs don't try to buffer messages when vtysh isn't
reading them fast enough.  Either the kernel has space and can accept
messages without delay, or it doesn't and we continue on.

While this is intentional (otherwise slow vtysh could block a routing
daemon), at least give the user an indication if messages were dropped.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:16 +01:00
David Lamparter
3bcdae106e lib: add monitor:<fd> command line log target
This provides direct raw log output with full metadata directly at
startup regardless of configuration details.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:16 +01:00
David Lamparter
834585bdb9 lib: add a few more bits to live log header
... and add some comments explaining the individual fields.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:16 +01:00
David Lamparter
da2fc19187 lib: support multiple --log options
Allow simultaneously enabling syslog, stdout and/or file logs.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:15 +01:00
David Lamparter
2eda953a2a lib: make live log sockets non-blocking
This was the intent here to begin with, not sure where I managed to
forget this along the way...

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:15 +01:00
David Lamparter
1609a9d636 lib: fix live log fields for crashlog
The timestamps used for the live log are wallclock, not monotonic.  Also
some fields were left uninitialized.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 18:03:13 +01:00
Donald Sharp
db0a45d0d6
Merge pull request #10741 from LabNConsulting/chopps/critfixgrpc
critical fixes for grpc
2022-03-07 11:49:48 -05:00
David Lamparter
378260fb65 zebra: remove unused variable
clang complains "variable 'curr_length' set but not used".

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 17:37:27 +01:00
Donald Sharp
e3015d915b bgpd: Fix continue/break change from old commit
Commit: ea47320b1d

Modified the bgp_clear_stale_route function to have
better indentation, but in the process changed some
`continue;` statements to `break;` which modified
the looping and caused stale paths to not always be
removed upon an update.

To reproduce:  A ---- B, setup with addpath and GR
One side has a prefix with nhop1 and nhop2, kill one
side and then resend the same prefix with nhop3,
paths nhop1 and 2 become stale and never removed.

Code inspection clearly shows that that `continue`
statements became `break` statements causing the
loop over all paths to stop prematurely.

The fix is to change the break back to continue
statements so the loop can continue instead of
stopping.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-07 11:28:28 -05:00
David Lamparter
d03440cab7 lib: fix log target removal when singlethreaded
While running singlethreaded, the RCU code is "dormant" and rcu_free is
an immediate operation.  This results in the log target loop accessing
free'd memory if a log target removes itself while a message is printed
(which is likely to happen on e.g. error conditions.)

Just use frr_each_safe to avoid this issue.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 17:23:12 +01:00
Donatas Abraitis
404e53da28
Merge pull request #10745 from mjstapp/fix_doc_star_again
doc: fix another doc typo
2022-03-07 17:37:35 +02:00
Mark Stapp
bdbdd2a75c doc: fix another doc typo
Fix another typo that mis-uses the asterisk character.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-03-07 08:34:09 -05:00
David Schweizer
a533bb8cac
pimd/pim_cmd.c: fix typo in IGMP iface stats JSON
Changes correct typo in JSON output for IGMP interface statistics show
command from "leaveV3" to "leaveV2".

Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
2022-03-07 14:18:09 +01:00
Donald Sharp
75ba864c81 bgpd: Warn user when an interface has no v6 LL address associated with it
When BGP detects that a peering is using a global address but no v6 LL
address has been created for the interface that the global address is
on warn the user that something is amiss and they need to fix it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-07 08:00:26 -05:00
Donatas Abraitis
583ba572b7
Merge pull request #10732 from anlancs/zebra-minor-move
zebra: Delay the usage of one variable
2022-03-07 12:15:43 +02:00
Christian Hopps
fe095adc24 lib: grpc: fix handling of "empty" yang type
- rather than coerce `const char *` to std:string&, just pass the
C ptr, as that's what is used anyway.

fixes #10578

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 12:00:22 -05:00
Christian Hopps
83f6fce7d2 lib: grpc: fix shutdown code
fixes #9732

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 12:00:17 -05:00
Christian Hopps
c85ecd6405 lib: grpc: initialize uninitialized member variables
fixes #9732, fixes #10578

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 07:39:58 -05:00
Christian Hopps
96d434f853 lib: grpc: do not remove candidate entry too soon
Fix from Rafael Zalamena <rzalamena@opensourcerouting.org>

Signed-off-by: Christian Hopps <chopps@labn.net>
2022-03-06 07:37:52 -05:00
Rafael Zalamena
673e440770 lib: tweak northbound gRPC default timeout
Don't let open sockets hang for too long. This will fix an issue where a
improperly coded client (e.g. socat) could exaust the amount of open
file descriptors.

Documentation:
https://grpc.github.io/grpc/cpp/md_doc_keepalive.html

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-06 07:37:52 -05:00
Jafar Al-Gharaibeh
5e5cd2784f
Merge pull request #10734 from ton31337/fix/rpki_read
bgpd: Fix while(read()) for RPKI sync callback
2022-03-05 01:55:53 -06:00
anlan_cs
38eda16a24 zebra: Delay the usage of one variable until need
In the loop, local variable `ip` is always set even if the check condition
is not satisfied.

Avoid the redundant set, move this set exactly after the check condition is
satisfied. Set `ip` only if the check condition is met, otherwise needn't.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-05 06:57:35 +08:00
Donatas Abraitis
bfc30f1687 bgpd: Fix while(read()) for RPKI sync callback
Bad formatting applied and it worked with small amount of prefixes (lurking).

With full BGP feed and full RPKI table, this causes infinity loop.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-05 00:14:37 +02:00
Donald Sharp
808fc9db0a
Merge pull request #10730 from mobash-rasool/fixes
lib: Route-map failed for OSPF routes even for matching prefixes
2022-03-04 08:27:11 -05:00
Mobashshera Rasool
a34ce5c5e4 lib: Route-map failed for OSPF routes even for matching prefixes
This issue is applicable to other protocols as well.
When user has used route-map, even though the prefixes are falling
under the permit rule, the prefixes were denied and were shown
as inactive route in zebra.

Reason being the parameter which is of type enum was passed to the api
route_map_get_index and was typecasted to uint8_t *.
This problem is visible in case of Big Endian systems because we are
accessing the most significant byte.

'match_ret' field is an enum in the caller and so it is of 4 bytes,
the typecasting it to 1 byte and passing it to the api made
the api to put the value in the most significant byte
which was already zero previously. Therefore the actual value
RMAP_NOMATCH which was 1 never gets reset in this case.
Therefore the api always returns 'RMAP_NOMATCH' and hence
the prefixes are always denied.

Fixes: #9782
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
2022-03-04 03:57:09 -08:00
Ville Skyttä
79cf42e53a debian: include frr@.service in deb
Signed-off-by: Ville Skyttä <ville.skytta@upcloud.com>
2022-03-04 13:10:08 +02:00
Donatas Abraitis
cd16c68376 doc: Drop unnecessary sentence about index.html changes for releases
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-04 09:53:16 +02:00
Donatas Abraitis
5c1d285012 doc: Add a couple of handy commands to get some info for release notes
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-04 09:46:25 +02:00
Sai Gomathi
6d733f0dbc pimd: Added new option "detail" in the "debug pim nht" CLI
Problem Statement:
=================
PIM Logs are coming at interval of 1 minute although pim
is not enabled.

Root Cause Analysis:
====================
By default, RCPM configures the PIM debugs when router comes up
via script. The product cannot disable PIM even though it is not required.
Hence moving these logs under a new debug option which will not be
enabled by default.

Fix:
====
Added a new option "detail" in the cli:
debug pim nht detail

Co-author: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
2022-03-03 22:33:20 -08:00
Jafar Al-Gharaibeh
7469f084f3
Merge pull request #10723 from opensourcerouting/bgpd-show-flow-detail
bgpd: fix 'show bgp detail json' output
2022-03-03 23:11:32 -06:00
Jafar Al-Gharaibeh
f73b0cc825 tests: add a topotest for ospf, mutli vrf, and route leaking
Test ospf running with 3 vrfs: default, neno, ray
Route leaking is setup via bgp between default and neno vrfs
Leaked routes include connected and ospf

Included test:
 1- OSPF convergnce
 2- zebra/kernel routes

Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
2022-03-03 23:09:15 -06:00
Chirag Shah
d5fdae8f45 zebra: fix crash in evpn neigh cleanup all
zebra crash is seen during shutdown (frr restart).

During shutdown, remote neigh and remote mac clean up
is triggered first, followed by per vni all neigh
(including local) and macs cleanup is triggered.

The crash occurs when a remote mac is cleaned up first
and its reference is remained in local neigh.
When local neigh attempt removes itself from its associated
mac's neigh_list it triggers inaccessible memory crash.

The fix is during mac deletion if its neigh_list is non-empty
then retain the MAC in AUTO state.
This can arise when MAC and neigh duo are in different state
(remote/local). Otherwise, the order of cleanup operation
is neighs followed by macs.

The auto mac will be cleaned up when per vni all neighs and macs
are cleaned up.

Ticket:CM-29826
Reviewed By:CCR-10369
Testing Done:

Configure evpn symmetric config where
MAC is in remote state and neigh is in local state.
Perform frr restart then crash is not seen.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-03 14:59:56 -08:00
Rafael Zalamena
3c1f92018b lib: rotate log file supplied by command line
Call `zlog_file_rotate` for command file lines as well otherwise on
`SIGUSR1` the old descriptor will still be used and no new log file will
be created for the rotation.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-03 18:28:08 -03:00
Rafael Zalamena
5be6fa9bf0 bgpd: fix 'show bgp detail json' output
Include the BGP_SHOW_OPT_DETAIL flag in the 'detail' version of the
command.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-03 18:26:03 -03:00
Donald Sharp
8b48cdb913 zebra: Prevent installation of connected multiple times
With recent changes to interface up mechanics in if_netlink.c
FRR was receiving as many as 4 up events for an interface
on ifdown/ifup events.  This was causing timing issues
in FRR based upon some fun timings.  Remove this from
happening.

Ticket: CM-31623
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 18:34:32 -08:00
Chirag Shah
d78fa57195 zebra: protodown-up event trigger interface up
Vxlan interfaces flap (protodown/up) event,
non ptm operative interfaces do not come up
as protodown up event do not trigger "if_up()"
event.

Ticket:CM-30477
Reviewed By:CCR-10681
Testing Done:

validated interfaces flaps, ip link down, ifdown
and protodown followed by UP event. all Vxlan interfaces
come up in bgpd post flap.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-02 18:16:56 -08:00
Chirag Shah
2d04bd98ac zebra: handle protodown netlink for vxlan device
Frr need to handle protocol down event for vxlan
interface.
In MLAG scenario, one of the pair switch can put
vxlan port to protodown state, followed by
tunnel-ip change from anycast IP to individual IP.

In absence of protodown handling, evpn end up
advertising locally learn EVPN (MAC-IP) routes
with individual IP as nexthop.
This leads an issue of overwriting locally learn
entries as remote on MLAG pair.

Ticket:CM-24545
Reviewed By:CCR-10310
Testing Done:

In EVPN deployment, restart one of the MLAG
daemon, which puts vxlan interfaces in protodown state.
FRR treats protodown as oper down for vxlan interfaces.
VNI down cleans up/withdraws locally learn routes.
Followed by vxlan device UP event, re-advertise
locally learn routes.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-02 17:40:44 -08:00
Jafar Al-Gharaibeh
4190587a3f
Merge pull request #10713 from donaldsharp/enum_fixup
lib, ospfd: Enum fixup
2022-03-02 16:29:46 -06:00
Donald Sharp
5f503e5f5a lib: Fix corruption when routemap delete/add sequence happens
If a operator issues a series of route-map deletions and
then re-adds, *and* this triggers the hash table to realloc
to grow to a larger size, then subsuquent route-map operations
will be against a corrupted hash table.

Why?

Effectively the route-map code was inserting each
route-map <NAME> into a hash for storage.  Upon
deletion there is this concept of delayed processing
so the routemap code sets a bit `to-be-processed`
and marks the route-map for deletion.  This is
1 entry in the hash table.  Then if the operator
recreates the hash, FRR would add another hash
entry.  If another deletion happens then there
now are 2 deletion entries that are indistinguishable
from a hash perspective.

FRR stores the deleted name of the route-map so that
any delayed processing can lookup the name and only process
those peers that are related to that route-map name.
This is good as that if in say BGP, we do not want
to reprocess all the peers that don't use the route-map.

Solution:
The whole purpose of the delay of deletion and the
storage of the route-map is to allow the using protocol
the ability to process the route-map at a later time
while still retaining the route-map name( for more efficient
reprocessing ).  The problem exists because we are keeping
multiple copies of deletion events that are indistinguishable
from each other causing hash havoc.

The truth is that we only need to keep 1 copy of the
routemap in the table.  If the series of events is:
a) delete ( schedule processing )
b) add ( reschedule processing )

Current code ends up processing the route-map two times
and in this event we really just need to reprocess everything
with the new route-map.

If the series of events is:
a) delete (schedule processing )
b) add (reschedule)
c) delete (reschedule)
d) add (reschedule)

All this really points to is that FRR just needs to keep the last
in the series of maps and ensuring that FRR knows that we need
to continue processing the route-map.  So in the creation processing
if the hash has an entry for this map, the routemap code knows that
this is a deletion event.  Mark this route-map for later processing
if it was marked so.  Also in the lookup function do not return
a map if the map found was deleted.

Fixes: #10708
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 15:41:54 -05:00
Rafael Zalamena
54aeec5ef0 lib,vtysh: show operational data with config
Add option to merge configuration data in the operational data show
command.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-02 16:37:43 -03:00
Renato Westphal
9b40fa1eae lib: fix iteration over YANG presence containers
State-only and configuration presence-containers need to be treated
differently when iterating over YANG operational data. Currently the
get_elem() callback is used to know when a state-only p-container
exists or not, and configuration p-containers are assumed to always
exist, which is clearly wrong. Fix this by checking the running
configuration to know whether a rw p-container exists or not.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2022-03-02 16:32:05 -03:00
Rafael Zalamena
852dfb3013 lib: fix show yang operational state output
Don't show default configuration values for state nodes.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-02 16:31:47 -03:00
Javier Garcia
432f143212 topotest: Add test for isis json cmds.
Signed-off-by: Javier Garcia <javier.martin.garcia@ibm.com>
2022-03-02 16:21:29 +01:00
Javier Garcia
a2cac12a63 isisd: Add json to show isis database command.
Signed-off-by: Javier Garcia <javier.martin.garcia@ibm.com>
2022-03-02 16:20:44 +01:00
Donald Sharp
91f2045d53 lib: Fix zclient.c enum event to enum zclient_event
zclient.c is using `enum event` let's rename it to a better
named data structure `enum zclient_event`.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 09:19:05 -05:00
Donald Sharp
14416b1f33 ospfd: Convert enum event to enum ospf_apiserver_event
Let's name this something more appropriate to
what is being done.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 09:17:57 -05:00
Donald Sharp
55a70ffb78 lib: Rename enum event to enum vty_event
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 09:17:47 -05:00
David Lamparter
0455229c5d pim6d: reenable NHT code
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-02 11:01:47 +01:00