Commit Graph

22036 Commits

Author SHA1 Message Date
Anuradha Karuppiah
46bf266c1c zebra: debug logs to detect incorrect mac deletions
A MAC entry cannot be deleted while a neigh is referencing it. It seems
there is some race condition where this may be happening. The log is
to help identify those cases.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:46:28 -08:00
Anuradha Karuppiah
4f9bb78eca zebra: change the L2 NHG id format to co-exist with the L3NHG ids
It is now 4bits of type and 28bits of value -
1. type=0 is for L3 NHG
2. type=1 is for L2 NH
3. type=2 is for L2 NHG

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:46:28 -08:00
Anuradha Karuppiah
5de10c3705 zebra: allocate one nexthop id per-VTEP instead of one per-ES-VTEP
This is an optimization to reduce the number of L2 nexthops. A
l2 or fdb nexthop simply provides the dataplane with a nexthop ip-
torm-12:mgmt:~# ip nexthop
id 268435461 via 27.0.0.20 scope link fdb
id 268435463 via 27.0.0.20 scope link fdb
id 268435465 via 27.0.0.20 scope link fdb

So there is no need to allocate a nexthop per-ES/per-VTEP. There
can be 100+ ESs per-VTEP so this change cuts the scale down by a
factor of 100.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:46:28 -08:00
Anuradha Karuppiah
15400f95b7 zebra: support for slow-failover of local MACs on an ES
When a local ES flaps there are two modes in which the local
MACs are failed over -
1. Fast failover - A backup NHG (ES-peer group) is programmed in the
dataplane per-access port. When a local ES flaps the MAC entries
are left unaltered i.e. pointing to the down access port. And the
dataplane redirects traffic destined to the oper-down access port
via the backup NHG.
2. Slow failover - This mode needs to be turned on to allow dataplanes
not capable of re-directing traffic. In this mode local MAC entries
on a down local ES are re-programmed to point to the ES-peers'
NHG. And vice-versa i.e. when the ES comes up the MAC entries
are re-programmed with the access port as dest.

Fast failover is on by default. Slow failover can be enabled via the
following config -
evpn mh redirect-off

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:46:26 -08:00
Anuradha Karuppiah
69711b3f83 zebra: on local mac add from the dplane a re-install maybe need as static
As a part of extended MM handing a MAC can be updated from local
to remote while being referenced by SYNC neighs (this is really a
temporary/small window). During this window if the MAC transitions
back to local again we need to re-inforce the previous SYNC flags
(based on the sync-neigh count) as subsequent SYNC updates to the
MAC will be de-duped and ignored.

Ticket: CM-29636

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:44:37 -08:00
Anuradha Karuppiah
1a4f9efd54 zebra: set inactive bit when zebra re-installs the MAC on dplane del
When a local mac is deleted by the dataplane zebra can re-install it
if the MAC is a SYNC MAC (learned from ES peers). The "local_inactive"
bit must be set as a part of the re-install to prevent zebra turning
around and advertising the MAC as locally active.

Also fixed up some debug logs in the slow-fail path to include the VNI.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:44:37 -08:00
Anuradha Karuppiah
80e19eb71f zebra: skip NDA_DST attr if NHG is present
NHG and DST (VTEP-IP) are mutually exclusive attributes. If DST is
present the kernel ignores NHG.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:44:37 -08:00
Anuradha Karuppiah
de86cc5bb1 zebra: free up the L2 NHG bitmap as a part of shutdown
Fix for a shutdown time memory leak found during review.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:44:37 -08:00
Anuradha Karuppiah
f3722826a4 zebra: remove FDB entries before de-activating a L2-NHG
NHG is activated i.e. programmed in the dataplane only if there
are active-VTEPs associated with it. When a NHG is de-activated
all the remote-mac entries associated with it need to be removed
before the NHG is removed.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-01 09:44:37 -08:00
Donald Sharp
9f3bcd52e1
Merge pull request #7641 from rampxxxx/fix_run_folder
Fix run folder permissions
2020-12-01 12:40:40 -05:00
Patrick Ruddy
0091461961
Merge pull request #7483 from AnuradhaKaruppiah/evpn-mh-dad
bgpd, zebra: Keep DAD disabled if EVPN MH is turned on
2020-12-01 17:37:32 +00:00
Donald Sharp
5c1a899432
Merge pull request #7632 from idryzhov/vtysh-memory-fixes
vtysh memory fixes
2020-12-01 12:08:52 -05:00
Mark Stapp
958c62b712
Merge pull request #7642 from donaldsharp/remove_pimd_version
pimd: Remove pim_version.c it is never used
2020-12-01 11:59:31 -05:00
Donald Sharp
9171f28204
Merge pull request #7640 from opensourcerouting/bfd-echo-minttl-check
bfdd: session specific command type checks
2020-12-01 09:31:57 -05:00
Donald Sharp
91f20e5cd5 pimd: Remove pim_version.c it is never used
The pim_version.[c|h] files are never used and we are getting
warnings about PIM_VERSION changing pointer sizes from
newer versions of the compiler.  I see no reason to keep this

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-12-01 07:57:45 -05:00
Donald Sharp
fc54da0966
Merge pull request #7404 from vishaldhingra/pim
pimd: (*,G) Prune processing doesn't remove SGRpt ifchannel
2020-12-01 07:48:54 -05:00
Donald Sharp
b8a9f6c6a2
Merge pull request #7578 from mjstapp/fix_pim_subdir_am
pimd: fix build and compilation errors
2020-12-01 07:41:33 -05:00
Javier Garcia
d3a3e6253b tools: Fix run folder permissions
In the case of some linux distros the /var/run dir is mounted
with tmpfs so in every reboot it's removed.
Then the frrcommon.sh will recreate it without 'x' perm
So no pid file cannot be created in /var/run/frr

Signed-off-by: Javier Garcia <rampxxxx@gmail.com>
2020-12-01 12:37:51 +01:00
Rafael Zalamena
7d2de131ce bfdd: session specific command type checks
Replace the unclear error message:

```
% Failed to edit configuration.

YANG error(s):
 Schema node not found.
 YANG path: /frr-bfdd:bfdd/bfd/sessions/single-hop[dest-addr='192.168.253.6'][interface=''][vrf='default']/minimum-ttl
```

With:

```
frr(config-bfd-peer)# minimum-ttl 250
% Minimum TTL is only available for multi hop sessions.

! or

frr(config-bfd-peer)# echo
% Echo mode is only available for single hop sessions.
frr(config-bfd-peer)# echo-interval 300
% Echo mode is only available for single hop sessions.
```

Reported-by: Trae Santiago
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2020-12-01 08:01:37 -03:00
Igor Ryzhov
fdac05fd9a doc: add description of the new memory macro
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2020-12-01 11:39:40 +03:00
Donatas Abraitis
6b1ca21086
Merge pull request #7630 from donaldsharp/snmp_clarity
doc: Explicitly call out need to add snmp module
2020-12-01 10:32:06 +02:00
Donald Sharp
d5ecf80558
Merge pull request #7631 from mjstapp/fix_pw_ctx_leak
zebra: free dplane ctx after pw update
2020-11-30 13:48:38 -05:00
Igor Ryzhov
6df43392d8 vtysh: fix incorrect memory statistics
As code comment states, 1 count of MTYPE_COMPLETION is leaked for each
autocompleted token. Let's manually decrement the counter before passing
the pointer to readline.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2020-11-30 18:55:40 +03:00
Igor Ryzhov
40ab41115d vtysh: fix memory leak
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2020-11-30 18:55:40 +03:00
Mark Stapp
a20e6c32a2 zebra: free dplane ctx after pw update
Free the dplane contexts used for pseudowire updates; we were
leaking these.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-11-30 10:02:40 -05:00
Donald Sharp
7815b994c3 doc: Explicitly call out need to add snmp module
The documentation implied how snmp works.  Explicitly call
it out a bit more for future users.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-30 08:39:29 -05:00
Rafael Zalamena
78695ce3a4
Merge pull request #7621 from idryzhov/fix-cisco-access-list
yang: fix cisco access list source value
2020-11-30 09:16:33 -03:00
Rafael Zalamena
9173163369
Merge pull request #7620 from ckishimo/cosmetic2
ospfd: fix a couple of typos
2020-11-30 08:53:33 -03:00
Donald Sharp
2734c7471c
Merge pull request #7584 from opensourcerouting/topotest-asan-fix
tests: Fix Topotest runs with newerversion of Address Sanitizer
2020-11-27 17:35:08 -05:00
Igor Ryzhov
de4132bfe5 yang: fix cisco access list source value
Source value must be a choice between host, network and any, not a set
of all three.

Fixes #7599.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2020-11-27 21:53:25 +03:00
Martin Winter
960c3f25f5
tests: Ignore YANG stderr messages in test_all_protocol_startup test
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2020-11-27 19:45:15 +01:00
Martin Winter
8972e2710c
tests: Fix logging output directory for older tests
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2020-11-27 19:45:15 +01:00
Martin Winter
1a31ada871
tests: Fix FRR process shutdown in failed topotest teardown phase
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2020-11-27 19:45:14 +01:00
Martin Winter
be2656eda2
tests: Fix Topotest runs with newerversion of Address Sanitizer
Fix Address Sanitizer Issue detection with newer ASAN versions

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2020-11-27 19:45:07 +01:00
Renato Westphal
8e418e8e3d
Merge pull request #7614 from donaldsharp/more_use_after_free
ldpd: Prevent usage after free
2020-11-27 08:51:24 -03:00
Donald Sharp
8fd47beb8b
Merge pull request #7593 from opensourcerouting/bgp_features_ospf_fix
tests: Make ospf in bgp_features testsuite predictable
2020-11-26 18:36:54 -05:00
Donald Sharp
903bd636ca
Merge pull request #7611 from opensourcerouting/docker_key_update
tests: Update topotest Dockerfile to pick up keys from deb repo
2020-11-26 18:33:12 -05:00
Donald Sharp
91191fa233 ldpd: Prevent usage after free
error	26-Nov-2020 14:35:02	ERROR: AddressSanitizer: heap-use-after-free on address 0x631000024838 at pc 0x55cefae977e9 bp 0x7ffdd3546860 sp 0x7ffdd3546850
error	26-Nov-2020 14:35:02	READ of size 4 at 0x631000024838 thread T0
error	26-Nov-2020 14:35:02	    #0 0x55cefae977e8 in ldpe_imsg_compose_parent_sync ldpd/ldpe.c:256
error	26-Nov-2020 14:35:02	    #1 0x55cefae9ab13 in vlog ldpd/log.c:53
error	26-Nov-2020 14:35:02	    #2 0x55cefae9b21f in log_info ldpd/log.c:102
error	26-Nov-2020 14:35:02	    #3 0x55cefae96eae in ldpe_shutdown ldpd/ldpe.c:237
error	26-Nov-2020 14:35:02	    #4 0x55cefae99254 in ldpe_dispatch_main ldpd/ldpe.c:585
error	26-Nov-2020 14:35:02	    #5 0x55cefaf93875 in thread_call lib/thread.c:1681
error	26-Nov-2020 14:35:02	    #6 0x55cefae97304 in ldpe ldpd/ldpe.c:136
error	26-Nov-2020 14:35:02	    #7 0x55cefae5a2e2 in main ldpd/ldpd.c:322
error	26-Nov-2020 14:35:02	    #8 0x7f4ef0c33b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
error	26-Nov-2020 14:35:02	    #9 0x55cefae525e9 in _start (/usr/lib/frr/ldpd+0xb35e9)
error	26-Nov-2020 14:35:02
error	26-Nov-2020 14:35:02	0x631000024838 is located 65592 bytes inside of 65632-byte region [0x631000014800,0x631000024860)
error	26-Nov-2020 14:35:02	freed by thread T0 here:
error	26-Nov-2020 14:35:02	    #0 0x7f4ef21e37a8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7a8)
error	26-Nov-2020 14:35:02	    #1 0x55cefae96e91 in ldpe_shutdown ldpd/ldpe.c:234
error	26-Nov-2020 14:35:02	    #2 0x55cefae99254 in ldpe_dispatch_main ldpd/ldpe.c:585
error	26-Nov-2020 14:35:02	    #3 0x55cefaf93875 in thread_call lib/thread.c:1681
error	26-Nov-2020 14:35:02	    #4 0x55cefae97304 in ldpe ldpd/ldpe.c:136
error	26-Nov-2020 14:35:02	    #5 0x55cefae5a2e2 in main ldpd/ldpd.c:322
error	26-Nov-2020 14:35:02	    #6 0x7f4ef0c33b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
error	26-Nov-2020 14:35:02
error	26-Nov-2020 14:35:02	previously allocated by thread T0 here:
error	26-Nov-2020 14:35:02	    #0 0x7f4ef21e3d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
error	26-Nov-2020 14:35:02	    #1 0x55cefae9725d in ldpe ldpd/ldpe.c:127
error	26-Nov-2020 14:35:02	    #2 0x55cefae5a2e2 in main ldpd/ldpd.c:322
error	26-Nov-2020 14:35:02	    #3 0x7f4ef0c33b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

Clean this problem up in the same way as the previous commit

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-26 18:28:48 -05:00
Martin Winter
57d89808bc
tests: Update topotest Dockerfile to pick up keys from deb repo
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2020-11-26 18:08:53 +01:00
Donatas Abraitis
4cfb2ae8c0
Merge pull request #7603 from donaldsharp/ospf_crash_fix
ospfd: Prevent crash by accessing memory not owned.
2020-11-26 14:53:18 +02:00
Donatas Abraitis
3200009046
Merge pull request #7586 from kuldeepkash/bgp_multi_vrf
tests: Add tests to bgp_multi_vrf_topo2
2020-11-26 09:48:54 +02:00
Martin Winter
3636b6c158
tests: Make ospf convergence predictable by setting if priority
Added OSPF priorities to force a predictable DR/Backup router selection

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2020-11-26 04:10:05 +01:00
Donald Sharp
142d6a1e61
Merge pull request #7600 from pjdruddy/evpn-mh-sa
bgpd: correctly store allocated ES struct
2020-11-25 16:30:29 -05:00
Mark Stapp
30ac1cdab8
Merge pull request #7608 from qlyoung/fix-missing-sockunion-init
bgpd: remove unused, uninitialized sockunion
2020-11-25 16:07:08 -05:00
Mark Stapp
3abf06b722
Merge pull request #7607 from pguibert6WIND/topo_python3_preparation
Topo python3 preparation
2020-11-25 14:22:10 -05:00
Pat Ruddy
f61fbf216b bgpd: correctly store allocated ES struct
in the rare situation where we allocate the ES during the path link
we fail to check/store the allocated ES pointer thus leading to a
NULL dereference later in the function.

Signed-off-by: Pat Ruddy <pat@voltanet.io>
2020-11-25 18:23:13 +00:00
Renato Westphal
68cd847ee6
Merge pull request #7602 from donaldsharp/ldp_use_after_free
ldpd: Prevent usage after free
2020-11-25 14:51:53 -03:00
Quentin Young
8395c1f865 bgpd: remove unused, uninitialized sockunion
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
2020-11-25 12:51:52 -05:00
Philippe Guibert
10870bbc20 topotests: precise importation folder
the topolog importation folder must be precised. otherwise following
error message appears:

root@dut-vm:~/topotests/bgp_flowspec# python3 test_bgp_flowspec_topo.py
Traceback (most recent call last):
  File "test_bgp_flowspec_topo.py", line 96, in <module>
    from lib.lutil import lUtil
  File "/root/topotests/bgp_flowspec/../lib/lutil.py", line 25, in <module>
    from topolog import logger
ImportError: No module named 'topolog'
root@dut-vm:~/topotests/bgp_flowspec#

The same error occurs with lutil and bgprib which are 2 libraries
located under lib/ folder. Some precisions are added too.

PR=71290
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-11-25 16:31:45 +00:00
Philippe Guibert
ecff3c7a0c topotests: python3, replace iteritems with items
replace iteritems() calls with items()

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-11-25 16:31:45 +00:00