1. Consider a established L2VPN EVPN BGP peer with soft-reconfiguartion
inbound configured
2. When the interface of this directly connected BGP peer is shutdown,
bgp_soft_reconfig_table_update() is called, which memsets the evpn buffer
and calls bgp_update() with received attributes stored in ain table(ain->attr).
In bgp_update(), evpn_overlay attribute in ain->attr (which is an interned
attr) was modified by doing a memcpy
3. Above action causes 2 attributes in the attrhash (which were previously different)
to match!
4. Later during fsm change event of the peer, bgp_adj_in_remove() is called
to clean up the ain->attr. But, because 2 attrs in attrhash match, it causes
BGP to assert in bgp_attr_unintern()
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
(cherry picked from commit 6e076ba523)
EVPN MH ES reduendant VTEPs need to install
sync MAC as notify inactive and generate
ND:Proxy stamped extended community on Type-2
route.
Ticket:#3436621
Issue:3436621
Testing Done:
tor-11 originates type-2 MAC route:
tor-11# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static
tor-12 receives sync MAC route:
Before fix:
----------
tor-12:/# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static
After fix: inactive is set to MAC entry
----------
tor-12:/#bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify inactive master bridge
static
Notice the difference in `inactive` post notify on tor-12
with the fix.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit 4a1f91a366)
The check flag of `found_pg_cmd` is already there, but not used.
So, make it really work for reload.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
(cherry picked from commit 9ea17a5d49)
From commit `411d1a2`, `bgp_delete_nbr_remote_as_line()` is added to
remove some specific bgp neighbors. But, when reloading the following
configuration, it will wrongly remove some good ones:
`neighbor 66.66.66.6 remote-as internal`:
```
router bgp 66
bgp router-id 172.16.204.6
neighbor ANLAN peer-group
neighbor ANLAN remote-as internal
neighbor 66.66.66.6 remote-as internal <- LOST
neighbor 66.66.66.60 peer-group ANLAN
```
The reason is that "66.66.66.6" is included in "66.66.66.60" literally,
then it is mistakenly thought to be a match. Just fix it with
excat match.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
(cherry picked from commit a70895e83a)
When ospf6 is started up and SPF is run depending on which route is
selected as the parent route we could miss adding a NH. If one
possible parent route has two equal cost paths and the second possible
parent route has only one depending on which one is selected first
determines if we have have one or two NHs.
In the network below when creating a route 2001:db8:3:4::/64 in R2.
When SPF is run there are two possible parent routes R3 and R4.
2001:db8:1:2 +-----+ 2001:db8:2:5
+--------------+ 2 +---------------+
| ::2 | | ::2 |
| +-----+ |
| |
::1| |
+-----+ |::5
| 1 |2001:db8:1:3+-----+2001:db8:3:5+-----+2001:db8:5:6+-----+
| +------------+ 3 +------------+ 5 +------------+ 6 |
+-----+ ::1 ::3 | |::3 ::5 | |::5 ::6| |
::1| +-----+ +-----+ +-----+
| |::3
| | 2001:db8:3:4
| |
| |::4
| 2001:db8:1:4 +-----+
+--------------+ 4 |
::4 | |
+-----+
The problem was if we first created the route to 2001:db8:3:4::/64 with R3
as the parent route all is fine. The code was merging the NH from the parent
route and R3 has 2 NH, one pointing to R1 and one to R5. But if route
2001:db8:3:4::/64 was first created with parent as R4, it has only one NH
pointing to R1, and then later a new vertex was created pointing to R3 the
code would only copy the nhs from the vertex not from the parent route. The
vertex always has just one NH. But the parent route could have more. So
when we would bringup this setup one time we would see one NH for
2001:db8:3:4::/64 and the next time we would see two NHs. The code has been
modified so that it behaves the same when the route is first created, or when
a vertex is created, it selects the NHs from the parent route.
Signed-off-by: Lynne Morrison <lynne@voltanet.io>
(cherry picked from commit 637e6f293b)
After doing "no ipv6 pim" and "ipv6 pim" on receiver interface in LHR,
mroutes were not created.
When we enable pim after disabling it, we refresh the membership on pim interface.
First we clear the membership on the interface, then we add it back.
For PIMv6 we were only clearing it, membership was not added back. So mroutes were not created.
Now added code to fetch all the local membership information into PIM.
Fixes: #12005, #12820
Signed-off-by: Abhishek N R <abnr@vmware.com>
(cherry picked from commit 00fed6ed3b)
Topology Used:
=============
Cisco---FRR4----FRR2
Initially PIM nbr is down between FRR4----FRR2 from FRR2 side
Cisco is sending BSR packet to FRR4.
Problem Statement:
=================
No shutdown the PIM neighbor on FRR2 towards FRR4.
FRR2, receives BSR packet immediately as the new neighbor
comes up. This BSR packet is having no-forward bit set.
FRR2 is not able to process the BSR packet, and drop the
BSR packet.
Root Cause:
==========
When PIMD comes up, we start BSM timer for 60 seconds.
Here, the value accept_nofwd_bsm is setting to false.
FRR2, when receives no-forward BSR packet, it is getting
accept_nofwd_bsm value as false.
So, it drops, the no-forward BSM packet.
Fix:
===
Set accept_nofwd_bsm as false after first BSM packet received.
Signed-off-by: Sarita Patra <saritap@vmware.com>
(cherry picked from commit 8b462d5579)
Issue:
After vlan flap, zebra was not marking the selected/best route as installed.
As a result, when a static route was configured with nexthop as directly
connected interface's(vlan) IP, the static route was not being installed
in the kernel since its nexthop was unresolved. The nexthop was marked
unresolved because zebra failed to mark the best route as installed after
interface flap.
This was happening because, in dplane_route_update_internal() if the old and
new context type, and nexthop group id are the same, then zebra doesn't send
down a route replace request to kernel. But, the installed (ROUTE_ENTRY_INSTALLED)
flag is set when zebra receives a response from kernel. Since the
request to kernel was being skipped for the route entry, installed flag
was not being set
Fix:
In dplane_route_update_internal() if the old and new context type, and
nexthop group id are the same, then before returning, installed flag will
be set on the route-entry if it's not set already.
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
(cherry picked from commit e25a0b138a)
Create VRF and interfaces:
ip netns add vrf1
ip link add veth1 index 100 type veth
ip link add link veth1 veth1.200 type vlan id 200
ip link set veth1.200 netns vrf1
ip -n vrf1 link add veth2 index 100 type veth
After reloading zebra, "show interface veth1.200" shows wrong parent
interface:
test# show interface veth1.200
Interface veth1.200 is down
...
Parent interface: veth2
This is because veth1.200 and veth1 are in different netns, and veth2
happens to have the same ifindex as veth1, in the same netns of
veth1.200.
When looking for parent, link-ifindex 100 should be looked up within
link-netns, rather than that of the child interface.
Add link_nsid to zebra interface, so that the <link_nsid, link_ifindex>
pair can uniquely identify the link interface.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
(cherry picked from commit af19624b00)
During shutdown, the main pthread stops the dplane pthread
before exiting. Don't try to clean up any events scheduled
to the dplane pthread at that point - just let the thread
exit and clean up. This is the 8.5 version.
Signed-off-by: Mark Stapp <mjs@labn.net>
When cleaning `ripd`, it should free `ctx->name` of `struct if_rmap_ctx`,
not `ctx` itself. Otherwise, it will lead to memory leak.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
(cherry picked from commit d3ec0066e5)
update_type1_routes_for_evi() is called from
L3VNI/L2VNI up event, if ESI is not UP then
do not advertise EAD-ES Type-1 route.
Just like from multiple places EAD-ES route
origination checks for its oper status.
Ticket:#3413454
Issue:3413454
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit c683b7baad)
When we have these sequence of events causing a crash in
evpn_type5_test_topo1:
(A) no router bgp vrf RED 100
this schedules for deletion the vrf RED instance
(B) a l3vni change event from zebra
this creates a bgp instance for VRF RED in some cases
additionally it auto imports evpn routes into VRF RED
Please note this is desired behavior to allow for the
auto importation of evpn vrf routes
(C) no router bgp 100
The code was allowing the deletion of the default
instance and causing tests to crash.
Effectively the test in bgp_vty to allow/dissallow
the removal of the default instance was not correct
for the case when (B) happens.
Let's just not allow the command to succeed in this case as that
the test was wrong.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 746e0522f3)
When a `no router bgp XXX` is issued and the bgp instance
is in the process of shutting down, do not allow a l3vni
change coming up from zebra to do anything. We can just
safely ignore it at this point in time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 5a7c43c77e)
Let's cut to the chase, we know the pointer type and
it allows us to not to some gyrations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit ef96e3753f)
All other parsing functions done from bgp_nlri_parse() assume
no attributes == an implicit withdrawal. Let's move
bgp_nlri_parse_flowspec() into the same alignment.
Reported-by: Matteo Memelli <mmemelli@amazon.it>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit cfd04dcb3e)
FRR's standards state that function declarations should
have actual variable names for parameters passed in.
Let's make this so for bgp_packet.h
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 78745b8700)
two things:
On shutdown cleanup any events associated with the update walker.
Also do not allow new events to be created.
Fixes this mem-leak:
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790:Direct leak of 8 byte(s) in 1 object(s) allocated from:
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #0 0x7f0dd0b08037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #1 0x7f0dd06c19f9 in qcalloc lib/memory.c:105
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #2 0x55b42fb605bc in rib_update_ctx_init zebra/zebra_rib.c:4383
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #3 0x55b42fb6088f in rib_update zebra/zebra_rib.c:4421
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #4 0x55b42fa00344 in netlink_link_change zebra/if_netlink.c:2221
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #5 0x55b42fa24622 in netlink_information_fetch zebra/kernel_netlink.c:399
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #6 0x55b42fa28c02 in netlink_parse_info zebra/kernel_netlink.c:1183
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #7 0x55b42fa24951 in kernel_read zebra/kernel_netlink.c:493
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #8 0x7f0dd0797f0c in event_call lib/event.c:1995
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #9 0x7f0dd0684fd9 in frr_run lib/libfrr.c:1185
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #10 0x55b42fa30caa in main zebra/main.c:465
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #11 0x7f0dd01b5d09 in __libc_start_main ../csu/libc-start.c:308
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 3cd0accb50)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The parser for extended communities was incorrectly disallowing an
operator from configuring "Route Origin" extended communities
(e.g. RD/RT/SoO) with a 4-byte value matching BGP_AS4_MAX (UINT32_MAX)
and allowed the user to overflow UINT32_MAX. This updates the parser to
read the value as a uint64_t so that we can do proper checks on the
upper bounds (> BGP_AS4_MAX || errno).
before:
```
TORC11(config-router-af)# neighbor uplink-1 soo 4294967296:65
TORC11(config-router-af)# do sh run | include soo
neighbor uplink-1 soo 0:65
TORC11(config-router-af)# neighbor uplink-1 soo 4294967295:65
% Malformed SoO extended community
TORC11(config-router-af)#
```
after:
```
TORC11(config-router-af)# neighbor uplink-1 soo 4294967296:65
% Malformed SoO extended community
TORC11(config-router-af)# neighbor uplink-1 soo 4294967295:65
TORC11(config-router-af)# do sh run | include soo
neighbor uplink-1 soo 4294967295:65
TORC11(config-router-af)#
```
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
(cherry picked from commit b571d79d64)
Currently the process of the `route-map` configuration for `per-vrf-rip`
is wrong.
There are two problems:
1. `ctx->name` for `if_rmap_ctx` is not initialized in `if_rmap_ctx_create()`.
2. The global `if_rmap_ctx_list` is wrongly used for `per-vrf-rip`.
So, two changes for it:
1. Correctly initializes `ctx->name`.
2. Use specific `if_rmap_ctx` for `per-vrf-rip`, not global one.
Note, this related implementation for `route-map` is only for `ripd`.
Before:
```
anlan(config)# route rip vrf vrf1
anlan(config-router)# route-map aa in lan
anlan(config-router)# do show run
!
router rip
route-map aa in lan
exit
!
```
After:
```
anlan(config)# route rip vrf vrf1
anlan(config-router)# route-map aa in lan
anlan(config-router)# do show run
!
router rip vrf vrf1
route-map aa in lan
exit
!
```
Signed-off-by: anlan_cs <vic.lan@pica8.com>
(cherry picked from commit ff0fa00c7d)
Fix for missing neighbor description in "show bgp summary [wide]"
when its length exceeds 20[64] chars and it doesn't contain
withespaces.
Existing behavior remains if description contains whitespaces
before size limit.
Signed-off-by: rbarroetavena <rbarroetavena@gmail.com>
(cherry picked from commit 420ac3d24c)
Test failed time to time, let's try this way:
```
$ for x in $(seq 1 20); do cp test_bgp_labeled_unicast_addpath.py test_$x.py; done
$ sudo pytest -s -n 20
```
Ran 10 times using this pattern, no failure 🤷
Before this change, we checked advertised routes, and at some point `=` was
missing from the output, but advertised correctly. Receiving router gets as
much routes as expected to receive.
I reversed checking received routes, not advertised.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 9db7ed2fc9)