Check `show ip route` for specific kernel routes after
the interface as their nexthop changes vrf.
After moving interface's vrf, there should be no kernel
route in old vrf.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
When changing one interface's vrf, the kernel routes are wrongly kept
in old vrf. Finally, the forwarding table in that old vrf can't forward
traffic correctly for those residual entries.
Follow these steps to make this problem happen:
( Firstly, "x1" interface of default vrf is with address of "6.6.6.6/24". )
```
anlan# ip route add 4.4.4.0/24 via 6.6.6.8 dev x1
anlan# ip link add vrf1 type vrf table 1
anlan# ip link set vrf1 up
anlan# ip link set x1 master vrf1
```
Then check `show ip route`, the route of "4.4.4.0/24" is still selected
in default vrf.
If the interface goes down, the kernel routes will be reevaluated. Those
kernel routes with active interface of nexthop can be kept no change, it
is a fast path. Otherwise, it enters into slow path to do careful examination
on this nexthop.
After the interface's vrf had been changed into new vrf, the down message of
this interface came. It means the interface is not in old vrf although it
still exists during that checking, so the kernel routes should be dropped
after this nexthop matching against a default route in slow path. But, in
current code they are wrongly kept in fast path for not checking vrf.
So, modified the checking active nexthop with vrf comparision for the interface
during reevaluation.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
There are relaxed nexthop requirements for kernel routes because we
trust kernel routes.
Two minor changes for kernel routes:
1. `if_is_up()` is one of the necessary conditions for `if_is_operative()`.
Here, we can remove this unnecessary check for clarity.
2. Since `nexthop_active()` doesn't distinguish whether it is kernel route,
modified the corresponding comment in it.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The loading_done event needs a event pointer to prevent
use after free's. Testing found this:
ERROR: AddressSanitizer: heap-use-after-free on address 0x613000035130 at pc 0x55ad42d54e5f bp 0x7ffff1e942a0 sp 0x7ffff1e94290
READ of size 1 at 0x613000035130 thread T0
#0 0x55ad42d54e5e in loading_done ospf6d/ospf6_neighbor.c:447
#1 0x55ad42ed7be4 in event_call lib/event.c:1995
#2 0x55ad42e1df75 in frr_run lib/libfrr.c:1213
#3 0x55ad42cf332e in main ospf6d/ospf6_main.c:250
#4 0x7f5798133c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
#5 0x55ad42cf2b19 in _start (/usr/lib/frr/ospf6d+0x248b19)
0x613000035130 is located 48 bytes inside of 384-byte region [0x613000035100,0x613000035280)
freed by thread T0 here:
#0 0x7f57998d77a8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7a8)
#1 0x55ad42e3b4b6 in qfree lib/memory.c:130
#2 0x55ad42d5d049 in ospf6_neighbor_delete ospf6d/ospf6_neighbor.c:180
#3 0x55ad42d1e1ea in interface_down ospf6d/ospf6_interface.c:930
#4 0x55ad42ed7be4 in event_call lib/event.c:1995
#5 0x55ad42ed84fe in _event_execute lib/event.c:2086
#6 0x55ad42d26d7b in ospf6_interface_clear ospf6d/ospf6_interface.c:2847
#7 0x55ad42d73f16 in ospf6_process_reset ospf6d/ospf6_top.c:755
#8 0x55ad42d7e98c in clear_router_ospf6_magic ospf6d/ospf6_top.c:778
#9 0x55ad42d7e98c in clear_router_ospf6 ospf6d/ospf6_top_clippy.c:42
#10 0x55ad42dc2665 in cmd_execute_command_real lib/command.c:994
#11 0x55ad42dc2b32 in cmd_execute_command lib/command.c:1053
#12 0x55ad42dc2fa9 in cmd_execute lib/command.c:1221
#13 0x55ad42ee3cd6 in vty_command lib/vty.c:591
#14 0x55ad42ee4170 in vty_execute lib/vty.c:1354
#15 0x55ad42eec94f in vtysh_read lib/vty.c:2362
#16 0x55ad42ed7be4 in event_call lib/event.c:1995
#17 0x55ad42e1df75 in frr_run lib/libfrr.c:1213
#18 0x55ad42cf332e in main ospf6d/ospf6_main.c:250
#19 0x7f5798133c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
previously allocated by thread T0 here:
#0 0x7f57998d7d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x55ad42e3ab22 in qcalloc lib/memory.c:105
#2 0x55ad42d5c8ff in ospf6_neighbor_create ospf6d/ospf6_neighbor.c:119
#3 0x55ad42d4c86a in ospf6_hello_recv ospf6d/ospf6_message.c:464
#4 0x55ad42d4c86a in ospf6_read_helper ospf6d/ospf6_message.c:1884
#5 0x55ad42d4c86a in ospf6_receive ospf6d/ospf6_message.c:1925
#6 0x55ad42ed7be4 in event_call lib/event.c:1995
#7 0x55ad42e1df75 in frr_run lib/libfrr.c:1213
#8 0x55ad42cf332e in main ospf6d/ospf6_main.c:250
#9 0x7f5798133c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Add an actual event pointer and just track it appropriately.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
I'm seeing crashes in ospf6_write on the `assert(node)`. The only
sequence of events that I see that could possibly cause this to happen
is this:
a) Someone has scheduled a outgoing write to the ospf6->t_write and
placed item(s) on the ospf6->oi_write_q
b) A decision is made in ospf6_send_lsupdate() to send an immediate
packet via a event_execute(..., ospf6_write,....).
c) ospf6_write is called and the oi_write_q is cleaned out.
d) the t_write event is now popped and the oi_write_q is empty
and FRR asserts on the `assert(node)` <crash>
When event_execute is called for ospf6_write, just cancel the t_write
event. If ospf6_write has more data to send at the end of the function
it will reschedule itself. I've only seen this crash one time and am
unable to reliably reproduce this at all. But this is the only mechanism
that I can see that could make this happen, given how little the oi_write_q
is actually touched in code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When we pass an unknown/wrong command and do `systemctl reload frr`, all processes
are killed, and not started up.
Like doing with frr-reload.py, all good:
```
$ /usr/lib/frr/frr-reload.py --reload /etc/frr/frr.conf
vtysh failed to process new configuration: vtysh (mark file) exited with status 2:
b'line 20: % Unknown command: neighbor 192.168.10.123 bfd 300 300\n\n'
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When no ip pim is performed subsequent pim related
configs under the interface also implicitly deleted.
When doing this via frr-reload requires to remove any
explicit no ip pim <blah> lines so delete list.
Testing Done:
running-config:
interface lo
ip pim
ip pim use-source 6.0.0.1
exit
frr.conf:
remove two pim config lines.
interface lo
exit
Before fix:
2023-06-29 23:44:26,062 INFO: Failed to execute interface lo no ip pim use-source 6.0.0.1
2023-06-29 23:44:26,142 INFO: Failed to execute interface lo no ip pim use-source
2023-06-29 23:44:26,221 INFO: Executed "interface lo no ip pim"
After fix:
Only no ip pim executed and rest of the other lines removed from delete
list.
2023-06-30 01:07:32,618 INFO: Executed "interface lo no ip pim"
Signed-off-by: Chirag Shah <chirag@nvidia.com>
When using asic_offload with an asynchronous notification the
rib_route_match_ctx function is testing for distance and tag
being correct against the re.
Normal route notification for static routes is this(well really all routes):
a) zebra dplane generates a ctx to send to the dplane for route install
b) dplane installs it in the kernel
c) if the dplane_fpm_nl.c module is being used it installs it.
d) The context's success code is set to it worked and passes the context
back up to zebra for processing.
e) Zebra master receives this and checks the distance and tag are correct
for static routes and accepts the route and marks it installed.
If the operator is using a wait for install mechansim where the dplane
is asynchronously sending the result back up at a future time *and*
it is using the dplane_fpm_nl.c code where it uses the rt_netlink.c
route parsing code, then there is no way to set distance as that we
do not pass distance to the kernel.
As such static routes were never being properly handled since the re and
context would not match and the route would still be marked as queued.
Modify the code such that the asynchronous path notification for static
routes ignores the distance and tag's as that there is no way to test
for this data from that path at this point in time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There is no route-map set action to replace any ASN,
or a part of an ASN, with a configured ASN.
The current commit adds a new command to use a configured
ASN as replacement, instead of using the local as number.
> set as-path replace any 65500
Update the 'bgp_set_aspath_replace' test.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In case the full label stack is used, there may be
a table overrun happening. Avoid it by increasing the
size of the table.
Fixes: 27f4deed0a ("bgpd: update the mpls entry to handle return traffic")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The bmnc pointer is never null. Do not keep the test
on the pointer.
Fixes: 1069425868 ("bgpd: allocate label bound to received mpls vpn routes")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add support for "[no] ip ospf capbility opaque" at the interface
level with the default being capability opaque enabled. The command
"no ip ospf capability opaque" will disable opaque LSA database
exchange and flooding on the interface. A change in configuration
will result in the interface being flapped to update our options
for neighbors but no attempt will be made to purge existing LSAs
as in dense topologies, these may received by neighbors through
different interfaces.
Topotests are added to test both the configuration and the LSA
opaque flooding suppression.
Signed-off-by: Acee <aceelindem@gmail.com>
Crash with empty `ip-protocol`:
```
anlan(config-pbr-map)# match ip-protocol
vtysh: error reading from pbrd: Resource temporarily unavailable (11)Warning: closing connection to pbrd because of an I/O error!
```
So, give warning for empty `ip-protocol`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
With these changes,
the code ensures that the peer data-structures are accessed
only after it knows that BGPD is not terminating.
Authored-by: Naveen Thanikachalam <nthanikachal@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
Imagine the following scenario:
when a neighbor has an inbound policy set to modify the next hop, but no outbound route-map is configured.
In this case, if(!post_attr && (ROUTE_MAP_OUT_NAME(filter) || bgp_path_suppressed(pi))) returns false, causing rmap_in_change_flag to not be correctly cleared, and mistakenly identified as rmap_out_change_flag, leading to the failure of the subsequent neighbor-nexthop-self command.
Signed-off-by: Jack.zhang <hanyu.zly@alibaba-inc.com>