Go to file
Donald Sharp c8453cd77e zebra: Fix v6 route replace failure turned into success
Currently when we have a route replace operation for v6 routes
with a new nexthop group the order of kernel installation is this:

a) New nexthop group insertion seq  1
b) Route delete operation seq 3
c) Route insertion operation seq 2

Currently the code in nl_batch_read_resp is attempting
to handle this situation by skipping the delete operation.
*BUT* it is enqueuing the context into the zebra dplane
queue before we read the response.  Since we create the ctx
with an implied success, success is being reported to the
upper level dplane and the zebra rib thinks the route has
been properly handled.

This is showing up in the zebra_seg6_route test code because
the test code is installing a seg6 route w/ sharpd and it
is failing to install because the route's nexthop is rejected:

First installation:

2021/10/29 09:28:10.218 ZEBRA: [JGWSB-SMNVE] dplane: incoming new work counter: 2
2021/10/29 09:28:10.218 ZEBRA: [Q52A7-211QJ] dplane enqueues 2 new work to provider 'Kernel'
2021/10/29 09:28:10.218 ZEBRA: [JVY1P-93VFY] dplane provider 'Kernel': processing
2021/10/29 09:28:10.218 ZEBRA: [TX9N0-9JKDF] ID (9) Dplane nexthop update ctx 0x56125390a820 op NH_INSTALL
2021/10/29 09:28:10.218 ZEBRA: [PM9ZJ-07RCP] 0:1::1/128 Dplane route update ctx 0x56125390add0 op ROUTE_INSTALL
2021/10/29 09:28:10.218 ZEBRA: [TJ327-ET8HE] netlink_send_msg: >> netlink message dump [sent]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=104 type=(104) NEWNEXTHOP flags=(0x0501) {REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=9 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [WCX94-SW894]   nhm [family=(10) AF_INET6 scope=(0) UNIVERSE protocol=(11) ZEBRA flags=0x00000000 {}]
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(1) ID]
2021/10/29 09:28:10.218 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(6) GATEWAY]
2021/10/29 09:28:10.218 ZEBRA: [STTSM-27M81]       2001::1
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(5) OIF]
2021/10/29 09:28:10.218 ZEBRA: [JR4EA-BKPTA]       6
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=6 (payload=2) type=(7) ENCAP_TYPE]
2021/10/29 09:28:10.218 ZEBRA: [JR4EA-BKPTA]       5
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=36 (payload=32) type=(32776) UNKNOWN]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=64 type=(24) NEWROUTE flags=(0x0401) {REQUEST,(ATOMIC|CREATE)} seq=10 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [GCEGC-W8YBF]   rtmsg [family=(10) AF_INET6 dstlen=128 srclen=0 tos=0 table=254 protocol=(194) UNKNOWN scope=(0) UNIVERSE type=(1) UNICAST flags=0x0000 {}]
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(1) DST]
2021/10/29 09:28:10.218 ZEBRA: [STTSM-27M81]       1::1
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(6) PRIORITY]
2021/10/29 09:28:10.218 ZEBRA: [Z4E9C-GD9EP]       20
2021/10/29 09:28:10.218 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(30) NH_ID]
2021/10/29 09:28:10.218 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:28:10.218 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=76 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=9 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:28:10.218 ZEBRA: [HSYZM-HV7HF] Extended Error: Gateway can not be a local address
2021/10/29 09:28:10.218 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=9, pid=3539131282
2021/10/29 09:28:10.218 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:28:10.218 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=68 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=10 pid=3539131282]
2021/10/29 09:28:10.218 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:28:10.218 ZEBRA: [HSYZM-HV7HF] Extended Error: Nexthop id does not exist
2021/10/29 09:28:10.218 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWROUTE(24), seq=10, pid=3539131282
2021/10/29 09:28:10.218 ZEBRA: [VCDW6-A7ZF1] dplane dequeues 2 completed work from provider Kernel
2021/10/29 09:28:10.218 ZEBRA: [JTWAB-1MH4Y] dplane has 2 completed, 0 errors, for zebra main
2021/10/29 09:28:10.218 ZEBRA: [J7K9Z-9M7DT] Nexthop dplane ctx 0x56125390a820, op NH_INSTALL, nexthop ID (9), result FAILURE
2021/10/29 09:28:10.218 ZEBRA: [P2XBZ-RAFQ5][EC 4043309074] Failed to install Nexthop ID (9) into the kernel
2021/10/29 09:28:10.218 ZEBRA: [RMK34-61HV5] default(0:254):1::1/128 Processing dplane result ctx 0x56125390add0, op ROUTE_INSTALL result FAILURE

Note the last line `op ROUTE_INSTALL result FAILURE` because we are attempting to use a
a gw nexthop that is local.  This is the result.

Then the test code was installing the route again:

2021/10/29 09:30:00.493 ZEBRA: [JGWSB-SMNVE] dplane: incoming new work counter: 2
2021/10/29 09:30:00.493 ZEBRA: [Q52A7-211QJ] dplane enqueues 2 new work to provider 'Kernel'
2021/10/29 09:30:00.493 ZEBRA: [JVY1P-93VFY] dplane provider 'Kernel': processing
2021/10/29 09:30:00.493 ZEBRA: [TX9N0-9JKDF] ID (9) Dplane nexthop update ctx 0x561253916a00 op NH_INSTALL
2021/10/29 09:30:00.493 ZEBRA: [PM9ZJ-07RCP] 0:1::1/128 Dplane route update ctx 0x561253915f40 op ROUTE_UPDATE
2021/10/29 09:30:00.493 ZEBRA: [TJ327-ET8HE] netlink_send_msg: >> netlink message dump [sent]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=104 type=(104) NEWNEXTHOP flags=(0x0501) {REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=11 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [WCX94-SW894]   nhm [family=(10) AF_INET6 scope=(0) UNIVERSE protocol=(11) ZEBRA flags=0x00000000 {}]
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(1) ID]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(6) GATEWAY]
2021/10/29 09:30:00.493 ZEBRA: [STTSM-27M81]       2001::1
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(5) OIF]
2021/10/29 09:30:00.493 ZEBRA: [JR4EA-BKPTA]       6
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=6 (payload=2) type=(7) ENCAP_TYPE]
2021/10/29 09:30:00.493 ZEBRA: [JR4EA-BKPTA]       5
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=36 (payload=32) type=(32776) UNKNOWN]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=56 type=(25) DELROUTE flags=(0x0401) {REQUEST,(ATOMIC|CREATE)} seq=13 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [GCEGC-W8YBF]   rtmsg [family=(10) AF_INET6 dstlen=128 srclen=0 tos=0 table=254 protocol=(194) UNKNOWN scope=(0) UNIVERSE type=(0) UNSPEC flags=0x0000 {}]
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(1) DST]
2021/10/29 09:30:00.493 ZEBRA: [STTSM-27M81]       1::1
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(6) PRIORITY]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       20
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=64 type=(24) NEWROUTE flags=(0x0401) {REQUEST,(ATOMIC|CREATE)} seq=12 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [GCEGC-W8YBF]   rtmsg [family=(10) AF_INET6 dstlen=128 srclen=0 tos=0 table=254 protocol=(194) UNKNOWN scope=(0) UNIVERSE type=(1) UNICAST flags=0x0000 {}]
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=20 (payload=16) type=(1) DST]
2021/10/29 09:30:00.493 ZEBRA: [STTSM-27M81]       1::1
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(6) PRIORITY]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       20
2021/10/29 09:30:00.493 ZEBRA: [KFBSR-XYJV1]     rta [len=8 (payload=4) type=(30) NH_ID]
2021/10/29 09:30:00.493 ZEBRA: [Z4E9C-GD9EP]       9
2021/10/29 09:30:00.493 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=76 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=11 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:30:00.493 ZEBRA: [HSYZM-HV7HF] Extended Error: Gateway can not be a local address
2021/10/29 09:30:00.493 ZEBRA: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Invalid argument, type=RTM_NEWNEXTHOP(104), seq=11, pid=3539131282
2021/10/29 09:30:00.493 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=36 type=(2) ERROR flags=(0x0100) {DUMP,(ROOT|REPLACE|CAPPED)} seq=13 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-3) No such process]
2021/10/29 09:30:00.493 ZEBRA: [V8KNF-8EXH8] netlink_recv_msg: << netlink message dump [recv]
2021/10/29 09:30:00.493 ZEBRA: [JAS4D-NCWGP] nlmsghdr [len=68 type=(2) ERROR flags=(0x0300) {DUMP,(ROOT|REPLACE|CAPPED),(MATCH|EXCLUDE|ACK_TLVS)} seq=12 pid=3539131282]
2021/10/29 09:30:00.493 ZEBRA: [KWP1C-6CSXF]   nlmsgerr [error=(-22) Invalid argument]
2021/10/29 09:30:00.493 ZEBRA: [VCDW6-A7ZF1] dplane dequeues 2 completed work from provider Kernel
2021/10/29 09:30:00.493 ZEBRA: [JTWAB-1MH4Y] dplane has 2 completed, 0 errors, for zebra main
2021/10/29 09:30:00.493 ZEBRA: [J7K9Z-9M7DT] Nexthop dplane ctx 0x561253916a00, op NH_INSTALL, nexthop ID (9), result FAILURE
2021/10/29 09:30:00.493 ZEBRA: [P2XBZ-RAFQ5][EC 4043309074] Failed to install Nexthop ID (9) into the kernel
2021/10/29 09:30:00.493 ZEBRA: [RMK34-61HV5] default(0:254):1::1/128 Processing dplane result ctx 0x561253915f40, op ROUTE_UPDATE result SUCCESS

Note that this time we do these three operations

a) nexthop installation seq 11
b) route delete seq 13
c) route add seq 12

Note the last line, we report the install as a success but it clearly failed from the seq=12 decode.
When we look at the v6 rib it thinks it is installed:

unet> r1 show ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

D>* 1::1/128 [150/0] via 2001::1, dum0, seg6local unspec unknown(seg6local_context2str), seg6 a::, weight 1, 00:00:17

So let's modify nl_batch_read_resp to not dequeue/enqueue the context until we are sure we have
the right one.  This fixes the test code to do the right thing on the second installation.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-04 15:33:58 -05:00
.github .github: improve bug report template 2020-10-20 16:12:03 -04:00
alpine alpine: fix path for daemons file install 2021-08-30 15:21:59 -04:00
babeld *: do not print vrf name for interface config when using vrf-lite 2022-01-24 14:44:05 +03:00
bfdd Merge pull request #10388 from anlancs/bfd-fsm-passive 2022-02-02 13:06:11 +03:00
bgpd bgpd: strncmp -> strcmp in community hash foo 2022-02-02 16:34:03 -05:00
debian debian: adjust the changelog for the dev branch 2021-11-08 11:54:42 -06:00
doc Merge pull request #10495 from anlancs/doc-ospf-range 2022-02-04 15:28:38 +03:00
docker docker: update alpine build enable set own version 2022-01-04 13:14:51 -05:00
eigrpd *: rework renaming the default VRF 2021-12-21 22:09:29 +03:00
fpm build: fix AM_LDFLAGS usage (and gcov) 2021-07-21 17:10:08 +02:00
gdb *: Cleanup some documentation from quagga->frr 2021-11-11 14:41:27 -05:00
grpc build: fix AM_LDFLAGS usage (and gcov) 2021-07-21 17:10:08 +02:00
include include, zebra: Add recent nexthop.h 2021-10-25 14:11:37 -04:00
isisd bgpd,pimd,isisd,nhrpd: Convert to vty_json() 2022-01-31 21:20:41 +02:00
ldpd *: rework renaming the default VRF 2021-12-21 22:09:29 +03:00
lib bgpd: Convert bgp_addpath_encode_[tr]x() to bool from int 2022-02-01 13:31:16 +02:00
m4 grpc: improve checks for GRPC C++ requirements 2021-05-22 00:01:06 +00:00
mlag build: fix AM_LDFLAGS usage (and gcov) 2021-07-21 17:10:08 +02:00
nhrpd bgpd,pimd,isisd,nhrpd: Convert to vty_json() 2022-01-31 21:20:41 +02:00
ospf6d Merge pull request #10373 from anlancs/ospf-add-asbr 2022-02-01 19:04:33 +03:00
ospfclient build: fix AM_LDFLAGS usage (and gcov) 2021-07-21 17:10:08 +02:00
ospfd ospfd: Core in ospf_if_down during shutdown. 2022-02-04 10:26:54 +02:00
pathd pathd: fix typo in pathd/path_ted.c 2021-12-19 11:25:15 +00:00
pbrd *: do not print vrf name for interface config when using vrf-lite 2022-01-24 14:44:05 +03:00
pceplib pceplib: fix style issues 2021-12-06 00:09:13 -05:00
pimd bgpd,pimd,isisd,nhrpd: Convert to vty_json() 2022-01-31 21:20:41 +02:00
pkgsrc *: cleanup .gitignore files 2018-09-08 21:30:42 +02:00
python python: pass conditionals through clippy for DEFPY 2022-01-13 16:01:53 +01:00
qpb build: fix AM_LDFLAGS usage (and gcov) 2021-07-21 17:10:08 +02:00
redhat redhat: logrotate file has typo for staticd 2022-01-24 15:05:48 -05:00
ripd *: rework renaming the default VRF 2021-12-21 22:09:29 +03:00
ripngd *: rework renaming the default VRF 2021-12-21 22:09:29 +03:00
sharpd *: rework renaming the default VRF 2021-12-21 22:09:29 +03:00
snapcraft snapcraft: add missing dependency 2021-08-23 15:08:05 +03:00
staticd staticd: small cleanup 2022-01-31 18:44:17 +08:00
tests Merge pull request #10464 from pguibert6WIND/negogiate 2022-02-01 13:00:54 -05:00
tools tools: Handle new lines for json_object_to_json_string_ext() 2022-01-31 15:34:24 +02:00
vrrpd vrrpd: use ipaddr_is_zero when needed 2022-01-27 21:05:40 +03:00
vtysh zebra: fix segment-routing command not found error with --disable-pathd 2022-01-16 21:46:22 +09:00
watchfrr *: Convert quagga_signal_X to frr_signal_X 2021-11-11 14:41:27 -05:00
yang bfdd,yang: optimize nb with YANG 2022-01-25 04:00:49 -05:00
zebra zebra: Fix v6 route replace failure turned into success 2022-02-04 15:33:58 -05:00
.clang-format *: Add FOREACH_AFI_SAFI_NSF(afi, safi) macro to reduce nesting 2022-01-13 14:29:54 +02:00
.dir-locals.el tests: remove python format block from dir-locals 2021-09-13 10:04:29 -04:00
.dockerignore docker: Make docker image on CentOS 7 2019-11-26 19:29:30 +00:00
.git-blame-ignore-revs tools: Ignore mass renaming of topotests for git blame 2021-05-11 14:14:26 +03:00
.gitignore *: Add some missed make check generated files in .gitignore 2021-09-16 08:13:17 -04:00
.pylintrc tests: micronet: update infra 2021-09-04 09:04:46 -04:00
.travis.yml lib: libyang2 add missed conversion 2021-05-17 22:13:59 -04:00
bootstrap.sh build: turn on automake warnings (& symlinks) 2021-04-21 15:42:37 +02:00
buildtest.sh build: remove --enable-exampledir 2021-06-24 16:42:58 +02:00
config.version.in build: carry --with-pkg-extra-version into tarballs 2018-10-24 15:11:50 +02:00
configure.ac build: FRR 8.3 development version 2022-02-01 11:02:03 +02:00
COPYING *: make consistent & update GPLv2 file headers 2017-05-15 16:37:41 +02:00
COPYING-LGPLv2.1 build: remove LGPL v2.0, add LGPL v2.1 2016-11-15 17:19:38 +09:00
Makefile.am build: fix AM_LDFLAGS usage (and gcov) 2021-07-21 17:10:08 +02:00
README.md doc: Update Documentation to note Solaris Unsupported status 2020-09-21 10:02:20 -04:00
stamp-h.in Initial revision 2002-12-13 20:15:29 +00:00
version.h build: make builddir include path consistent 2021-04-21 15:42:33 +02:00

Icon

FRRouting

FRR is free software that implements and manages various IPv4 and IPv6 routing protocols. It runs on nearly all distributions of Linux and BSD and supports all modern CPU architectures.

FRR currently supports the following protocols:

  • BGP
  • OSPFv2
  • OSPFv3
  • RIPv1
  • RIPv2
  • RIPng
  • IS-IS
  • PIM-SM/MSDP
  • LDP
  • BFD
  • Babel
  • PBR
  • OpenFabric
  • VRRP
  • EIGRP (alpha)
  • NHRP (alpha)

Installation & Use

For source tarballs, see the releases page.

For Debian and its derivatives, use the APT repository at https://deb.frrouting.org/.

Instructions on building and installing from source for supported platforms may be found in the developer docs.

Once installed, please refer to the user guide for instructions on use.

Community

The FRRouting email list server is located here and offers the following public lists:

Topic List
Development dev@lists.frrouting.org
Users & Operators frog@lists.frrouting.org
Announcements announce@lists.frrouting.org

For chat, we currently use Slack. You can join by clicking the "Slack" link under the Participate section of our website.

Contributing

FRR maintains developer's documentation which contains the project workflow and expectations for contributors. Some technical documentation on project internals is also available.

We welcome and appreciate all contributions, no matter how small!

Security

To report security issues, please use our security mailing list:

security [at] lists.frrouting.org