When pim is creating an upstream for a S,G that it has received
*but* it has not received a route to the S, the oil is not
scanned to see if it should inherit anything from the *,G
that may be present when it cannot find the correct iif to
use. When the nexthop tracking actually
resolves the route, the oil is never rescanned and the
S,G stream will be missing a correct oil list leading
to absolute mayhem in the network.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a pim vxlan S,G is created, the code attempts to send out a NULL
register. This is used to build the S,G tree from the RP to the
FHR. Upon initial startup it is not unusual for the pim vxlan state
be fully ready to go but the RP is still not reachable. Let's add
a bit of a pump prime that allows the vxlan code to re-attempt to
send the null register for vxlan S,G's that the RP's outgoing
interface changed from unknown to an actual interface.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As per RFC7606 section 3g,
g. If the MP_REACH_NLRI attribute or the MP_UNREACH_NLRI [RFC4760]
attribute appears more than once in the UPDATE message, then a
NOTIFICATION message MUST be sent with the Error Subcode
"Malformed Attribute List". If any other attribute (whether
recognized or unrecognized) appears more than once in an UPDATE
message, then all the occurrences of the attribute other than the
first one SHALL be discarded and the UPDATE message will continue
to be processed.
However, notification is sent out currently for all the cases.
Fix:
For cases other than MP_REACH_NLRI & MP_UNREACH_NLRI, handling has been updated
to discard the occurrences other than the first one and proceed with further parsing.
Again, the handling is relaxed only for the EBGP case.
Also, since in case of error, the attribute is discarded &
stream pointer is being adjusted accordingly based on length,
the total attribute length sanity check case has been moved up in the function
to be checked before this case.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
As per RFC7606 section 4,
when the total attribute length value is in conflict with the
enclosed attribute length, treat-as-withdraw approach must be followed.
However, notification is being sent out for this case currently,
that leads to session reset.
Fix:
The handling has been updated to conform to treat-as-withdraw
approach only for EBGP case. For IBGP, since we are not following
treat-as-withdraw approach for any of the error handling cases,
the existing behavior is retained for the IBGP.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
Upon startup the pim vxlan code initiates a pim null register
send for the S,G and sends a *,G join towards the RP at the same
time. Since a S,G upstream is created in the vxlan code with
the appropriate flags, the *,G join has the embedded S,G RPT
Prune. When an intermediate route receives this *,G RPT Prune
it creates a blackhole S,G route since this particular intermediate
router has not received a join from the RP yet( say the packet is
lost, or that part of the network is slower coming up ).
Let's try to intelligently decide that the S,G RPT Prune
should not be sent as part of the *,G join until the actual
S,G join from the RP reaches this box. Then we can make
intelligent decisions about whether or not to send it
out.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The tests were originally tor --- spine
lets add a tor -- leaf -- spine. At this
point this change was to allow me to test
some funkiness I am seeing in pim vxlan setups
when the leaf is acting as the intermediate routers.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
See the documentation update, but system() calls and
it's ilk block the processing of SIGINT and they are
not properly handled as a result leading to shutdown
issues where one or more daemons never stop.
See aa530b627d as an example
of system call usage removed from the system.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Before this patch:
```
no service cputime-warning
no service cputime-warning
no ipv6 forwarding
no service cputime-warning
no service cputime-warning
no service cputime-warning
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The zebra_rmap_obj was storing the re->metric and allowing
matches against it, but in most cases it was just using 0.
Use the Route entries metric instead. This should fix
some bugs where a match metric never worked.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In all cases the instance is derived from the re pointer
and since the re pointer is already stored, let's just
remove it from the game and cut to the chase.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Replace the source_protocol with just saving a pointer to the re
in the `struct zebra_rmap_obj` data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
With a negative form we get:
```
Internal CLI error [walltime_warning_str]
Internal CLI error [cputime_warning_str]
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The nexthop that is stored already knows it's nexthop and
in all cases the vrf id is derived from the nexthop->vrf_id
let's just cut to the chase and not do this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If an import table route-map is trying to match against
a particular interface, The code is matching against
the actual vrf the route entry is in -vs- the vrf
the nexthop entry is in. Let's modify the code
to actually allow the import table entry to match
against the nexthops vrf.
Not working:
ip import-table 91
ip import-table 93 route-map FOO
no service integrated-vtysh-config
!
debug zebra events
!
interface green
ip address 192.168.4.3/24
exit
!
route-map FOO permit 10
match interface green
exit
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 1d10h07m
T[91]>* 1.2.3.5/32 [15/0] via 192.168.119.1, enp13s0, 00:00:05
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr0 linkdown, 1d16h34m
C>* 192.168.44.0/24 is directly connected, virbr1, 01:30:51
C>* 192.168.45.0/24 is directly connected, virbr2, 01:30:51
C>* 192.168.119.0/24 is directly connected, enp13s0, 1d16h34m
C>* 192.168.122.0/24 is directly connected, virbr0 linkdown, 01:30:51
eva# show ip route table 91
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 91:
K>* 1.2.3.5/32 [0/0] via 192.168.119.1, enp13s0, 00:00:15
eva# show ip route table 93
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 93:
K * 1.2.3.4/32 [0/0] via 192.168.4.5, green (vrf green), 00:03:05
Working:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 00:03:09
T[93]>* 1.2.3.4/32 [15/0] via 192.168.4.5, green (vrf green), 00:02:21
T[91]>* 1.2.3.5/32 [15/0] via 192.168.119.1, enp13s0, 00:02:26
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr0, 00:03:09
C>* 192.168.44.0/24 is directly connected, virbr1, 00:03:09
C>* 192.168.45.0/24 is directly connected, virbr2, 00:03:09
C>* 192.168.119.0/24 is directly connected, enp13s0, 00:03:09
C>* 192.168.122.0/24 is directly connected, virbr0, 00:03:09
eva# show ip route table 91
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 91:
K * 1.2.3.5/32 [0/0] via 192.168.119.1, enp13s0, 00:03:12
eva# show ip route table 93
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 93:
K * 1.2.3.4/32 [0/0] via 192.168.4.5, green (vrf green), 00:03:14
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This structure is really the generic route map object for
handling routemaps in zebra. Let's name it appropriately.
Future commits will consolidate the data to using the
struct route_entry as part of this data instead of copying
bits and bobs of it. This will allow future work to
set/control the route_entry more directly.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Found some code where bgp was not unlocking the dest
and rd_dest when walking the tree attempting to
find something to install.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC 3032 defines:
A value of 2 represents the "IPv6 Explicit NULL Label".
This label value is only legal at the bottom of the label
stack. It indicates that the label stack must be popped,
and the forwarding of the packet must then be based on the
IPv6 header.
Before this patch we set 128, but it was even more wrong, because it was sent
in host-byte order, not the network-byte.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
There is no test that checks for the mpls interface
configuration.
The new test checks that mpls configuration per
interface works when value is enabled or disabled.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The yang NB API does not handle the mpls configuration
on its leaf.
Add an mpls leaf to stick to the mpls configuration.
- true or false to mean if config
- not defined, means no config.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The 'no mpls' command wrongly assumes the user wants to disable
the mpls handling on the interface whereas this is just a config
knob that should mean 'I don't care with mpls'.
Fix this by adding a 'disable' option to the mpls command.
Fixes: 39ffa8e8e8 ("zebra: Add a `mpls enable` interface node command")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Bug is reporoduced in case of switching interfaces betwean VRFs.
ospf6d is enabled and configured in each VRF.
'dest' can be removed from the route node in the time when the same
route node waiting processing in another sub-queue.
A route node must only be in one sub-queue at a time.
Details:
1. Config:
interface if0
ipv6 address 2001:db8:cafe:2::2/64
ipv6 nat inside
ipv6 ospf6 area 0.0.0.51
ipv6 ospf6 cost 10
vrf test2
exit
!
interface if1
ipv6 address 2001:db8:cafe:4::1/64
ipv6 nat outside
ipv6 ospf6 area 0.0.0.0
ipv6 ospf6 cost 10
vrf test2
exit
!
router ospf6
ospf6 router-id 2.2.2.2
exit
!
router ospf6 vrf test1
ospf6 router-id 2.2.2.2
exit
!
router ospf6 vrf test2
ospf6 router-id 2.2.2.2
exit
I just quickly switched interfaces between different VRFs (default/test1/test2).
2. Log messages:
Aug 02 16:51:56 ubuntu zebra[386985]: [MFYWV-KH3MC] process_subq_early_route_add: (0:?):2001:db8:cafe:2::/64: Inserting route rn 0x56267593de90, re 0x56267595ae40 (connected) existing 0x0, same_count 0
Aug 02 16:51:56 ubuntu zebra[386985]: [Q4T2G-E2SQF] process_subq_early_route_add: dumping RE entry 0x56267595ae40 for 2001:db8:cafe:2::/64 vrf default(0)
Aug 02 16:51:56 ubuntu zebra[386985]: [GCGMT-SQR82] rib_link: (0:?):2001:db8:cafe:2::/64: rn 0x56267593de90 adding dest
Aug 02 16:51:56 ubuntu zebra[386985]: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: queued rn 0x56267593de90 into sub-queue Connected Routes
Aug 02 16:51:56 ubuntu zebra[386985]: [QE6V0-J8BG5] rib_delnode: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595ae40, removing
Aug 02 16:51:56 ubuntu zebra[386985]: [KMPGN-JBRKW] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90 is already queued in sub-queue Connected Routes
Aug 02 16:51:56 ubuntu zebra[386985]: [MFYWV-KH3MC] process_subq_early_route_add: (0:254):2001:db8:cafe:2::/64: Inserting route rn 0x56267593de90, re 0x56267595abf0 (ospf6) existing 0x0, same_count 1
Aug 02 16:51:56 ubuntu zebra[386985]: [Q4T2G-E2SQF] process_subq_early_route_add: dumping RE entry 0x56267595abf0 for 2001:db8:cafe:2::/64 vrf default(0)
Aug 02 16:51:56 ubuntu zebra[386985]: [KMPGN-JBRKW] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90 is already queued in sub-queue Connected Routes
Aug 02 16:51:56 ubuntu zebra[386985]: [YEYFX-TDSC2] process_subq_early_route_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, removing unneeded re 0x56267595ae40
Aug 02 16:51:56 ubuntu zebra[386985]: [Y53JX-CBC5H] rib_unlink: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595ae40
Aug 02 16:51:56 ubuntu zebra[386985]: [QE6V0-J8BG5] rib_delnode: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595abf0, removing
Aug 02 16:51:56 ubuntu zebra[386985]: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: queued rn 0x56267593de90 into sub-queue RIP/OSPF/ISIS/EIGRP/NHRP Routes
Aug 02 16:51:56 ubuntu zebra[386985]: [NZNZ4-7P54Y] default(0:254):2001:db8:cafe:2::/64: Processing rn 0x56267593de90
Aug 02 16:51:56 ubuntu zebra[386985]: [ZJVZ4-XEGPF] default(0:254):2001:db8:cafe:2::/64: Examine re 0x56267595abf0 (ospf6) status: Removed Changed flags: None dist 110 metric 10
Aug 02 16:51:56 ubuntu zebra[386985]: [NM15X-X83N9] rib_process: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, removing re 0x56267595abf0
Aug 02 16:51:56 ubuntu zebra[386985]: [Y53JX-CBC5H] rib_unlink: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595abf0
Aug 02 16:51:56 ubuntu zebra[386985]: [KT8QQ-45WQ0] rib_gc_dest: (0:?):2001:db8:cafe:2::/64: removing dest from table
Aug 02 16:51:56 ubuntu zebra[386985]: [HH6N2-PDCJS] default(0:0):2001:db8:cafe:2::/64 rn 0x56267593de90 dequeued from sub-queue Connected Routes
3. ...and then assert:
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140662163115136) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140662163115136) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140662163115136, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007fee76753476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007fee767397f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007fee76a420fd in _zlog_assert_failed () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#6 0x0000562674efe0f0 in process_subq_route (qindex=7 '\a', lnode=0x562675940c60) at zebra/zebra_rib.c:2540
#7 process_subq (qindex=META_QUEUE_NOTBGP, subq=0x562675574580) at zebra/zebra_rib.c:3055
#8 meta_queue_process (dummy=<optimized out>, data=0x56267556d430) at zebra/zebra_rib.c:3091
#9 0x00007fee76a386e8 in work_queue_run () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#10 0x00007fee76a31c91 in thread_call () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#11 0x00007fee769ee528 in frr_run () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#12 0x0000562674e97ec5 in main (argc=5, argv=0x7ffd1e275958) at zebra/main.c:478
(gdb) print lnode->data
$10 = (void *) 0x56267593de90
(gdb) p/x *(struct route_node *)0x56267593de90
$11 = {
p = {
family = 0xa,
prefixlen = 0x40,
u = {
prefix = 0x20,
prefix4 = {
s_addr = 0xb80d0120
},
prefix6 = {
__in6_u = {
__u6_addr8 = {0x20, 0x1, 0xd, 0xb8, 0xca, 0xfe, 0x0, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
__u6_addr16 = {0x120, 0xb80d, 0xfeca, 0x200, 0x0, 0x0, 0x0, 0x0},
__u6_addr32 = {0xb80d0120, 0x200feca, 0x0, 0x0}
}
},
...
table = 0x5626755ae010,
parent = 0x5626755ae070,
link = {0x0, 0x0},
lock = 0x4,
nodehash = {
hi = {
next = 0x5626755ae0d0,
hashval = 0xebe8bdbf
}
},
info = 0x0
3. What's happen:
We removed unneeded re 0x56267595ae40 while adding re 0x56267595abf0. It was the last connected re,
but rn 0x56267593de90 is still in the connected sub-queue.
Then rib_delnode was called for 0x56267595abf0. (rn 0x56267593de90 is still in the connected sub-queue).
rib_delnode have called rib_meta_queue_add which have checked, that rn is absent in sub-queue RIP/OSPF/ISIS/EIGRP/NHRP
and have added rn in the second sub-queue.
Fixes: d7ac4c4d88 ("zebra: Introduce early route processing on the MetaQ")
Signed-off-by: Pavel Ivashchenko <pivashchenko@nfware.com>
Before now, PBRD used non-zero values to imply that a rule's
match or action field was active. This approach was getting
cumbersome for fields where 0 is a valid active value and
various field-specific magic values had to be used.
This commit changes PBRD to use a flag bit per field to
indicate that the field is active.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
In the netlink-mediated kernel dataplane, each rule is stored
in either an IPv4-specific database or an IPv6-specific database.
PBRD opportunistically gleans each rule's address family value
from its source or destination IP address match value (if either
exists), or from its nexthop or nexthop-group (if it exists).
The 'family' value is particularly needed for netlink during
incremental rule deletion when none of the above fields remain set.
Before now, this address family has been encoded by occult means
in the (possibly otherwise unset) source/destination IP match
fields in ZAPI and zebra.
This commit documents the reasons for maintaining the 'family'
field in the PBRD rule structure, adds a 'family' field in the
common lib/pbr.h rule structure, and carries it explicitly in ZAPI.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
DSCP and ECN matching are configured independently. Maintain
these values in independent fields in pbrd, zapi, and zebra.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>