BGP tracks connections based upon the peer. But the problem
with this is that the doppelganger structure for it is being
created. This has introduced a bunch of fragileness in that
the peer exists independently of the connections to it.
The whole point of the doppelganger structure was to allow
BGP to both accept and initiate tcp connections and then
when we get one to a `good` state we collapse into the
appropriate one. The problem with this is that having
2 peer structures for this creates a situation where
we have to make sure we are configing the `right` one
and also make sure that we collapse the two independent
peer structures into 1 acting peer. This makes no sense
let's abstract out the peer into having 2 connection
one for incoming connections and one for outgoing connections
then we can easily collapse down without having to do crazy
stuff. In addition people adding new features don't need
to have to go touch a million places in the code.
This is the start of this abstraction. In this commit
we'll just pull out the fd and input/output buffers
into a connection data structure. Future commits
will abstract further.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Properly free the dynamically allocated memory held by `str` after its use.
The change also maintains the return value of `nb_cli_apply_changes` by using 'ret' variable.
The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in bgp_set_aspath_replace.test_bgp_set_aspath_replace/r1.asan.bgpd.11586
=================================================================
==11586==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 92 byte(s) in 3 object(s) allocated from:
#0 0x7f4e2951db40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
#1 0x7f4e28f19ea2 in qmalloc lib/memory.c:100
#2 0x7f4e28edbb08 in frrstr_join lib/frrstr.c:89
#3 0x7f4e28e9a601 in argv_concat lib/command.c:183
#4 0x56519adf8413 in set_aspath_replace_access_list_magic bgpd/bgp_routemap.c:6174
#5 0x56519adf8942 in set_aspath_replace_access_list bgpd/bgp_routemap_clippy.c:683
#6 0x7f4e28e9d548 in cmd_execute_command_real lib/command.c:993
#7 0x7f4e28e9da0c in cmd_execute_command lib/command.c:1051
#8 0x7f4e28e9de8b in cmd_execute lib/command.c:1218
#9 0x7f4e28fc4f1c in vty_command lib/vty.c:591
#10 0x7f4e28fc53c7 in vty_execute lib/vty.c:1354
#11 0x7f4e28fcdc8d in vtysh_read lib/vty.c:2362
#12 0x7f4e28fb8c8b in event_call lib/event.c:1979
#13 0x7f4e28efd445 in frr_run lib/libfrr.c:1213
#14 0x56519ac85d81 in main bgpd/bgp_main.c:510
#15 0x7f4e27f40c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
SUMMARY: AddressSanitizer: 92 byte(s) leaked in 3 allocation(s).
***********************************************************************************
***********************************************************************************
Address Sanitizer Error detected in bgp_set_aspath_exclude.test_bgp_set_aspath_exclude/r1.asan.bgpd.10385
=================================================================
==10385==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 55 byte(s) in 2 object(s) allocated from:
#0 0x7f6814fdab40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
#1 0x7f68149d6ea2 in qmalloc lib/memory.c:100
#2 0x7f6814998b08 in frrstr_join lib/frrstr.c:89
#3 0x7f6814957601 in argv_concat lib/command.c:183
#4 0x5570e05117a1 in set_aspath_exclude_access_list_magic bgpd/bgp_routemap.c:6327
#5 0x5570e05119da in set_aspath_exclude_access_list bgpd/bgp_routemap_clippy.c:836
#6 0x7f681495a548 in cmd_execute_command_real lib/command.c:993
#7 0x7f681495aa0c in cmd_execute_command lib/command.c:1051
#8 0x7f681495ae8b in cmd_execute lib/command.c:1218
#9 0x7f6814a81f1c in vty_command lib/vty.c:591
#10 0x7f6814a823c7 in vty_execute lib/vty.c:1354
#11 0x7f6814a8ac8d in vtysh_read lib/vty.c:2362
#12 0x7f6814a75c8b in event_call lib/event.c:1979
#13 0x7f68149ba445 in frr_run lib/libfrr.c:1213
#14 0x5570e03a0d81 in main bgpd/bgp_main.c:510
#15 0x7f68139fdc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
SUMMARY: AddressSanitizer: 55 byte(s) leaked in 2 allocation(s).
***********************************************************************************
```
Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
Upon some internal testing some crashes were found. This fixes
the several crashes and normalizes the code to be closer in
it's execution pre and post changes to use the data plane.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem:
On GR helper, paths learnt from an interface based peer were linked
to bnc with ifindex=0. During restart of GR peer, BGP (unnumbered)
session (with GR restarter peer) goes down on GR helper but the routes
are retained. Later, when BGP receives an interface up event, it
will process all the paths associated with BNC whose ifindex matches the
ifindex of the interface for which UP event is received. However, paths
associated with bnc that has ifindex=0 were not being reinstalled since
ifindex=0 doesn't match ifindex of any interfaces. This results in
BGP routes not being reinstalled in zebra and kernel.
Fix:
For paths learnt from an interface based peer, set the
ifindex to peer's interface ifindex so that correct
peer based nexthop can be found and linked to the path.
Signed-off-by: Donald Sharp sharpd@nvidia.com
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
Create a constant OSPFV3_CRYPTO_PROTO_ID to replace the hard-coded
Cryptographic Protocol ID in the OSPFv3 authentication trailer
code. This enhances code clarity and maintainability.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Ensure the OSPFv3 Cryptographic Protocol ID is encoded in network
byte order when appending it to the authentication key. This solves
interoperability issues with other implementations such as BIRD
and IOS-XR.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
PBR configuration may specify "set nexthop blackhole" which,
for linux dataplanes, is implemented as a table with a blackhole
route.
Other dataplanes might implement this action as an explicit
packet-filtering "drop" action instead of a route. This new flag
PBR_ACTION_DROP is now set when a rule has "set nexthop blackhole"
as an aid to other dataplanes.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Relax this handling (RFC 7606) only for eBGP peers.
More details: https://datatracker.ietf.org/doc/html/rfc7606#section-4
There are two error cases in which the Total Attribute Length value
can be in conflict with the enclosed path attributes, which
themselves carry length values:
* In the first case, the length of the last encountered path
attribute would cause the Total Attribute Length to be exceeded
when parsing the enclosed path attributes.
* In the second case, fewer than three octets remain (or fewer than
four octets, if the Attribute Flags field has the Extended Length
bit set) when beginning to parse the attribute. That is, this
case exists if there remains unconsumed data in the path
attributes but yet insufficient data to encode a single minimum-
sized path attribute. <<<< HANDLING THIS CASE IN THIS COMMIT >>>>
In either of these cases, an error condition exists and the "treat-
as-withdraw" approach MUST be used (unless some other, more severe
error is encountered dictating a stronger approach), and the Total
Attribute Length MUST be relied upon to enable the beginning of the
NLRI field to be located.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before the patch:
```
donatas-laptop(config-router)# bgp confederation
identifier AS number in plain <1-4294967295> or dotted <0-65535>.<0-65535> format
peers Peer ASs in BGP confederation
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When pim is creating an upstream for a S,G that it has received
*but* it has not received a route to the S, the oil is not
scanned to see if it should inherit anything from the *,G
that may be present when it cannot find the correct iif to
use. When the nexthop tracking actually
resolves the route, the oil is never rescanned and the
S,G stream will be missing a correct oil list leading
to absolute mayhem in the network.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a pim vxlan S,G is created, the code attempts to send out a NULL
register. This is used to build the S,G tree from the RP to the
FHR. Upon initial startup it is not unusual for the pim vxlan state
be fully ready to go but the RP is still not reachable. Let's add
a bit of a pump prime that allows the vxlan code to re-attempt to
send the null register for vxlan S,G's that the RP's outgoing
interface changed from unknown to an actual interface.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As per RFC7606 section 3g,
g. If the MP_REACH_NLRI attribute or the MP_UNREACH_NLRI [RFC4760]
attribute appears more than once in the UPDATE message, then a
NOTIFICATION message MUST be sent with the Error Subcode
"Malformed Attribute List". If any other attribute (whether
recognized or unrecognized) appears more than once in an UPDATE
message, then all the occurrences of the attribute other than the
first one SHALL be discarded and the UPDATE message will continue
to be processed.
However, notification is sent out currently for all the cases.
Fix:
For cases other than MP_REACH_NLRI & MP_UNREACH_NLRI, handling has been updated
to discard the occurrences other than the first one and proceed with further parsing.
Again, the handling is relaxed only for the EBGP case.
Also, since in case of error, the attribute is discarded &
stream pointer is being adjusted accordingly based on length,
the total attribute length sanity check case has been moved up in the function
to be checked before this case.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
As per RFC7606 section 4,
when the total attribute length value is in conflict with the
enclosed attribute length, treat-as-withdraw approach must be followed.
However, notification is being sent out for this case currently,
that leads to session reset.
Fix:
The handling has been updated to conform to treat-as-withdraw
approach only for EBGP case. For IBGP, since we are not following
treat-as-withdraw approach for any of the error handling cases,
the existing behavior is retained for the IBGP.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
Upon startup the pim vxlan code initiates a pim null register
send for the S,G and sends a *,G join towards the RP at the same
time. Since a S,G upstream is created in the vxlan code with
the appropriate flags, the *,G join has the embedded S,G RPT
Prune. When an intermediate route receives this *,G RPT Prune
it creates a blackhole S,G route since this particular intermediate
router has not received a join from the RP yet( say the packet is
lost, or that part of the network is slower coming up ).
Let's try to intelligently decide that the S,G RPT Prune
should not be sent as part of the *,G join until the actual
S,G join from the RP reaches this box. Then we can make
intelligent decisions about whether or not to send it
out.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The tests were originally tor --- spine
lets add a tor -- leaf -- spine. At this
point this change was to allow me to test
some funkiness I am seeing in pim vxlan setups
when the leaf is acting as the intermediate routers.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
See the documentation update, but system() calls and
it's ilk block the processing of SIGINT and they are
not properly handled as a result leading to shutdown
issues where one or more daemons never stop.
See aa530b627d as an example
of system call usage removed from the system.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Before this patch:
```
no service cputime-warning
no service cputime-warning
no ipv6 forwarding
no service cputime-warning
no service cputime-warning
no service cputime-warning
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The zebra_rmap_obj was storing the re->metric and allowing
matches against it, but in most cases it was just using 0.
Use the Route entries metric instead. This should fix
some bugs where a match metric never worked.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In all cases the instance is derived from the re pointer
and since the re pointer is already stored, let's just
remove it from the game and cut to the chase.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Replace the source_protocol with just saving a pointer to the re
in the `struct zebra_rmap_obj` data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
With a negative form we get:
```
Internal CLI error [walltime_warning_str]
Internal CLI error [cputime_warning_str]
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The nexthop that is stored already knows it's nexthop and
in all cases the vrf id is derived from the nexthop->vrf_id
let's just cut to the chase and not do this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If an import table route-map is trying to match against
a particular interface, The code is matching against
the actual vrf the route entry is in -vs- the vrf
the nexthop entry is in. Let's modify the code
to actually allow the import table entry to match
against the nexthops vrf.
Not working:
ip import-table 91
ip import-table 93 route-map FOO
no service integrated-vtysh-config
!
debug zebra events
!
interface green
ip address 192.168.4.3/24
exit
!
route-map FOO permit 10
match interface green
exit
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 1d10h07m
T[91]>* 1.2.3.5/32 [15/0] via 192.168.119.1, enp13s0, 00:00:05
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr0 linkdown, 1d16h34m
C>* 192.168.44.0/24 is directly connected, virbr1, 01:30:51
C>* 192.168.45.0/24 is directly connected, virbr2, 01:30:51
C>* 192.168.119.0/24 is directly connected, enp13s0, 1d16h34m
C>* 192.168.122.0/24 is directly connected, virbr0 linkdown, 01:30:51
eva# show ip route table 91
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 91:
K>* 1.2.3.5/32 [0/0] via 192.168.119.1, enp13s0, 00:00:15
eva# show ip route table 93
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 93:
K * 1.2.3.4/32 [0/0] via 192.168.4.5, green (vrf green), 00:03:05
Working:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp13s0, 00:03:09
T[93]>* 1.2.3.4/32 [15/0] via 192.168.4.5, green (vrf green), 00:02:21
T[91]>* 1.2.3.5/32 [15/0] via 192.168.119.1, enp13s0, 00:02:26
K>* 169.254.0.0/16 [0/1000] is directly connected, virbr0, 00:03:09
C>* 192.168.44.0/24 is directly connected, virbr1, 00:03:09
C>* 192.168.45.0/24 is directly connected, virbr2, 00:03:09
C>* 192.168.119.0/24 is directly connected, enp13s0, 00:03:09
C>* 192.168.122.0/24 is directly connected, virbr0, 00:03:09
eva# show ip route table 91
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 91:
K * 1.2.3.5/32 [0/0] via 192.168.119.1, enp13s0, 00:03:12
eva# show ip route table 93
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
VRF default table 93:
K * 1.2.3.4/32 [0/0] via 192.168.4.5, green (vrf green), 00:03:14
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This structure is really the generic route map object for
handling routemaps in zebra. Let's name it appropriately.
Future commits will consolidate the data to using the
struct route_entry as part of this data instead of copying
bits and bobs of it. This will allow future work to
set/control the route_entry more directly.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Found some code where bgp was not unlocking the dest
and rd_dest when walking the tree attempting to
find something to install.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC 3032 defines:
A value of 2 represents the "IPv6 Explicit NULL Label".
This label value is only legal at the bottom of the label
stack. It indicates that the label stack must be popped,
and the forwarding of the packet must then be based on the
IPv6 header.
Before this patch we set 128, but it was even more wrong, because it was sent
in host-byte order, not the network-byte.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>