mirror_frr

mirror of https://git.proxmox.com/git/mirror_frr synced 2025-12-31 15:23:54 +00:00

Author	SHA1	Message	Date
Donald Sharp	bdb832b489	bgpd: t_holdtime move to peer_connection The t_holdtime event pointer belongs in the peer connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	904c98c4d9	bgpd: move t_start into peer_connection The t_start event pointer belongs on the peer_connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	b8f3b2cd4a	bgpd: move t_delayopen from peer to peer_connection This belongs in peer_connection let's move it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	a8888edd42	bgpd: t_connect conversion from peer to peer_connect Move t_connect into struct peer_connect Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donatas Abraitis	e89fd723ee	Merge pull request #14118 from GaladrielZhao/master bgpd: Convert from struct bgp_node to struct bgp_dest	2023-08-30 17:43:29 +03:00
Donatas Abraitis	7c4ed2a719	bgpd: Add a warning for the operator that keepalive was changed ``` donatas-pc(config-router)# timers bgp 8 12 % keeplive value 8 is larger than 1/3 of the holdtime, setting to 4 donatas-pc(config-router)# do sh run \| include timers bgp timers bgp 4 12 donatas-pc(config-router)# ``` Closes https://github.com/FRRouting/frr/issues/14287 Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-08-29 15:14:07 +03:00
Yuqing Zhao	6e7f305e54	bgpd: Convert from struct bgp_node to struct bgp_dest This is based on @donaldsharp's work The current code base is the struct bgp_node data structure. The problem with this is that it creates a bunch of extra data per route_node. The table structure generates ‘holder’ nodes that are never going to receive bgp routes, and now the memory of those nodes is allocated as if they are a full bgp_node. After splitting up the bgp_node into bgp_dest and route_node, the memory of ‘holder’ node which does not have any bgp data will be allocated as the route_node, not the bgp_node, and the memory usage is reduced. The memory usage of BGP node will be reduced from 200B to 96B. The total memory usage optimization of this part is ~16.00%. Signed-off-by: Donald Sharp <sharpd@nvidia.com> Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>	2023-08-22 09:35:46 +08:00
Donald Sharp	3e5a31b24e	bgpd: Convert `struct peer_connection` to dynamically allocated As part of the conversion to a `struct peer_connection` it will be desirable to have 2 pointers one for when we open a connection and one for when we receive a connection. Start this actual conversion over to this in `struct peer`. If this sounds confusing take a look at the bgp state machine for connections and how it resolves the processing of this router opening -vs- this router receiving an open. At some point in time the state machine decides that we are keeping one of the two connections. Future commits will allow us to untangle the peer/doppelganger duality with this abstraction. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	5d52756735	bgpd: Move t_process_packet and t_process_packet_error to connection The t_process_packet thread events should be managed by the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	e20c23fa5b	bgpd: Move status and ostatus to `struct peer_connection` The status and ostatus are a function of the `struct peer_connection` move it into that data structure. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	71d72c4998	bgpd: READ and WRITE flags are a part of the connection Move PEER_THREAD_WRITES_ON and PEER_THREAD_READS_ON to be a part of the `struct peer_connection` since this is a connection oriented bit of data. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	c528b3b153	bgpd: Move t_write and t_read into `struct peer_connection` Move the peer->t_write and peer->t_read into `struct peer_connection` as that these are properties of the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com> P# Please enter the commit message for your changes. Lines starting	2023-08-18 09:29:04 -04:00
Donald Sharp	84d1abd3d9	bgpd: Add peer backpointer to `struct peer_connection` We will need the peer backpointer for a `struct peer_connection` Let's add it in. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	3b2d89b0a3	bgpd: Create destructor function for `struct peer_connection` Create a destructor function to free up memory associated with the io buffers. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	1f32eb30d9	bgpd: Start abstraction of `struct peer_connection` BGP tracks connections based upon the peer. But the problem with this is that the doppelganger structure for it is being created. This has introduced a bunch of fragileness in that the peer exists independently of the connections to it. The whole point of the doppelganger structure was to allow BGP to both accept and initiate tcp connections and then when we get one to a `good` state we collapse into the appropriate one. The problem with this is that having 2 peer structures for this creates a situation where we have to make sure we are configing the `right` one and also make sure that we collapse the two independent peer structures into 1 acting peer. This makes no sense let's abstract out the peer into having 2 connection one for incoming connections and one for outgoing connections then we can easily collapse down without having to do crazy stuff. In addition people adding new features don't need to have to go touch a million places in the code. This is the start of this abstraction. In this commit we'll just pull out the fd and input/output buffers into a connection data structure. Future commits will abstract further. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	052debc3ee	bgpd: Have bgp notice the zebra ability to use v6_with_v4_nexthops Store the data. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-03 08:25:20 -04:00
Donald Sharp	73b66bed83	bgpd: The last_reset_cause in the peer structure is too large The last_reset_cause is a plain old BGP_MAX_PACKET_SIZE buffer that is really enlarging the peer data structure. Let's just copy the stream that failed and only allocate how ever much the packet size actually was. While it's likely that we have a reset reason, the packet typically is not going to be 65k in size. Let's save space. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-07-24 22:41:14 -04:00
Donald Sharp	bdc1762405	bgpd: Replace peer->ibuf_scratch The peer->ibuf_scratch was allocating 65535 * 10 bytes for scratch space to hold data incoming from a read from a peer. When you have 4k peers this is 262,1400,000 or 262 mb of data. Which is crazy large. Especially since the i/o pthread is reading per peer without any chance of having the data interfere with other reads. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-07-21 13:10:03 -04:00
Donald Sharp	c81d6d4d5f	bgpd: Remove peer->sync array It is never used. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-07-21 12:41:35 -04:00
Donald Sharp	acf4defcd8	bgpd: Remove peer->obuf_work This is never used. Free up another 65k of stream data never used per peer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-07-21 12:30:20 -04:00
Donald Sharp	b157af0ac1	bgpd: Remove peer->scratch This was only ever being allocated and de-allocated. Let's save 65k per peer Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-07-21 12:14:59 -04:00
Donatas Abraitis	30db544508	bgpd: Send software-version capability by default Useful to have it for datacenter profile only, disabled for traditional. If the peer is not established or established, but has no description set, we will show the FRR version instead, which is kinda handy to have instead of nothing. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-07-18 09:42:48 +03:00
Donatas Abraitis	c76f6146ab	bgpd: Deprecate Prestandard Outbound Route Filtering capability https://www.rfc-editor.org/rfc/rfc8810.html Not relevant anymore. Use RFC'd version of ORF. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-07-07 23:41:43 +03:00
Donatas Abraitis	04dfcb14ff	bgpd: Deprecate Prestandard Route Refresh capability (128) More details: https://www.rfc-editor.org/rfc/rfc8810.html Not sure if we want to maintain the old code more. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-07-07 16:19:54 +03:00
Donatas Abraitis	2b768c5295	bgpd: Retry connecting to synchronouse label manager if not ready Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-06-20 20:50:38 +03:00
Donatas Abraitis	0043ebab99	bgpd: Use synchronous way to get labels from Zebra Both the label manager and table manager zapi code send data requests via zapi to zebra and then immediately listen for a response from zebra. The problem here is of course that the listen part is throwing away any zapi command that is not the one it is looking for. ISIS/OSPF and PIM all have synchronous abilities via zapi, which they all do through a special zapi connection to zebra. BGP needs to follow this model as well. Additionally the new zclient_sync connection that should be created, a once a second timer should wake up and read any data on the socket to prevent problems too much data accumulating in the socket. ``` r3# sh bgp labelpool summary Labelpool Summary ----------------- Ledger: 3 InUse: 3 Requests: 0 LabelChunks: 1 Pending: 128 Reconnects: 1 r3# sh bgp labelpool inuse Prefix Label --------------------------- 10.0.0.1/32 16 192.168.31.0/24 17 192.168.32.0/24 18 r3# ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-06-20 20:50:10 +03:00
Russ White	4d9fb376c8	Merge pull request #13728 from opensourcerouting/fix/addpath_drop_non_best_addpaths bgpd: Implement neighbor X addpath-tx-best-selected command	2023-06-20 09:20:36 -04:00
Russ White	68da3eab07	Merge pull request #13524 from pguibert6WIND/mpls_vpn_lsr_redistribute MPLS vpn LSR redistribute	2023-06-20 09:13:33 -04:00
Russ White	56a10caa03	Merge pull request #12971 from taspelund/trey/mac_vrf_soo_upstream bgpd: Add MAC-VRF Site-of-Origin support	2023-06-20 09:08:28 -04:00
Louis Scalbert	29b49f67eb	bgpd: add mpls vpn nh label bind cache struct and apis In the context of the ASBR facing an EBGP neighbor, or facing an IBGP neighbor where the BGP updates received are re-advertised with a modified next-hop, a new local label will be re-advertised too, to replace the original one. Create a binding table, in the form of a hash list, from the original labels to the new labels. Since labels can be the same on several routers, set the next-hop and the label as the keys. Add the needed API functions to manage the hash list. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-06-16 10:54:58 +02:00
Louis Scalbert	ef1fc25431	bgpd: add 'mpls bgp l3vpn-multi-domain-switching' command When acting as intermediate device for BGP signaling, and as transit device for data traffic, the device is not able to modify the label value from incoming MPLS VPN updates: - as BGP device, modifying the label value is necessary when redistributing VPN prefixes with its own next-hop. - as transit device that connects two ethernet segments on separate interfaces, the return MPLS traffic must be handled: the modified label value must be swapped with the original label value and sent back to the original next-hop. The border router use case can be taken as example, when it acts both as transit and as BGP device: - When receiving updates from a border router peer, and where interior traffic is expected to transit through the local border router. - When receiving updates from interior devices, and where exterior traffic will transit through the local border router. In those two situations, a new label is bound to the received entry, and the entry is advertised to a new peer with the new label. In the same time, an MPLS entry is created to handle return traffic with the new mpls label: the traffic would be swapped to the original MPLS label and the original next-hop. This is the first commit of a series of patches, that address the above mentioned issue. The first commit introduces a new per-interface command: > interface eth0 > [no] mpls bgp l3vpn-multi-domain-switching > exit This command will authorise mpls vpn updates to have a new label value bound to the mpls vpn routes received over that interface. Link: https://www.rfc-editor.org/rfc/rfc3107.html#section-3 > When a BGP speaker redistributes a route, the label(s) assigned to > that route must not be changed (except by omission), unless the > speaker changes the value of the Next Hop attribute of the route. Link: https://www.rfc-editor.org/rfc/rfc3031.html#section-4.6 Link: https://www.rfc-editor.org/rfc/rfc4364.html#section-10 sub-chapter b. Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-06-16 10:54:58 +02:00
Donatas Abraitis	78981a80c7	bgpd: Implement `neighbor X addpath-tx-best-selected` command When using `addpath-tx-all` BGP announces all known paths instead of announcing only an arbitrary number of best paths. With this new command we can send N best paths to the neighbor. That means, we send the best path, then send the second best path excluding the previous one, and so on. In other words, we run best path selection algorithm N times before we finish. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-06-07 22:27:29 +03:00
Donatas Abraitis	d49700dd2f	bgpd: Add an ability to control default-originate route-map timer By default it's 5 seconds. That means, every 5 second it iterates over the whole BGP table and checks if a route-map is kicked in (if route-map is defined). Having a full feed with many of neighbors, this is a huge CPU-killer, and takes a lot of time. Thread statistics for bgpd: Showing statistics for pthread default -------------------------------------- CPU (user+system): Real (wall-clock): Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Starv_Warn Type Thread 0 0.487 10 48 84 49 85 0 0 0 T (bgp_connect_timer) 0 0.000 1 0 0 1 1 0 0 0 T bgp_startup_timer_expire 2 3.991 276 14 1032 14 1031 0 0 0 R zclient_read 0 0.010 4 2 6 3 6 0 0 0 E _bfd_sess_send 0 0.057 11 5 26 6 26 0 0 0 W vtysh_write 0 65.054 136 478 28907 484 28914 0 0 0 E bgp_event 0 11233.040 24 468043 2772209 1341293 7781145 0 3 0 T subgroup_coalesce_timer 2 3.649 33 110 394 111 395 0 0 0 R bgp_accept 0 468.837 5 93767 178929 93799 178960 0 0 0 T (bgp_graceful_stale_timer_expire) 0 0.462 9 51 77 51 78 0 0 0 T (bgp_start_timer) 1 415.825 14200 29 414 29 415 0 0 0 R vtysh_accept 0 0.052 3 17 47 18 49 0 0 1 T bgp_config_finish 0 0.011 1 11 11 12 12 0 0 0 E frr_config_read_in 0 0.022 4 5 8 6 9 0 0 0 E bgp_nht_ifp_initial 0 0.121 44 2 64 3 65 0 0 0 T (bgp_routeadv_timer) 0 34194.454 3 11398151 21874014 27937411 52641827 2 0 1 T bgp_route_map_update_timer 0 13246.820 8 1655852 3065476 4589606 8454782 0 4 1 T bgp_announce_route_timer_expired 0 0.035 2 17 26 18 27 0 0 0 E zclient_connect 0 279624.026 318778 877 571779 2808 1639624 0 0 5 T work_queue_run 0 0.097 32 3 21 3 23 0 0 0 RW bgp_connect_check 2 6005.738 43560 137 680012 138 680446 0 0 0 R vtysh_read 0 1605.840 1116298 1 1331 2 10152 0 0 133 T (bgp_generate_updgrp_packets) 0 1073.162 17 63127 222065 63175 222087 0 0 0 E bgp_packet_process_error 1 16744058.262 10691 1566182 1807248 1566900 1808301 0 0 5 T update_group_refresh_default_originate_route_map 0 0.000 11 0 0 0 1 0 0 0 T update_subgroup_merge_check_thread_cb 0 94544.034 1898726 49 225054 69 225156 0 0 0 E bgp_process_packet Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-05-31 22:58:30 +03:00
Trey Aspelund	67b493a5b3	bgpd: generalize EVPN martian nexthop changes Currently we have a handler function that will walk the global EVPN rib and unimport/remove routes matching a local IP/TIP. This generalizes this function so that it can be re-used for other BGP Martian entry types. Now this can be used to unimport routes when the MAC-VRF SoO is reconfigured. Signed-off-by: Trey Aspelund <taspelund@nvidia.com>	2023-05-30 15:20:35 +00:00
Donatas Abraitis	bdf8b8dda9	bgpd: Show the real table version for a decent peer subgroup Without the patch: ``` Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc 192.168.1.2 4 65002 4 5 2 0 0 00:00:45 1 1 N/A 192.168.1.3 4 65003 5 5 2 0 0 00:00:45 0 2 N/A 192.168.1.4 4 65004 5 5 2 0 0 00:00:45 0 2 N/A ``` With the patch: ``` Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc 192.168.1.2 4 65002 6 6 2 0 0 00:01:05 0 1 N/A 192.168.1.3 4 65003 7 7 3 0 0 00:01:05 0 1 N/A 192.168.1.4 4 65004 7 7 3 0 0 00:01:05 0 1 N/A ``` JSON output is also fixed: ``` munet> r1 shi vtysh -c 'sh ip bgp sum json' \| grep version -i "tableVersion":3, "version":4, "tableVersion":2, "version":4, "tableVersion":3, "version":4, "tableVersion":3, munet> ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-05-15 21:46:41 +03:00
Philippe Guibert	546d58702e	bgpd: add the bgp_label_per_nexthop_cache struct and apis This commit introduces the necessary structs and apis to create the cache entries that store the label information associated to a given nexthop. A hash table is created in each BGP instance for all the AFIs: IPv4 and IPv6. That hash table is initialised. An API to look and/or create an entry based on a given nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-05-09 21:00:57 +02:00
Philippe Guibert	d4cdcee5bf	bgpd: add vty command to select label allocation per nexthop A new VTY command is introduced in ipv4 unicast and ipv6 unicast address family, under a BGP instance. > r1# label vpn export allocation-mode per-nexthop\|per-vrf This command will update the label values associated for each BGP update to export to the global instance. Two modes are available: per-nexthop and per-vrf. The latter is the default one. With this commit only, configuring label allocation per nexthop will only reset the BGP updates, and the per-vrf mode label allocation will be chosen. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-05-09 21:00:57 +02:00
Donatas Abraitis	786e2b8bdb	Revert "MPLS allocation mode per next hop" Broken tests, let's revert now. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-05-03 13:52:46 +03:00
Donatas Abraitis	99a1ab0b21	Merge pull request #12646 from pguibert6WIND/mpls_alloc_per_nh MPLS allocation mode per next hop	2023-05-02 18:36:45 +03:00
Philippe Guibert	cf1c7e309e	bgpd: configure explicit-null for local paths per address family Until now, the bgp local paths were using the default null label defined. It was not possible to select the null label for the ipv4 or the ipv6 address families. This commit addresses this issues by adding two extra-parameters to the BGP labeled-unicast command. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-04-27 17:05:35 +02:00
Philippe Guibert	7ee70320d3	bgpd: add cli command to control explicit-null label usage In BGP labeled unicast address-family, it is not possible to send explicit-null label values with redistributed or network declared prefixes. A new CLI command is introduced: > [no] bgp labeled-unicast explicit-null When used, the explicit-null value for IPv4 ('0' value) or IPv6 ('2' value) will be used. It is necessary to reconfigure the networks or the redistribution in order to inherit this new behaviour. Add the documentation. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-04-11 16:08:09 +02:00
Donald Sharp	cd9d053741	*: Convert `struct event_master` to `struct event_loop` Let's find a better name for it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00
Donald Sharp	2453d15dbf	*: Convert struct thread_master to struct event_master and it's ilk Convert the `struct thread_master` to `struct event_master` across the code base. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00
Donald Sharp	e6685141aa	*: Rename `struct thread` to `struct event` Effectively a massive search and replace of `struct thread` to `struct event`. Using the term `thread` gives people the thought that this event system is a pthread when it is not Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00
Philippe Guibert	4a3243116b	bgpd: add the bgp_label_per_nexthop_cache struct and apis This commit introduces the necessary structs and apis to create the cache entries that store the label information associated to a given nexthop. A hash table is created in each BGP instance for all the AFIs: IPv4 and IPv6. That hash table is initialised. An API to look and/or create an entry based on a given nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-03-22 12:06:29 +01:00
Philippe Guibert	bbae0bb042	bgpd: add vty command to select label allocation per nexthop A new VTY command is introduced in ipv4 unicast and ipv6 unicast address family, under a BGP instance. > r1# label vpn export allocation-mode per-nexthop\|per-vrf This command will update the label values associated for each BGP update to export to the global instance. Two modes are available: per-nexthop and per-vrf. The latter is the default one. With this commit only, configuring label allocation per nexthop will only reset the BGP updates, and the per-vrf mode label allocation will be chosen. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-03-22 12:06:29 +01:00
Donatas Abraitis	5acfd822be	tests: Check if peer->af_flags can be higher than uint32_t Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-02-24 00:24:20 +02:00
Donatas Abraitis	47017b846f	bgpd: Renumber peer->af_flags to be without any gaps Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-02-23 23:52:08 +02:00
Donatas Abraitis	2c722516c3	bgpd: Convert peer_af_flag_check() to bool Since we increased peer->af_flags from uint32_t to uint64_t, peer_af_flag_check() was historically returning integer, and not bool as should be. The bug was that if we have af_flags higher than uint32_t it will never returned a right value. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-02-23 22:54:12 +02:00
Russ White	ba755d35e5	Merge pull request #12248 from pguibert6WIND/bgpasdot lib, bgp: add initial support for asdot format	2023-02-21 08:01:03 -05:00
Donatas Abraitis	5cb8497795	bgpd: Convert flags_invert/flags_override to uint64_t peer->af_flags got this correctly. peer->flags were already converted a time ago, but these were missed... Let's fix this. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-02-19 12:28:54 +02:00
Donald Sharp	8383d53e43	Merge pull request #12780 from opensourcerouting/spdx-license-id *: convert to SPDX License identifiers	2023-02-17 09:43:05 -05:00
Donatas Abraitis	234f6fd4f4	bgpd: Add BGP Software Version Capability Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability Tested with GoBGP: ``` % ./gobgp neighbor 192.168.10.124 BGP neighbor is 192.168.10.124, remote AS 65001 BGP version 4, remote router ID 200.200.200.202 BGP state = ESTABLISHED, up for 00:01:49 BGP OutQ = 0, Flops = 0 Hold time is 3, keepalive interval is 1 seconds Configured hold time is 90, keepalive interval is 30 seconds Neighbor capabilities: multiprotocol: ipv4-unicast: advertised and received ipv6-unicast: advertised route-refresh: advertised and received extended-nexthop: advertised Local: nlri: ipv4-unicast, nexthop: ipv6 UnknownCapability(6): received UnknownCapability(9): received graceful-restart: advertised and received Local: restart time 10 sec ipv6-unicast ipv4-unicast Remote: restart time 120 sec, notification flag set ipv4-unicast, forward flag set 4-octet-as: advertised and received add-path: received Remote: ipv4-unicast: receive enhanced-route-refresh: received long-lived-graceful-restart: advertised and received Local: ipv6-unicast, restart time 10 sec ipv4-unicast, restart time 20 sec Remote: ipv4-unicast, restart time 0 sec, forward flag set fqdn: advertised and received Local: name: donatas-pc, domain: Remote: name: spine1-debian-11, domain: software-version: advertised and received Local: GoBGP/3.10.0 Remote: FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt cisco-route-refresh: received Message statistics: ``` FRR side: ``` root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' \| \ > jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion' "GoBGP/3.10.0" root@spine1-debian-11:~# ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-02-15 23:14:48 +02:00
Russ White	423c803580	Merge pull request #12728 from opensourcerouting/feature/bgp_neighbor_path-attribute_treat_as_withdraw bgpd: Add neighbor path-attribute treat-as-withdraw command	2023-02-14 11:22:16 -05:00
Philippe Guibert	fa566a94af	bgpd: store the route-distinguisher from config as a string The route-distinguisher string can be expressed in different ways when the AS number is part of the RD. And the configured string value has to be kept intact. The following vty commands store the string value internally: - router bgp / address-family ipv4 unicast / rd vpn export <> - router bgp / address-family l2vpn evpn / rd <> - router bgp / address-family l2vpn evpn / vni <> / rd <> The vty commands where RD is configured in the below places is not considered: - router bgp / rfapi related commands - router bgp / address-family xxx xxx / network .. rd <> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:27:23 +01:00
Philippe Guibert	7e14d0fab2	bgpd: store the confederation as identifier as a string The confederation peers as and the confederation identifier as are stored as a string to preserve the output in the running configuration. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:27:23 +01:00
Philippe Guibert	de76ed8a0e	bgpd: store the neighbor as identifier as a string This identifier is used to display the peer configuration in the running-config, like it has been configured. The following commands are using a specific string attribute: - neighbor .. remote-as ASN - neighbor .. local-as ASN Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:27:23 +01:00
Philippe Guibert	17571c4ae7	bgpd: aspath list format binds on as-notation format Each BGP prefix may have an as-path list attached. A forged string is stored in the BGP attribute and shows the as-path list output. Before this commit, the as-path list output was expressed as a list of AS values in plain format. Now, if a given BGP instance uses a specific asnotation, then the output is changed: new output: router bgp 1.1 asnotation dot ! address-family ipv4 unicast network 10.200.0.0/24 route-map rmap network 10.201.0.0/24 route-map rmap redistribute connected route-map rmap exit-address-family exit ! route-map rmap permit 1 set as-path prepend 1.1 5433.55 264564564 exit ubuntu2004# do show bgp ipv4 BGP table version is 2, local router ID is 10.0.2.15, vrf id 0 Default local pref 100, local AS 1.1 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path > 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ? > 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ? 10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i 10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i The changes include: - the aspath structure has a new field: asnotation type The ashash list will differentiate 2 aspaths using a different asnotation. - 3 new printf extensions display the as number in the wished format: pASP, pASD, pASE for plain, dot, or dot+ format (extended). Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:27:23 +01:00
Philippe Guibert	e55b088399	bgpd: add as-notation keyword to 'router bgp' vty command A new keyword permits changing the BGP as-notation output: - [no] router bgp <> [vrf BLABLA] [as-notation [<dot\|plain\|dot+>]] At the BGP instance creation, the output will inherit the way the BGP instance is declared. For instance, the 'router bgp 1.1' command will configure the output in the dot format. However, if the client wants to choose an alternate output, he will have to add the extra command: 'router bgp 1.1 as-notation dot+'. Also, if the user wants to have plain format, even if the BGP instance is declared in dot format, the keyword can also be used for that. The as-notation output is only taken into account at the BGP instance creation. In the case where VPN instances are used, a separate instance may be dynamically created. In that case, the real as-notation format will be taken into acccount at the first configuration. Linking the as-notation format with the BGP instance makes sense, as the operators want to keep consistency of what they configure. One technical reason why to link the as-notation output with the BGP instance creation is that the as-path segment lists stored in the BGP updates use a string representation to handle aspath operations (by using regexp for instance). Changing on the fly the output needs to regenerate this string representation to the correct format. Linking the configuration to the BGP instance creation avoids refreshing the BGP updates. A similar mechanism is put in place in junos too. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:27:23 +01:00
Philippe Guibert	8079a4138d	lib, bgp: add initial support for asdot format AS number can be defined as an unsigned long number, or two uint16 values separated by a period (.). The possible valus are: - usual 32 bit values : [1;2^32 -1] - <1.65535>.<0.65535> for dot notation - <0.65535>.<0.65535> for dot+ notation. The 0.0 value is forbidden when configuring BGP instances or peer configurations. A new ASN type is added for parsing in the vty. The following commands use that new identifier: - router bgp .. - bgp confederation .. - neighbor <> remote-as <> - neighbor <> local-as <> - clear ip bgp <> - route-map / set as-path <> An asn library is available in lib/ and provides some services: - convert an as string into an as number. - parse an as path list string and extract a number. - convert an as number into a string. Also, the bgp tests forge an as_zero_path, and to do that, an API to relax the possibility to have a 0 as value is specifically called from the tests. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:27:17 +01:00
Philippe Guibert	9eb1199710	bgpd: store the bgp as identifier in the configured as-notation This is a preliminary work to handle various ways to configure a BGP Autonomous System. When creating a BGP instance, the user may want to define the AS number as a dotted value, instead of using an integer value. To handle both cases, an as_pretty char attribute will store the as number as it has been given to the vtysh command: router bgp <as number> Whenever the as integer of the BGP instance was dumped, the as_pretty original format is used. The json output reuses the integer value to keep backward compatibility with old displays. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:19:06 +01:00
David Lamparter	acddc0ed3c	*: auto-convert to SPDX License IDs Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2023-02-09 14:09:11 +01:00
Donatas Abraitis	e2863b4ff5	bgpd: Add `neighbor path-attribute treat-as-withdraw` command To filter out routes with unwanted prefixes. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-02-01 22:57:34 +02:00
Donald Sharp	58cf0823bf	bgpd: Add missing enum's to case statement Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-01-31 12:29:08 -05:00
Donatas Abraitis	e9dbc60ee2	Merge pull request #12666 from donaldsharp/bgp_outq_limit Bgp outq limit	2023-01-20 11:59:34 +02:00
Donald Sharp	963b7ee448	bgpd: Limit peer output queue length like input queue length Consider this scenario: Lots of peers with a bunch of route information that is changing fast. One of the peers happens to be really slow for whatever reason. The way the output queue is filled is that bgpd puts 64 packets at a time and then reschedules itself to send more in the future. Now suppose that peer has hit it's input Queue limit and is slow. As such bgp will continue to add data to the output Queue, irrelevant if the other side is receiving this data. Let's limit the Output Queue to the same limit as the Input Queue. This should prevent bgp eating up large amounts of memory as stream data when under severe network trauma. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-01-19 11:48:01 -05:00
Donatas Abraitis	cfd01fc0ac	Revert "bgpd: optimal router reflection cli and fsm changes" This reverts commit `70cd87ca02`. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-01-17 18:15:28 +02:00
Russ White	c542606e56	Merge pull request #12603 from opensourcerouting/fix/deprecate_bgp_stuff_some bgpd: Deprecate some stuff	2023-01-17 09:12:39 -05:00
Donatas Abraitis	db3f8f3199	bgpd: Deprecate some unused BGP stuff * BGP optional parameter type (Authentication) * BGP UPDATE message error subcode for AS loop Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-01-14 21:30:35 +02:00
Donatas Abraitis	a5c6a9b18e	bgpd: Add `neighbor path-attribute discard` command The idea is to drop unwanted attributes from the BGP UPDATE messages and continue by just ignoring them. This improves the security, flexiblity, etc. This is the command that Cisco has also. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-01-14 21:29:41 +02:00
Donatas Abraitis	f5540d6d41	bgpd: Drop deprecated BGP_ATTR_AS_PATHLIMIT path attribute Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-01-06 14:40:49 +02:00
Donald Sharp	534db980a2	bgpd: When creating peer convey if it is a CONFIG_NODE or not When actually creating a peer in BGP, tell the creation if it is a config node or not. There were cases where the CONFIG_NODE was being set after being placed into the bgp->peerhash, thus causing collisions between the doppelganger and the peer and eventually use after free's. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-12-05 09:11:22 -05:00
Donatas Abraitis	4f770cf1d2	bgpd: Implement graceful-shutdown command per neighbor We already have a global knob for graceful-shutdown, but it's handy having per neighbor knob as well. Especially when a single neighbor needs to be restarted/shutdown gracefuly. We can do this route-maps, but this is a faster/cleaner way doing the same for an operator. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-11-16 21:42:21 +02:00
Donald Sharp	b36156760b	Merge pull request #12259 from opensourcerouting/fix/show_rtt_always bgpd: Shutdown RTT improvements	2022-11-16 10:28:23 -05:00
Donald Sharp	7f1f931447	bgpd: Break up rpki prefix revalidation by bgp structure RPKI revalidation is an possibly expensive operation. Break up revalidation on a prefix basis by the `struct bgp` pointer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-11-08 08:11:52 -05:00
Donald Sharp	7651f27751	bgpd: Make rpki soft_reconfig calling events An end operator is showing cases with multiple bgp feeds and a rpki table that calling the revalidation functions is extremely expensive and they are seeing lots of thread WARNS about timers being late and eventually the whole thing gets unresponsive. Let's break up soft reconfiguration in to a series of events per peer so that all the work for this is not done at the same exact time. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-11-08 08:11:52 -05:00
Donatas Abraitis	5597214ccb	bgpd: Show the reason when the session is killed due to RTT Simulated latency with: ``` tc qdisc add dev eth3 root netem delay 100ms ``` ``` donatas-laptop# sh ip bgp summary failed IPv4 Unicast Summary (VRF default): BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0 BGP table version 28 RIB entries 0, using 0 bytes of memory Peers 1, using 724 KiB of memory Neighbor EstdCnt DropCnt ResetTime Reason 192.168.10.65 2 2 00:00:17 Admin. shutdown (RTT) Displayed neighbors 1 Total number of neighbors 1 donatas-laptop# ``` Another end received: ``` %NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)" ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-11-04 15:56:23 +02:00
Russ White	a5dac02901	Merge pull request #12114 from opensourcerouting/feature/bgp_aigp_attribute bgpd: Implement AIGP	2022-10-31 11:24:43 -04:00
Donatas Abraitis	97a52c82a5	bgpd: Implement Accumulated IGP Metric Attribute for BGP https://www.rfc-editor.org/rfc/rfc7311.html Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-10-26 11:26:57 +03:00
Stephen Worley	a0b937de42	bgpd,doc: limit InQ buf to allow for back pressure Add a default limit to the InQ for messages off the bgp peer socket. Make the limit configurable via cli. Adding in this limit causes the messages to be retained in the tcp socket and allow for tcp back pressure and congestion control to kick in. Before this change, we allow the InQ to grow indefinitely just taking messages off the socket and adding them to the fifo queue, never letting the kernel know we need to slow down. We were seeing under high loads of messages and large perf-heavy routemaps (regex matching) this queue would cause a memory spike and BGP would get OOM killed. Modifying this leaves the messages in the socket and distributes that load where it should be in the socket buffers on both send/recv while we handle the mesages. Also, changes were made to allow the ringbuffer to hold messages and continue to be filled by the IO pthread while we wait for the Main pthread to handle the work on the InQ. Memory spike seen with large numbers of routes flapping and route-maps with dozens of regex matching: ``` Memory statistics for bgpd: System allocator statistics: Total heap allocated: > 2GB Holding block headers: 516 KiB Used small blocks: 0 bytes Used ordinary blocks: 160 MiB Free small blocks: 3680 bytes Free ordinary blocks: > 2GB Ordinary blocks: 121244 Small blocks: 83 Holding blocks: 1 ``` With most of it being held by the inQ (seen from the stream datastructure info here): ``` Type : Current# Size Total Max# MaxBytes ... ... Stream : 115543 variable 26963208 15970740 3571708768 ``` With this change that memory is capped and load is left in the sockets: RECV Side: ``` State Recv-Q Send-Q Local Address:Port Peer Address:Port Process ESTAB 265350 0 [fe80::4080:30ff:feb0:cee3]%veth1:36950 [fe80::4c14:9cff:fe1d:5bfd]:179 users:(("bgpd",pid=1393334,fd=26)) skmem:(r403688,rb425984,t0,tb425984,f1816,w0,o0,bl0,d61) ``` SEND Side: ``` State Recv-Q Send-Q Local Address:Port Peer Address:Port Process ESTAB 0 1275012 [fe80::4c14:9cff:fe1d:5bfd]%veth1:179 [fe80::4080:30ff:feb0:cee3]:36950 users:(("bgpd",pid=1393443,fd=27)) skmem:(r0,rb131072,t0,tb1453568,f1916,w1300612,o0,bl0,d0) ``` Signed-off-by: Stephen Worley <sworley@nvidia.com>	2022-10-24 18:23:29 -04:00
Carmine Scarpitta	527588aa78	bgpd: add support for per-VRF SRv6 SID In the current implementation of bgpd, SRv6 SIDs can be configured only under the address-family. This enables bgpd to leak IPv6 routes using an SRv6 End.DT6 behavior and IPv4 routes using an SRv6 End.DT4 behavior. It is not possible to leak both IPv6 and IPv4 routes using a single SRv6 SID. This commit adds a new CLI command "sid vpn per-vrf export <sid_idx\|auto>" that enables bgpd to leak both IPv6 and IPv4 routes using a single SRv6 SID (End.DT46 behavior). Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>	2022-10-18 16:08:23 +02:00
Donatas Abraitis	272c6d5db1	Merge pull request #8647 from sworleys/DVNI-Config-Changes bgpd: EVPN D-VNI L3 RT Config Enhancements	2022-10-18 14:17:04 +03:00
Donatas Abraitis	46dbf9d0c0	bgpd: Implement ACCEPT_OWN extended community TL;DR: rfc7611. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-10-12 17:48:43 +03:00
Russ White	984eb32b58	Merge pull request #11159 from maduri111/bgpd-orr bgpd: optimal route reflection	2022-10-12 09:30:36 -04:00
Russ White	b6aa61ba3c	Merge pull request #11981 from proelbtn/add-support-to-change-function-length bgpd: Add support to change Segment Routing function length	2022-10-12 08:44:29 -04:00
Madhuri Kuruganti	70cd87ca02	bgpd: optimal router reflection cli and fsm changes Signed-off-by: Madhuri Kuruganti <maduri111@gmail.com>	2022-10-12 13:43:55 +05:30
Donatas Abraitis	eb53128367	Merge pull request #9998 from pguibert6WIND/bgp_tcp_keepalive Bgp tcp keepalive	2022-10-10 15:46:30 +03:00
Ryoga Saito	bee2e7d08f	bgpd: save srv6_locator_chunk in vpn_policy In order to send correct SRv6 L3VPN advertisement, we need to save srv6_locator_chunk in vpn_policy. With this information, we can construct correct SRv6 L3VPN advertisement packets. Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>	2022-10-07 18:26:48 +09:00
Russ White	a8ef436639	Merge pull request #12040 from opensourcerouting/fix/bgp_local_as_remote_as bgpd: Allow using remote-as the same as local-as	2022-10-06 10:03:26 -04:00
Madhuri Kuruganti	e85e4a8d16	bgpd: conditional advertisement code cleanup Signed-off-by: Madhuri Kuruganti <maduri111@gmail.com>	2022-10-06 12:43:05 +05:30
Donatas Abraitis	d6b0327c35	bgpd: Allow using remote-as the same as local-as As an example, Arista EOS allows this behavior. Configuration something like: ``` neighbor PG peer-group neighbor PG remote-as 65001 neighbor PG local-as 65001 neighbor 192.168.10.124 peer-group PG ``` Or without peer-group. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-09-29 21:13:40 +03:00
Xiao Liang	a783cc05f0	bgpd: Handle route-refresh request received before EoR See the BGP message sequence: R1 R2 \| updates \| \|------------------>\| \| \| \| refresh request \| x<------------------\| \| \| \| updates cont. \| \|------------------>\| \| \| \| end-of-rib \| \|------------------>\| \| \| When R1 and R2 establish BGP session, R1 begins to send initial updates. If R2 sends a route-refresh request before EoR, it's silently ignored by R1, and routes received earlier have no chance to be processed again. RFC7313 says, "for a BGP speaker that supports the BGP Graceful Restart, it MUST NOT send a BoRR for an <AFI, SAFI> to a neighbor before it sends the EoR for the <AFI, SAFI> to the neighbor." But it doesn't forbid route-refresh request to be sent before receiving EoR. To handle this scenario, postpone response to refresh request until EoR is sent. Signed-off-by: Xiao Liang <shaw.leon@gmail.com>	2022-09-16 18:26:21 +08:00
Rafael Zalamena	340ed5f9e2	Merge pull request #11823 from pguibert6WIND/bgp_vpnv4_gre_ebgp Bgp vpnv4 convey without transport label	2022-09-06 13:37:19 -03:00
Philippe Guibert	4cd690ae4d	bgpd: add 'mpls bgp forwarding' to ease mpls vpn ebgp peering RFC4364 describes peerings between multiple AS domains, to ease the continuity of VPN services across multiple SPs. This commit implements a sub-set of IETF option b) described in chapter 10 b. The ASBR to ASBR approach is taken, with an EBGP peering between the two routers. The EBGP peering must be directly connected to the outgoing interface used. In those conditions, the next hop is directly connected, and there is no need to have a transport label to convey the VPN label. A new vty command is added on a per interface basis: This command if enabled, will permit to convey BGP VPN labels without any transport labels (i.e. with implicit-null label). restriction: this command is used only for EBGP directly connected peerings. Other use cases are not covered. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2022-09-05 22:26:33 +02:00
Donatas Abraitis	b6a3df6b48	bgpd: Drop useless comments for peer af flags Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-08-31 14:39:38 +03:00
Donatas Abraitis	da5e1a58e9	bgpd: Increase peer af_flags to uint64_t Increasing in advance, as we already hitting the current limit. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-08-31 14:35:55 +03:00
Russ White	d72c279d08	Merge pull request #11833 from opensourcerouting/feature/bgp_neighbor_soo bgpd: Add `neighbor soo` command	2022-08-30 11:17:53 -04:00
Philippe Guibert	d1adb44843	bgpd: support TCP keepalive for BGP connection TCP keepalive is enabled once BGP connection is established. New vty commands: bgp tcp-keepalive <1-65535> <1-65535> <1-30> no bgp tcp-keepalive Signed-off-by: Xiaofeng Liu <xiaofeng.liu@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2022-08-30 15:09:28 +02:00
Donald Sharp	083ec940ab	bgpd: Convert from bgp_clock() to monotime() Let's convert to our actual library call instead of using yet another abstraction that makes it fun for people to switch daemons. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-08-24 08:23:40 -04:00
Stephen Worley	58d8948cf4	bgpd: evpn L3 RT auto config and wildcard implementation Implement forcing L3 auto derivation via configs even when manually RTs are set. This will allow both to coexist in BGP RTs. Without using auto config command, it will remove auto derived RTs when you manually configure your own. To allow both, use the auto command ond import/export/both. Implement '*' wildcard import L3 RTs so we can import a route into any AS. This is necessary to avoid a user from having to configure an L3 RT for every AS they care to import evpn route from. Signed-off-by: Stephen Worley <sworley@nvidia.com>	2022-08-23 12:41:25 -04:00

1 2 3 4 5 ...

685 Commits