Key BGP 'show' commands have been expanded to support 'vrf all':
show ip bgp vrf all summary
show ip bgp vrf all neighbors
show ip bgp vrf all nexthop
show ip bgp vrf all update-group
show ip bgp vrf all
show bgp vrf all summary
show bgp vrf all update-group
show bgp vrf all
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-10402
Reviewed By: CCR-4466
Testing Done: Manual
When creating a 'struct peer' add in the ability to set the peer group
associated with that peer.
Ticket: CM-10184
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Fix and enhance the entire hierarchy of clear commands in BGP to work
for VRFs.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-9945
Reviewed By: CCR-4360
Testing Done: Manual (brief)
Updates to routemaps and delete of the routemap were not working properly
for VRFs. This was because while routemaps are global, the routemap update
processing timer and the processing were at the per-instance level. This
approach was unable to handle processing for multiple instances as the
routemap has no tracking of which instances are still pending processing.
This lead to the processing happening correctly only for the first instance
- which could be the default instance or some other instance. It could also
result in reference to freed memory for an instance.
The fix done is to make the update/delete processing also global and not per
instance. This means that the route-map delay timer will be global and a global
thread will handle the change (or delete) for all instances instead of spawning
a separate thread for each instance. To support this, a global BGP command
"bgp route-map delay-timer <value>" has been implemented. The existing command
per-instance is not deleted but will update the global timer.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-6970, CM-9918
Reviewed By: CCR-4320
Testing Done: Manual, bgpsmoke
Ensure commands dealing with update-groups and peer-groups support VRFs.
Also implement a new command "show bgp vrfs" to show summary information of
all configured VRFs. Some additional code cleanup in this area.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-9247
Reviewed By: CCR-4267
Testing Done: Manual
'show bgp ipv4 uni summ' is counting each prefix sent
to each neighbor. At scale this is an expensive
operation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Various changes and fixes related to VRF registration, deletion,
BGP exit etc.
- Define instance type
- Ensure proper handling upon instance create, delete and
VRF add/delete from zebra
- Cleanup upon bgp_exit()
- Ensure messages are not sent to zebra for unknown VRFs
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-9128, CM-7203
Reviewed By: CCR-4098
Testing Done: Manual
The work-quanta that a user can specify is ~4billion. If a user
specifies such a large value this translates into processing 4billion
outgoing packets before moving onto the next interface. This makes
no sense. Reduce the value of allowed work quanta's to be between
1 and 10000.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-8788
Modify the various maxpath commands to use MULTIPATH_NUM
as the upper limit of allowed max paths in BGP. There
is no point in allowing a number of maximum paths greater
than what Quagga is compiled for.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
BGP uses a second #define that is equal to MULTIPATH_NUM. There
is no point in having a different #define. Just consolidate.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
configuration
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Ticket: CM-8321
This allows the user to configure the peer-group as an option for the
"neighbor swpX interface" command.
{code}
!
router bgp 100
neighbor FOO peer-group
neighbor FOO remote-as external
neighbor swp1 interface
neighbor swp2 interface v6only
neighbor swp3 interface peer-group FOO
neighbor swp4 interface v6only peer-group FOO
!
address-family ipv4 unicast
neighbor FOO activate
neighbor swp1 activate
neighbor swp2 activate
exit-address-family
!
{code}
Note that if the user configures
{code}
neighbor swp5 interface
neighbor swp5 peer-group FOO
{code}
We will display that as "neighbor swp5 interface peer-group FOO". It
did not seem worth tracking that the peer-group was entered via two
lines instead of one.
During CR for nexthop upstream it was noticed that usage
of prefix2str was not consistent. This fixes this problem
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The json keyword was being read incorrectly.
Basically some commands read a variable # of arguments
and in ospf the command values were being placed into
argc and argv. With a variable # of arguments their
existed a possibility that less arguments would be read
from the cli than were being tested for in the command function
handler. This caused core dumps in some situations.
All code to read to decide to use the json keyword has
been centralized through a function and all code
converted to use it, irrelevant if it exhibited the bug
Ticket: CM-8278
Reviewed by: CCR-3830
Testing: OSPF no longer crashes and all other test suites still run
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
BGP is currently displaying the in-use router-id in the config. This is
conditional on a CONFIG flag, however, that flag is set even when there
is no configured router-id and the router-id learnt from Zebra is in-use.
The CONFIG flag for router-id is redundant since there is a separate
variable for the configured value, so use that and deprecate the CONFIG
flag. This also makes BGP behave like OSPF.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Ticket: CM-8077, CM-8220
Reviewed By: CCR-3793
Testing Done: Manual verification (in 2.5-br)
Note: Imported from 2.5-br patch bgpd-fix-router-id-config-display.patch
The code has tests to see if the MULTIPATH_NUM == 0 and to
treat it like the user has entered 'Maximum PATHS'.
This 0 is treated as 64 internally. Remove this dependency
and setup MULTIPATH_NUM to 64 when --enable-multipath=0 from
the configure cli.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
will be deprecated in the future
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-8144
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-8122
per draft-ietf-idr-ix-bgp-route-server-09:
2.3.2.2.2. BGP ADD-PATH Approach
The [I-D.ietf-idr-add-paths] Internet draft proposes a different
approach to multiple path propagation, by allowing a BGP speaker to
forward multiple paths for the same prefix on a single BGP session.
As [RFC4271] specifies that a BGP listener must implement an implicit
withdraw when it receives an UPDATE message for a prefix which
already exists in its Adj-RIB-In, this approach requires explicit
support for the feature both on the route server and on its clients.
If the ADD-PATH capability is negotiated bidirectionally between the
route server and a route server client, and the route server client
propagates multiple paths for the same prefix to the route server,
then this could potentially cause the propagation of inactive,
invalid or suboptimal paths to the route server, thereby causing loss
of reachability to other route server clients. For this reason, ADD-
PATH implementations on a route server should enforce send-only mode
with the route server clients, which would result in negotiating
receive-only mode from the client to the route server.
This allows us to delete all of the following code:
- All XXXX_rsclient() functions
- peer->rib
- BGP_TABLE_MAIN and BGP_TABLE_RSCLIENT
- RMAP_IMPORT and RMAP_EXPORT
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Vivek Venkataraman <vivek@cumulusnetworks.com
Ticket: CM-8014
This implements addpath TX with the first feature to use it
being "neighbor x.x.x.x addpath-tx-all-paths".
One change to show output is 'show ip bgp x.x.x.x'. If no addpath-tx
features are configured for any peers then everything looks the same
as it is today in that "Advertised to" is at the top and refers to
which peers the bestpath was advertise to.
root@superm-redxp-05[quagga-stash5]# vtysh -c 'show ip bgp 1.1.1.1'
BGP routing table entry for 1.1.1.1/32
Paths: (6 available, best #6, table Default-IP-Routing-Table)
Advertised to non peer-group peers:
r1(10.0.0.1) r2(10.0.0.2) r3(10.0.0.3) r4(10.0.0.4) r5(10.0.0.5) r6(10.0.0.6) r8(10.0.0.8)
Local, (Received from a RR-client)
12.12.12.12 (metric 20) from r2(10.0.0.2) (10.0.0.2)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 8
Last update: Fri Oct 30 18:26:44 2015
[snip]
but once you enable an addpath feature we must display "Advertised to" on a path-by-path basis:
superm-redxp-05# show ip bgp 1.1.1.1/32
BGP routing table entry for 1.1.1.1/32
Paths: (6 available, best #6, table Default-IP-Routing-Table)
Local, (Received from a RR-client)
12.12.12.12 (metric 20) from r2(10.0.0.2) (10.0.0.2)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 8
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:44 2015
Local, (Received from a RR-client)
34.34.34.34 (metric 20) from r3(10.0.0.3) (10.0.0.3)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 7
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
56.56.56.56 (metric 20) from r6(10.0.0.6) (10.0.0.6)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 6
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
56.56.56.56 (metric 20) from r5(10.0.0.5) (10.0.0.5)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 5
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
34.34.34.34 (metric 20) from r4(10.0.0.4) (10.0.0.4)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 4
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
12.12.12.12 (metric 20) from r1(10.0.0.1) (10.0.0.1)
Origin IGP, metric 0, localpref 100, valid, internal, best
AddPath ID: RX 0, TX 3
Advertised to: r1(10.0.0.1) r2(10.0.0.2) r3(10.0.0.3) r4(10.0.0.4) r5(10.0.0.5) r6(10.0.0.6) r8(10.0.0.8)
Last update: Fri Oct 30 18:26:34 2015
superm-redxp-05#
When configuring listen ranges for allowing dynamic BGP neighbors,
ensure that there are no duplicate or overlapping ones. This is
necessary because at the time of handling an incoming connection,
the first range that matches the source of the connection (and hence,
its peer-group parameters) will be used.
Signed-off-by: Vivek Venkataraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Atul Patel <atul@cumulusnetworks.com>
Ticket: CM-5153
Reviewed By: CCR-3714
Testing Done: Manual verification
very easy to blackhole routes.
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-6649
bgp is using both bm->master and master pointers interchangebly
for thread manipulation. Since they are the same thing consolidate
to one pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-6680
Reviewed-by: CCR-3486
Testing: See bug
In these situations:
(A) user enters under bgp more 'maximum-paths' than zebra is compiled with
warn the user that there is a problem
(B) Zebra receives more maximum paths than what it can handle log the fact
that this happened
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-7439
Reviewed By: Donald Sharp
Testing Done:
If a session was reset due to a NOTIFICATION the "show ip bgp
neighbor" output would not display details on what the
notification actually was. This patch changes that. Example:
superm-redxp-05# show ip bgp neighbors 20.1.2.2
BGP neighbor is 20.1.2.2, remote AS 21, local AS 10, external link
[snip]
Last reset 01:05:07, due to NOTIFICATION sent (OPEN Message Error/Bad Peer AS)
This patch adds a hostname capability. The node's hostname and
domainname are exchanged in the new capability and used in show command
outputs based on a knob enabled by the user. The hostname and domainname
can be a maximum of 64 chars long, each.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Vivek Venkataraman <vivek@cumulusnetworks.com>
Ticket: CM-5660
Reviewed By: CCR-2563
Testing Done:
Ticket: CM-7012
Reviwed by: CCR-3451
Testing: See bug
When you specify a neighbor <interface> <something>
and don't specify a remote-as the neighbor relationship
will still come up with ipv6 unnumbered if you have
RA configured on the interface.
Ticket: CM-7177
Reviewed-by: CCR-3396
Testing: See bug
This code change does several small things:
(A) Fix a couple detected memory leaks
(B) Fix all malloc operations to use the correct XMALLOC operation in bgpd and parts of lib
(C) Adds a few new memory types to make it easier to detect issues
Ticket: CM-6789
Reviewed By: CCR-3263
Testing Done: Manual Testing and smoke tests
Whenever some sort of output is encountered, added a json version with
proper logic as well.
Ticket: CM-6048
Reviewed-By: CCR-3251
Tested: See bug
When a redistribute metric is changed, the new metric
was not being used. Modify the code to look for existing
redistributed routes and fix their metric.
BGP: Determine peer's IP address if interface has /30, /31
Allow interface-based session config for IPv4 numbered links
if the link address is either /30 or /31. This is not RFC5549,
but can be deployed now, and independent of whether the peer
supports RFC5549 or not.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Reviewed-By: Vivek Venkataram <vivek@cumulusnetworks.com>
This patch also adds BFD multihop support for BGP. Whether a peer is multi-hop or single hop is determined internally. All IGP peers are considered as multi-hop peers. EBGP peers are considered as single hop unless configured as multi-hop.
BGP BFD command enhancement to configure BFD parameters (detect multiplier, min rx and min tx).
router bgp <as-number>
neighbor <name/ip-address> bfd <detect mult> <min rx> <min tx>
Signed-off-by: Radhika Mahankali <radhika@cumulusnetworks.com>
Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Reviewed-by: Vipin Kumar <vipin@cumulusnetworks.com>
Reviewed-by: Kanna Rajagopal <kanna@cumulusnetworks.com>
"bgp confederation id X" are the same value.
router bgp 1
bgp router-id 10.1.1.1
bgp confederation identifier 1
bgp confederation peers 24 35
neighbor 10.1.1.2 remote-as 24
neighbor 10.1.1.2 update-source lo
neighbor 10.1.1.3 remote-as 1
neighbor 10.1.1.3 update-source lo
The customer does this because they want to peer to 10.1.1.2 as a
confed-external peer but peer with 10.1.1.3 as a normal iBGP peer.
The bug was that we thought 10.1.1.3 was an EBGP peer so we did not send him
LOCALPREF which caused the Juniper to send us a NOTIFICATION. I confirmed
that quagga also sends a NOTIFICATION in this scenario.
The fix is to add a check to see if router bgp X and bgp confederation
identifier X are equal because that is a factor in determining if a peer is
EBGP or IBGP
Additional issues fixed in the this patch:
We were not properly removing all AS_CONFED_SEQUENCEs/SETs from the aspath
when advertising a route to an ebgp peer. This was due to two issues:
We only called aspath_delete_confed_seq() if confederations were
configured. We can RX as aspath with CONFED segments even if
confederations are not configured.
aspath_delete_confed_seq() was implemented based on the original confed
RFC 3065 which basically said "remove all of the leading
AS_CONFED_SEQUENCEs/SETs" where the new confed RFC 5065 says "remove ALL
of the AS_CONFED_SEQUENCEs/SETs"
peer-groups did not work for confed-external peers. peer_calc_sort() always
returned BGP_PEER_EBGP for a confederations where the remote-as was not
specified. The reason was the peer->as_type was AS_UNSPECIFIED but we checked
if (peer->as_type != AS_SPECIFIED)
return (peer->as_type == AS_INTERNAL ? BGP_PEER_IBGP : BGP_PEER_EBGP);
After fixing that I found that when we got to the else where we checked for
peer1 we could only possibly return BGP_PEER_IBGP or BGP_PEER_EBGP, we need
to also be able to return BGP_PEER_CONFED. I changed this to return
peer1->sort.
"show ip bgp x.x.x.x" would always display "Local" for the aspath. This is
because we were calling aspath_counts_hop() to determine if the aspath was
empty. This is wrong though because CONFED segments do not count towards
aspath hopcount. The fix is to null check aspath->segments to determine if
the aspath is actually empty.
"show ip bgp x.x.x.x" and "show ip bgp neighbor" always displayed
"internal" or "external" and never "confed-internal" or "confed-external".
This made troubleshooting difficult because I couldn't tell exactly what
kind of peer I was dealing with. I added the confed-internal and
confed-external output...also added a "peer-type" field in the json output
for 'show ip bgp x.x.x.x'
"show ip bgp peer-group" did not list the peer-group name if we hadn't
determined the "type" (internal, external, etc) for the peer-group
This adds support for BGP RFC 5549 (Extended Next Hop Encoding capability)
* send and receive of the capability
* processing of IPv4->IPv6 next-hops
* for resolving these IPv6 next-hops, itsworks with the current
next-hop-tracking support
* added a new message type between BGP and Zebra for such route
install/uninstall
* zserv side of changes to process IPv4 prefix ->IPv6 next-hops
* required show command changes for IPv4 prefix having IPv6 next-hops
Few points to note about the implementation:
* It does an implicit next-hop-self when a [IPv4 prefix -> IPv6 LL next-hop]
is to be considered for advertisement to IPv4 peering (or IPv6 peering
without Extended next-hop capability negotiated)
* Currently feature is off by default, enable it by configuring
'neighbor <> capability extended-nexthop'
* Current support is for IPv4 Unicast prefixes only.
IMPORTANT NOTE:
This patch alone isn't enough to have IPv4->IPv6 routes installed into
the kernel. A separate patch is needed for that to work for the netlink
interface.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Vivek Venkatraman <vivek@cumulusnetworks.com>
Donald Sharp <sharpd@cumulusnetworks.com>
Table-id argument support wasnt complete, used the [proto, instance]
combination changes that were done for OSPF multi-instance. In this case
its 'table <table-id>' just like it was 'ospf <instance-id>'
bgpd: Exchange hostname capability and display hostnames in outputs
This patch adds a hostname capability. The node's hostname and
domainname are exchanged in the new capability and used in show command
outputs based on a knob enabled by the user. The hostname and domainname
can be a maximum of 64 chars long, each.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Vivek Venkataraman <vivek@cumulusnetworks.com>
bgp: Fixup of the remote-as command to allow user to not have to enter an actual as number
Signed-off-by: Donald Sharp<sharpd@cumulusnetworks.com>
Reviewed-by:
BGP: Fix network import check use with NHT instead of scanner
When next hop tracking was implemented and the bgp scanner was eliminated,
the "network import-check" command got broken. This patch fixes that
issue. NHT is used to not just track nexthops, but also the static routes
that are announced as part of BGP's network command. The routes are
registered only when import-check is enabled. To optimize performance,
we register static routes only when import-check is enabled.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
show update-groups summary was mislabeled. What it displays is not a summary
at all, but the detailed info about all update-groups. Furthermore, there
was no way to get detailed info about a specific subgroup.
This patch renames "show * update-groups summary" to "show * update-groups"
and adds an option to see the info specific to a subgroup only. It also
validates the subgroup-id.
show * update-groups summary will be added separately.
sockunion_same() and bgp_peer_conf_if_to_su_update() need to use the scope_id
field of the ipv6 address to uniquify/identify the address.
This allows sessions based on link local address when that address is not
unique across peers.
In the data center, in conjunction with next hop propagation for features
such as announcing VIP routes to load balancers and such, it is desired to
disable the connected route check even on ebgp peers with TTL of 1. This
patch is used to disable the check for all peers instead of the peer by
peer check that is currently supported. Furthermore, the existing
disable-connected-check is different from how Cisco implements this feature.
So, we add this new flag to avoid reliance on the existing flag.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
sessions dynamically. The operator configures a range of neighbor addresses
to which peering is allowed. The ranges are configured as subnets and
multiple ranges are allowed. Each range is associated with a peer-group
so that additional parameters can be configured.
BGP neighbor sessions are dynamically created when connections are initiated
by remote neighbors whose addresses fall within a configured range. The
sessions are deleted when the BGP connection terminates.
A limit on the number of neighbors allowed from each range of addresses
can be specified.
IPv4 and IPv6 peering is supported. Over the peering, any of the address
families configured for the peer-group can be negotiated.
This patch implements the 'update-groups' functionality in BGP. This is a
function that can significantly improve BGP performance for Update generation
and resultant network convergence. BGP Updates are formed for "groups" of
peers and then replicated and sent out to each peer rather than being formed
for each peer. Thus major BGP operations related to outbound policy
application, adj-out maintenance and actual Update packet formation
are optimized.
BGP update-groups dynamically groups peers together based on configuration
as well as run-time criteria. Thus, it is more flexible than update-formation
based on peer-groups, which relies on operator configuration.
[Note that peer-group based update formation has been introduced into BGP by
Cumulus but is currently intended only for specific releases.]
From 11098af65b2b8f9535484703e7f40330a71cbae4 Mon Sep 17 00:00:00 2001
Subject: [PATCH] updgrp commits
The problem is that zclient->redist[ZEBRA_ROUTE_MAX] used for storing a
client’s redist state, has no address-family qualification. This means
a client can only store its interest in a protocol (connected, static etc.),
but cant choose IPv4 or ipv6 with that. This hindered implementation on
client sides to manage redistribution of ipv4 and ipv6 both.
BGP's redistribution of protocols like connected/static is one such place.
One fix could be to overload this and flap the redist connection each time
any new afi is added for redist, but that may have side-effects on the
existing afi redist.
The cleaner way is to modify redist data-structure to also take AFI, and adjust
routines that deal with it, so that a client can register for a protocol
redistribution based on the AFI. BGP already maintains redistribution state
based on afi and protocol (bgp->redist[AFI_MAX][ZEBRA_ROUTE_MAX]). This patch
takes care of filling up the gap in zclient/zserv redistribution state to
also use AFI qualification.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Zebra: Redistribute routes from non-main kernel table to main.
This can be the basis for many interesting features such as variations
of redistribute ARP, using zebra as the RIB in the presence of multiple
routing protocol stacks etc. The code only supports IPv4 for now, but
the infrastructure is in place for IPv6.
Usage:
There is a new route type introduced by this model: TABLE. Routes
imported from alternate kernel tables will have their protocol type set to
TABLE.
Routes from alternate kernel tables MUST be first imported into the main
table via "ip import-table <table id>". They can then be redistributed via
a routing protocol via the "redistribute table" command. Each imported table
can an optional administrative distance specified. In Zebra, a route with a
lower distance is chosen over routes with a higher distance. So, distance
is how the user can choose to prioritize routes from a particular table over
routes from other tables or routes learnt another way in zebra.
Route maps for imported tables are specified via "ip protocol" command in
zebra. Route maps for redistributed routes within a routing protocol are
subject to the route map options supported by the protocol. The
"match source-protocol" option in route maps can match against "table"
to filter routes learnt from alternate kernel routing tables.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
——————————————-------------
- etc/init.d/quagga is modified to support creating separate ospf daemon
process for each instance. Each individual instance is monitored by
watchquagga just like any protocol daemons.(requires initd-mi.patch).
- Vtysh is modified to able to connect to multiple daemons of the same
protocol (supported for OSPF only for now).
- ospfd is modified to remember the Instance-ID that its invoked with. For
the entire life of the process it caters to any command request that
matches that instance-ID (unless its a non instance specific command).
Routes/messages to zebra are tagged with instance-ID.
- zebra route/redistribute mechanisms are modified to work with
[protocol type + instance-id]
- bgpd now has ability to have multiple instance specific redistribution
for a protocol (OSPF only supported/tested for now).
- zlog ability to display instance-id besides the protocol/daemon name.
- Changes in other daemons are to because of the needed integration with
some of the modified APIs/routines. (Didn’t prefer replicating too many
separate instance specific APIs.)
- config/show/debug commands are modified to take instance-id argument
as appropriate.
Guidelines to start using multi-instance ospf
---------------------------------------------
The patch is backward compatible, i.e for any previous way of single ospf
deamon(router ospf <cr>) will continue to work as is, including all the
show commands etc.
To enable multiple instances, do the following:
1. service quagga stop
2. Modify /etc/quagga/daemons to add instance-ids of each desired
instance in the following format:
ospfd=“yes"
ospfd_instances="1,2,3"
assuming you want to enable 3 instances with those instance ids.
3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf
and ospfd-3.conf.
4. service quagga start/restart
5. Verify that the deamons are started as expected. You should see
ospfd started with -n <instance-id> option.
ps –ef | grep quagga
With that /var/run/quagga/ should have ospfd-<instance-id>.pid and
ospfd-<instance-id>/vty to each instance.
6. vtysh to work with instances as you would with any other deamons.
7. Overall most quagga semantics are the same working with the instance
deamon, like it is for any other daemon.
NOTE:
To safeguard against errors leading to too many processes getting invoked,
a hard limit on number of instance-ids is in place, currently its 5.
Allowed instance-id range is <1-65535>
Once daemons are up, show running from vtysh should show the instance-id
of each daemon as 'router ospf <instance-id>’ (without needing explicit
configuration)
Instance-id can not be changed via vtysh, other router ospf configuration
is allowed as before.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Summary of changes
- added an option to enable keepalive debugs for a specific peer
- added an option to enable inbound and/or outbound updates debugs for a specific peer
- added an option to enable update debugs for a specific prefix
- added an option to enable zebra debugs for a specific prefix
- combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer.
- merged "deb bgp filters" into "deb bgp update"
- moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use
r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer.
- Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols.
- Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols.
- Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing.
The new parse tree for BGP debugging is:
deb bgp as4
deb bgp as4 segment
deb bgp keepalives [A.B.C.D|WORD|X:X::X:X]
deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X]
deb bgp nht
deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X]
deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M]
deb bgp zebra
deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]
protocols. BGP and OSPF are integrated to respond this BFD session down message
originated in Zebra via ptmd.
BGP and OSPF now have a bfd command, which tells OSPF/BGP to respond to the
BFD session down message.
OSPF:
interface <>
ip ospf bfd
BGP:
router bgp <>
neighbor <> bfd
Please note that these commands don't enable BFD as a protocol. BFD configuration
and paramter tuning are via BFD applicable UI.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com>
Reviewed-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
BGP: Reprocess the trigger points when an attached route map changes
Currently, modifications to route maps do not affect already processed
routes; they only affect new route updates. This patch addresses this
limitation.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
COMMAND:
Possible forms of the command configuration:
[no] bgp max-med administrative
[no] bgp max-med administrative <max-med-value>
[no] bgp max-med on-startup <period>
[no] bgp max-med on-startup <period> <max-med-value>
DESCRIPTION:
'administrative' takes effect from the time of the config until the config is
removed.
'on-startup' is effective only at the startup time for the given '<period>'
after the first peer is established.
'<max-med-value>' is used as the MED value to be sent out when the max-med
is effective. Default max-med value is 4294967294.
NOTE:
When max-med is active, MED is changed only in the outgoing attributes to the
peers, it doesn't modify any MED specific state of the attributes in BGP on
the local node.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
ISSUE:
During startup, BGP update prefix packing wasnt optimal and route installation
was found to be spread over.
SOLUTION:
With this patch, update-delay post processing is serialized to achieve:
a. better peer update packing
(which helps in reducing total number of BGP update packets)
b. installation of the resulting routes in zebra as close to each others
as possible.
(which can help zebra batch its processing and updates to Kernel better)
BGPd: Allow route-map policy modifications to also affect route reflectors.
By default, attribute modification via route-map policy out is ignored on
reflected routes. This patch provides an option to allow this modification
to occur. Once enabled, it affects all reflected routes.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
BGP: Fix FSM to handle active/passive connections better
The existing code didn't work well when dual connections resulted between
peers during session bringup. This patch fixes that.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
BGP: Event-driven route announcement taking into account min route advertisement interval
ISSUE
BGP starts the routeadv timer (peer->t_routeadv) to expire in 1 sec
when a peer is established. From then on, the timer expires
periodically based on the configured MRAI value (default: 30sec for
EBGP, 5sec for IBGP). At the expiry, the write thread is triggered
that takes the routes from peer's sync FIFO (adj-rib-out) and sends
UPDATEs. This has a few drawbacks:
(1) Delay in new route announcement: Even when the last UPDATE message
was sent a while back, the next route change will necessarily have
to wait for routeadv expiry
(2) CPU usage: The timer is always armed. If the operator chooses to
configure a lower value of MRAI (zero second is a preferred choice
in many deployments) for better convergence, it leads to high CPU
usage for BGP process, even at the times of no network churn.
PATCH
Make the route advertisement event-driven - When routes are added to
peer's sync FIFO, check if the routeadv timer needs to be adjusted (or
started). Conversely, do not arm the routeadv timer unconditionally.
The patch also addresses route announcements during read-only mode
(update-delay). During read-only mode operation, the routeadv timer
is not started. When BGP comes out of read-only mode and all the
routes are processed, the timer is started for all peers with zero
expiry, so that the UPDATEs can be sent all at once. This leads to
(near-)optimal UPDATE packing.
Finally, the patch makes the "max # packets to write to peer socket at
a time" configurable. Currently it is hard-coded to 10. The command is
at the top router-bgp mode and is called "write-quanta <number>". It
is a useful convergence parameter to tweak.
Signed-off-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
BGP: Show more meaningful outq value in 'show ip bgp summary' output.
'outq' field in 'show ip bgp sum' displays the number of formatted packets
to a peer. Since the route announcement follows an input-buffered pattern
(i.e. adj-rib-out is a separate queue of routes per peer and packets are
formatted from the routes at the time of TCP write), the outq field doesn't
show any interesting data worth watching.
The patch is to display the adj-rib-out queue depth instead.
signed-off-by: pmohapat@cumulusnetworks.com
reviewed-by: dwalton@cumulusnetworks.com
COMMAND:
table-map <route-map-name>
DESCRIPTION:
This feature is used to apply a route-map on route updates from BGP to Zebra.
All the applicable match operations are allowed, such as match on prefix,
next-hop, communities, etc. Set operations for this attach-point are limited
to metric and next-hop only. Any operation of this feature does not affect
BGPs internal RIB.
Supported for ipv4 and ipv6 address families. It works on multi-paths as well,
however, metric setting is based on the best-path only.
IMPLEMENTATION NOTES:
The route-map application at this point is not supposed to modify any of BGP
route's attributes (anything in bgp_info for that matter). To achieve that,
creating a copy of the bgp_attr was inevitable. Implementation tries to keep
the memory footprint low, code comments do point out the rationale behind a
few choices made.
bgp_zebra_announce() was already a big routine, adding this feature would
extend it further. Patch has created a few smaller routines/macros whereever
possible to keep the size of the routine in check without compromising on the
readability of the code/flow inside this routine.
For updating a partially filtered route (with its nexthops), BGP to Zebra
replacement semantic of the next-hops serves the purpose well. However, with
this patch there could be some redundant withdraws each time BGP announces a
route thats (all the nexthops) gets denied by the route-map application.
Handling of this case could be optimized by keeping state with the prefix and
the nexthops in BGP. The patch doesn't optimizing that case, as even with the
redundant withdraws the total number of updates to zebra are still be capped
by the total number of routes in the table.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com>
Reviewed-by: Pradosh Mohapatra <pmohapat@cumulusnetworks.com>