Commit Graph

2266 Commits

Author SHA1 Message Date
Donald Sharp
bdb6f268f3
Merge pull request #1595 from dwiesner/pmsi-tunnel
bgpd: add PMSI_TUNNEL_ATTRIBUTE to EVPN IMET routes
2018-01-05 14:08:29 -05:00
Dario Wiesner
a21bd7a3b9 bgpd: add PMSI_TUNNEL_ATTRIBUTE to EVPN IMET routes
Signed-off-by: Dario Wiesner <dario.wiesner@gmail.com>
2018-01-04 13:51:15 +01:00
Donald Sharp
8356e9b7b0 bgpd: fix failing to compile on 32 bit systems
-Werror=sign-compare is failing with signed/unsigned usage
in the conditional expression.

Fixes: #1593
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-01-04 05:45:28 -05:00
Quentin Young
74ffbfe6fe
bgpd: use ring buffer for network input
The multithreading code has a comment that reads:
"XXX: Heavy abuse of stream API. This needs a ring buffer."

This patch makes the relevant code use a ring buffer.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-01-03 14:35:11 -05:00
Donald Sharp
d3c7efede7 bgpd: Allow for deprecation of json bgpTimerUp
The bgpTimerUp value was incorrectly named, add
a correct name bgpTimerUpMsec and add some
code to allow for deprecation.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-01-03 11:19:20 -05:00
Donald Sharp
e033027459 bgpd: Fix peer uptime display in milliseconds
For some reason bgp is calculating the peer uptime
in miliseconds incorrectly.  Additionally we have
the peer_uptime function call that should be doing this!
But since we've choosen different names for the json output
we cannot fix it at this point.

uptime contains the number of seconds of uptime here.  Just
multiply by 1k and display that( as peer_uptime does )

Fixes: #1585
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-01-03 11:19:20 -05:00
mitesh
523cafc418 bgpd, lib, zebra: fix style problems
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-27 11:47:10 -08:00
Don Slice
e2a86ad9b4 bgpd: convert network statements from DEFUN to DEFPY
Problems reported with inconsistent use of parameters for bgp network
statements.  Converted 12 DEFUNs to 2 DEFPY statements, making the
parameter use consistent with the exception of keeping the "backdoor"
keywork ipv4 only.  Also verified that if a route-map or label-index
is specified in the "no" case it matches what had been previously
defined. Manual testing looks good and bgp-smoke will be performed
before pushing.

Ticket: CM-16860
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-7056
2017-12-21 19:07:56 +00:00
Donald Sharp
ff99c5b2bb
Merge pull request #1551 from LabNConsulting/working/master/minor-perf
bgpd: minor performance enhancement
2017-12-19 13:21:09 -05:00
Rafael Zalamena
e492e668b6
Merge pull request #1553 from donaldsharp/bgp_json_routes
bgpd: Speedup vtysh handling of 'show bgp afi safi json' display
2017-12-19 14:49:59 -02:00
Donald Sharp
449feb8ef9 bgpd: Fix double free
The json code was freeing json_paths and then
turning around and free'ing it again.  Newer
versions of json-c have started to assert
this bad behavior.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-19 08:20:30 -05:00
Raymond P. Burkholder
fb8f41e455 bgpd: fixed '-Werror=maybe-uninitialized' warnings
- used @sharpd's slack patch as a starting point
- fixes compile time issue, but code path not tested

Signed-off-by: Raymond P. Burkholder <github@oneunified.net>
2017-12-19 08:15:44 -05:00
Mitesh Kanjariya
3d57c99404 bgpd: rd_idspace should be freed in bgp_exit
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-15 13:15:47 -08:00
Mitesh Kanjariya
c383080176 bgpd: solve valgrind issues in bgp_evpn_cleanup
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-15 09:23:13 -08:00
Rafael Zalamena
62b05982c6 Merge pull request #1535 from qlyoung/fix-coalesce-time-display
bgpd: fix config display of coalesce-time
2017-12-15 11:46:28 -02:00
Mitesh Kanjariya
877702e757 bgpd: solve SA issue in bgp_evpn_unconfigure_export_rt_for_vrf
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-15 02:09:04 -08:00
Quentin Young
37a333fef1
bgpd: fix configuration of 0 for coalesce-time
Was using 0 as a sentinel value, so user couldn't configure 0 as the
value of the coalesce timer.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-12-14 16:43:31 -05:00
mitesh
c3004bc483 bgpd: resolve memory leak in bm_master_init
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
655b04d1c2 zebra/bgpd: cleanup l3vni on no advertise-all-vni
EVPN is only enabled when user configures advertise-all-vni.
All VNIs (L2 and L3) should be cleared upon removal of this config.

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
42cb44f282 bgpd: uninstall type-5 routes from vrf
When we receive an MP_UNREACH,
we try to uninstall routes from the VRF and the VNI.
The route-node in the VRF corresponds
to the ip prefix formed from EVPN prefix.
We should correctly form the prefix based on the EVPN route-type.

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
bf48830b01 bgpd: move vrf rd command under address-family l2vpn evpn
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
5424b7bad8 bgpd: advertise/withdraw new added/deleted type-5 routes
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
90264d64ef bgpd: process evpn type-5 routes received from peers
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
e9fc2840f7 bgpd: distinguish between frr prefixlen and packet prefixlen for EVPN type-5 routes
for EVPN routes prefixlen filed in struct prefix
represents the sizeof of the struct rather than the actual prefix len.
This is later used in looking up route node in RIB.

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
408b00c4d7 bgpd: only advertise valid subnet routes as evpn type-5 routes
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
6ee86383bb bgpd: write advertise <ipv4|ipv6> unicast under bgp vrf config
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
053905d2e3 bgpd: follow AFI/SAFI style for advertising/withdrawing type-5 routes
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
4992b4aee2 bgpd: update type-5 routes upon vrf export-rt change
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
06d2e8f3ba bgpd: update/withdraw type-5 routes upon l3-vni add/del
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:08 -08:00
Mitesh Kanjariya
94c2f693a4 bgpd: evpn enabled should only be checked for default instance
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
80b140af22 bgpd: update type-5 routes when RD changes
when router-id/RD changes for a bgp vrf instance,
we need to update all type-5 routes.

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
mitesh
342dd0c623 bgpd: advertise/withdraw type-5 routes upon user config
CLI config for enabling/disabling type-5 routes

router bgp <as> vrf <vrf>
  address-family l2vpn evpn
    [no] advertise <ipv4|ipv6|both>

loop through all the routes in VRF instance and advertise/withdraw
all ip routes as type-5 routes in default instance.

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
mitesh
b67a60d2cf bgpd: set vrf originator ip to kernels local-ip
For EVPN type-5 route the NH in the NLRI is set to the local tunnel ip.
This information has to be obtained from kernel notification.
We need to pass this info from zebra to bgp in l3vni call flow.
This patch doesn't handle the tunnel-ip change.

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
mitesh
676f83b991 bgpd: RD derivation for VRF
1. VRF RD can be auto-derived (simillar to RD for a VNI)
2. VRF RD can be configured manually through a config

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
mitesh
e9eb5f63ed bgpd: move rd id bitfield to bgp_master
currently, we have a rd_id bitfield
to assign an unique index for auto RD.
This bitfield currently resides under struct bgp which seems wrong.
We need to shift this to a global space
as this ID space is really global per box.
One more reason to keep it at a global data structure is,
the ID space could be used by both VNIs and VRFs.

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
0b5131c951 bgpd: handle different sequence of bgp vrf create/delete
BGP VRF can be created/deleted either via config or via l3vni add/del.
We need to handle various sequences.

1. If user config is presented, an l3vni del should not delete the vrf instance
2. do not write bgp config in show running for auto created vrf
2. If l3vni present, disallow the cli for deleting bgp vrf instance
3. If l3vni is added and vrf config is present set the flags properly
4. if bgp vrf is configured unset the AUTO flag

Ticket: CM-18630
Review: CCR-6906
Testing: Manual

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
c4edf7086b bgpd: properly initialize ret variable
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
df399d1cb1 bgpd: move vrf RT command under address-family l2vpn evpn
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
7ec156a9c4 bgpd: do not advertise ipv6 host routes with l3-vni related ext-comm
Currently, kernel doesn't support ipv6 routes with ipv4 nexthops.
To avoid the crash,
we will only attach l3-vni related
RTs/ecommunities only to ipv4 host routes.

Ticket: CM-18743
Review: ccr-6912
Testing: Manaul

Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
mitesh
bb7a24aba9 zebra: use list_delete_and_null instead of list_delete
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
6be9a2087f bgpd: adjust show bgp l2vpn evpn vni command to avoid sanity breakages
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
41d6d607b7 bgpd: write vrf import/export RT config to frr.conf
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
e92bd2a28b bgpd: update all routes when vrf changes on a VNI
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:07 -08:00
Mitesh Kanjariya
ceb9a9210b bgpd: json support for show bgp vrf <> l3vni info
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:06 -08:00
Mitesh Kanjariya
181c08c6fc bgpd: json support for show bgp l2vpn evpn vrf-import-rt
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:06 -08:00
Mitesh Kanjariya
01740ff487 bgpd: set type-2 route flag if necessary in bgp_zebra_witgdraw
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:06 -08:00
Mitesh Kanjariya
a89b49cc4e bgpd: do not send label to zebra route add for evpn routes
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:06 -08:00
Mitesh Kanjariya
30a30f5727 bgpd: only install mac_ip routes in vrf
Signed-off-by: Mitesh Kanjariya<mitesh@cumulusnetworks.com>
2017-12-14 10:57:06 -08:00
Mitesh Kanjariya
1eb8800250 bgpd: use bgp_process while processing evpn routes in vrf
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:06 -08:00
Mitesh Kanjariya
5ba238b74a bgpd: import/unimport vrf routes upon l3vni change
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
mitesh
2dbad57fc6 bgpd: program nh/rmac entries
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
mitesh
d3135ba31d bgpd: program mac-ip routes in matching vrfs
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
10ebe1ab54 bgpd: import rt to vrf mapping
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
bc59a6720c bgpd: rmac ext comm
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
23a06e1170 zebra: don't get rmac in remote macip delete
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
f1f8b53c51 bgpd: handle export rt change for vrf
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
7a3e76f119 bgpd: add VRF export RTs to mac-ip routes
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
6d8c58b7e1 bgpd: bgpevpn APIs to get l3vni/rmac and import/export RT list
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
6a8657d0f0 bgpd: link l2vnis(bgpevpn) to l3vni(vrf)
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
c581d8b0f4 bgpd: import/export rt for BGP vrf
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
fe1dc5a374 bgpd: l3vni/rmac association with bgp vrf
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Mitesh Kanjariya
29c539221f bgpd: Bgpevpn tenant vrf association
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
2017-12-14 10:57:05 -08:00
Rafael Zalamena
1ad057aed6 bgpd: handle argv_find_and_parse_afi return value
Handle the return value of argv_find_and_parse_afi() to avoid passing
along bad values.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2017-12-14 14:04:01 -02:00
Rafael Zalamena
a90b8cb58a bgpd: use buffer size instead of hardcoded value
This is a possible buffer overflow.

We should always use the buffer size (whenever possible) to tell
functions what the size of the buffer is, instead of a hardcoded value.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2017-12-14 14:03:14 -02:00
Donald Sharp
23b2a7ef52 bgpd: Speedup vtysh handling of 'show bgp afi safi json' display
When bgp has a metric butt-load of routes w/ ecmp, this command
can take an inordinate amount of time to run and complete via
vtysh.

Converting the bgp route output in this case back to not
using the json pretty-print drops ~2 minutes of runtime
off.

It is assumed that if users would like pretty output they
can run it through an appropriate tool via a pipe command.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-14 09:31:26 -05:00
Lou Berger
0d0268a6ba bgpd: minor performance enhancement
Signed-off-by: Lou Berger <lberger@labn.net>
2017-12-14 08:32:50 -05:00
Donald Sharp
06b9f471ab bgpd: Fixup buffer sizes for prefix_rd2str
The prefix_rd2str uses a buffer size of RD_ADDRSTRLEN.
Modify the code to consistently use this for all of BGP.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-11 12:39:16 -05:00
Donald Sharp
02d3243970 bgpd: Modify prefix_rd2str to return "Unknown" when unknown
Make prefix_rd2str return an "Unknown" string when something
goes wrong.  This will allow for simplification of the
code that uses prefix_rd2str.

Additionally ensure that size is big enough with an assert.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-11 12:39:16 -05:00
Donald Sharp
ff6566f3ef bgpd: Cleanup SA error in ignoring return from function
Ignoring the return from argv_find_and_parse_afi
makes the SA system assume that you could pass a AFI_MAX
value to bgp_show_route.  Which in turn would cause
an array out of bounds read.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-11 08:42:51 -05:00
Donald Sharp
872ed4c793 bgpd: Fix prefix2str using BUFSIZ to PREFIX_STRLEN
PREFIX_STRLEN is the correct length for buffers needed to output
a prefix2str.  Additionally cleanup some setting of the last
value to a `\0` this is handled by prefix2str.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-11 08:42:51 -05:00
Rafael Zalamena
35f897c3b9
Merge pull request #1520 from donaldsharp/vrf_leaking
Cleanup a bunch of code
2017-12-07 11:43:38 -02:00
Quentin Young
14b8641a5e
bgpd: fix config display of coalesce-time
Since coalesce time is now heuristically adjusted based on peer count,
we need to separate out specific configuration by the user from the
current value. Behavior established is to not adjust if the user has a
value set.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-12-06 17:38:24 -05:00
Martin Winter
176b305080 bgpd: bgp_attr.c GCC 7.0 with --werror needs explicit fall-thru comment
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2017-12-05 00:30:13 -08:00
Donald Sharp
8cea954775 bgpd: Cleanup unneeded NULL checks.
All the NULL checks come after previous dereferences.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-04 21:28:19 -05:00
Donald Sharp
116e176d99 bgpd, zebra: Use sscanf return value
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-04 21:28:19 -05:00
Donald Sharp
82417ac28e bgpd: Tell the compiler we don't care about the return code for peer_sort
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-04 21:28:19 -05:00
Donald Sharp
2ad7c8ecb7 bgpd: Reorder assignment and assertion.
If we ever turn off assertion for production builds
this code as written will cause a crash in that
the assignment will not happen.

Modify the code such that this erroneous assumption
cannot happen.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-04 21:28:19 -05:00
Renato Westphal
2e4c229616 *: make clippy usage more consistent
Fixes #1511.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2017-12-04 19:46:38 -02:00
Renato Westphal
ded104de0c
Merge pull request #1507 from donaldsharp/bgp_af_open
bgpd: Allow Address-Family activation to work in certain states
2017-12-04 17:34:19 -02:00
Rafael Zalamena
6bbda63df7
Merge pull request #1508 from qlyoung/bgpd-fix-lock
bgpd: fix potential deadlock
2017-12-04 11:16:45 -02:00
Quentin Young
2d34fb80b8
*: don't use deprecated stream.h macros
Some of the deprecated stream.h macros see such little use that we may
as well just remove them and use the non-deprecated macros.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-12-01 13:51:06 -05:00
Quentin Young
8ec586b01b
bgpd: fix potential deadlock
With the way things are set up, this bit of code would never actually
cause a deadlock, but would be highly likely in the future.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-12-01 13:41:27 -05:00
Donald Sharp
64c1a9b59b bgpd: Allow Address-Family activation to work in certain states
If we are in OpenSent or OpenConfirm peer state and we receive a new
address-family activation, we would end up ignoring the new activation
and not tell our peer about it.  You could notice this by seeing
the fact that a 'show bgp neighbor' command returns a 'Not in
any update group' for a particular family.

This modifies the code such that we now notice that we are in
either OpenSent or OpenConfirm state and reset the peer to
allow us to send them the new capability.

Ticket: CM-19021
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-12-01 11:49:13 -05:00
Quentin Young
6ec98a2f37
bgpd: small optimization with UPDATE generation
After a batch of generated UPDATEs, call bgp_writes_on() once instead of
after generating each packet.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 17:17:16 -05:00
Quentin Young
c58b0f46dd
bgpd: use FOREACH_AFI_SAFI()
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:58:37 -05:00
Quentin Young
4961a5a2eb
bgpd: intelligently adjust coalesce timer
The subgroup coalesce timer controls how long updates to a particular
subgroup are delayed in order to allow additional peers to join the
subgroup. Presently the timer value is 200 ms. Increase it to 1 second
and adjust up as peers are configured, with an upper cap at 10s.

This cuts convergence time by a factor of 3 at large scale (300+ peers,
1000+ prefixes per peer).

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:47:51 -05:00
Quentin Young
934af4587f
bgpd: turn off keepalives when sending NOTIFY
This is necessary because otherwise between the time we wipe the output
buffer and the time we push the NOTIFY onto it, the KA generation thread
could have pushed a KEEPALIVE in the middle.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:07 -05:00
Quentin Young
d0ad6d8e5f
bgpd: yield more when generating UPDATEs
In the same vein as the round-robin input commit, this re-adds logic for
limiting the amount of time spent generating UPDATEs per generation
cycle. Missed this when shifting around wpkt_quanta; prior to MT it
limited both calls to write() as well as UPDATE generation.
2017-11-30 16:18:07 -05:00
Quentin Young
b785b7adda
bgpd: schedule UPDATE generation smarter
No need to schedule a job to generate more packets until we're done with
the ones we've got. Shaves a few percent off convergence time.
2017-11-30 16:18:06 -05:00
Quentin Young
9773a576bd
bgpd: restore packet input limit
Unfortunately, batching input processing severely impacts BGP initial
convergence times. As a consequence of the way update-groups were
implemented, advancing the state of the routing table based on prefixes
learned from one peer prior to all (or at least most) peers establishing
connections will cause us to start generating outbound UPDATEs, which is
a very expensive operation at present. This intensive processing starves
out bgp_accept(), delaying connection of additional peers. When
additional peers do connect the problem gets worse and worse, yielding
approximately exponential growth in convergence time dependent on both
peering and prefix counts. This behavior is present pre-multithreading
as well, but batched input exacerbates it.

Round-robin input processing marginally harms convergence times for
small topologies but should allow much larger topologies to function
within reasonable performance thresholds.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:06 -05:00
Quentin Young
4af766600a
bgpd: schedule process packet as timer
Different places scheduling the same thread should use the same
semantics and thread type. Additionally providing the back reference
here makes sure we only schedule the job once and avoids flooding the
event queue with jobs to process an empty buffer.
2017-11-30 16:18:06 -05:00
Quentin Young
af1e1dc69e
bgpd: re-add write trigger logic
Apparently I didn't fully understand how subgroup packets make their way
out to individual peers. Turns out (on the base branch) we just busy
poll while waiting for packets to make their way onto subgroup queues.
While this needs to be fixed in the future, for now readding this logic
fixes performance issues with convergence.
2017-11-30 16:18:06 -05:00
Quentin Young
5c075a907d
bgpd: properly set peer->last_update
Instead of checking whether the post-write number of updates sent was
greater than the pre-write number of updates sent, it was comparing post
to zero. In effect this meant every time we wrote a packet it was
counted as an update for route advertisement timer purposes.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:06 -05:00
Quentin Young
7a86aa5a0a
bgpd: schedule packet job after connection xfer
During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input and a
reorganized packet processor, the following race condition manifests:

1. Remote peer initiates a connection. After exchanging OPEN messages,
   we send them a KEEPALIVE. They send us a KEEPALIVE followed by
   10,000 UPDATE messages. The I/O thread pushes these onto our local
   peer's input buffer and schedules a packet processing job on the
   main thread.
2. The packet job runs and processes the KEEPALIVE, which completes the
   handshake on our end. As part of transferring to ESTABLISHED we
   transfer all peer state to a new struct, as mentioned. Upon returning
   from the KEEPALIVE processing routing, the peer context we had has
   now been destroyed. We notice this and stop processing. Meanwhile
   10k UPDATE messages are sitting on the input buffer.
3. N seconds later, the remote peer sends us a KEEPALIVE. The I/O thread
   schedules another process job, which finds 10k UPDATEs waiting for
   it. Convergence is achieved, but has been delayed by the value of the
   KEEPALIVE timer.

The racey part is that if the remote peer takes a little bit of time to
send UPDATEs after KEEPALIVEs -- somewhere on the order of a few hundred
milliseconds -- we complete the transfer successfully and the packet
processing job is scheduled on the new peer upon arrival of the UPDATE
messages. Yuck.

The solution is to schedule a packet processing job on the new peer
struct after transferring state.

Lengthy commit message in case someone has to debug similar problems in
the future...

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:05 -05:00
Quentin Young
7db44ec8fa
bgpd: transfer raw input buffer to new peer
During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input, I forgot
to transfer the raw input buffer to the new peer. This resulted in
infrequent failures during session handshaking whereby half of a packet
would be thrown away in the middle of a read causing us to send a NOTIFY
for an unsynchronized header. Usually the transfer coincided with a
clean input buffer, hence why it only showed up once in a while.
2017-11-30 16:18:05 -05:00
Quentin Young
387f984e58
bgpd: fix bgp active open
At some point when rearranging FSM code, bgpd lost the ability to
perform active opens because it was only paying attention to POLLIN and
not POLLOUT, when the latter is used to signify a successful connection
in the active case.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:05 -05:00
Quentin Young
3fe63c291d
bgpd: use correct byte order for notify data
Broke this when rewriting header validation.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:05 -05:00
Quentin Young
becedef6c3
bgpd, tests: comment formatting
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:05 -05:00
Quentin Young
85145b6264
bgpd: fix some formatting in bgp_io.c
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:04 -05:00
Quentin Young
1588f6f441
bgpd: update atomic memory orders
Use best-performing memory orders where appropriate.
Also update some style and add missing comments.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:04 -05:00