This flag needs to be set by default for l2vpn evpn address-family.
We needed to find a place in the code which gets called by all peers
at somepoint in the statemachine and before the routes are advertised.
peer_new seems like the right place for this
as we are setting other default af_flags here as well.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
In the FRR implementation of EVPN,
eBGP leaf-spine peering for EVPN is fully supported by allowing
the next hop to be propagated and not rewritten at each hop.
There are other changes also related to route import to facilitate this.
However, propagating the next hop is not correct in some cases.
Specifically, if the DC is comprised of multiple PODs
with distinct intra-POD and inter-POD VxLAN tunnels,
EVPN routes received from an adjacent POD by a border/exit leaf
must be propagated into the local POD with the next hop rewritten (to self).
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Attribute set on peer was being overridden when set on the peer-group.
This commit also adds a parallel flags array that indicates whether a
particular flag is sourced from the peer-group or is peer-specific. It
assumes the default state of all flags is unset. This looks to be true
except in the case of PEER_FLAG_SEND_COMMUNITY,
PEER_FLAG_SEND_EXT_COMMUNITY, and PEER_FLAG_SEND_LARGE_COMMUNITY; these
flags are set by default except when the user specifies to use
config-type = cisco. However the flag field can merely be flipped to
mean the negation of those options in a future commit.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Upon BGP destroy, the hash list related to PBR are removed.
The pbr_match entries, as well as the contained pbr_match_entries
entries.
Then the pbr_action entries. The order is important, since the former
are referencing pbr_action. So the references must be removed, prior to
remove pbr action.
Also, the zebra associated contexts are removed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgp structure is being extended with hash sets that will be used by
flowspec to give policy routing facilities.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Setup a per-VRF identifier to use along with the Router Id to build the
RD. Define a function to encode the RD. Code is brought over from EVPN
and EVPN code has been modified to use the generic function.
Ticket: CM-20256
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
add the `import vrf XXXX` command
router bgp 4 vrf DONNA
<config>
!
router bgp 4 vrf EVA
<config>
address-family ipv4 uni
import vrf DONNA
!
!
This command will allow for vrf EVA to specify that it would like
to receive the routes from vrf DONNA into it's table.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add support for CLI "auto" keyword in vrf->vpn export label:
router bgp NNN vrf FOO
address-family ipv4 unicast
label vpn export auto
exit-address-family
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
MPLS label pool backed by allocations from the zebra label manager.
A caller requests a label (e.g., in support of an "auto" label
specification in the CLI) via lp_get(), supplying a unique ID and
a callback function. The callback function is invoked at a later
time with the unique ID and a label value to inform the requestor
of the assigned label.
Requestors may release their labels back to the pool via lp_release().
The label pool is stocked with labels allocated by the zebra label
manager. The interaction with zebra is asynchronous so that bgpd
is not blocked while awaiting a label allocation from zebra.
The label pool implementation allows for bgpd operation before (or
without) zebra, and gracefully handles loss and reconnection of
zebra. Of course, before initial connection with zebra, no labels
are assigned to requestors. If the zebra connection is lost and
regained, callbacks to requestors will invalidate old assignments
and then assign new labels.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Routes that have labels must be sent via a nexthop that also has labels.
This change notes whether any path in a nexthop update from zebra contains
labels. If so, then the nexthop is valid for routes that have labels.
If a nexthop update has no labeled paths, then any labeled routes
referencing the nexthop are marked not valid.
Add a route flag BGP_INFO_ANNC_NH_SELF that means "advertise myself
as nexthop when announcing" so that we can track our notion of the
nexthop without revealing it to peers.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
This work is derived from a work done by China-Telecom.
That initial work can be found in [0].
As the gap between frr and quagga is important, a reworks has been
done in the meantime.
The initial work consists of bringing the following:
- Bringing the client side of flowspec.
- the enhancement of address-family ipv4/ipv6 flowspec
- partial data path handling at reception has been prepared
- the support for ipv4 flowspec or ipv6 flowspec in BGP open messages,
and the internals of BGP has been done.
- the memory contexts necessary for flowspec has been provisioned
In addition to this work, the following has been done:
- the complement of adaptation for FS safi in bgp code
- the code checkstyle has been reworked so as to match frr checkstyle
- the processing of IPv6 FS NLRI is prevented
- the processing of FS NLRI is stopped ( temporary)
[0] https://github.com/chinatelecom-sdn-group/quagga_flowspec/
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: jaydom <chinatelecom-sdn-group@github.com>
In BGP, doing policy-routing requires to use table identifiers.
Flowspec protocol will need to have that. 1 API from bgp zebra has been
done to get the table chunk.
Internally, onec flowspec is enabled, the BGP engine will try to
connect smoothly to the table manager. If zebra is not connected, it
will try to connect 10 seconds later. If zebra is connected, and it is
success, then a polling mechanism each 60 seconds is put in place. All
the internal mechanism has no impact on the BGP process.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This commit is relying on bgp vpn-policy. It is needed to configure
several bgp vrf instances, and in each of the bgp instance, configure
the following command under address-family ipv4 unicast node:
[no] rt redirect import RTLIST
Then, a function is provided, that will parse the BGP instances.
The incoming ecommunity will be compared with the configured rt redirect
import ecommunity list, and return the VRF first instance of the matching
route target.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Previous patches to suppress display of automatically calculated
coalesce-time did not fully work because the flag indicating whether the
value was automatically calculated was not set properly upon creation.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
PR #1739 added code to leak routes between (default VRF) VPN safi and unicast RIBs in any VRF. That set of changes included temporary CLI including vpn-policy blocks to specify RD/RT/label/&c. After considerable discussion, we arrived at a consensus CLI shown below.
The code of this PR implements the vpn-specific parts of this syntax:
router bgp <as> [vrf <FOO>]
address-family <afi> unicast
rd (vpn|evpn) export (AS:NN | IP:nn)
label (vpn|evpn) export (0..1048575)
rt (vpn|evpn) (import|export|both) RTLIST...
nexthop vpn (import|export) (A.B.C.D | X:X::X:X)
route-map (vpn|evpn|vrf NAME) (import|export) MAP
[no] import|export [vpn|evpn|evpn8]
[no] import|export vrf NAME
User documentation of the vpn-specific parts of the above syntax is in PR #1937
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
- add "debug bgp vpn label" CLI
- improved debug messages for "debug bgp bestpath"
- send vrf label to zebra after zebra informs bgpd of vrf_id
- withdraw vrf_label from zebra if zebra informs bgpd that vrf_id is disabled
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
The work_queue_free function free'd up the wq pointer but
did not set it too NULL. This of course causes situations
where we may use the work_queue after it is freed. Let's
modify the work_queue to set the pointer for you.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When VRF is not yet available at startup, the check for main socket
presence must be done. As the main socket creation is made in a separate
place from vrf socket for netns, ths main socket creation must not be
prevented when a BGP VRF relies on vrf lite mechanism.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon creation of BGP instances, server socket may or may not be created.
In the case of VRF instances, if the VRF backend relies on NETNS, then
a new server socket will be created for each BGP VRF instance. If the
VRF backend relies on VRF LITE, then only one server socket will be
enough. Moreover, At startup, with BGP VRF configuration, a server
socket may not be created if VRF is not the default one or VRF is not
recognized yet.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The change contained in this commit does the following:
- discovery of vrf id from zebra daemon, and adaptation of bgp contexts
with BGP.
The list of network addresses contain a reference to the bgp context
supporting the vrf.
The bgp context contains a vrf pointer that gives information about
the netns path in case the vrf is a netns path.
Only some contexts are impacted, namely socket creation, and retrieval
of local IP settings. ( this requires vrf identifier).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
peer->ifindex was only used in two places but it was never populated so
neither of them worked as they should. 'struct peer' also has a 'struct
interface' pointer which we can use to get the ifindex.
Use the new threading facilities provided in lib/ to streamline the
threads used in bgpd. In particular, all of the lifecycle code has been
removed from the I/O thread and replaced with the default loop. Did not
do the same to the keepalives thread as it is much smaller (doesn't need
the event system).
Also cleaned up some comments to match the style guide.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Adds ability to specify that peers should be administratively shutdown
when first configured.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When a peer configured with administrative shutdown is added to a peer
group, the administrative shutdown status is discarded and the peer will
enter the BGP FSM. This is not what we want. Preserve the flag instead.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The constant to limit # of allowed cli tokens on any one line was
defined in multiple places, all inconsistent with each other. Fix.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The BGP IO thread must be running before other threads
can start using it. So at startup check to see
that it running once, instead of before every
function call into.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The multithreading code has a comment that reads:
"XXX: Heavy abuse of stream API. This needs a ring buffer."
This patch makes the relevant code use a ring buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Was using 0 as a sentinel value, so user couldn't configure 0 as the
value of the coalesce timer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
currently, we have a rd_id bitfield
to assign an unique index for auto RD.
This bitfield currently resides under struct bgp which seems wrong.
We need to shift this to a global space
as this ID space is really global per box.
One more reason to keep it at a global data structure is,
the ID space could be used by both VNIs and VRFs.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
BGP VRF can be created/deleted either via config or via l3vni add/del.
We need to handle various sequences.
1. If user config is presented, an l3vni del should not delete the vrf instance
2. do not write bgp config in show running for auto created vrf
2. If l3vni present, disallow the cli for deleting bgp vrf instance
3. If l3vni is added and vrf config is present set the flags properly
4. if bgp vrf is configured unset the AUTO flag
Ticket: CM-18630
Review: CCR-6906
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Since coalesce time is now heuristically adjusted based on peer count,
we need to separate out specific configuration by the user from the
current value. Behavior established is to not adjust if the user has a
value set.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
If we are in OpenSent or OpenConfirm peer state and we receive a new
address-family activation, we would end up ignoring the new activation
and not tell our peer about it. You could notice this by seeing
the fact that a 'show bgp neighbor' command returns a 'Not in
any update group' for a particular family.
This modifies the code such that we now notice that we are in
either OpenSent or OpenConfirm state and reset the peer to
allow us to send them the new capability.
Ticket: CM-19021
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The subgroup coalesce timer controls how long updates to a particular
subgroup are delayed in order to allow additional peers to join the
subgroup. Presently the timer value is 200 ms. Increase it to 1 second
and adjust up as peers are configured, with an upper cap at 10s.
This cuts convergence time by a factor of 3 at large scale (300+ peers,
1000+ prefixes per peer).
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
bgpd supports setting a write-quanta that serves as a hint on how many
packets to write per I/O cycle. Now that input is buffered, it makes
sense to add the equivalent parameter for how many packets are processed
per cycle. This is *not* how many packets are read off the wire per I/O
cycle; rather it is how many packets are processed from the input buffer
in a given cycle after having been read off the wire and sanitized.
Since these values must be used from multiple threads, they have also
been made atomic.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Instead of reading a packet header and the rest of the packet in two
separate i/o cycles, instead read a chunk of data at one time and then
parse as many packets as possible out of the chunk.
Also changes bgp_packet.c to batch process packets.
To avoid thrashing on useless mutex locks, the scheduling call for
bgp_process_packet has been changed to always succeed at the cost of no
longer being cancel-able. In this case this is acceptable; following the
pattern of other event-based callbacks, an additional check in
bgp_process_packet to ignore stray events is sufficient. Before deleting
the peer all events are cleared which provides the requisite ordering.
XXX: chunk hardcoded to 5, should use something similar to wpkt_quanta
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
piggybacking off of bgp_read()
* Tons of little fixups
Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
After implement threading, bgp_packet.c was serving the double purpose
of consolidating packet parsing functionality and handling actual I/O
operations. This is somewhat messy and difficult to understand. I've
thus moved all code and data structures for handling threaded packet
writes to bgp_io.[ch].
Although bgp_io.[ch] only handles writes at the moment to keep the noise
on this commit series down, for organization purposes, it's probably
best to move bgp_read() and its trappings into here as well and
restructure that code so that read()'s happen in the pthread and packet
processing happens on the main thread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Changes all synchronization primitives to be dynamically allocated. This
should help catch any subtle errors in pthread lifecycles.
This change also pre-initializes synchronization primitives before
threads begin to run, eliminating a potential race condition that
probably would have caused a segfault on startup on a very fast box.
Also changes mutex and condition variable allocations to use
MTYPE_PTHREAD and updates tests to do the proper initializations.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Removes the WiP shim and implements proper thread lifecycle management.
* Declare necessary pthread_t's in bgp_master
* Define new MTYPE in lib/thread.c for pthreads
* Allocate and free BGP's pthreads appropriately
* Terminate and join threads appropriately
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Problem reported that we weren't adjusting the keepalive timer
correctly when we negotiated a lower hold time learned from a
peer. While working on this, found we didn't do inheritance
correctly at all. This fix solves the first problem and also
ensures that the timers are configured correctly based on this
priority order - peer defined > peer-group defined > global config.
This fix also displays the timers as "configured" regardless of
which of the three locations above is used.
Ticket: CM-18408
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-6807
Testing-performed: Manual testing successful, fix tested by
submitter, bgp-smoke completed successfully
This improves code readability and also future-proofs our codebase
against new changes in the data structure used to store interfaces.
The FOR_ALL_INTERFACES_ADDRESSES macro was also moved to lib/ but
for now only babeld is using it.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is an important optimization for users running FRR on systems with
a large number of interfaces (e.g. thousands of tunnels). Red-black
trees scale much better than sorted linked-lists and also store the
elements in an ordered way (contrary to hash tables).
This is a big patch but the interesting bits are all in lib/if.[ch].
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
list_free is occassionally being used to delete the
list and accidently not deleting all the nodes.
We keep running across this usage pattern. Let's
remove the temptation and only allow list_delete
to handle list deletion.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Convert the list_delete(struct list *) function to use
struct list **. This is to allow the list pointer to be nulled.
I keep running into uses of this list_delete function where we
forget to set the returned pointer to NULL and attempt to use
it and then experience a crash, usually after the developer
has long since left the building.
Let's make the api explicit in it setting the list pointer
to null.
Cynical Prediction: This code will expose a attempt
to use the NULL'ed list pointer in some obscure bit
of code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If we are configuring a peer in multiple address families
and assigning the peer group valid configuration. If you
delay the non-automatically address family activation you
will not copy the peer group data into that peer.
Suppose we enter this:
router bgp 65001
bgp router-id 6.0.0.17
neighbor ISL peer-group
neighbor ISL advertisement-interval 0
neighbor ISL timers connect 5
neighbor ISL timers 3 10
address-family ipv4 unicast
neighbor ISL allowas-in 1
neighbor swp31s0 interface
neighbor swp31s0 peer-group ISL
address-family ipv6 unicast
neighbor ISL allowas-in 1
We've assigned allowas-in to the ISL peer group. Now suppose
we have a peer start connection to swp31s0. We startup and
auto copy the v4 peer group information onto the peer. We
do not copy the v6 peer group information because it has
not started yet.
Now at a later time if we enter:
address-family ipv6 unicast
neighbor ISL activate
We start the swp31s0 peer in v6, but we are not copying the
peer group data into the v6 swp31s0 peer data structure. As
such we are not respecting the v6 peer group config.
This Change modifies and renames the non_peergroup_activate_af
function to peer_activate_af. We also call the function
peer_group2peer_config_copy_af if the peer is part of a peer
group.
The static function peer_group2peer_config_copy_af I have moved
to higher up so we don't have to add a static function declaration
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This fixes the broken indentation of several foreach loops throughout
the code.
From clang's documentation[1]:
ForEachMacros: A vector of macros that should be interpreted as foreach
loops instead of as function calls.
[1] http://clang.llvm.org/docs/ClangFormatStyleOptions.html
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
1) Add hash names to all hash_create calls
2) Fix community_hash, ecommunity_hash and lcommunity_hash key
creation
3) Fix output of community and lcommunity iterators( why would
we want to see the memory location of the backet? ).
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
afi_header_vty_out() is easily replaced with vty_frame(), which means we
can drop a whole batch of "int *write" args as well as the entirety of
bgp_config_write_family_header().
=> AFI/SAFI config writing is now a lot simpler.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
1. Change hostname_get to cmd_hostname_get
2. Change domainname_get to cmd_domainname_get
3. New API to set domainname
3. Provide a CLI command to set domainname
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Bug introduced by commit 37d361e7. Removing the call to bgp_close()
from bgp_delete() was a mistake.
Reported-by: Don Slice <dslice@cumulusnetworks.com>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When the underlying VRF is deleted, ensure that state for the
next hops that BGP registers with zebra for tracking purposes is
properly updated. Otherwise BGP will not re-register the next hop
when the VRF is re-created, resulting in the next hop staying
unresolved.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-17456
Reviewed By: CCR-6587
Testing Done: Manual, bgp-min, vrf
There are two parts to this commit:
1. create a database of self tunnel-ip for used in martian nexthop check
In a CLAG setup, the tunnel-ip (VNI UP) notification comes before the clag-anycast-ip comes up in the system.
This was causing our self next hop check to fail and we were instaling routes with martian nexthop in zebra.
We need to keep this info in a seperate database for all local tunnel-ip.
This database will be used in parallel with the self next hop database to martian nexthop checks.
2. When a local VNI comes up, update the tunnel-ip database and filter routes in the RD table if necessary
In case of EVPN we might receive routes from clag peer before the clag-anycast ip and VNI is up on the system.
We will store the routes in the RD table for later processing.
When VNI comes UP, we loop thorugh all the routes and install them in zebra if required.
However, we were missing the martian nexthop check in this code path.
From now onwards, when a VNI comes UP,
we will first update the tunnel-ip database
We then loop through all the routes in RD table and apply martian next hop filter if required.
Things not covered in this commit but are required:
This processing is needed in general when an address becomes a connected address.
We need to loop through all the routes in BGP and apply martian nexthop filter if necessary.
This will be taken care in a seperate bug
Ticket:CM-17271/CM-16911
Reviewed By: ccr-6542
Testing Done: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When displaying the config, bgpd only checked for the existance of a peer-group prefix-list before
deciding to not display the outbound prefix-list. This commit updates the outbound prefix-list
logic to match the inbound.
afi_header_vty_out is sidestepping the vty code, writing straight to the
output (either stdout or the obuf), which results in newline translation
not being performed.
Easiest fix is replacing it with a macro. Longer-term, I have some old
code to add "prefaces" to the vty output, planning to dig that up.
Fixes: #949 ("bgpd show running doesn't show new lines")
Reported-by: Lou Berger <lberger@labn.net>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The defines:
ONE_DAY_SECOND
ONE_WEEK_SECOND
ONE_YEAR_SECOND
were being defined all over the system, move the
define to a central location.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
SAFI values have been a major source of confusion over the last few
years. That's because each SAFI needs to be represented in two different
ways:
* IANA's value used to send/receive packets over the network;
* Internal value used for array indexing.
In the second case, defining reserved values makes no sense because we
don't want to index SAFIs that simply don't exist. The sole purpose of
the internal SAFI values is to remove the gaps we have among the IANA
values, which would represent wasted memory in C arrays. With that said,
remove these reserved SAFIs to avoid further confusion in the future.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
swpX peers all start out with the same sockunion so initially they all
go into the same hash bucket. Once IPv6 ND has worked its magic they
will have different sockunions and will go in different buckets...life
is good.
Until then though, we are in a phase where all swpX peers have the same
socknunion. Once we have HASH_THRESHOLD (10) swpX peers and call
hash_get for a new swpX peer the hash code calls hash_expand(). This
happens because there are more than HASH_THRESHOLD entries in a single
bucket so the logic is "expand the hash to spread things out"...in our
case expanding doesn't spread out the swpX peers because all of their
sockunions are the same.
I looked at having peer_hash_make and peer_hash_same consider the ifname
of the swpX peer but that is a large change that we don't want to make
at the moment. So the fix is to put a cap on how large we are
willing to let the hash table get. By default there is no limit but if
max_size is set we will not allow the hash to expand above that.
This reverts commit c14777c6bf.
clang 5 is not widely available enough for people to indent with. This
is particularly problematic when rebasing/adjusting branches.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
This allows frr-reload.py (or anything else that scripts via vtysh)
to know if the vtysh command worked or hit an error.
When the BGP router-id changes, EVPN routes need to be processed due
to potential change to their RD.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement configuration options for EVPN. The configuration options include
VNI configuration with RD and Import and Export Route Targets. Also, display
the EVPN configuration.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Define the EVPN (EVI) hash table and related structures and initialize
and cleanup.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
json-c does not (yet) offer support for unsigned integer types, and
furthermore, the docs state that all integers are stored internally as
64-bit. So there's never a case in which we would want to limit,
implicitly or otherwise, the range of an integer when adding it to a
json object.
Among other things this fixes the display of ASN values greater than
(1/2) * (2^32 - 1)
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
- All ipv4 labeled-unicast routes are now installed in the ipv4 unicast
table. This allows us to do things like take routes from an ipv4
unicast peer, allocate a label for them and TX them to a ipv4
labeled-unicast peer. We can do the opposite where we take routes from
a labeled-unicast peer, remove the label and advertise them to an ipv4
unicast peer.
- Multipath over a labeled route and non-labeled route is not allowed.
- You cannot activate a peer for both 'ipv4 unicast' and 'ipv4
labeled-unicast'
- The 'tag' variable was overloaded for zebra's route tag feature as
well as the mpls label. I added a 'mpls_label_t mpls' variable to
avoid this. This is much cleaner but resulted in touching a lot of
code.
* Ranges for some MED were 2^32 - 2 instead of 2^32 - 1
* Use correct printf specifiers for unsigned values
* Some drive-by CLI collapsing and simplification
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
We would only diplay one "no neighbor 2.2.2.2 send-community XYZ" but
there might be multiple that need to be displayed.
Add checks related to AFI_L2VPN/SAFI_EVPN that were missing in some parts
of the code. Fix incorrect check skipping EVPN when sending End of RIB.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
For certain (sub) address families such as EVPN or L3VPN, the routing
table is organized as a 2-level tree. Ensure that code walking the
routing table does the correct handling in such cases.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Basically if we are reading in a cli with a extended-nexthop
and we have not received from zebra the interface we are working
on I believe we have a race condition where we are not
propagating the PEER_FLAG_CAPABILITY_ENHE in this case.
Modify the code to propagate even if we haven't found the
interface yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Basically if we are reading in a cli with a extended-nexthop
and we have not received from zebra the interface we are working
on I believe we have a race condition where we are not
propagating the PEER_FLAG_CAPABILITY_ENHE in this case.
Modify the code to propagate even if we haven't found the
interface yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
vrf_iflist_create -> By the time this is called in enable, the vrf's iflist
is already created. Additionally this code should be a properly of the vrf
to init/destroy not someone else.
vrf_iflist_terminate -> This function should be a property of vrf deletion
and does not need to be exposed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The way thread.c is written, a caller who wishes to be able to cancel a
thread or avoid scheduling it twice must keep a reference to the thread.
Typically this is done with a long lived pointer whose value is checked
for null in order to know if the thread is currently scheduled. The
check-and-schedule idiom is so common that several wrapper macros in
thread.h existed solely to provide it.
This patch removes those macros and adds a new parameter to all
thread_add_* functions which is a pointer to the struct thread * to
store the result of a scheduling call. If the value passed is non-null,
the thread will only be scheduled if the value is null. This helps with
consistency.
A Coccinelle spatch has been used to transform code of the form:
if (t == NULL)
t = thread_add_* (...)
to the form
thread_add_* (..., &t)
The THREAD_ON macros have also been transformed to the underlying
thread.c calls.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Implement support for negotiating IPv4 or IPv6 labeled-unicast address
family, exchanging prefixes and installing them in the routing table, as
well as interactions with Zebra for FEC registration. This is the
implementation of RFC 3107.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Implement support for activating the labeled-unicast address family in
BGP and relevant configuration for this address family.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This means there are no ties into the SNMP code anymore other than the
init call at startup.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Start centralising startup & option parsing into the library.
FRR_DAEMON_INFO is a bit weird, but it will become useful later (e.g.
for killing the ZLOG_* enum, and having the daemon name available)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When starting up bgp and zebra now, you can specify
-e <number> or --ecmp <number>
and that number will be used as the maximum ecmp
that can be used.
The <number specified must be >= 1 and <= MULTIPATH_NUM
that Quagga is compiled with.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When configuring l2vpn evpn address-family, the show running indicates
that the address-family l2vpn evpn address-family has been configured,
and not evpn, as it was done before this commit. This is a bug fix.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This patch introduces show show bgp evpn commands to dump
NLRI entries configured or received on BGP, related to EVPN
New command introduced is the following:
show [ip] bgp l2vpn evpn [all | rd <rd name> ] [overlay]
Like for MPLS, similar set of commands is added for EVPN:
show [ip] bgp l2vpn evpn [all|rd <RDNAME>]
show [ip] bgp l2vpn evpn all neighbor <NEIGHBOR> routes
show [ip] bgp l2vpn evpn all neighbor <NEIGHBOR> advertised-routes
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This patch introduces code to receive a NLRI message with route type
5, as defined in draft-ietf-bess-evpn-prefix-advertisement-02. It
It increases the number of parameters to extract from the NLRI and
to store into bgp extra information structure. Those parameters are
the ESI (ethernet segment identifier), the gateway IP Address (which
acts like nexthop attribute but is contained inside the NLRI itself)
and the ethernet tag identifier ( that acts for the VXLan Identifier)
This patch updates bgp_update() and bgp_withdraw() api, and then does the
necessary adapations for rfapi.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
To handle BGP NLRI EVPN messages, bgp is modified to handle AFI_L2VPN
and SAFI_EVPN values.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
BGP Large Communities are a novel way to signal information between
networks. An example of a Large Community is: "2914:65400:38016". Large
BGP Communities are composed of three 4-byte integers, separated by a
colon. This is easy to remember and accommodates advanced routing
policies in relation to 4-Byte ASNs.
This feature was developed by:
Keyur Patel <keyur@arrcus.com> (Arrcus, Inc.),
Job Snijders <job@ntt.net> (NTT Communications),
David Lamparter <equinox@opensourcerouting.org>
and Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Job Snijders <job@ntt.net>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Introduce internal and IANA defintions for AFI/SAFI and mapping
functions and modify code to use these. This refactoring will
facilitate adding support for other AFI/SAFI whose IANA values
won't be suitable for internal data structure definitions (e.g.,
they are not contiguous).
The commit adds some fixes related to afi/safi testing with 'make check
' command.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Ticket: CM-11416
Reviewed By: CCR-3594 (mpls branch)
Testing Done: Not tested now, tested earlier on mpls branch
Preserve forwarding state bit can be set in OPEN message with the use of
new vty commands
bgp graceful-restart preserve-fw-state
no bgp graceful-restart preserve-fw-state
This must be set before activating the connection to a peer, since it is
for forging graceful restart capability of OPEN messages.
Signed-off-by: Julien Courtat <julien.courtat@6wind.com>
This patch enable the support of graceful restart for routes sets with
vpnv4 address family format. In this specific case, data model is
slightly different and some additional processing must be done when
accessing bgp tables and nodes.
The clearing stale algorithm takes into account the specificity where
the 2 node level for MPLS has to be reached.
Signed-off-by: Julien Courtat <julien.courtat@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgp_master_init is called first thing in main(), so we need to wedge a
qobj_init() call in there... this needs some improvement...
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This contains bgp memory leak fixes as well as cleanups to VRF/namespace
handling and has been run through extended testing in Cumulus' testbed:
Tested-by: Donald Sharp <sharpd@cumulusnetworks.com>
During connection establishment, there is a separate peer structure created
for the doppelganger (for incoming connection). When this is deleted after
the connection has established, take care to ensure that the nexthop entry
for the peer is not deleted.
Fixes: f9164b1d74
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-13875
Reviewed By: None
Testing Done: Manual
(cherry picked from commit 4f2bc892cbddbf36bd5e1b2f36c33260af614b33)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
We shoult not call bgp_unlock() before calling
bgp_delete_connected_nexthop() in the peer_free() function. Otherwise,
if bgp->lock reaches zero, bgp_free() is called and peer->bgp becomes
an invalid pointer in the bgp_delete_connected_nexthop() function.
To fix this, move the call to bgp_unlock() to the end of peer_free().
Bug exposed by commit 37d361e ("bgpd: plug several memleaks").
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
We shoult not call bgp_unlock() before calling
bgp_delete_connected_nexthop() in the peer_free() function. Otherwise,
if bgp->lock reaches zero, bgp_free() is called and peer->bgp becomes
an invalid pointer in the bgp_delete_connected_nexthop() function.
To fix this, move the call to bgp_unlock() to the end of peer_free().
Bug exposed by commit 37d361e ("bgpd: plug several memleaks").
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Since VRFs can be searched by vrf_id or name, make this explicit in the
helper functions.
s/vrf_lookup/vrf_lookup_by_id/
s/zebra_vrf_lookup/zebra_vrf_lookup_by_id/
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
It seems these two were at some point copied in from rsync; replace with
more recent versions that will hopefully become available in glibc as
well.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
bgpd/bgpd.c had a typo
zebra/zebra_mpls_netlink.c was derived from rt_netlink.c
isisd/include-netbsd/* are not needed (2 constants moved over)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-13239
Until today the admin distance cannot be configured for any IPv6
routing protocol. This patch implements it for bgp.
Signed-off-by: Maitane Zotes <maz@open.ch>
Signed-off-by: Roman Hoog Antink <rha@open.ch>
* Fix mild leak, bgp_nexthop_caches were not deleted when their peer was.
Not a huge one, but makes valgrinding for other leaks noisier.
Credit to Lou Berger <lberger@labn.net> for doing the hard work of
debugging and pinning down the leak, and supplying an initial fix.
That one didn't quite get the refcounting right, it seemed, hence
this version.
This version also keeps bncs pinned so long as the peer is defined, where
Lou's tried to delete whenever the peer went through bgp_stop. That causes
lots of zebra traffic if down peers go Active->Connect->Active, etc., so
leaving bnc's in place until peer_delete seemed better.
* bgp_nht.c: (bgp_unlink_nexthop_by_peer) similar to bgp_unlink_nexthop, but
by peer.
* bgp_nht.c: (bgp_unlink_nexthop_check) helper to consolidate checking
if a bnc should be deleted.
(bgp_unlink_nexthop_by_peer) ensure the bnc->nht_info peer reference
is removed, and hence allow bncs to be removed by previous.
* bgpd.c: (peer_delete) cleanup the peer's bnc.
Ticket: CM-13053
Reviewed By: dslice@cumulusnetworks.com
'neighbor x.x.x.x weight' was implemented as a per-peer knob instead of
a per-peer per-afi-safi option. This makes it configurable per-peer
per-afi-safi so that we can do things like soft clear that afi/safi when
weight is modified.
This feature adds an L3 & L2 VPN application that makes use of the VPN
and Encap SAFIs. This code is currently used to support IETF NVO3 style
operation. In NVO3 terminology it provides the Network Virtualization
Authority (NVA) and the ability to import/export IP prefixes and MAC
addresses from Network Virtualization Edges (NVEs). The code supports
per-NVE tables.
The NVE-NVA protocol used to communicate routing and Ethernet / Layer 2
(L2) forwarding information between NVAs and NVEs is referred to as the
Remote Forwarder Protocol (RFP). OpenFlow is an example RFP. For
general background on NVO3 and RFP concepts see [1]. For information on
Openflow see [2].
RFPs are integrated with BGP via the RF API contained in the new "rfapi"
BGP sub-directory. Currently, only a simple example RFP is included in
Quagga. Developers may use this example as a starting point to integrate
Quagga with an RFP of their choosing, e.g., OpenFlow. The RFAPI code
also supports the ability import/export of routing information between
VNC and customer edge routers (CEs) operating within a virtual
network. Import/export may take place between BGP views or to the
default zebera VRF.
BGP, with IP VPNs and Tunnel Encapsulation, is used to distribute VPN
information between NVAs. BGP based IP VPN support is defined in
RFC4364, BGP/MPLS IP Virtual Private Networks (VPNs), and RFC4659,
BGP-MPLS IP Virtual Private Network (VPN) Extension for IPv6 VPN . Use
of both the Encapsulation Subsequent Address Family Identifier (SAFI)
and the Tunnel Encapsulation Attribute, RFC5512, The BGP Encapsulation
Subsequent Address Family Identifier (SAFI) and the BGP Tunnel
Encapsulation Attribute, are supported. MAC address distribution does
not follow any standard BGB encoding, although it was inspired by the
early IETF EVPN concepts.
The feature is conditionally compiled and disabled by default.
Use the --enable-bgp-vnc configure option to enable.
The majority of this code was authored by G. Paul Ziemba
<paulz@labn.net>.
[1] http://tools.ietf.org/html/draft-ietf-nvo3-nve-nva-cp-req
[2] https://www.opennetworking.org/sdn-resources/technical-library
Now includes changes needed to merge with cmaster-next.
This is a rather large mechanical commit that splits up the memory types
defined in lib/memtypes.c and distributes them into *_memory.[ch] files
in the individual daemons.
The zebra change is slightly annoying because there is no nice place to
put the #include "zebra_memory.h" statement.
bgpd, ospf6d, isisd and some tests were reusing MTYPEs defined in the
library for its own use. This is bad practice and would break when the
memtype are made static.
Acked-by: Vincent JARDIN <vincent.jardin@6wind.com>
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
[CF: rebased for cmaster-next]
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Since the default for ipv4 unicast is to now assume
that the neighbor is activated, print out the
no neighbor 192.168.33.44 activate
line when it is explicitly turned off.
Ticket: CM-12809
Reported-by: Lou Berger <lberger@labn.net>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by:
There is support to configure graceful restart timer. This is the
time to wait to delete stale routes before a BGP open message is
received.
bgp graceful-restart restart-time <1-3600>
no bgp graceful-restart [<1-255>]
* bgpd/bgp_vty.c
* Define command strings for above CLI
* bgpd/bgpd.c
* bgp_config_write(): Output graceful restart-time configuration
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Tested-by: NetDEF CI System <cisystem@netdef.org>
bgp_address_destroy became per-bgp instance. Moved the
call to the bgp_address_destroy function to the bgp delete.
Signed-off-by: Lou Berger <lberger@labn.net>
(cherry picked from commit 637035710a2f8e1e5944ee714135b7f88ac15ac4)
* Solaris doesn't have u_int64_t, so use uint64_t instead. C99-style
fixed-width integers should always be preferred to improve portability;
* 's_addr' is a macro on Solaris, so we can't use it as a variable name.
Rename the 's_addr' variable to 'addr' in the
bgp_peer_conf_if_to_su_update_v4() function.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Logic for determining the router-id was spread out over bgp_zebra.c and
bgp_vty.c. Move to bgpd/bgpd.c and have these two call more properly
encapsulated functions.
Significant work by Christian Franke <chris@opensourcerouting.org>.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Adds "const" on:
- peer_update_source_addr_set()
- peer_description_set()
Adds parameter names on:
- bgp_timers_set()
(really confusing, this one, with 2 unexplained args of same type)
Adds new setter:
- peer_afc_set(), calling peer_activate/peer_deactivate.
(intended for API consumers, matches peer->afc)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
There was an exit added at the end of the BGP commands after we pulled the code from upstream. This was causing the reload scripts to fail. Removed this exit.
Ticket: CM-11464 CM-11924
Reviewed By: CCR-4995
Testing Done: Manual
<DETAILED DESCRIPTION (REPLACE)>
To make BGP configuration as simple as possible, assume the capability
extended-nexthop to be default for interface neighbors. Also allow the
ability to specify remote-as on the same line as neighbor interface to
make BGP unnumbered configuration a single line.
One corner case. This is the first feature for which the default for a
member is different from the default for a peer-group. Since advertising
the capability is only done for interface neighbors, the capability is
not set for the peer-group, but is automatically set for interface
neighbors that belong to that peer-group. So, if you want to disable the
advertisement of this capability for an interface neighbor, you must
do it per each interface neighbor.
The patch is more complicated than it needs to be due to the handling
of quagga reload and appropriate updates to the show running output.
Ticket: CM-11830
Reviewed By: CCR-4958
Testing Done: Usual coterie, including manual
(cherry picked from commit 347914a0a7)
Ticket: CM-11460
Reviewed By: CCR-4927
Testing Done:
Quagga's default "show running" model is to only print the non-default config.
Historically, IPv4 unicast has always had a default 'activate' model unless
its been configured otherwise. In 3.0, we introduced a print of the 'activate'
statement for IPv4 unicast independent of whether it was the default or not.
This causes quagga reload to break as the user doesn't configure 'activate' for
IPv4 unicast, and so any config changes will also not have it. However 'show
running' will display it, causing quagga reload to think that the AFI/SAFI has
been deactivated and bounce the sessions incorrectly.
This patch reverts to the original quagga behavior/model of not printing the
'activate' line for IPv4 unicast if its the default.
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-11444
The bgp table may contain nodes without an 'info' (these nodes are used
for balancing the tree, they are created by route_common() in lib/table.c).
When we call bgp_recalculate_all_bestpaths() we should avoid calling
bgp_process() for these nodes. bgp_recalculate_all_bestpaths() is only
called when knobs are configured that could have an impact on which
routes are selected as best.
VPNv6 changes picked from upstream needed fixes and updates due to some
fundamental changes implemented by Cumulus (BGP update-groups, RFC 5549
and nexthop setting etc.) which aren't present upstream.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Updates: 945c8fe, 8ecd326, bb86c60, 93b73df, f4c8985
Ticket: CM-11327
Signed-off-by: Don Slice
Reviewed-by: Donald Sharp
Testing Done: Manual testing, bgp-min, vrf-min, bgp-smoke, vrf-smoke all successful
When bgp was configured in a vrf and then deleted, the vrf->iflist
was being deleted from the vrf. Since the vrf itself was not deleted,
it was assumed in later calls that the vrf->iflist was still there
and when it was referenced, the crash occurred.
Signed-off-by: Radhika Mahankali <radhika@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Kanna Rajagopal <kanna@cumulusnetworks.com>
Ticket: CM-11055
Reviewed By: CCR-4773
Testing Done: Unit, PTM smoke, BGP neighbor smoke
Issue: bgpd is not replaying the BFD registrations to PTM after quagga restart.
Root Cause: This problem happens when BFD configuration is part of the peer group template. Currently, the BFD configuration is being copied to the peer from template as part of the AF (address family) configuration. But, when the saved config is used after the quagga restart the peer group template is applied to the peer before the AF configuration is configured for the template. Due to this the BFD configuration never gets copied from the template to the peer and the BGP peers have no BFD configuration after the restart
Sample config which failed:
router bgp 100
bgp router-id 10.10.0.1
no bgp default ipv4-unicast
bgp bestpath as-path multipath-relax
neighbor dpeergrp_2 peer-group
neighbor dpeergrp_2 remote-as 100
neighbor dpeergrp_2 bfd
neighbor dpeergrp_2 advertisement-interval 1
neighbor dpeergrp_2 timers connect 1
neighbor dpeergrp_4 peer-group
neighbor dpeergrp_4 remote-as 400
neighbor dpeergrp_4 bfd
neighbor dpeergrp_4 advertisement-interval 1
neighbor dpeergrp_4 timers connect 1
neighbor swp2s0.1 interface peer-group dpeergrp_2
neighbor swp18s3.1 interface peer-group dpeergrp_4
!
address-family ipv4 unicast
redistribute connected route-map redist
neighbor dpeergrp_2 activate
neighbor dpeergrp_2 next-hop-self
neighbor dpeergrp_2 default-originate
neighbor dpeergrp_2 soft-reconfiguration inbound
neighbor dpeergrp_4 activate
neighbor dpeergrp_4 next-hop-self
neighbor dpeergrp_4 default-originate
neighbor dpeergrp_4 soft-reconfiguration inbound
maximum-paths 14
exit-address-family
Fix: Moved the BFD config copy from the peer group AF config copy function to the main peer group config copy function.
Ticket: CM-11327
Signed-off-by: Don Slice
Reviewed-by: Donald Sharp
Testing Done: Manual testing, bgp-min, vrf-min, bgp-smoke, vrf-smoke all successful
When bgp was configured in a vrf and then deleted, the vrf->iflist
was being deleted from the vrf. Since the vrf itself was not deleted,
it was assumed in later calls that the vrf->iflist was still there
and when it was referenced, the crash occurred.
Avoids a dynamic allocation which is usually freed immediate afterwards.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
During instance cleanup, an earlier patch walked the workqueue in order
to process queued routes of the instance. However, since the workqueue
is not per instance, the code walks and immediately processes all routes
across all instances.
This may not be ideal in the presence of VRFs, when multiple instances
will be a fact. Revert that part of the change from earlier patch. This
needs to be revisited later for a better solution.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Updates: bb86c6017e
There wasn't much missing for VPNv6 to begin with; just a few bits of
de- & encoding and a few lists to be updated.
Signed-off-by: Lou Berger <lberger@labn.net>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
[Editorial note: Signed-off-by may imply an authorship claim, but need not]
Edited-by: Paul Jakma <paul.jakma@hpe.com> / <paul@jakma.org>
(cherry picked from commit 9da04bca0e994ec92b9242159bf27d89c6743354)
Conflicts:
bgpd/bgp_attr.c
bgpd/bgp_mplsvpn.c
bgpd/bgpd.c
rtt is calculated dynamically by the kernel. Refresh it on
soft clear.
Fixes: ef757700d0 "bgpd: allow using rtt in route-map's set metric"
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
(cherry picked from commit 5a2a1ec18c89daec5de6690a9b0f47c0d11a0f2d)
Conflicts:
bgpd/bgpd.c
Fix lots of warnings. Some const and type-pun breaks strict-aliasing
warnings left but much reduced.
* bgp_advertise.h: (struct bgp_advertise_fifo) is functionally identical to
(struct fifo), so just use that. Makes it clearer the beginning of
(struct bgp_advertise) is compatible with with (struct fifo), which seems
to be enough for gcc.
Add a BGP_ADV_FIFO_HEAD macro to contain the right cast to try shut up
type-punning breaks strict aliasing warnings.
* bgp_packet.c: Use BGP_ADV_FIFO_HEAD.
(bgp_route_refresh_receive) fix an interesting logic error in
(!ok || (ret != BLAH)) where ret is only well-defined if ok.
* bgp_vty.c: Peer commands should use bgp_vty_return to set their return.
* jhash.{c,h}: Can take const on * args without adding issues & fix warnings.
* libospf.h: LSA sequence numbers use the unsigned range of values, and
constants need to be set to unsigned, or it causes warnings in ospf6d.
* md5.h: signedness of caddr_t is implementation specific, change to an
explicit (uint_8 *), fix sign/unsigned comparison warnings.
* vty.c: (vty_log_fixed) const on level is well-intentioned, but not going
to fly given iov_base.
* workqueue.c: ALL_LIST_ELEMENTS_RO tests for null pointer, which is always
true for address of static variable. Correct but pointless warning in
this case, but use a 2nd pointer to shut it up.
* ospf6_route.h: Add a comment about the use of (struct prefix) to stuff 2
different 32 bit IDs into in (struct ospf6_route), and the resulting
type-pun strict-alias breakage warnings this causes. Need to use 2
different fields to fix that warning?
general:
* remove unused variables, other than a few cases where they serve a
sufficiently useful documentary purpose (e.g. for code that needs
fixing), or they're required dummies. In those cases, try mark them as
unused.
* Remove dead code that can't be reached.
* Quite a few 'no ...' forms of vty commands take arguments, but do not
check the argument matches the command being negated. E.g., should
'distance X <prefix>' succeed if previously 'distance Y <prefix>' was set?
Or should it be required that the distance match the previously configured
distance for the prefix?
Ultimately, probably better to be strict about this. However, changing
from slack to strict might expose problems in command aliases and tools.
* Fix uninitialised use of variables.
* Fix sign/unsigned comparison warnings by making signedness of types consistent.
* Mark functions as static where their use is restricted to the same compilation
unit.
* Add required headers
* Move constants defined in headers into code.
* remove dead, unused functions that have no debug purpose.
(cherry picked from commit 7aa9dcef80b2ce50ecaa77653d87c8b84e009c49)
Conflicts:
bgpd/bgp_advertise.h
bgpd/bgp_mplsvpn.c
bgpd/bgp_nexthop.c
bgpd/bgp_packet.c
bgpd/bgp_route.c
bgpd/bgp_routemap.c
bgpd/bgp_vty.c
lib/command.c
lib/if.c
lib/jhash.c
lib/workqueue.c
ospf6d/ospf6_lsa.c
ospf6d/ospf6_neighbor.h
ospf6d/ospf6_spf.c
ospf6d/ospf6_top.c
ospfd/ospf_api.c
zebra/router-id.c
zebra/rt_netlink.c
zebra/rt_netlink.h
Signed-off-by: Radhika Mahankali <radhika@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Kanna Rajagopal <kanna@cumulusnetworks.com>
Ticket: CM-11055
Reviewed By: CCR-4773
Testing Done: Unit, PTM smoke, BGP neighbor smoke
Issue: bgpd is not replaying the BFD registrations to PTM after quagga restart.
Root Cause: This problem happens when BFD configuration is part of the peer group template. Currently, the BFD configuration is being copied to the peer from template as part of the AF (address family) configuration. But, when the saved config is used after the quagga restart the peer group template is applied to the peer before the AF configuration is configured for the template. Due to this the BFD configuration never gets copied from the template to the peer and the BGP peers have no BFD configuration after the restart
Sample config which failed:
router bgp 100
bgp router-id 10.10.0.1
no bgp default ipv4-unicast
bgp bestpath as-path multipath-relax
neighbor dpeergrp_2 peer-group
neighbor dpeergrp_2 remote-as 100
neighbor dpeergrp_2 bfd
neighbor dpeergrp_2 advertisement-interval 1
neighbor dpeergrp_2 timers connect 1
neighbor dpeergrp_4 peer-group
neighbor dpeergrp_4 remote-as 400
neighbor dpeergrp_4 bfd
neighbor dpeergrp_4 advertisement-interval 1
neighbor dpeergrp_4 timers connect 1
neighbor swp2s0.1 interface peer-group dpeergrp_2
neighbor swp18s3.1 interface peer-group dpeergrp_4
!
address-family ipv4 unicast
redistribute connected route-map redist
neighbor dpeergrp_2 activate
neighbor dpeergrp_2 next-hop-self
neighbor dpeergrp_2 default-originate
neighbor dpeergrp_2 soft-reconfiguration inbound
neighbor dpeergrp_4 activate
neighbor dpeergrp_4 next-hop-self
neighbor dpeergrp_4 default-originate
neighbor dpeergrp_4 soft-reconfiguration inbound
maximum-paths 14
exit-address-family
Fix: Moved the BFD config copy from the peer group AF config copy function to the main peer group config copy function.
* bgpd.c: (peer_uptime) Wraps after 1 year, and doesn't indicate years.
Fix. Assume a year is 365 days, for an easy life.
Fixes: Bug #836
Reported-by: Rolf Hanßen
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
peer_uptime was doing the same work unnecessarily.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
When a BGP instance including the default instance is deleted, it needs to be
unlinked from the corresponding VRF structure. However, instance deletion does
not happen in one shot but needs a lot of threads to run - peer event handling,
route processing etc. - before it can complete. Premature unlinking of the
instance from underlying VRF would result in BGP routes not being deleted from
the zebra RIB.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-10930
Reviewed By: CCR-4717
Testing Done: Manual, bgp-smoke
Instead of turning on IPv6 RA on every interface as soon as it has an IPv6
address, only enable it upon configuration of BGP neighbor. When the BGP
neighbor is deleted, signal that RAs can be turned off.
To support this, introduce new message interaction between BGP and Zebra.
Also, take appropriate actions in BGP upon interface add/del since the
unnumbered neighbor could exist prior to interface creation etc.
Only unnumbered IPv6 neighbors require RA, the /30 or /31 based neighbors
don't. However, to keep the interaction simple and not have to deal with
too many dynamic conditions (e.g., address deletes or neighbor change to/from
'v6only'), RAs on the interface are triggered upon any unnumbered neighbor
configuration.
BGP-triggered RAs will cause RAs to be initiated on the interface; however,
if BGP asks that RAs be stopped (upon delete of unnumbered neighbor), RAs
will continue to be exchanged if the operator has explicitly enabled.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-10640
Reviewed By: CCR-4589
Testing Done: Various manual and automated (refer to defect)
With the addition of RA being turned on by default.
Spewing this error message when unable to connect
doesn't make much sense anymore.
Ticket: CM-10494
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Radhika Mahankali <radhika@cumulusnetworks.com>
Modified the configuration code to properly allow a peer-group member
to have a route-map out applied when one does not exist on the peer-group
itself. This capability already existed for route-map in.
Ticket: CM-10058
Signed-off-by: Don Slice
Reviewed-by: Donald Sharp
Quagga BGP needed a config 'bgp multiple-instance' in order to be able
to configure and use VRFs. Since this support is intrinsic to the
implementation, make this configuration on by default. Corresponding
change to 'show running-config' (and write) to display only if "no" is
configured.
This change will eliminate one unnecessary step in the configuration.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-10070
Reviewed By: CCR-4383
Testing Done: Manual
When displaying the output of a 'show run',
display the neighbor information in an ordered
manner.
Ticket: CM-10184
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
When creating a 'struct peer' add in the ability to set the peer group
associated with that peer.
Ticket: CM-10184
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Upon receipt of incoming connection, a peer structure (doppelganger) is
created internally and the connection processed for it. The problem is
that in the case of BGP unnumbered, the sockunion structure within BGP was
being updated (in peer_create()) prior to the peer's flags being updated,
so it didn't take into account the 'v6only' configuration. This results
in subsequent problems when bgp_bind() is done - the socket ends up being
bound to the BGP instance instead of the interface.
In the case of an incoming connection, we should just use the addresses
on which the connection was setup/accepted, there is no need to attempt to
derive it again. Further, there is no need to attempt to update addresses
at the time of peer_create() since that is done when the connection is
attempted in bgp_start().
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-10028
Reviewed By: CCR-4373
Testing Done: Manual, bgpsmoke
Updates to routemaps and delete of the routemap were not working properly
for VRFs. This was because while routemaps are global, the routemap update
processing timer and the processing were at the per-instance level. This
approach was unable to handle processing for multiple instances as the
routemap has no tracking of which instances are still pending processing.
This lead to the processing happening correctly only for the first instance
- which could be the default instance or some other instance. It could also
result in reference to freed memory for an instance.
The fix done is to make the update/delete processing also global and not per
instance. This means that the route-map delay timer will be global and a global
thread will handle the change (or delete) for all instances instead of spawning
a separate thread for each instance. To support this, a global BGP command
"bgp route-map delay-timer <value>" has been implemented. The existing command
per-instance is not deleted but will update the global timer.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-6970, CM-9918
Reviewed By: CCR-4320
Testing Done: Manual, bgpsmoke
Code changes to make unnumbered peering work in a VRF. The changes have
to do with locating the interface in the correct VRF (in order to look for
neighbor address) in the case of outgoing connections and when specifying
source address as well as fetching the correct instance for an incoming
connection based on reading the device the socket is bound to (the multi-vrf
socket option in the kernel).
Additionally, for IPv4 unnumbered peering in a VRF (based on /30 or /31
addresses), bind to the VRF rather than the interface.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-9311
Reviewed By: CCR-4192
Testing Done: Manual, bgpsmoke
The BGP instance cleanup was deleting interfaces in that instance after
prior fixes, but this ended up deleting the interface list header which
was not being re-created. Added code to re-create this at the time an
instance is created.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-9466
Reviewed By: CCR-4164
Testing Done: Manual and verified failed test
Perform interface cleanup as an instance is deleted. This takes care of the
scenario when BGP exits (or is stopped/restarted) too as instances undergo
deletion and the interface cleanup is done as the last step in that.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Fixes: 46abd3e3e6
Ticket: CM-9410
Reviewed By: CCR-4143
Testing Done: Reran failed test
Link BGP instance (Default or VRF) to the corresponding VRF structure and
modify lookup to use this. The logic is very similar to what is implemented
in zebra - the 'struct zebra_vrf' there is essentially 'struct bgp' in BGP.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-9122
Reviewed By: CCR-4102
Testing Done: Manual
With VRF support, certain objects are now maintained per BGP instance. At
exit, the list of BGP instances has to be freed only after processing the
per-instance objects.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Fixes: ad4cbda1a3
Ticket: CM-9340
Reviewed By:
Testing Done:
Various changes and fixes related to VRF registration, deletion,
BGP exit etc.
- Define instance type
- Ensure proper handling upon instance create, delete and
VRF add/delete from zebra
- Cleanup upon bgp_exit()
- Ensure messages are not sent to zebra for unknown VRFs
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-9128, CM-7203
Reviewed By: CCR-4098
Testing Done: Manual
When a BGP instance is deleted through 'no router bgp', the required
cleanup was not being performed. This is after VRF-related changes.
Fix to ensure this is taken care of.
Note: Further changes needed in this area for VRFs.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-9597
Reviewed By: CCR-4097
Testing Done: Verified failed test
While processing references to the macro PEERAF_FOREACH(), aggressive loop
optimization by gcc 4.9.x (probably 4.8 and greater) was resulting in the
generated code not checking on the index as well as eliminating some code.
This was leading to a dereference of invalid memory when a BGP peer came up.
The fix is to scrap this convoluted macro. Two other changes done are to
eliminate overloading of "afindex" and make the loop iterator an integer.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Dave Olson <olson@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Ticket: CM-8889
Reviewed By: CCR-4018
Testing Done: Verified failure scenario
Note: This code was added as part of update-groups implementation; when
upstreaming update-groups, this patch should also be included.
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-8788
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-8472
The first as-path is what the original as-path for the test, the second
is what the as-path should look like once all confed SETs and SEQs have
been removed. At the time the test was written quagga did not correctly
remove confed SETs and SEQs but whoever wrote the test didn't notice
this and assumed that the behavior they were seeing was correct so used
that output to populate the second as-path.
Now that we do correctly remove the confed parts these tests fail. So
the fix is to update the second as-path for these two tests so that they
no longer contain any confed SETs/SEQs.
BGP uses a second #define that is equal to MULTIPATH_NUM. There
is no point in having a different #define. Just consolidate.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
configuration
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Ticket: CM-8321
This allows the user to configure the peer-group as an option for the
"neighbor swpX interface" command.
{code}
!
router bgp 100
neighbor FOO peer-group
neighbor FOO remote-as external
neighbor swp1 interface
neighbor swp2 interface v6only
neighbor swp3 interface peer-group FOO
neighbor swp4 interface v6only peer-group FOO
!
address-family ipv4 unicast
neighbor FOO activate
neighbor swp1 activate
neighbor swp2 activate
exit-address-family
!
{code}
Note that if the user configures
{code}
neighbor swp5 interface
neighbor swp5 peer-group FOO
{code}
We will display that as "neighbor swp5 interface peer-group FOO". It
did not seem worth tracking that the peer-group was entered via two
lines instead of one.