This commit adds the same peer-group override capabilites as d122d7cf7
for all filter/map options that can be enabled/disabled on each
address-family of a BGP peer.
All currently existing filter/map options are being supported:
filter-list, distribute-list, prefix-list, route-map and unsuppress-map
To implement this behavior, a new peer attribute 'filter_override' has
been added together with various PEER_FT_ (filter type) constants for
tracking the state of each filter in the same way as it is being done
with 'af_flags_override'.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
The current implementation for overriding peer-group configuration on a
peer member consists of several bandaids, which introduce more issues
than they fix. A generic approach for implementing peer-group overrides
for address-family flags is clearly missing.
This commit implements a generic and sane approach to overriding
peer-group configuration on a peer-member. A separate peer attribute
called 'af_flags_override' which was introduced in 04e1c5b is being used
to keep track of all address-family flags, storing whether the
configuration is being inherited from the parent-group or overridden.
All address-family flags are being supported by this implementation
(note: flags, not filters/maps) except 'send-community', which currently
breaks due to having the three flags enabled by default, which is not
being properly handled within this commit; all flags are supposed to
have an 'off'/'false' state by default.
In the interest of readability and comprehensibility, the flag
'send-community' is being fixed in a separate commit.
The following rules apply when looking at the new peer-group override
implementation this commit provides:
- Each peer-group can enable every flag (except the limitations noted
above), which gets automatically inherited to all members.
- Each peer can enable each flag independently and/or modify their
value, if available. (e.g.: weight <value>)
- Each command executed on a neighbor/peer gets explicitely set as an
override, so even when the peer-group has the same kind of
configuration, both will show up in 'show running-configuration'.
- Executing 'no <command>' on a peer will remove the peer-specific
configuration and make the peer inherit the configuration from the
peer-group again.
- Executing 'no <command>' on a peer-group will only remove the flag
from the peer-group, however not from peers explicitely setting that
flag.
This guarantees a clean implementation which does not break, even when
constantly messing with the flags of a peer-group. The same behavior is
present in Cisco devices, so people familiar with those should feel safe
when dealing with FRRs peer-groups.
The only restriction that now applies is that single peer cannot
disable a flag which was set by a peer-group, because 'no <command>' is
already being used for disabling a peer-specific override. This is not
supported by any known vendor though, would require many specific
edge-cases and magic comparisons and will most likely only end up
confusing the user. Additionally, peer-groups should only contain flags
which are being used by all peer members.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
policy routing is configurable via address-family ipv4 flowspec
subfamily node. This is then possible to restrict flowspec operation
through the BGP instance, to a single or some interfaces, but not all.
Two commands available:
[no] local-install [IFNAME]
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit moves the command 'bgp enforce-first-as' from global BGP
instance configuration to peer/neighbor configuration, which can now be
changed by executing '[no] neighbor <neighbor> enforce-first-as'.
End users can now enforce sane first-AS checking on regular sessions
while e.g. disabling the checks on routeserver sessions, which usually
strip away their own AS number from the path.
To ensure backwards-compatibility, a migration routine was added which
automatically sets the 'enforce-first-as' flag on all configured
neighbors if the old global setting was activated. The old global
command immediately disappears after running the migration routine once.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
Attribute set on peer was being overridden when set on the peer-group.
This commit also adds a parallel flags array that indicates whether a
particular flag is sourced from the peer-group or is peer-specific. It
assumes the default state of all flags is unset. This looks to be true
except in the case of PEER_FLAG_SEND_COMMUNITY,
PEER_FLAG_SEND_EXT_COMMUNITY, and PEER_FLAG_SEND_LARGE_COMMUNITY; these
flags are set by default except when the user specifies to use
config-type = cisco. However the flag field can merely be flipped to
mean the negation of those options in a future commit.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
bgp structure is being extended with hash sets that will be used by
flowspec to give policy routing facilities.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Note that when we are importing vrf EVA into vrf DONNA
we must keep track of all the vrfs EVA is being
exported into and we must also keep track of all the vrf's
that DONNA is receiving data from as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Setup a per-VRF identifier to use along with the Router Id to build the
RD. Define a function to encode the RD. Code is brought over from EVPN
and EVPN code has been modified to use the generic function.
Ticket: CM-20256
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
add the `import vrf XXXX` command
router bgp 4 vrf DONNA
<config>
!
router bgp 4 vrf EVA
<config>
address-family ipv4 uni
import vrf DONNA
!
!
This command will allow for vrf EVA to specify that it would like
to receive the routes from vrf DONNA into it's table.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
RFC 8635 explains how RT auto-derivation should be done in section
5.1.2.1 [1]. In addition to encoding the VNI in the lowest bytes, a
3-bit field is used to encode a namespace. For VXLAN, we have to put 1
in this field. This is needed for proper interoperability with RT
auto-derivation in JunOS. Since this would break existing setup, an
additional option, "autort rfc8365-compatible" is used.
[1]: https://tools.ietf.org/html/rfc8365#section-5.1.2.1
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Add support for CLI "auto" keyword in vrf->vpn export label:
router bgp NNN vrf FOO
address-family ipv4 unicast
label vpn export auto
exit-address-family
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
MPLS label pool backed by allocations from the zebra label manager.
A caller requests a label (e.g., in support of an "auto" label
specification in the CLI) via lp_get(), supplying a unique ID and
a callback function. The callback function is invoked at a later
time with the unique ID and a label value to inform the requestor
of the assigned label.
Requestors may release their labels back to the pool via lp_release().
The label pool is stocked with labels allocated by the zebra label
manager. The interaction with zebra is asynchronous so that bgpd
is not blocked while awaiting a label allocation from zebra.
The label pool implementation allows for bgpd operation before (or
without) zebra, and gracefully handles loss and reconnection of
zebra. Of course, before initial connection with zebra, no labels
are assigned to requestors. If the zebra connection is lost and
regained, callbacks to requestors will invalidate old assignments
and then assign new labels.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
This work is derived from a work done by China-Telecom.
That initial work can be found in [0].
As the gap between frr and quagga is important, a reworks has been
done in the meantime.
The initial work consists of bringing the following:
- Bringing the client side of flowspec.
- the enhancement of address-family ipv4/ipv6 flowspec
- partial data path handling at reception has been prepared
- the support for ipv4 flowspec or ipv6 flowspec in BGP open messages,
and the internals of BGP has been done.
- the memory contexts necessary for flowspec has been provisioned
In addition to this work, the following has been done:
- the complement of adaptation for FS safi in bgp code
- the code checkstyle has been reworked so as to match frr checkstyle
- the processing of IPv6 FS NLRI is prevented
- the processing of FS NLRI is stopped ( temporary)
[0] https://github.com/chinatelecom-sdn-group/quagga_flowspec/
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: jaydom <chinatelecom-sdn-group@github.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This commit is relying on bgp vpn-policy. It is needed to configure
several bgp vrf instances, and in each of the bgp instance, configure
the following command under address-family ipv4 unicast node:
[no] rt redirect import RTLIST
Then, a function is provided, that will parse the BGP instances.
The incoming ecommunity will be compared with the configured rt redirect
import ecommunity list, and return the VRF first instance of the matching
route target.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
PR #1739 added code to leak routes between (default VRF) VPN safi and unicast RIBs in any VRF. That set of changes included temporary CLI including vpn-policy blocks to specify RD/RT/label/&c. After considerable discussion, we arrived at a consensus CLI shown below.
The code of this PR implements the vpn-specific parts of this syntax:
router bgp <as> [vrf <FOO>]
address-family <afi> unicast
rd (vpn|evpn) export (AS:NN | IP:nn)
label (vpn|evpn) export (0..1048575)
rt (vpn|evpn) (import|export|both) RTLIST...
nexthop vpn (import|export) (A.B.C.D | X:X::X:X)
route-map (vpn|evpn|vrf NAME) (import|export) MAP
[no] import|export [vpn|evpn|evpn8]
[no] import|export vrf NAME
User documentation of the vpn-specific parts of the above syntax is in PR #1937
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
- add "debug bgp vpn label" CLI
- improved debug messages for "debug bgp bestpath"
- send vrf label to zebra after zebra informs bgpd of vrf_id
- withdraw vrf_label from zebra if zebra informs bgpd that vrf_id is disabled
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
We have af_flags in struct bgp which holds address family related flags.
Seems like we had a conflict between two flags.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Upon creation of BGP instances, server socket may or may not be created.
In the case of VRF instances, if the VRF backend relies on NETNS, then
a new server socket will be created for each BGP VRF instance. If the
VRF backend relies on VRF LITE, then only one server socket will be
enough. Moreover, At startup, with BGP VRF configuration, a server
socket may not be created if VRF is not the default one or VRF is not
recognized yet.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
peer->ifindex was only used in two places but it was never populated so
neither of them worked as they should. 'struct peer' also has a 'struct
interface' pointer which we can use to get the ifindex.
Implement support for 'default-originate' for L2VPN/EVPN address family.
This is needed for the case where external routing within a POD,
will follow the default route to the border/exit leaf.
The border leaf has more than one next hop to forward the packet on to,
depending on the destination IP.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
We have af_flags in struct bgp to hold address family related flags,
l2vpn evpn flags to indicate advertise ipvX unicast should be moved there.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
FRR/CL provides the means for injecting regular (IPv4) routes
from the BGP RIB into EVPN as type-5 routes.
This needs to be enhanced to allow selective injection.
This can be achieved by adding a route-map option
for the "advertise ipv4/ipv6 unicast" command.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Asymmetric routing is an ideal choice when all VLANs are cfged on all leafs.
It simplifies the routing configuration and
eliminates potential need for advertising subnet routes.
However, we need to reach the Internet or global destinations
or to do subnet-based routing between PODs or DCs.
This requires EVPN type-5 routes but those routes require L3 VNI configuration.
This task is to support EVPN type-5 routes for prefix-based routing in
conjunction with asymmetric routing within the POD/DC.
It is done by providing an option to use the L3 VNI only for prefix routes,
so that type-2 routes (host routes) will only use the L2 VNI.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Adds ability to specify that peers should be administratively shutdown
when first configured.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The multithreading code has a comment that reads:
"XXX: Heavy abuse of stream API. This needs a ring buffer."
This patch makes the relevant code use a ring buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Was using 0 as a sentinel value, so user couldn't configure 0 as the
value of the coalesce timer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
CLI config for enabling/disabling type-5 routes
router bgp <as> vrf <vrf>
address-family l2vpn evpn
[no] advertise <ipv4|ipv6|both>
loop through all the routes in VRF instance and advertise/withdraw
all ip routes as type-5 routes in default instance.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For EVPN type-5 route the NH in the NLRI is set to the local tunnel ip.
This information has to be obtained from kernel notification.
We need to pass this info from zebra to bgp in l3vni call flow.
This patch doesn't handle the tunnel-ip change.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
1. VRF RD can be auto-derived (simillar to RD for a VNI)
2. VRF RD can be configured manually through a config
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
currently, we have a rd_id bitfield
to assign an unique index for auto RD.
This bitfield currently resides under struct bgp which seems wrong.
We need to shift this to a global space
as this ID space is really global per box.
One more reason to keep it at a global data structure is,
the ID space could be used by both VNIs and VRFs.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Since coalesce time is now heuristically adjusted based on peer count,
we need to separate out specific configuration by the user from the
current value. Behavior established is to not adjust if the user has a
value set.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Some of the deprecated stream.h macros see such little use that we may
as well just remove them and use the non-deprecated macros.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
At some point when rearranging FSM code, bgpd lost the ability to
perform active opens because it was only paying attention to POLLIN and
not POLLOUT, when the latter is used to signify a successful connection
in the active case.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Use best-performing memory orders where appropriate.
Also update some style and add missing comments.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Start bit flags at 1, not 2
* Make run-flags atomic for i/o thread
* Remove work_cond mutex, it should no longer be necessary
* Add asserts to ensure proper ordering in bgp_connect()
* Use true/false with booleans, not 1/0
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
bgpd supports setting a write-quanta that serves as a hint on how many
packets to write per I/O cycle. Now that input is buffered, it makes
sense to add the equivalent parameter for how many packets are processed
per cycle. This is *not* how many packets are read off the wire per I/O
cycle; rather it is how many packets are processed from the input buffer
in a given cycle after having been read off the wire and sanitized.
Since these values must be used from multiple threads, they have also
been made atomic.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
piggybacking off of bgp_read()
* Tons of little fixups
Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
After implement threading, bgp_packet.c was serving the double purpose
of consolidating packet parsing functionality and handling actual I/O
operations. This is somewhat messy and difficult to understand. I've
thus moved all code and data structures for handling threaded packet
writes to bgp_io.[ch].
Although bgp_io.[ch] only handles writes at the moment to keep the noise
on this commit series down, for organization purposes, it's probably
best to move bgp_read() and its trappings into here as well and
restructure that code so that read()'s happen in the pthread and packet
processing happens on the main thread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Changes all synchronization primitives to be dynamically allocated. This
should help catch any subtle errors in pthread lifecycles.
This change also pre-initializes synchronization primitives before
threads begin to run, eliminating a potential race condition that
probably would have caused a segfault on startup on a very fast box.
Also changes mutex and condition variable allocations to use
MTYPE_PTHREAD and updates tests to do the proper initializations.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Remove t_write
* Remove t_keepalive
These have been replaced by pthreads and are no longer needed. Since
some code looks at these values to determine if the threads are
scheduled, also add a new bitfield to store the same information.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Removes the WiP shim and implements proper thread lifecycle management.
* Declare necessary pthread_t's in bgp_master
* Define new MTYPE in lib/thread.c for pthreads
* Allocate and free BGP's pthreads appropriately
* Terminate and join threads appropriately
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Problem reported that we weren't adjusting the keepalive timer
correctly when we negotiated a lower hold time learned from a
peer. While working on this, found we didn't do inheritance
correctly at all. This fix solves the first problem and also
ensures that the timers are configured correctly based on this
priority order - peer defined > peer-group defined > global config.
This fix also displays the timers as "configured" regardless of
which of the three locations above is used.
Ticket: CM-18408
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-6807
Testing-performed: Manual testing successful, fix tested by
submitter, bgp-smoke completed successfully
afi_header_vty_out() is easily replaced with vty_frame(), which means we
can drop a whole batch of "int *write" args as well as the entirety of
bgp_config_write_family_header().
=> AFI/SAFI config writing is now a lot simpler.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
There are two parts to this commit:
1. create a database of self tunnel-ip for used in martian nexthop check
In a CLAG setup, the tunnel-ip (VNI UP) notification comes before the clag-anycast-ip comes up in the system.
This was causing our self next hop check to fail and we were instaling routes with martian nexthop in zebra.
We need to keep this info in a seperate database for all local tunnel-ip.
This database will be used in parallel with the self next hop database to martian nexthop checks.
2. When a local VNI comes up, update the tunnel-ip database and filter routes in the RD table if necessary
In case of EVPN we might receive routes from clag peer before the clag-anycast ip and VNI is up on the system.
We will store the routes in the RD table for later processing.
When VNI comes UP, we loop thorugh all the routes and install them in zebra if required.
However, we were missing the martian nexthop check in this code path.
From now onwards, when a VNI comes UP,
we will first update the tunnel-ip database
We then loop through all the routes in RD table and apply martian next hop filter if required.
Things not covered in this commit but are required:
This processing is needed in general when an address becomes a connected address.
We need to loop through all the routes in BGP and apply martian nexthop filter if necessary.
This will be taken care in a seperate bug
Ticket:CM-17271/CM-16911
Reviewed By: ccr-6542
Testing Done: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
The size of an enum is compiler dependent and thus we shouldn't use
enums inside structures that represent fields of a packet.
Problem detected by the 'test_capability' unit test.
The problem was not apparent before because the 'iana_safi_t' enum didn't
exist and 'safi_t' was a typedef to uint8_t. Now we have two different
enums, 'iana_afi_t' and 'iana_safi_t', and both need to be encoded in
different ways on the wire (2 bytes vs 1 byte).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
swpX peers all start out with the same sockunion so initially they all
go into the same hash bucket. Once IPv6 ND has worked its magic they
will have different sockunions and will go in different buckets...life
is good.
Until then though, we are in a phase where all swpX peers have the same
socknunion. Once we have HASH_THRESHOLD (10) swpX peers and call
hash_get for a new swpX peer the hash code calls hash_expand(). This
happens because there are more than HASH_THRESHOLD entries in a single
bucket so the logic is "expand the hash to spread things out"...in our
case expanding doesn't spread out the swpX peers because all of their
sockunions are the same.
I looked at having peer_hash_make and peer_hash_same consider the ifname
of the swpX peer but that is a large change that we don't want to make
at the moment. So the fix is to put a cap on how large we are
willing to let the hash table get. By default there is no limit but if
max_size is set we will not allow the hash to expand above that.
This reverts commit c14777c6bf.
clang 5 is not widely available enough for people to indent with. This
is particularly problematic when rebasing/adjusting branches.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Define the EVPN (EVI) hash table and related structures and initialize
and cleanup.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
log.c provides functionality for associating a constant (typically a
protocol constant) with a string and finding the string given the
constant. However this is highly delicate code that is extremely prone
to stack overflows and off-by-one's due to requiring the developer to
always remember to update the array size constant and to do so correctly
which, as shown by example, is never a good idea.b
The original goal of this code was to try to implement lookups in O(1)
time without a linear search through the message array. Since this code
is used 99% of the time for debugs, it's worth the 5-6 additional cmp's
worst case if it means we avoid explitable bugs due to oversights...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
- All ipv4 labeled-unicast routes are now installed in the ipv4 unicast
table. This allows us to do things like take routes from an ipv4
unicast peer, allocate a label for them and TX them to a ipv4
labeled-unicast peer. We can do the opposite where we take routes from
a labeled-unicast peer, remove the label and advertise them to an ipv4
unicast peer.
- Multipath over a labeled route and non-labeled route is not allowed.
- You cannot activate a peer for both 'ipv4 unicast' and 'ipv4
labeled-unicast'
- The 'tag' variable was overloaded for zebra's route tag feature as
well as the mpls label. I added a 'mpls_label_t mpls' variable to
avoid this. This is much cleaner but resulted in touching a lot of
code.
Add checks related to AFI_L2VPN/SAFI_EVPN that were missing in some parts
of the code. Fix incorrect check skipping EVPN when sending End of RIB.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
The initial implementation was against draft-keyupate-idr-bgp-prefix-sid-02
This updates our label-index implementation up to draft-ietf-idr-bgp-prefix-sid-05
- changed BGP_ATTR_LABEL_INDEX to BGP_ATTR_PREFIX_SID
- since there are multiple TLVs in BGP_ATTR_PREFIX_SID you can no longer
rely on that flag to know if there is a label-index for the path. I
changed bgp_attr_extra_new() to init the label_index to
BGP_INVALID_LABEL_INDEX
- put some placeholder code in for the other two TLVs (IPv6 and
Originator SRGB)
Implement BGP Prefix-SID IETF draft to be able to signal a labeled-unicast
prefix with a label index (segment ID). This makes it easier to deploy
global MPLS labels with BGP, even without other aspects of Segment Routing
implemented.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement support for negotiating IPv4 or IPv6 labeled-unicast address
family, exchanging prefixes and installing them in the routing table, as
well as interactions with Zebra for FEC registration. This is the
implementation of RFC 3107.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Start centralising startup & option parsing into the library.
FRR_DAEMON_INFO is a bit weird, but it will become useful later (e.g.
for killing the ZLOG_* enum, and having the daemon name available)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When starting up bgp and zebra now, you can specify
-e <number> or --ecmp <number>
and that number will be used as the maximum ecmp
that can be used.
The <number specified must be >= 1 and <= MULTIPATH_NUM
that Quagga is compiled with.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
BGP Large Communities are a novel way to signal information between
networks. An example of a Large Community is: "2914:65400:38016". Large
BGP Communities are composed of three 4-byte integers, separated by a
colon. This is easy to remember and accommodates advanced routing
policies in relation to 4-Byte ASNs.
This feature was developed by:
Keyur Patel <keyur@arrcus.com> (Arrcus, Inc.),
Job Snijders <job@ntt.net> (NTT Communications),
David Lamparter <equinox@opensourcerouting.org>
and Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Job Snijders <job@ntt.net>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Introduce internal and IANA defintions for AFI/SAFI and mapping
functions and modify code to use these. This refactoring will
facilitate adding support for other AFI/SAFI whose IANA values
won't be suitable for internal data structure definitions (e.g.,
they are not contiguous).
The commit adds some fixes related to afi/safi testing with 'make check
' command.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Ticket: CM-11416
Reviewed By: CCR-3594 (mpls branch)
Testing Done: Not tested now, tested earlier on mpls branch
thread.c fails to build properly on systems that do
not have a CLOCK_MONOTONIC. Therefore there is
no need for bgp to have knowledge of it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Enhance struct bgp to add a new flag BGP_FLAG_GR_PRESERVE_FWD, which
allow to set the Preserve State F bit of Graceful Restart capability in
OPEN messages.
Signed-off-by: Julien Courtat <julien.courtat@6wind.com>
Since VRFs can be searched by vrf_id or name, make this explicit in the
helper functions.
s/vrf_lookup/vrf_lookup_by_id/
s/zebra_vrf_lookup/zebra_vrf_lookup_by_id/
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Until today the admin distance cannot be configured for any IPv6
routing protocol. This patch implements it for bgp.
Signed-off-by: Maitane Zotes <maz@open.ch>
Signed-off-by: Roman Hoog Antink <rha@open.ch>
Ticket: CM-13053
Reviewed By: dslice@cumulusnetworks.com
'neighbor x.x.x.x weight' was implemented as a per-peer knob instead of
a per-peer per-afi-safi option. This makes it configurable per-peer
per-afi-safi so that we can do things like soft clear that afi/safi when
weight is modified.
This feature adds an L3 & L2 VPN application that makes use of the VPN
and Encap SAFIs. This code is currently used to support IETF NVO3 style
operation. In NVO3 terminology it provides the Network Virtualization
Authority (NVA) and the ability to import/export IP prefixes and MAC
addresses from Network Virtualization Edges (NVEs). The code supports
per-NVE tables.
The NVE-NVA protocol used to communicate routing and Ethernet / Layer 2
(L2) forwarding information between NVAs and NVEs is referred to as the
Remote Forwarder Protocol (RFP). OpenFlow is an example RFP. For
general background on NVO3 and RFP concepts see [1]. For information on
Openflow see [2].
RFPs are integrated with BGP via the RF API contained in the new "rfapi"
BGP sub-directory. Currently, only a simple example RFP is included in
Quagga. Developers may use this example as a starting point to integrate
Quagga with an RFP of their choosing, e.g., OpenFlow. The RFAPI code
also supports the ability import/export of routing information between
VNC and customer edge routers (CEs) operating within a virtual
network. Import/export may take place between BGP views or to the
default zebera VRF.
BGP, with IP VPNs and Tunnel Encapsulation, is used to distribute VPN
information between NVAs. BGP based IP VPN support is defined in
RFC4364, BGP/MPLS IP Virtual Private Networks (VPNs), and RFC4659,
BGP-MPLS IP Virtual Private Network (VPN) Extension for IPv6 VPN . Use
of both the Encapsulation Subsequent Address Family Identifier (SAFI)
and the Tunnel Encapsulation Attribute, RFC5512, The BGP Encapsulation
Subsequent Address Family Identifier (SAFI) and the BGP Tunnel
Encapsulation Attribute, are supported. MAC address distribution does
not follow any standard BGB encoding, although it was inspired by the
early IETF EVPN concepts.
The feature is conditionally compiled and disabled by default.
Use the --enable-bgp-vnc configure option to enable.
The majority of this code was authored by G. Paul Ziemba
<paulz@labn.net>.
[1] http://tools.ietf.org/html/draft-ietf-nvo3-nve-nva-cp-req
[2] https://www.opennetworking.org/sdn-resources/technical-library
Now includes changes needed to merge with cmaster-next.
This is a rather large mechanical commit that splits up the memory types
defined in lib/memtypes.c and distributes them into *_memory.[ch] files
in the individual daemons.
The zebra change is slightly annoying because there is no nice place to
put the #include "zebra_memory.h" statement.
bgpd, ospf6d, isisd and some tests were reusing MTYPEs defined in the
library for its own use. This is bad practice and would break when the
memtype are made static.
Acked-by: Vincent JARDIN <vincent.jardin@6wind.com>
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
[CF: rebased for cmaster-next]
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
* Solaris doesn't have u_int64_t, so use uint64_t instead. C99-style
fixed-width integers should always be preferred to improve portability;
* 's_addr' is a macro on Solaris, so we can't use it as a variable name.
Rename the 's_addr' variable to 'addr' in the
bgp_peer_conf_if_to_su_update_v4() function.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Made fix to update the redistribute vrf bitmap when vrf goes down and comes up.
Ticket: CM-11982
Reviewed By: CCR-5032
Testing Done: bgp-min passed, manual
Logic for determining the router-id was spread out over bgp_zebra.c and
bgp_vty.c. Move to bgpd/bgpd.c and have these two call more properly
encapsulated functions.
Significant work by Christian Franke <chris@opensourcerouting.org>.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Adds "const" on:
- peer_update_source_addr_set()
- peer_description_set()
Adds parameter names on:
- bgp_timers_set()
(really confusing, this one, with 2 unexplained args of same type)
Adds new setter:
- peer_afc_set(), calling peer_activate/peer_deactivate.
(intended for API consumers, matches peer->afc)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
VPNv6 changes picked from upstream needed fixes and updates due to some
fundamental changes implemented by Cumulus (BGP update-groups, RFC 5549
and nexthop setting etc.) which aren't present upstream.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Updates: 945c8fe, 8ecd326, bb86c60, 93b73df, f4c8985
Useful when the BGP neighbors are over tunnels that have large
differences in geographic distances and RTTs. Especially useful
for DMVPN setups to allow preferring closes hub.
The parameter is added as new alias command as otherwise it seems
the command parser is not able to match it properly (it seems
merging is done for the various 'set metric' route-map objects in
different routing engines). For same reason also they are listed
as three separate options: optional +/- seems not possibly easily.
Related research papers:
http://www.pps.univ-paris-diderot.fr/~jch/research/delay-based.pdfhttp://arxiv.org/pdf/1309.0632.pdf
Paper on similar extension to Babel:
http://www.pps.univ-paris-diderot.fr/~jch/research/rapport-jonglez-2013.pdf
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit ef757700d0fd51dc0b46df9d3631208919f9b779)
Fix lots of warnings. Some const and type-pun breaks strict-aliasing
warnings left but much reduced.
* bgp_advertise.h: (struct bgp_advertise_fifo) is functionally identical to
(struct fifo), so just use that. Makes it clearer the beginning of
(struct bgp_advertise) is compatible with with (struct fifo), which seems
to be enough for gcc.
Add a BGP_ADV_FIFO_HEAD macro to contain the right cast to try shut up
type-punning breaks strict aliasing warnings.
* bgp_packet.c: Use BGP_ADV_FIFO_HEAD.
(bgp_route_refresh_receive) fix an interesting logic error in
(!ok || (ret != BLAH)) where ret is only well-defined if ok.
* bgp_vty.c: Peer commands should use bgp_vty_return to set their return.
* jhash.{c,h}: Can take const on * args without adding issues & fix warnings.
* libospf.h: LSA sequence numbers use the unsigned range of values, and
constants need to be set to unsigned, or it causes warnings in ospf6d.
* md5.h: signedness of caddr_t is implementation specific, change to an
explicit (uint_8 *), fix sign/unsigned comparison warnings.
* vty.c: (vty_log_fixed) const on level is well-intentioned, but not going
to fly given iov_base.
* workqueue.c: ALL_LIST_ELEMENTS_RO tests for null pointer, which is always
true for address of static variable. Correct but pointless warning in
this case, but use a 2nd pointer to shut it up.
* ospf6_route.h: Add a comment about the use of (struct prefix) to stuff 2
different 32 bit IDs into in (struct ospf6_route), and the resulting
type-pun strict-alias breakage warnings this causes. Need to use 2
different fields to fix that warning?
general:
* remove unused variables, other than a few cases where they serve a
sufficiently useful documentary purpose (e.g. for code that needs
fixing), or they're required dummies. In those cases, try mark them as
unused.
* Remove dead code that can't be reached.
* Quite a few 'no ...' forms of vty commands take arguments, but do not
check the argument matches the command being negated. E.g., should
'distance X <prefix>' succeed if previously 'distance Y <prefix>' was set?
Or should it be required that the distance match the previously configured
distance for the prefix?
Ultimately, probably better to be strict about this. However, changing
from slack to strict might expose problems in command aliases and tools.
* Fix uninitialised use of variables.
* Fix sign/unsigned comparison warnings by making signedness of types consistent.
* Mark functions as static where their use is restricted to the same compilation
unit.
* Add required headers
* Move constants defined in headers into code.
* remove dead, unused functions that have no debug purpose.
(cherry picked from commit 7aa9dcef80b2ce50ecaa77653d87c8b84e009c49)
Conflicts:
bgpd/bgp_advertise.h
bgpd/bgp_mplsvpn.c
bgpd/bgp_nexthop.c
bgpd/bgp_packet.c
bgpd/bgp_route.c
bgpd/bgp_routemap.c
bgpd/bgp_vty.c
lib/command.c
lib/if.c
lib/jhash.c
lib/workqueue.c
ospf6d/ospf6_lsa.c
ospf6d/ospf6_neighbor.h
ospf6d/ospf6_spf.c
ospf6d/ospf6_top.c
ospfd/ospf_api.c
zebra/router-id.c
zebra/rt_netlink.c
zebra/rt_netlink.h
This change extends the earlier change which added the ability in BGP to
trigger IPv6 Router Advertisements when an unnumbered neighbor is configured.
In addition to triggering the RAs, the advertisement interval is also set to
10 seconds. This is needed to handle the scenario where the peer may start
later.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-10896
Reviewed By: CCR-4693
Testing Done: Manual, bgp-min, bgp-smoke
Instead of turning on IPv6 RA on every interface as soon as it has an IPv6
address, only enable it upon configuration of BGP neighbor. When the BGP
neighbor is deleted, signal that RAs can be turned off.
To support this, introduce new message interaction between BGP and Zebra.
Also, take appropriate actions in BGP upon interface add/del since the
unnumbered neighbor could exist prior to interface creation etc.
Only unnumbered IPv6 neighbors require RA, the /30 or /31 based neighbors
don't. However, to keep the interaction simple and not have to deal with
too many dynamic conditions (e.g., address deletes or neighbor change to/from
'v6only'), RAs on the interface are triggered upon any unnumbered neighbor
configuration.
BGP-triggered RAs will cause RAs to be initiated on the interface; however,
if BGP asks that RAs be stopped (upon delete of unnumbered neighbor), RAs
will continue to be exchanged if the operator has explicitly enabled.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-10640
Reviewed By: CCR-4589
Testing Done: Various manual and automated (refer to defect)
Ticket:
Reviewed By:
Testing Done:
For interface-based peering, we don't update the reset reason to be
interface down. Similarly, we don't update the reason to be loss of
neighbor address (maybe due to RA loss). This patch addresses these
limitations.
When creating a 'struct peer' add in the ability to set the peer group
associated with that peer.
Ticket: CM-10184
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Updates to routemaps and delete of the routemap were not working properly
for VRFs. This was because while routemaps are global, the routemap update
processing timer and the processing were at the per-instance level. This
approach was unable to handle processing for multiple instances as the
routemap has no tracking of which instances are still pending processing.
This lead to the processing happening correctly only for the first instance
- which could be the default instance or some other instance. It could also
result in reference to freed memory for an instance.
The fix done is to make the update/delete processing also global and not per
instance. This means that the route-map delay timer will be global and a global
thread will handle the change (or delete) for all instances instead of spawning
a separate thread for each instance. To support this, a global BGP command
"bgp route-map delay-timer <value>" has been implemented. The existing command
per-instance is not deleted but will update the global timer.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-6970, CM-9918
Reviewed By: CCR-4320
Testing Done: Manual, bgpsmoke
The issue here has to do with the fact that VRFs (like interfaces) are not
actually getting deleted when they are removed - they remain present. This
leads to situations in which BGP may try to unlink more than once, which
messes up the reference count (lock) in the BGP instance.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-9419
Reviewed By: CCR-4302
Testing Done: Manual, also verified by Atul
<DETAILED DESCRIPTION (REPLACE)>