Add ESI as an inline attribute field along with the other EVPN
attributes. This may be re-worked when the rest of the EVPN
attributes find a new home.
Some cleanup has been done to get rid of stale/unused references
to ESI. And also to consolidate duplicate definitions of ES ID
types.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
This is not the attribute involved in path selection and by rfc7606 it should
be just ignored.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Multiple different issues causing mostly UAFs but maybe other more
subtle things.
- Cluster lists were the only attributes whose pointers were not being
NULL'd when freed, resulting in heap UAF
- When performing an insert into the cluster hash, our temporary struct
used for hash_get() was inconsistent with our hash keying and
comparison functions. In the case of a zero length cluster list, the
->length field is 0 and the ->list field is NULL. When performing an
insert, we set the ->list field regardless of whether the length is 0.
This resulted in the two cluster lists hashing equal but not comparing
equal. Later, when removing one of them from the hash before freeing
it, because the key matched and the comparison succeeded (because it
was set to NULL *after* the search but *before* inserting into the
hash) we would sometimes release the duplicated copy of the struct,
and then free the one that remained in the hash table. Later accesses
constitute UAF. This is fixed by making sure the fields used for the
existence check match what is actually inserted into the hash when
that check fails.
This patch also makes cluster_unintern static, because it should be.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Having a full feed this leads to unknown. You can't point which prefix or
aspath has this malforming behavior.
Printing just `[EC 33554434] AGGREGATOR attribute is BGP_AS_ZERO(0)` isn't
enough, you can't directly pin-point where is the problem.
Additionally print at least aspath here:
```
[EC 33554434] AGGREGATOR AS number is 0 for aspath: 65000 65031
```
Overall the full table has only 6 such malformed prefixes:
```
aspath: 64539 15096 6939 45430 45458
aspath: 64539 15096 6939 1299 3257 34984 34984 34984 34984 34984 51174
aspath: 64539 15096 6939 286 34984 16135 16135 {16135}
aspath: 64539 15096 6939 7545 7545 136001
aspath: 64539 15096 6939 6762 3269 20746
aspath: 64539 15096 6939 7018 3379
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Modify more code to use `const struct prefix` throughout
bgp. This is all prep work for adding an accessor function
for bgp_node to get the prefix and reduce all the places that
code needs to be touched when we get that work done.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some were converted to bool, where true/false status is needed.
Converted to void only those, where the return status was only false or true.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
We were using XMALLOC for these, and only initializing the refcount to 0
on one of them. Let's just use XCALLOC instead...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Prefix-SID path attribute Label-index TLV (type-1) is
used by SR-MPLS. And Label-index TLV MUST ignored
if that path attribute is append on non-Labeled-unicast
UPDATE message described on [ref1].
There is a problem case exist arround this implementation.
This commit fix that.
Before this commit,
unfortunally, setting Label-Index value is skipped at somecases.
because, Label-Index TLV implementation check the AFI/SAFI pair.
by mp_update variable that is set by bgp_mp_reach_parse function.
if MP_REACH_NLRI is present after PREFIX_SID, bgp_attr_psid_sub
function can't understand AFI/SAFI pair. and the order of each
path attributes is never no-deterministic thing for receiver.[ref2]
In this commit,
I re-located checking code of AFI/SAFI pair after path-attr loop.
[ref1](https://tools.ietf.org/html/draft-ietf-idr-bgp-prefix-sid-27#section-3.2)
> The Originator SRGB TLV may only appear in a BGP Prefix-SID attribute
> attached to IPv4/IPv6 Labeled Unicast prefixes ([RFC8277]). It MUST
> be ignored when received for other BGP AFI/SAFI combinations.
[ref2](https://tools.ietf.org/html/rfc4271#section-5)
> The sender of an UPDATE message SHOULD order path attributes within
> the UPDATE message in ascending order of attribute type. The
> receiver of an UPDATE message MUST be prepared to handle path
> attributes within UPDATE messages that are out of order.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Prefix-SID is desined to capable for TLV array.
That behaviour is important to support SR-MPLS feature
and that supported by previous PR #5418.
In that implementation, but if some additional data
(such as next BGP update message or next path attributes)
was present after Prefix-SID path attribute,
bgpd will parse that addional data as Prefix-SID TLV.
This commit fix that. before this commit, loop condition
is determed by stream is readable or not. In more correct
implementatoin, the prefix-sid boundaly should be checked
additonally. the length of Prefix-sid path attribute can
be get by bgp_attr_parse_args.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
According to https://tools.ietf.org/html/rfc7606 some of the attributes
MUST be handled as "treat-as-withdraw" approach.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
bgp flowspec packets are being forged correctly. There is no need to
check for bgp length, as the bgp nlri length is checked at reception.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd already supports BGP Prefix-SID path attribute and
there are some sub-types of Prefix-SID path attribute.
This commits makes bgpd to support additional sub-types.
sub-Type-4 and sub-Type-5 for construct the VPNv4 SRv6 backend
with vpnv4-unicast address family.
This path attributes is already supported by Ciscos IOS-XR and NX-OS.
Prefix-SID sub-Type-4 and sub-Type-5 is defined on following
IETF-drafts.
Supports(A-part-of):
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-04
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-05
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Guess what - for a bounds check to work, it has to happen *before* you
read the data. We were trusting the attribute field received in a prefix
SID attribute and then checking if it was correct afterwards, but if was
wrong we'd crash before that.
This fixes the problem, and adds additional paranoid bounds checks.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When passing a v4 multicast route to a peer send
the v4 nexthop as a preferred methodology.
Fixes: #5582
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Move VNC interning to the appropriate spot
* Use existing bgp_attr_flush_encap to free encap sets
* Assert that refcounts are correct before exiting to keep the demons
contained in their fiery prison
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Early exits without appropriate cleanup were causing obscure double
frees and other issues later on in the attribute parsing code. If we
return anything except a hard attribute parse error, we have cleanup and
refcounts to manage.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This commit make bgpd to skip and ignore unsupported
sub-type of PREFIX_SID. (especially new defined sub-type)
Current bgpd can't parase unsupported sub-type of PREFIX_SID.
PREFIX_SID is drafted on draft-ietf-idr-bgp-prefix-sid-27.
There are already new sub-type drafted on
draft-dawra-idr-srv6-vpn-05. (Type5,6 is new defined.)
This commit fix the problem reported as #5277 on GitBub.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
this table identifier can be used for policy routing. incoming entries
are locally exported to that local table identifier.
note that so that the user applies the new table identifier to all
entries, the user should flush local tables first.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC 4271 sec 6.3 p33, In the case of a BGP_NEXTHOP attribute with an
incorrect value, FRR is supposed to send a notification
and include 'Corresponding type, length and value of the NEXT_HOP
attribute in the notification data.
Fixes: #4997
Signed-off-by: Nikos <ntriantafillis@gmail.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow bgp to set a local Administrative distance to use
for installing routes into the rib.
Example:
!
router bgp 9323
bgp router-id 1.2.3.4
neighbor enp0s8 interface remote-as external
!
address-family ipv4 unicast
neighbor enp0s8 route-map DISTANCE in
exit-address-family
!
route-map DISTANCE permit 10
set distance 153
!
line vty
!
end
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B 0.0.0.0/0 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
K>* 0.0.0.0/0 [0/100] via 10.0.2.2, enp0s3, 00:06:31
B>* 1.1.1.1/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
B>* 1.1.1.2/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
B>* 1.1.1.3/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:06:31
K>* 169.254.0.0/16 [0/1000] is directly connected, enp0s3, 00:06:31
eva#
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change addresses the following:
1) Ensures logs under DEBUG macro checks are categorized
as zlog_debug instead of zlog_info.
2) Error logs are categorized as zlog_err instead of zlog_info.
3) Rephrasing certain logs to make them appear more intuitive.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Problem reported that when vrf route-leaking between an unnumbered
peer in one vrf to a numbered peer in another vrf, the nexthop
attribute was missing from the update, causing the session to fail.
determined that we needed to expand the mechanism for verifying if
the route has been learned in the other vrf without an ipv4 nexthop.
Ticket: CM-25610
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
bgp update messages were not correctly calculating the size
for a labeled-unicast prefix, as they were not accounting
for the label. If the update message was large enough to
overflow the maximum packet size (4096 bytes) this could
cause bgpd to send a malformed update packet.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Modify the code such that we can auto turn the iana values of afi
and safi to pleasant to read strings.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The bgp_attr_extcom_tunnel_type does not properly
compile with warnings turned on due to recent change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This diff contains 2 parts:
1. Extract the tunnel type info from bgp extended communities.
2. Make rfapi use this common tunnel type ap
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
This is causing interop issues with vendors. According to the RFC,
receiver should ignore the NEXT_HOP attribute with MP_REACH_NLRI
present.
Signed-off-by: nikos <ntriantafillis@gmail.com>
This is causing interop issues with vendors. According to the RFC,
receiver should ignore the NEXT_HOP attribute with MP_REACH_NLRI
present.
Signed-off-by: nikos ntriantafillis@gmail.com
When using remove-private-AS together with local-as
aspath_remove_private_asns() is called before bgp_packet_attribute().
In this case, private AS will always appear in front of change_local_as.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
We have the same warn message in 3 spots, which makes it extremely
hard to figure out which of the 3 has gone terribly wrong.
Add a bit of code to disambiguate the 3 situations.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Any evpn bgp update message comes with router mac extended
community, which can potentially contain the madd adddress
same as any of the local SVIs (L3VNI) MAC address.
Set route mac exist and during route processing in
bgp_update() filter the route.
Ticket:CM-23674
Reviewed By:CCR-8336
Testing Done:
Configure L3vni mac on TORS1 which is similar to TORC11
L3vni MAC. When TORC11 received the EVPN update with
Router mac extended community, this check rejected the
BGP update message.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Consider the following topo VTEP1->SPINE1->VTEP2. ebgp is being used
for evpn route exchange with SPINE just acting as a pass through.
1. VTEP1 was building the type-3 IMET route with the correct PMSI
tunnel type (ingress-replication) and label (VNI)
2. Spine1 was however only parsing the tunnel-type in the attr (was
skipping parsing of the label field altogether) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@MSP1:~# net show bgp l2vpn evpn route rd 27.0.0.15:4 type multicast
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
BGP routing table entry for 27.0.0.15:4:[3]:[0]:[32]:[27.0.0.15]
Paths: (1 available, best #1)
Advertised to non peer-group peers:
TORC11(downlink-1) TORC12(downlink-2) TORC21(downlink-3) TORC22(downlink-4) TORS1(downlink-5) TORS2(downlink-6)
Route [3]:[0]:[32]:[27.0.0.15]
5550
27.0.0.15 from TORS1(downlink-5) (27.0.0.15)
Origin IGP, valid, external, bestpath-from-AS 5550, best
Extended Community: RT:5550:1003 ET:8
AddPath ID: RX 0, TX 227
Last update: Thu Feb 7 15:44:22 2019
PMSI Tunnel Type: Ingress Replication, label: 16777213 >>>>>>>
Displayed 1 prefixes (1 paths) with this RD (of requested type)
root@MSP1:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
3. So VTEP2 didn't rx the correct label.
In an all FRR setup this doesn't have any functional consequence but some
vendors are validating the content of the label field as well and ignoring
the IMET route from FRR (say VTEP1 is FRR and VTEP2 is 3rd-party). The
functional consequence of this VTEP2 ignores VTEP1's IMET route and doesn't
add VTEP1 to the corresponding l2-vni flood list.
This commit fixes up the PMSI attr parsing on spine-1 -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@MSP1:~# net show bgp l2vpn evpn route rd 27.0.0.15:4 type multicast
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
BGP routing table entry for 27.0.0.15:4:[3]:[0]:[32]:[27.0.0.15]
Paths: (1 available, best #1)
Advertised to non peer-group peers:
TORC11(downlink-1) TORC12(downlink-2) TORC21(downlink-3) TORC22(downlink-4) TORS1(downlink-5) TORS2(downlink-6)
Route [3]:[0]:[32]:[27.0.0.15]
5550
27.0.0.15 from TORS1(downlink-5) (27.0.0.15)
Origin IGP, valid, external, bestpath-from-AS 5550, best
Extended Community: RT:5550:1003 ET:8
AddPath ID: RX 0, TX 278
Last update: Thu Feb 7 00:17:40 2019
PMSI Tunnel Type: Ingress Replication, label: 1003 >>>>>>>>>>>
Displayed 1 prefixes (1 paths) with this RD (of requested type)
root@MSP1:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Ticket: CM-23790
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Currently we are hardcoding it at the time of attr building to
ingress-replication. This is just a code clean-up and has no
functional impact.
Ticket: CM-23790
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The `struct bgp_route_evpn` and `struct overlay_index` data
structures are exactly the same. Reduce to 1.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
community_free, lcommunity_free and ecommunity_free are similar type of functions. Most of the places, these three are called together. The signature of community_free is different from other two functions. Modified the community_free API signature to align with other two functions to avoid any confusion. There is no functionality impact with this and this is just to avoid any confusion.
Testing: manual testing and show commands
Signed-off-by: Sri Mohana Singamsetty msingamsetty@vmware.com
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to aggregate routes to handle
extended communities. Make the actions similiar
to what we do for normal communities.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The peer->nexthop.ifp pointer must be set when parsing the
attributes in bgp_mp_reach_parse, notice this
and fail gracefully.
Rework bgp_nexthop_set to remove the HAVE_CUMULUS and to
fail the nexthop_set when we have a zebra connection and
no ifp pointer, as that not havinga zebra connection and
no ifp pointer is legal.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
EVPN ND ext community support NA flag R-bit, to have proxy ND.
Set R-bit in EVPN NA if a given router is default gateway or there is a
local
router attached, which can be determine based on local neighbor entry.
Implement BGP ext community attribute to generate and parse R-bit and
pass along zebra to program neigh entry in kernel.
Upon receiving MAC/IP update with community type 0x06 and sub_type 0x08,
pass the R-bit to zebra to program neigh entry.
Set NTF_ROUTER in neigh entry and inform kernel to do proxy NA for EVPN.
Ref:
https://tools.ietf.org/html/draft-ietf-bess-evpn-na-flags-01
Ticket:CM-21712, CM-21711
Reviewed By:
Testing Done:
Configure Local vni enabled L3 Gateway, which would act as router,
checked
show evpn arp-cache vni x ip <ip of svi> on originated and remote VTEPs.
"Router" flag is set.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
On the case where an mp_unreach attribute is received, while there is no
mp_reach attribute too, it is not necessary to check for missing
attributes.
Fixes: 67495ddb2e ("bgpd: Fixes for recent well-known-attr check patch.")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit moves the command 'bgp enforce-first-as' from global BGP
instance configuration to peer/neighbor configuration, which can now be
changed by executing '[no] neighbor <neighbor> enforce-first-as'.
End users can now enforce sane first-AS checking on regular sessions
while e.g. disabling the checks on routeserver sessions, which usually
strip away their own AS number from the path.
To ensure backwards-compatibility, a migration routine was added which
automatically sets the 'enforce-first-as' flag on all configured
neighbors if the old global setting was activated. The old global
command immediately disappears after running the migration routine once.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
Handle multiple PREFIX_SID's at the same time. The draft clearly
states that multiple should be handled and we have a actual pcap
file that clearly has multiple PREFIX_SID's at the same time.
Fixes: #2153
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The vrf 2 vrf route leaking auto-derives RD and RT and
installs the routes into the appropriate vpn table.
These routes when a operator configured ipv[4|6] vpn
neighbors were showing up off box. The RD and RT
values choosen are localy significant but globaly
useless and may cause confusion.
Put a special bit of code in to notice that we
should not be advertising these routes off box.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Connected routes redistributed into BGP as well as IPv4 routes with IPv6
link-local next hops (RFC 5549) need information about the associated
interface in BGP if they are candidates to be leaked into another VRF. In
the absence of route leaking, this was not necessary. Introduce the
appropriate mechanism and ensure this is used during route install (in
the target VRF).
Ticket: CM-20343, CM-20382
Testing done:
1. Manually verified failed scenarios and some additional ones - logs
in the tickets.
2. Ran bgp-min and evpn-min - results are good.
3. Ran vrf smoke - has some failures, but none which look new
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
FS UNREACH message with 0 NLRI inside is sent after each peer
establishment. FS can send NLRI messages with no nexthop.
The commit fixes a message that is triggered by mistake
if FS was about to be sent, then that message is not output.
Also it fixes a typo.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This work is derived from a work done by China-Telecom.
That initial work can be found in [0].
As the gap between frr and quagga is important, a reworks has been
done in the meantime.
The initial work consists of bringing the following:
- Bringing the client side of flowspec.
- the enhancement of address-family ipv4/ipv6 flowspec
- partial data path handling at reception has been prepared
- the support for ipv4 flowspec or ipv6 flowspec in BGP open messages,
and the internals of BGP has been done.
- the memory contexts necessary for flowspec has been provisioned
In addition to this work, the following has been done:
- the complement of adaptation for FS safi in bgp code
- the code checkstyle has been reworked so as to match frr checkstyle
- the processing of IPv6 FS NLRI is prevented
- the processing of FS NLRI is stopped ( temporary)
[0] https://github.com/chinatelecom-sdn-group/quagga_flowspec/
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: jaydom <chinatelecom-sdn-group@github.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Received PMSI tunnel attributes (in EVPN type-3 route) were not recognized.
Parse them and display the tunnel type when looking at routes. Note that
the only tunnel type currently supported is ingress replication (IR). A
warning message will be logged if the received tunnel type is something
else, but the attribute is otherwise ignored.
Updates: a21bd7a (bgpd: add PMSI_TUNNEL_ATTRIBUTE to EVPN IMET routes)
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
When doing symmetric routing,
EVPN type-2 (MACIP) routes need to be advertised with two labels (VNIs)
the first being the L2 VNI (identifying the VLAN) and
the second being the L3 VNI (identifying the VRF).
The receive processing needs to handle one or two labels too.
Ticket: CM-18489
Review: CCR-6949
Testing: manual and bgp/evpn/mpls smoke
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
1. Added default gw extended community
2. code modification to handle sticky-mac/default-gw-mac as they go together
3. show command support for newly added extended community
4. State in zebra to reflect if a mac/neigh is default gateway
5. show command enhancement to refelect the same in zebra commands
Ticket: CM-17428
Review: CCR-6580
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Some of the deprecated stream.h macros see such little use that we may
as well just remove them and use the non-deprecated macros.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
piggybacking off of bgp_read()
* Tons of little fixups
Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
So we have the ability to apply speculative route-maps to
neighbor display to see what the changes would look like
via some show commands. When we do this we make a
shallow copy of the attr data structure and then pass
it around for applying the routemap. After we've applied
this route-map and displayed it we really need to clean
up memory that the route-map application applied.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A crafted BGP UPDATE with a malformed path attribute length field causes
bgpd to dump up to 65535 bytes of application memory and send it as the
data field in a BGP NOTIFY message, which is truncated to 4075 bytes
after accounting for protocol headers. After reading a malformed length
field, a NOTIFY is generated that is supposed to contain the problematic
data, but the malformed length field is inadvertently used to compute
how much data we send.
CVE-2017-15865
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This issue was discovered on a live session with an extremely
old cisco 7206VXR router running 12.2(33)SRE4. The sending router
is sending us an empty NLRI that is MP_REACH. From RFC
exploration(thanks Russ!) it appears that this was
considered a 'valid' way to send EOR.
Following discussion decided that we should treat
this situation as a EOR marker instead of bringing
down the session.
Applying this fix on the FRR router seeing this issue
allows it to continue it's peering relationship with
the ASR. Since this is a point fix I do not see
a high likelihood of further fallout.
Fixes: #1258
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Commit 8c9cc7bbf6 changed the size
of the `struct bgp_attr_encap_subtlv` type to be a zero length
array at the end instead of having a 1 byte. All memory allocations
for this subsuquently were off by 1 byte since those were not
adjusted either.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
stlv_last is initialized with the loops. No need to reset it.
Its scope is local to the use with the loops.
Signed-off-by: Vincent Jardin <vincent.jardin@6wind.com>
These are now unused. route-maps can't modify these attributes, so
there is no need for _dup functions.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
bgp_attr_deep_dup is based on a misunderstanding of how route-maps work.
They never change actual data, just pointers & fields in "struct attr".
The correct thing to do is copy struct attr and call bgp_attr_flush()
afterwards.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This attempt at optimization has cost us more than a week's worth of
time on several people hunting down the subtle bug that it was missing
an increment on attr->lcommunity.
This is absolutely not worth the maintenance cost.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
1) Add hash names to all hash_create calls
2) Fix community_hash, ecommunity_hash and lcommunity_hash key
creation
3) Fix output of community and lcommunity iterators( why would
we want to see the memory location of the backet? ).
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A prior change broke the nexthop setting for labeled-unicast
address-family in a RFC-5549 scenario (IPv4 prefixes exchanged
with IPv6 next hops). This commit fixes the issue.
Fixes: "bgpd: Fix next hop setting for EVPN"
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Commit c8e7b895 ("bgpd: use Jenkins hash for BGP transit, cluster and
attr hashes") changed attrhash_key_make() to use Jenkins hash, commit
c8f3fe30 ("bgpd: Remove AS Path limit/TTL functionality") introduced
a bogus change with a snippet of code that was deleted in the first
one.
Signed-off-by: Jorge Boncompte <jbonor@gmail.com>
This reverts commit c14777c6bf.
clang 5 is not widely available enough for people to indent with. This
is particularly problematic when rebasing/adjusting branches.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Most of the attributes in 'struct attr_extra' allow for
the more interesting cases of using bgp. The extra
overhead of managing it will induce errors as we add
more attributes and the extra memory overhead is
negligible on anything but full bgp feeds.
Additionally this greatly simplifies the code for
the handling of data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
bgpd: Fix missing label set
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement support for sticky (static) MACs. This includes the following:
- Recognize MAC is static (using NUD_NOARP flag) and inform BGP
- Construct MAC mobility extended community for sticky MACs as per
RFC 7432 section 15.2
- Inform to zebra that remote MAC is sticky, where appropriate
- Install sticky MACs into the kernel with the right flag
- Appropriate handling in route selection
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Core EVPN route handling functionality. This includes support for the
following:
- interface with zebra to learn about local VNIs and MACIPs as well as
to install remote VTEPs (per VNI) and remote MACIPs
- create/update/delete EVPN type-2 and type-3 routes
- attribute creation, route selection and install
- route handling per VNI and for the global routing table
- parsing of received EVPN routes and handling by route type
- encoding attributes for EVPN routes and EVPN prefix creation (for
Updates)
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
The next hop for EVPN routes must be an IPv4 or IPv6 address as per
RFC 7432. Ensure this is correctly handled. Also, ensure there
are correct checks for AFI_L2VPN and nexthop AFI is not AFI_L2VPN.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
log.c provides functionality for associating a constant (typically a
protocol constant) with a string and finding the string given the
constant. However this is highly delicate code that is extremely prone
to stack overflows and off-by-one's due to requiring the developer to
always remember to update the array size constant and to do so correctly
which, as shown by example, is never a good idea.b
The original goal of this code was to try to implement lookups in O(1)
time without a linear search through the message array. Since this code
is used 99% of the time for debugs, it's worth the 5-6 additional cmp's
worst case if it means we avoid explitable bugs due to oversights...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
- All ipv4 labeled-unicast routes are now installed in the ipv4 unicast
table. This allows us to do things like take routes from an ipv4
unicast peer, allocate a label for them and TX them to a ipv4
labeled-unicast peer. We can do the opposite where we take routes from
a labeled-unicast peer, remove the label and advertise them to an ipv4
unicast peer.
- Multipath over a labeled route and non-labeled route is not allowed.
- You cannot activate a peer for both 'ipv4 unicast' and 'ipv4
labeled-unicast'
- The 'tag' variable was overloaded for zebra's route tag feature as
well as the mpls label. I added a 'mpls_label_t mpls' variable to
avoid this. This is much cleaner but resulted in touching a lot of
code.
The bpacket_reformat_for_peer() function rewrites the nexthop of outgoing
route updates on a per-peer basis in order to handle route-maps ("set
ip next-hop") and locally-originated routes missing a nexthop.
In the latter case, RFC 4271 says the following: "When announcing a
locally-originated route to an internal peer, the BGP speaker SHOULD use
the interface address of the router through which the announced network
is reachable for the speaker as the NEXT_HOP".
We were doing this for regular IPv4/IPv6 routes, but not for
VPN/EVPN/ENCAP routes, which were being announced with invalid nexthops
(0.0.0.0 or ::).
This patch fixes this problem.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Added bgp_nexthop_afi() to have one place that determines what the
Nexthop AFI is for bgp_packet_mpattr_start()
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
The initial implementation was against draft-keyupate-idr-bgp-prefix-sid-02
This updates our label-index implementation up to draft-ietf-idr-bgp-prefix-sid-05
- changed BGP_ATTR_LABEL_INDEX to BGP_ATTR_PREFIX_SID
- since there are multiple TLVs in BGP_ATTR_PREFIX_SID you can no longer
rely on that flag to know if there is a label-index for the path. I
changed bgp_attr_extra_new() to init the label_index to
BGP_INVALID_LABEL_INDEX
- put some placeholder code in for the other two TLVs (IPv6 and
Originator SRGB)
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
- cleaned up the "show bgp ipv4 labeled-unicast x.x.x.x" output
- fixed some json keys to use camelCase
- bgp_attr_label_index() was clearing BGP_ATTR_LABEL_INDEX because it
was comparing mp_update->afi against SAFI_LABELED_UNICAST instead of
mp_update->safi
- added BGP_ATTR_LABEL_INDEX to attr_str
Labeled-unicast updates were being sent with an ipv6 nexthop due to
not setting the mp_nexthop_len or nh_afi. On the receive side, the
prefix length was being incorrectly determined and has been fixed.
Also the stream for bgp_label_buf was not created. All resolved.
Ticket: CM-15260
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by:
Implement BGP Prefix-SID IETF draft to be able to signal a labeled-unicast
prefix with a label index (segment ID). This makes it easier to deploy
global MPLS labels with BGP, even without other aspects of Segment Routing
implemented.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement support for negotiating IPv4 or IPv6 labeled-unicast address
family, exchanging prefixes and installing them in the routing table, as
well as interactions with Zebra for FEC registration. This is the
implementation of RFC 3107.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
To know if overlay index is the same between two route information,
then the two overlay index field is compared. If both fields are set to
null, then the comparison should be equal, then return true, which was
not done.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
As per draft-ietf-idr-tunnel-encaps-02, section 3.2.1, BGP Encap
attribute supports vxlan tunnel type. A new tunnel attribute has been
appended to subtlv list, describing the vxlan network identifier to
be used for the routing information of the BGP update message.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This patch introduces the ability to make route type 5 message
when EVPN is enabled. Picked up paramters are collected from the
bgp extra attribute structure and are the ESI, the ethernet tag
information. In addition to this, nexthop attribute is collected too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This patch appends nexthop attribute to EVPN message, in addition
to appending gateway IP in RT-5 NLRI itself. In reception, if
both informations are stored, then the GW IP information will
supersede the NLRI nexthop attribute.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The commit introduces the changes to be done to carry route type 5 EVPN
information in bgp extra attribute information. The commit also handles
the update processing for route type 5 information, including ESI,
gatewayIP and label information.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(to match surrounding code)
"git diff -w" should be almost empty.
Copyright edited to say FRR, this is not GNU Zebra :)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
BGP Large Communities are a novel way to signal information between
networks. An example of a Large Community is: "2914:65400:38016". Large
BGP Communities are composed of three 4-byte integers, separated by a
colon. This is easy to remember and accommodates advanced routing
policies in relation to 4-Byte ASNs.
This feature was developed by:
Keyur Patel <keyur@arrcus.com> (Arrcus, Inc.),
Job Snijders <job@ntt.net> (NTT Communications),
David Lamparter <equinox@opensourcerouting.org>
and Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Job Snijders <job@ntt.net>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Introduce internal and IANA defintions for AFI/SAFI and mapping
functions and modify code to use these. This refactoring will
facilitate adding support for other AFI/SAFI whose IANA values
won't be suitable for internal data structure definitions (e.g.,
they are not contiguous).
The commit adds some fixes related to afi/safi testing with 'make check
' command.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Ticket: CM-11416
Reviewed By: CCR-3594 (mpls branch)
Testing Done: Not tested now, tested earlier on mpls branch