DF (Designated forwarder) election is used for picking a single
BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called
service carving for DF election. However that mechanism has many
disadvantages -
1. LBs poorly.
2. Doesn't allow for a controlled failover needed in upgrade
scenarios.
3. Not easy to hw accelerate.
To fix the poor performance of service carving alternate DF mechanisms
have been proposed via the following drafts -
draft-ietf-bess-evpn-df-election-framework
draft-ietf-bess-evpn-pref-df
This commit adds support for the pref-df election mechanism which
is used as the default. Other mechanisms including service-carving
may be added later.
In this mechanism one switch on an ES is elected as DF based on the
preference value; higher preference wins with IP address acting
as the tie-breaker (lower-IP wins if pref value is the same).
Sample output
=============
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01
ESI: 03:00:00:00:00:01:11:00:00:01
Type: LR
RD: 27.0.0.15:6
Originator-IP: 27.0.0.15
Local ES DF preference: 100
VNI Count: 10
Remote VNI Count: 10
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
VTEPs:
27.0.0.16 flags: EA df_alg: preference df_pref: 32767
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01
*> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15]
27.0.0.15 32768 i
ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Attribute may not be long enough to contain a localpref value, resulting
in an assert on stream size. Gracefully handle this case instead.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Example configuration:
route-map SET_SR_POLICY permit 10
set sr-te color 1
!
router bgp 1
bgp router-id 1.1.1.1
neighbor 2.2.2.2 remote-as 1
neighbor 2.2.2.2 update-source lo
address-family ipv4 unicast
neighbor 2.2.2.2 next-hop-self
neighbor 2.2.2.2 route-map SET_SR_POLICY in
exit-address-family
!
!
Learned BGP routes from 2.2.2.2 are mapped to the SR-TE Policy
which is uniquely determined by the BGP nexthop (2.2.2.2 in this
case) and the SR-TE color in the route-map.
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
rfc 5701 is supported. it is possible to configure in bgp vpn, a list of
route target with ipv6 external communities to import. it is to be noted
that this ipv6 external community has been developed only for matching a
bgp flowspec update with same ipv6 ext commmunity.
adding to this, draft-ietf-idr-flow-spec-v6-09 is implemented regarding
the redirect ipv6 option.
Practically, under bgp vpn, under ipv6 unicast, it is possible to
configure : [no] rt6 redirect import <IPV6>:<AS> values.
An incoming bgp update with fs ipv6 and that option matching a bgp vrf,
will be imported in that bgp vrf.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This would be handy for situations when a notification was sent, but it's
absolutely not clear who triggered that.
Just in case dumping all attributes under the debug mode would help finding
the _bad_ attribute.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
A new proxy flag has been added to the already existing NA extended
community to allow proxy advertisment of a local host by a VTEP that is
yet to indpendently establish local reachability.
Reference: draft-rbickhart-evpn-ip-mac-proxy-adv
The extendend mac-mobility sequence number needs to be synced across
the ES peers. However we cannot let a ES-peer path win over a local
path on the same ES. To accomplish that some parameters such as the
MM seq number are bubbled up from the non-best path to the local path.
This mechanism is explained further in the path-selection patch.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Add ESI as an inline attribute field along with the other EVPN
attributes. This may be re-worked when the rest of the EVPN
attributes find a new home.
Some cleanup has been done to get rid of stale/unused references
to ESI. And also to consolidate duplicate definitions of ES ID
types.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
This is not the attribute involved in path selection and by rfc7606 it should
be just ignored.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Multiple different issues causing mostly UAFs but maybe other more
subtle things.
- Cluster lists were the only attributes whose pointers were not being
NULL'd when freed, resulting in heap UAF
- When performing an insert into the cluster hash, our temporary struct
used for hash_get() was inconsistent with our hash keying and
comparison functions. In the case of a zero length cluster list, the
->length field is 0 and the ->list field is NULL. When performing an
insert, we set the ->list field regardless of whether the length is 0.
This resulted in the two cluster lists hashing equal but not comparing
equal. Later, when removing one of them from the hash before freeing
it, because the key matched and the comparison succeeded (because it
was set to NULL *after* the search but *before* inserting into the
hash) we would sometimes release the duplicated copy of the struct,
and then free the one that remained in the hash table. Later accesses
constitute UAF. This is fixed by making sure the fields used for the
existence check match what is actually inserted into the hash when
that check fails.
This patch also makes cluster_unintern static, because it should be.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Having a full feed this leads to unknown. You can't point which prefix or
aspath has this malforming behavior.
Printing just `[EC 33554434] AGGREGATOR attribute is BGP_AS_ZERO(0)` isn't
enough, you can't directly pin-point where is the problem.
Additionally print at least aspath here:
```
[EC 33554434] AGGREGATOR AS number is 0 for aspath: 65000 65031
```
Overall the full table has only 6 such malformed prefixes:
```
aspath: 64539 15096 6939 45430 45458
aspath: 64539 15096 6939 1299 3257 34984 34984 34984 34984 34984 51174
aspath: 64539 15096 6939 286 34984 16135 16135 {16135}
aspath: 64539 15096 6939 7545 7545 136001
aspath: 64539 15096 6939 6762 3269 20746
aspath: 64539 15096 6939 7018 3379
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Modify more code to use `const struct prefix` throughout
bgp. This is all prep work for adding an accessor function
for bgp_node to get the prefix and reduce all the places that
code needs to be touched when we get that work done.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some were converted to bool, where true/false status is needed.
Converted to void only those, where the return status was only false or true.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
We were using XMALLOC for these, and only initializing the refcount to 0
on one of them. Let's just use XCALLOC instead...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Prefix-SID path attribute Label-index TLV (type-1) is
used by SR-MPLS. And Label-index TLV MUST ignored
if that path attribute is append on non-Labeled-unicast
UPDATE message described on [ref1].
There is a problem case exist arround this implementation.
This commit fix that.
Before this commit,
unfortunally, setting Label-Index value is skipped at somecases.
because, Label-Index TLV implementation check the AFI/SAFI pair.
by mp_update variable that is set by bgp_mp_reach_parse function.
if MP_REACH_NLRI is present after PREFIX_SID, bgp_attr_psid_sub
function can't understand AFI/SAFI pair. and the order of each
path attributes is never no-deterministic thing for receiver.[ref2]
In this commit,
I re-located checking code of AFI/SAFI pair after path-attr loop.
[ref1](https://tools.ietf.org/html/draft-ietf-idr-bgp-prefix-sid-27#section-3.2)
> The Originator SRGB TLV may only appear in a BGP Prefix-SID attribute
> attached to IPv4/IPv6 Labeled Unicast prefixes ([RFC8277]). It MUST
> be ignored when received for other BGP AFI/SAFI combinations.
[ref2](https://tools.ietf.org/html/rfc4271#section-5)
> The sender of an UPDATE message SHOULD order path attributes within
> the UPDATE message in ascending order of attribute type. The
> receiver of an UPDATE message MUST be prepared to handle path
> attributes within UPDATE messages that are out of order.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Prefix-SID is desined to capable for TLV array.
That behaviour is important to support SR-MPLS feature
and that supported by previous PR #5418.
In that implementation, but if some additional data
(such as next BGP update message or next path attributes)
was present after Prefix-SID path attribute,
bgpd will parse that addional data as Prefix-SID TLV.
This commit fix that. before this commit, loop condition
is determed by stream is readable or not. In more correct
implementatoin, the prefix-sid boundaly should be checked
additonally. the length of Prefix-sid path attribute can
be get by bgp_attr_parse_args.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
According to https://tools.ietf.org/html/rfc7606 some of the attributes
MUST be handled as "treat-as-withdraw" approach.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
bgp flowspec packets are being forged correctly. There is no need to
check for bgp length, as the bgp nlri length is checked at reception.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
bgpd already supports BGP Prefix-SID path attribute and
there are some sub-types of Prefix-SID path attribute.
This commits makes bgpd to support additional sub-types.
sub-Type-4 and sub-Type-5 for construct the VPNv4 SRv6 backend
with vpnv4-unicast address family.
This path attributes is already supported by Ciscos IOS-XR and NX-OS.
Prefix-SID sub-Type-4 and sub-Type-5 is defined on following
IETF-drafts.
Supports(A-part-of):
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-04
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-05
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Guess what - for a bounds check to work, it has to happen *before* you
read the data. We were trusting the attribute field received in a prefix
SID attribute and then checking if it was correct afterwards, but if was
wrong we'd crash before that.
This fixes the problem, and adds additional paranoid bounds checks.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When passing a v4 multicast route to a peer send
the v4 nexthop as a preferred methodology.
Fixes: #5582
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Move VNC interning to the appropriate spot
* Use existing bgp_attr_flush_encap to free encap sets
* Assert that refcounts are correct before exiting to keep the demons
contained in their fiery prison
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Early exits without appropriate cleanup were causing obscure double
frees and other issues later on in the attribute parsing code. If we
return anything except a hard attribute parse error, we have cleanup and
refcounts to manage.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This commit make bgpd to skip and ignore unsupported
sub-type of PREFIX_SID. (especially new defined sub-type)
Current bgpd can't parase unsupported sub-type of PREFIX_SID.
PREFIX_SID is drafted on draft-ietf-idr-bgp-prefix-sid-27.
There are already new sub-type drafted on
draft-dawra-idr-srv6-vpn-05. (Type5,6 is new defined.)
This commit fix the problem reported as #5277 on GitBub.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
this table identifier can be used for policy routing. incoming entries
are locally exported to that local table identifier.
note that so that the user applies the new table identifier to all
entries, the user should flush local tables first.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
RFC 4271 sec 6.3 p33, In the case of a BGP_NEXTHOP attribute with an
incorrect value, FRR is supposed to send a notification
and include 'Corresponding type, length and value of the NEXT_HOP
attribute in the notification data.
Fixes: #4997
Signed-off-by: Nikos <ntriantafillis@gmail.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow bgp to set a local Administrative distance to use
for installing routes into the rib.
Example:
!
router bgp 9323
bgp router-id 1.2.3.4
neighbor enp0s8 interface remote-as external
!
address-family ipv4 unicast
neighbor enp0s8 route-map DISTANCE in
exit-address-family
!
route-map DISTANCE permit 10
set distance 153
!
line vty
!
end
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B 0.0.0.0/0 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
K>* 0.0.0.0/0 [0/100] via 10.0.2.2, enp0s3, 00:06:31
B>* 1.1.1.1/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
B>* 1.1.1.2/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
B>* 1.1.1.3/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:06:31
K>* 169.254.0.0/16 [0/1000] is directly connected, enp0s3, 00:06:31
eva#
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change addresses the following:
1) Ensures logs under DEBUG macro checks are categorized
as zlog_debug instead of zlog_info.
2) Error logs are categorized as zlog_err instead of zlog_info.
3) Rephrasing certain logs to make them appear more intuitive.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Problem reported that when vrf route-leaking between an unnumbered
peer in one vrf to a numbered peer in another vrf, the nexthop
attribute was missing from the update, causing the session to fail.
determined that we needed to expand the mechanism for verifying if
the route has been learned in the other vrf without an ipv4 nexthop.
Ticket: CM-25610
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
bgp update messages were not correctly calculating the size
for a labeled-unicast prefix, as they were not accounting
for the label. If the update message was large enough to
overflow the maximum packet size (4096 bytes) this could
cause bgpd to send a malformed update packet.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Modify the code such that we can auto turn the iana values of afi
and safi to pleasant to read strings.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The bgp_attr_extcom_tunnel_type does not properly
compile with warnings turned on due to recent change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This diff contains 2 parts:
1. Extract the tunnel type info from bgp extended communities.
2. Make rfapi use this common tunnel type ap
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
This is causing interop issues with vendors. According to the RFC,
receiver should ignore the NEXT_HOP attribute with MP_REACH_NLRI
present.
Signed-off-by: nikos <ntriantafillis@gmail.com>
This is causing interop issues with vendors. According to the RFC,
receiver should ignore the NEXT_HOP attribute with MP_REACH_NLRI
present.
Signed-off-by: nikos ntriantafillis@gmail.com
When using remove-private-AS together with local-as
aspath_remove_private_asns() is called before bgp_packet_attribute().
In this case, private AS will always appear in front of change_local_as.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
We have the same warn message in 3 spots, which makes it extremely
hard to figure out which of the 3 has gone terribly wrong.
Add a bit of code to disambiguate the 3 situations.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Any evpn bgp update message comes with router mac extended
community, which can potentially contain the madd adddress
same as any of the local SVIs (L3VNI) MAC address.
Set route mac exist and during route processing in
bgp_update() filter the route.
Ticket:CM-23674
Reviewed By:CCR-8336
Testing Done:
Configure L3vni mac on TORS1 which is similar to TORC11
L3vni MAC. When TORC11 received the EVPN update with
Router mac extended community, this check rejected the
BGP update message.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Consider the following topo VTEP1->SPINE1->VTEP2. ebgp is being used
for evpn route exchange with SPINE just acting as a pass through.
1. VTEP1 was building the type-3 IMET route with the correct PMSI
tunnel type (ingress-replication) and label (VNI)
2. Spine1 was however only parsing the tunnel-type in the attr (was
skipping parsing of the label field altogether) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@MSP1:~# net show bgp l2vpn evpn route rd 27.0.0.15:4 type multicast
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
BGP routing table entry for 27.0.0.15:4:[3]:[0]:[32]:[27.0.0.15]
Paths: (1 available, best #1)
Advertised to non peer-group peers:
TORC11(downlink-1) TORC12(downlink-2) TORC21(downlink-3) TORC22(downlink-4) TORS1(downlink-5) TORS2(downlink-6)
Route [3]:[0]:[32]:[27.0.0.15]
5550
27.0.0.15 from TORS1(downlink-5) (27.0.0.15)
Origin IGP, valid, external, bestpath-from-AS 5550, best
Extended Community: RT:5550:1003 ET:8
AddPath ID: RX 0, TX 227
Last update: Thu Feb 7 15:44:22 2019
PMSI Tunnel Type: Ingress Replication, label: 16777213 >>>>>>>
Displayed 1 prefixes (1 paths) with this RD (of requested type)
root@MSP1:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
3. So VTEP2 didn't rx the correct label.
In an all FRR setup this doesn't have any functional consequence but some
vendors are validating the content of the label field as well and ignoring
the IMET route from FRR (say VTEP1 is FRR and VTEP2 is 3rd-party). The
functional consequence of this VTEP2 ignores VTEP1's IMET route and doesn't
add VTEP1 to the corresponding l2-vni flood list.
This commit fixes up the PMSI attr parsing on spine-1 -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@MSP1:~# net show bgp l2vpn evpn route rd 27.0.0.15:4 type multicast
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
BGP routing table entry for 27.0.0.15:4:[3]:[0]:[32]:[27.0.0.15]
Paths: (1 available, best #1)
Advertised to non peer-group peers:
TORC11(downlink-1) TORC12(downlink-2) TORC21(downlink-3) TORC22(downlink-4) TORS1(downlink-5) TORS2(downlink-6)
Route [3]:[0]:[32]:[27.0.0.15]
5550
27.0.0.15 from TORS1(downlink-5) (27.0.0.15)
Origin IGP, valid, external, bestpath-from-AS 5550, best
Extended Community: RT:5550:1003 ET:8
AddPath ID: RX 0, TX 278
Last update: Thu Feb 7 00:17:40 2019
PMSI Tunnel Type: Ingress Replication, label: 1003 >>>>>>>>>>>
Displayed 1 prefixes (1 paths) with this RD (of requested type)
root@MSP1:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Ticket: CM-23790
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Currently we are hardcoding it at the time of attr building to
ingress-replication. This is just a code clean-up and has no
functional impact.
Ticket: CM-23790
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The `struct bgp_route_evpn` and `struct overlay_index` data
structures are exactly the same. Reduce to 1.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
community_free, lcommunity_free and ecommunity_free are similar type of functions. Most of the places, these three are called together. The signature of community_free is different from other two functions. Modified the community_free API signature to align with other two functions to avoid any confusion. There is no functionality impact with this and this is just to avoid any confusion.
Testing: manual testing and show commands
Signed-off-by: Sri Mohana Singamsetty msingamsetty@vmware.com
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to aggregate routes to handle
extended communities. Make the actions similiar
to what we do for normal communities.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The peer->nexthop.ifp pointer must be set when parsing the
attributes in bgp_mp_reach_parse, notice this
and fail gracefully.
Rework bgp_nexthop_set to remove the HAVE_CUMULUS and to
fail the nexthop_set when we have a zebra connection and
no ifp pointer, as that not havinga zebra connection and
no ifp pointer is legal.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
EVPN ND ext community support NA flag R-bit, to have proxy ND.
Set R-bit in EVPN NA if a given router is default gateway or there is a
local
router attached, which can be determine based on local neighbor entry.
Implement BGP ext community attribute to generate and parse R-bit and
pass along zebra to program neigh entry in kernel.
Upon receiving MAC/IP update with community type 0x06 and sub_type 0x08,
pass the R-bit to zebra to program neigh entry.
Set NTF_ROUTER in neigh entry and inform kernel to do proxy NA for EVPN.
Ref:
https://tools.ietf.org/html/draft-ietf-bess-evpn-na-flags-01
Ticket:CM-21712, CM-21711
Reviewed By:
Testing Done:
Configure Local vni enabled L3 Gateway, which would act as router,
checked
show evpn arp-cache vni x ip <ip of svi> on originated and remote VTEPs.
"Router" flag is set.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
On the case where an mp_unreach attribute is received, while there is no
mp_reach attribute too, it is not necessary to check for missing
attributes.
Fixes: 67495ddb2e ("bgpd: Fixes for recent well-known-attr check patch.")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit moves the command 'bgp enforce-first-as' from global BGP
instance configuration to peer/neighbor configuration, which can now be
changed by executing '[no] neighbor <neighbor> enforce-first-as'.
End users can now enforce sane first-AS checking on regular sessions
while e.g. disabling the checks on routeserver sessions, which usually
strip away their own AS number from the path.
To ensure backwards-compatibility, a migration routine was added which
automatically sets the 'enforce-first-as' flag on all configured
neighbors if the old global setting was activated. The old global
command immediately disappears after running the migration routine once.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
Handle multiple PREFIX_SID's at the same time. The draft clearly
states that multiple should be handled and we have a actual pcap
file that clearly has multiple PREFIX_SID's at the same time.
Fixes: #2153
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The vrf 2 vrf route leaking auto-derives RD and RT and
installs the routes into the appropriate vpn table.
These routes when a operator configured ipv[4|6] vpn
neighbors were showing up off box. The RD and RT
values choosen are localy significant but globaly
useless and may cause confusion.
Put a special bit of code in to notice that we
should not be advertising these routes off box.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Connected routes redistributed into BGP as well as IPv4 routes with IPv6
link-local next hops (RFC 5549) need information about the associated
interface in BGP if they are candidates to be leaked into another VRF. In
the absence of route leaking, this was not necessary. Introduce the
appropriate mechanism and ensure this is used during route install (in
the target VRF).
Ticket: CM-20343, CM-20382
Testing done:
1. Manually verified failed scenarios and some additional ones - logs
in the tickets.
2. Ran bgp-min and evpn-min - results are good.
3. Ran vrf smoke - has some failures, but none which look new
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
FS UNREACH message with 0 NLRI inside is sent after each peer
establishment. FS can send NLRI messages with no nexthop.
The commit fixes a message that is triggered by mistake
if FS was about to be sent, then that message is not output.
Also it fixes a typo.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This work is derived from a work done by China-Telecom.
That initial work can be found in [0].
As the gap between frr and quagga is important, a reworks has been
done in the meantime.
The initial work consists of bringing the following:
- Bringing the client side of flowspec.
- the enhancement of address-family ipv4/ipv6 flowspec
- partial data path handling at reception has been prepared
- the support for ipv4 flowspec or ipv6 flowspec in BGP open messages,
and the internals of BGP has been done.
- the memory contexts necessary for flowspec has been provisioned
In addition to this work, the following has been done:
- the complement of adaptation for FS safi in bgp code
- the code checkstyle has been reworked so as to match frr checkstyle
- the processing of IPv6 FS NLRI is prevented
- the processing of FS NLRI is stopped ( temporary)
[0] https://github.com/chinatelecom-sdn-group/quagga_flowspec/
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: jaydom <chinatelecom-sdn-group@github.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Received PMSI tunnel attributes (in EVPN type-3 route) were not recognized.
Parse them and display the tunnel type when looking at routes. Note that
the only tunnel type currently supported is ingress replication (IR). A
warning message will be logged if the received tunnel type is something
else, but the attribute is otherwise ignored.
Updates: a21bd7a (bgpd: add PMSI_TUNNEL_ATTRIBUTE to EVPN IMET routes)
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
When doing symmetric routing,
EVPN type-2 (MACIP) routes need to be advertised with two labels (VNIs)
the first being the L2 VNI (identifying the VLAN) and
the second being the L3 VNI (identifying the VRF).
The receive processing needs to handle one or two labels too.
Ticket: CM-18489
Review: CCR-6949
Testing: manual and bgp/evpn/mpls smoke
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
1. Added default gw extended community
2. code modification to handle sticky-mac/default-gw-mac as they go together
3. show command support for newly added extended community
4. State in zebra to reflect if a mac/neigh is default gateway
5. show command enhancement to refelect the same in zebra commands
Ticket: CM-17428
Review: CCR-6580
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Some of the deprecated stream.h macros see such little use that we may
as well just remove them and use the non-deprecated macros.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
piggybacking off of bgp_read()
* Tons of little fixups
Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
So we have the ability to apply speculative route-maps to
neighbor display to see what the changes would look like
via some show commands. When we do this we make a
shallow copy of the attr data structure and then pass
it around for applying the routemap. After we've applied
this route-map and displayed it we really need to clean
up memory that the route-map application applied.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A crafted BGP UPDATE with a malformed path attribute length field causes
bgpd to dump up to 65535 bytes of application memory and send it as the
data field in a BGP NOTIFY message, which is truncated to 4075 bytes
after accounting for protocol headers. After reading a malformed length
field, a NOTIFY is generated that is supposed to contain the problematic
data, but the malformed length field is inadvertently used to compute
how much data we send.
CVE-2017-15865
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This issue was discovered on a live session with an extremely
old cisco 7206VXR router running 12.2(33)SRE4. The sending router
is sending us an empty NLRI that is MP_REACH. From RFC
exploration(thanks Russ!) it appears that this was
considered a 'valid' way to send EOR.
Following discussion decided that we should treat
this situation as a EOR marker instead of bringing
down the session.
Applying this fix on the FRR router seeing this issue
allows it to continue it's peering relationship with
the ASR. Since this is a point fix I do not see
a high likelihood of further fallout.
Fixes: #1258
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Commit 8c9cc7bbf6 changed the size
of the `struct bgp_attr_encap_subtlv` type to be a zero length
array at the end instead of having a 1 byte. All memory allocations
for this subsuquently were off by 1 byte since those were not
adjusted either.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
stlv_last is initialized with the loops. No need to reset it.
Its scope is local to the use with the loops.
Signed-off-by: Vincent Jardin <vincent.jardin@6wind.com>
These are now unused. route-maps can't modify these attributes, so
there is no need for _dup functions.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
bgp_attr_deep_dup is based on a misunderstanding of how route-maps work.
They never change actual data, just pointers & fields in "struct attr".
The correct thing to do is copy struct attr and call bgp_attr_flush()
afterwards.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This attempt at optimization has cost us more than a week's worth of
time on several people hunting down the subtle bug that it was missing
an increment on attr->lcommunity.
This is absolutely not worth the maintenance cost.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
1) Add hash names to all hash_create calls
2) Fix community_hash, ecommunity_hash and lcommunity_hash key
creation
3) Fix output of community and lcommunity iterators( why would
we want to see the memory location of the backet? ).
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>