Given that two routers are connected each other and they have IPv6
addresses and they establish BGP peer with extended-nexthop capability
and one router tries to advertise locally-generated IPv4-VPN routes to
other router.
In this situation, bgpd on the router that tries to advertise IPv4-VPN
routes will be crashed with "invalid MP nexthop length (AFI IP6)".
This issue is happened because MP_REACH_NLRI path attribute is not
generated correctly when ipv4-vpn routes are advertised to IPv6 peer.
When IPv4 routes are leaked from VRF RIB, the nexthop of these routes
are also IPv4 address (0.0.0.0/0 or specific addresses). However,
bgp_packet_mpattr_start only covers the case of IPv6 nexthop (for IPv6
peer).
ipv4-unicast routes were not affected by this issue because the case of
IPv4 nexthop is covered in `else` block.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Currently bgpd uses the opaque codepoint (0xFFFF) in the BGP
advertisement. In this commit, we update bgpd to use the SRv6 codepoints
defined in the IANA SRv6 Endpoint Behaviors Registry
(https://www.iana.org/assignments/segment-routing/segment-routing.xhtml)
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
For now, only if the knob is enabled. Later this gonna be (most likely) removed
and routes with AS_SET / AS_CONFED_SET will be denied by default.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
According to RFC9234:
An UPDATE message with a malformed OTC Attribute SHALL be handled
using the approach of "treat-as-withdraw".
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The command `debug bgp allow-martian` is not actually
a debug command it's a command that when entered allows
bgp to not reset a peering when a martian nexthop is
passed in the nlri.
Add the `bgp allow-martian-nexthop` command and allow it to be
used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
When implementing the bgp_packet_mpunreach_prefix a uint8_t array
of 3 bytes was created and then assigned to a label type, which
is 4 bytes and then various pointer work is done on it. Eventually
coverity is complaining that the 3 -vs- 4 bytes is not enough
to properly dereference it. Just make the uint8_t 4 bytes
and be done with it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The compiler is, rightly, pointing out that in some cases it is
possible that the pkt_afi and pkt_safi values are not properly
set and could result in a use before initialized. I do not
actually belive that this is possible, but let's make the compiler
happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
https://datatracker.ietf.org/doc/html/rfc7947#section-2.2
Optional recognized and unrecognized BGP attributes,
whether transitive or non-transitive, SHOULD NOT be updated by the
route server (unless enforced by local IXP operator configuration)
and SHOULD be passed on to other route server clients.
By default LB ext-community works with iBGP peers. When we receive a route
from eBGP peer, we can send LB ext-community to iBGP peers.
With this patch, allow sending LB ext-community to iBGP/eBGP peers if they
are set as RS clients.
FRR does not send non-transitive ext-communities to eBGP peers, but for
example GoBGP sends and if it's set as RS client, we should pass those attributes
towards another RS client.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When reading the BGP_PREFIX_SID_SRV6_L3_SERVICE_SID_STRUCTURE
it is possible that the length read in the packet is insufficiently
large enough to read a BGP_PREFIX_SID_SRV6_L3_SERVICE_SID_STRUCTURE.
Let's ensure that it is.
Fixes: #10860
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_attr_undup does the same thing as bgp_attr_flush – frees the
temporary data that might be allocated when applying a route-map. There
is no need to have two separate functions for that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Avoid use-after-free situation. Flush attr_extra structure only when flushing
all attributes, not just for unintern.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Description:
Replacing memcmp at certain places,
to avoid the coverity issues caused by it.
Co-authored-by: Kantesh Mundargi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
This is the initial work to move all non IPv4/IPv6 AFI related
attributes/structs to attr->extra to avoid unnecesarry allocations.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Always free the locally allocated attribute not the one we are using for
return. This fixes a memory leak and a crash when AS Path is set with
route-map.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
If soft-reconfiguration is enabled, bgp_adj_in_set will be called
from bgp_update and bgp_adj_in_set will call bgp_attr_intern to intern
attr pointer. If given attr isn't found in attrhash, hash_get will call
bgp_attr_hash_alloc to allocate new attr structure. In
bgp_attr_hash_alloc, NULL will be assigned to srv6_vpn field and
srv6_l3vpn field in origin attr pointer. attr->srv6_vpn and
attr->srv6_l3vpn are interned in bgp_attr_intern, so NULL assignment
isn't needed.
And, these fields are used later in bgp_update to set SRv6 information
to bgp_path_info. If bgp_attr_hash_alloc assign NULL to these fields,
SRv6 information will be lost and incorrect routes are inserted into
data-plane.
Signed-off-by: Ryoga Saito <contact@proelbtn.com>
Description:
Change is intended for fixing the following issues related to vrf route leaking:
Routes with special nexthops i.e. blackhole/sink routes when imported,
are not programmed into the FIB and corresponding nexthop is set as 'inactive',
nexthop interface as 'unknown'.
While importing/leaking routes between VRFs, in case of special nexthop(ipv4/ipv6)
once bgp announces route(s) to zebra, nexthop type is incorrectly set as
NEXTHOP_TYPE_IPV6_IFINDEX/NEXTHOP_TYPE_IFINDEX
i.e. directly connected even though we are not able to resolve through an interface.
This leads to nexthop_active_check marking nexthop !NEXTHOP_FLAG_ACTIVE.
Unable to find the active nexthop(s), route is not programmed into the FIB.
Whenever BGP leaks routes, set the correct nexthop type, so that route gets resolved
and correctly programmed into the FIB, in the imported vrf.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
draft-ietf-bess-srv6-services-07 defines new SID structure Sub-Sub-TLV.
This patch adds SID structure information to bgp_attr_srv6_l3vpn. This
patch also defines default SID stucture used by following patches.
Signed-off-by: Ryoga Saito <contact@proelbtn.com>
This is to avoid breaking changes between existing deployments of
extended community for bandwidth encoding. By default FRR uses uint32
to encode bandwidth, which is not as the draft requires (IEEE floating-point).
This switch enables the required encoding per-peer.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When bgp receives the admin distance from a redistribution statement
let's store that distance for later usage.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
the realloc man page:
If ptr is NULL, then the call is equivalent to malloc(size)
This should be sufficient for our needs to not have to have
XMALLOC and XREALLOC
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Process this a bit later instead of bgp_attr_parse() which is causing
the session to be shutdown upon receiving a prefix with AS number 0 inside.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This commit make bgpd to support VPN SID advertisement
as BGP Prefix-SID when route-leaking from BGP-vrf instance
to BGP-vpn instance.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit fix bgpd's prefix-sid type4,5 feature which has
miss implementation from https://github.com/FRRouting/frr/pull/5653
was merged. Due to some nessesary lines are not presented.
When bgpd receives multi update message with same service-sid on
prefix-sid type-5 attribute, bgpd will crash arround path-attribute's
values object reference count.
And also, this commit add a topotest to check that feature work fine.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Just for more debug information regarding malformed aggregator_as.
```
bgpd[5589]: [EC 33554434] 192.168.10.25: AGGREGATOR AS number is 0 for aspath: 65030
bgpd[5589]: bgp_attr_aggregator: attributes: nexthop 192.168.10.25, origin i, path 65030
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Avoid mangling packet size which is expected to be the same as received.
Stream pointer advancing is necessary to avoid changing the packet and
reseting BGP sessions.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
An UPDATE message that contains the AS number of zero in the AS_PATH
or AGGREGATOR attribute MUST be considered as malformed and be
handled by the procedures specified in [RFC7606].
An UPDATE message with a malformed AGGREGATOR attribute SHALL be
handled using the approach of "attribute discard".
Attribute discard: In this approach, the malformed attribute MUST
be discarded and the UPDATE message continues to be processed.
This approach MUST NOT be used except in the case of an attribute
that has no effect on route selection or installation.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The `struct ecommunity` structure is using an int for a size value.
Let's switch it over to a uint32_t for size values since a size
value for data can never be negative.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Convert usage of the attr->evpn_overlay to get/set functionality.
Future commits will allow us to abstract this data to when
we actually need it for the `struct attr`.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Abstract the access of `attr->cluster` to appropriate
accessor/set functionality.
Future commits will allow us to move this data around
to make `struct attr` smaller.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Convert the `struct attr`->ipv6_ecommunity to use
accessor functions. We'll be able to reduce memory
usage in the `struct bgp_attr` by doing this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add an accessor for the bgp_attr.pmsi_tnl_type to allow
us to abstract where it is. Every attribute is paying
the price of this bit of data as part of `struct bgp_attr`
In the future we'll move it elsewhere.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The route_map_object_t was being used to track what protocol we were
being called against. But each protocol was only ever calling itself.
So we had a variable that was only ever being passed in from route_map_apply
that had to be carried against and everyone was testing if that variable
was for their own stack.
Clean up this route_map_object_t from the entire system. We should
speed some stuff up. Yes I know not a bunch but this will add up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
DF (Designated forwarder) election is used for picking a single
BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called
service carving for DF election. However that mechanism has many
disadvantages -
1. LBs poorly.
2. Doesn't allow for a controlled failover needed in upgrade
scenarios.
3. Not easy to hw accelerate.
To fix the poor performance of service carving alternate DF mechanisms
have been proposed via the following drafts -
draft-ietf-bess-evpn-df-election-framework
draft-ietf-bess-evpn-pref-df
This commit adds support for the pref-df election mechanism which
is used as the default. Other mechanisms including service-carving
may be added later.
In this mechanism one switch on an ES is elected as DF based on the
preference value; higher preference wins with IP address acting
as the tie-breaker (lower-IP wins if pref value is the same).
Sample output
=============
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01
ESI: 03:00:00:00:00:01:11:00:00:01
Type: LR
RD: 27.0.0.15:6
Originator-IP: 27.0.0.15
Local ES DF preference: 100
VNI Count: 10
Remote VNI Count: 10
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
VTEPs:
27.0.0.16 flags: EA df_alg: preference df_pref: 32767
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01
*> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15]
27.0.0.15 32768 i
ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Attribute may not be long enough to contain a localpref value, resulting
in an assert on stream size. Gracefully handle this case instead.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Example configuration:
route-map SET_SR_POLICY permit 10
set sr-te color 1
!
router bgp 1
bgp router-id 1.1.1.1
neighbor 2.2.2.2 remote-as 1
neighbor 2.2.2.2 update-source lo
address-family ipv4 unicast
neighbor 2.2.2.2 next-hop-self
neighbor 2.2.2.2 route-map SET_SR_POLICY in
exit-address-family
!
!
Learned BGP routes from 2.2.2.2 are mapped to the SR-TE Policy
which is uniquely determined by the BGP nexthop (2.2.2.2 in this
case) and the SR-TE color in the route-map.
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
rfc 5701 is supported. it is possible to configure in bgp vpn, a list of
route target with ipv6 external communities to import. it is to be noted
that this ipv6 external community has been developed only for matching a
bgp flowspec update with same ipv6 ext commmunity.
adding to this, draft-ietf-idr-flow-spec-v6-09 is implemented regarding
the redirect ipv6 option.
Practically, under bgp vpn, under ipv6 unicast, it is possible to
configure : [no] rt6 redirect import <IPV6>:<AS> values.
An incoming bgp update with fs ipv6 and that option matching a bgp vrf,
will be imported in that bgp vrf.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This would be handy for situations when a notification was sent, but it's
absolutely not clear who triggered that.
Just in case dumping all attributes under the debug mode would help finding
the _bad_ attribute.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
A new proxy flag has been added to the already existing NA extended
community to allow proxy advertisment of a local host by a VTEP that is
yet to indpendently establish local reachability.
Reference: draft-rbickhart-evpn-ip-mac-proxy-adv
The extendend mac-mobility sequence number needs to be synced across
the ES peers. However we cannot let a ES-peer path win over a local
path on the same ES. To accomplish that some parameters such as the
MM seq number are bubbled up from the non-best path to the local path.
This mechanism is explained further in the path-selection patch.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Add ESI as an inline attribute field along with the other EVPN
attributes. This may be re-worked when the rest of the EVPN
attributes find a new home.
Some cleanup has been done to get rid of stale/unused references
to ESI. And also to consolidate duplicate definitions of ES ID
types.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
This is not the attribute involved in path selection and by rfc7606 it should
be just ignored.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>