When the remote peer is neither EBGP nor confed, aspath is the
shadow copy of attr->aspath in bgp_packet_attribute(). Striping
AS4_PATH should not be done on the aspath directly, since
that would lead to bgpd core dump when unintern the attr.
Signed-off-by: Yuan Yuan <yyuanam@amazon.com>
Based on RFC-4760, if NEXT_HOP attribute is not
suppose to be set if MP_REACH_NLRI NLRI is used.
for IPv4 aggregate route only NEXT_HOP attribute
with ipv4 prefixlen needs to be set.
Testing Done:
Before fix:
----------
aggregate route:
*> 184.123.0.0/16 ::(TORC11) 0 32768 i
After fix:
---------
aggregate route:
*> 184.123.0.0/16 0.0.0.0(TORC11) 0 32768 i
* i peerlink-3 0 100 0 i
* uplink1 0 4435 5546 i
184.123.1.0/24 0.0.0.0(TORC11) 0 32768 i
s> 184.123.8.0/22 0.0.0.0(TORC11) 0 32768 i
Signed-off-by: Chirag Shah <chirag@nvidia.com>
BGP_PREFIX_SID_SRV6_L3_SERVICE attributes must not
fully trust the length value specified in the nlri.
Always ensure that the amount of data we need to read
can be fullfilled.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a hash_clean_and_free() function as well as convert
the code to use it. This function also takes a double
pointer to the hash to set it NULL. Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When we specify remote-as as external/internal, we need to set local_as to
bgp->as, instead of bgp->confed_id. Before this patch, (bgp->as != *as) is
always valid for such a case because *as is always 0.
Also, append peer->local_as as CONFED_SEQ to avoid other side withdrawing
the routes due to confederation own AS received and/or malformed as-path.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Because mp_nexthop_len attribute value stands for the length
to encode in the stream, simplify the way the nexthop is
forged.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The peer pointer theorically have a NULL bgp pointer. This triggers
a SA issue. So let us fix it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Each BGP prefix may have an as-path list attached. A forged
string is stored in the BGP attribute and shows the as-path
list output.
Before this commit, the as-path list output was expressed as
a list of AS values in plain format. Now, if a given BGP instance
uses a specific asnotation, then the output is changed:
new output:
router bgp 1.1 asnotation dot
!
address-family ipv4 unicast
network 10.200.0.0/24 route-map rmap
network 10.201.0.0/24 route-map rmap
redistribute connected route-map rmap
exit-address-family
exit
!
route-map rmap permit 1
set as-path prepend 1.1 5433.55 264564564
exit
ubuntu2004# do show bgp ipv4
BGP table version is 2, local router ID is 10.0.2.15, vrf id 0
Default local pref 100, local AS 1.1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
*> 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
The changes include:
- the aspath structure has a new field: asnotation type
The ashash list will differentiate 2 aspaths using a different
asnotation.
- 3 new printf extensions display the as number in the wished
format: pASP, pASD, pASE for plain, dot, or dot+ format (extended).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In ipv6 vpn, when the global and the local ipv6 address are received,
when re-transmitting the bgp ipv6 update, the nexthop attribute
length must still be 48 bytes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The idea is to drop unwanted attributes from the BGP UPDATE messages and
continue by just ignoring them. This improves the security, flexiblity, etc.
This is the command that Cisco has also.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
At bgpd startup, VRF instances are sent from zebra before the
interfaces. When importing a l3vpn prefix from another local VRF
instance, the interfaces are not known yet. The prefix nexthop interface
cannot be set to the loopback or the VRF interface, which causes setting
invalid routes in zebra.
Update route leaking when the loopback or a VRF interface is received
from zebra.
At a VRF interface deletion, zebra voluntarily sends a
ZEBRA_INTERFACE_ADD message to move it to VRF_DEFAULT. Do not update if
such a message is received. VRF destruction will destroy all the related
routes without adding codes.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If 'network import-check' is defined on the source BGP session, prefixes
that are stated in the network command cannot be leaked to the other
VRFs BGP table even if they are present in the origin VRF RIB if the
'rt import' statement is defined after the 'network <prefix>' ones.
When a prefix nexthop is updated, update the prefix route leaking. The
current state of nexthop validation is now stored in the attributes of
the bgp path info. Attributes are compared with the previous ones at
route leaking update so that a nexthop validation change now triggers
the update of destination VRF BGP table.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The function bgp_packet_mpattr_prefix was using an if statement
to encode packets to the peer. Change it to a switch and make
it handle all the cases and fail appropriately when something
has gone wrong. Hopefully in the future when a new afi/safi
is added we can catch it by compilation breaking instead of
weird runtime errors
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This function was just using default: case statements for
the encoding of nlri's to a peer. Lay out all the different
cases and make things fail hard when a dev escape is found.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The function bgp_packet_mpattr_prefix_size had an if/else
body that allowed people to add encoding types to bgpd
such that we could build the wrong size packets. This
was exposed recently in commit:
0a9705a1e0
Where it was discovered flowspec was causing bgp update
messages to exceed the maximum size and the peer to
drop the connection. Let's be proscriptive about this
and hopefully make it so that things don't work when
someone adds a new safi to the system ( and they'll have
to update this function ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, bgp_packet_mpattr_prefix_size (bgpd/bgp_attr.c:3978) always returns zero for Flowspec prefixes.
This is because, for flowspec prefixes, the prefixlen attribute of the prefix struct is always set to 0, and the actual length is bytes is set inside the flowspec_prefix struct instead (see lib/prefix.h:293 and lib/prefix.h:178).
Because of this, with a large number of flowspec NLRIs, bgpd ends up building update messages that exceed the maximum size and cause the peer to drop the connection (bgpd/bgp_updgrp_packet.c:L719).
The proposed change allows the bgp_packet_mpattr_prefix_size to return correct result for flowspec prefixes.
Signed-off-by: Stephane Poignant <stephane.poignant@proton.ch>
It's possible to send less data then the length you say you are.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Use %pI4/%pI6 where possible, otherwise at least atjust stack buffer sizes
for inet_ntop() calls.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Given that two routers are connected each other and they have IPv6
addresses and they establish BGP peer with extended-nexthop capability
and one router tries to advertise locally-generated IPv4-VPN routes to
other router.
In this situation, bgpd on the router that tries to advertise IPv4-VPN
routes will be crashed with "invalid MP nexthop length (AFI IP6)".
This issue is happened because MP_REACH_NLRI path attribute is not
generated correctly when ipv4-vpn routes are advertised to IPv6 peer.
When IPv4 routes are leaked from VRF RIB, the nexthop of these routes
are also IPv4 address (0.0.0.0/0 or specific addresses). However,
bgp_packet_mpattr_start only covers the case of IPv6 nexthop (for IPv6
peer).
ipv4-unicast routes were not affected by this issue because the case of
IPv4 nexthop is covered in `else` block.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Currently bgpd uses the opaque codepoint (0xFFFF) in the BGP
advertisement. In this commit, we update bgpd to use the SRv6 codepoints
defined in the IANA SRv6 Endpoint Behaviors Registry
(https://www.iana.org/assignments/segment-routing/segment-routing.xhtml)
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
For now, only if the knob is enabled. Later this gonna be (most likely) removed
and routes with AS_SET / AS_CONFED_SET will be denied by default.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
According to RFC9234:
An UPDATE message with a malformed OTC Attribute SHALL be handled
using the approach of "treat-as-withdraw".
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The command `debug bgp allow-martian` is not actually
a debug command it's a command that when entered allows
bgp to not reset a peering when a martian nexthop is
passed in the nlri.
Add the `bgp allow-martian-nexthop` command and allow it to be
used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>