https://datatracker.ietf.org/doc/html/rfc6514#page-10 states:
A router that supports the PMSI Tunnel attribute considers this
attribute to be malformed if either (a) it contains an undefined
tunnel type in the Tunnel Type field of the attribute, or (b) the
router cannot parse the Tunnel Identifier field of the attribute as a
tunnel identifier of the tunnel types specified in the Tunnel Type
field of the attribute.
When a router that receives a BGP Update that contains the PMSI
Tunnel attribute with its Partial bit set determines that the
attribute is malformed, the router SHOULD treat this Update as though
all the routes contained in this Update had been withdrawn.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Before this path we used session reset method, which is discouraged by rfc7606.
Handle this as rfc requires.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Found when fuzzing:
```
==3470861==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffff77801ef7 at pc 0xaaaaba7b3dbc bp 0xffffcff0e760 sp 0xffffcff0df50
READ of size 2 at 0xffff77801ef7 thread T0
0 0xaaaaba7b3db8 in __asan_memcpy (/home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgpd+0x363db8) (BuildId: cc710a2356e31c7f4e4a17595b54de82145a6e21)
1 0xaaaaba81a8ac in ptr_get_be16 /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/./lib/stream.h:399:2
2 0xaaaaba819f2c in bgp_attr_aigp_valid /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgp_attr.c:504:3
3 0xaaaaba808c20 in bgp_attr_aigp /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgp_attr.c:3275:7
4 0xaaaaba7ff4e0 in bgp_attr_parse /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgp_attr.c:3678:10
```
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Relax this handling (RFC 7606) only for eBGP peers.
More details: https://datatracker.ietf.org/doc/html/rfc7606#section-4
There are two error cases in which the Total Attribute Length value
can be in conflict with the enclosed path attributes, which
themselves carry length values:
* In the first case, the length of the last encountered path
attribute would cause the Total Attribute Length to be exceeded
when parsing the enclosed path attributes.
* In the second case, fewer than three octets remain (or fewer than
four octets, if the Attribute Flags field has the Extended Length
bit set) when beginning to parse the attribute. That is, this
case exists if there remains unconsumed data in the path
attributes but yet insufficient data to encode a single minimum-
sized path attribute. <<<< HANDLING THIS CASE IN THIS COMMIT >>>>
In either of these cases, an error condition exists and the "treat-
as-withdraw" approach MUST be used (unless some other, more severe
error is encountered dictating a stronger approach), and the Total
Attribute Length MUST be relied upon to enable the beginning of the
NLRI field to be located.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
As per RFC7606 section 3g,
g. If the MP_REACH_NLRI attribute or the MP_UNREACH_NLRI [RFC4760]
attribute appears more than once in the UPDATE message, then a
NOTIFICATION message MUST be sent with the Error Subcode
"Malformed Attribute List". If any other attribute (whether
recognized or unrecognized) appears more than once in an UPDATE
message, then all the occurrences of the attribute other than the
first one SHALL be discarded and the UPDATE message will continue
to be processed.
However, notification is sent out currently for all the cases.
Fix:
For cases other than MP_REACH_NLRI & MP_UNREACH_NLRI, handling has been updated
to discard the occurrences other than the first one and proceed with further parsing.
Again, the handling is relaxed only for the EBGP case.
Also, since in case of error, the attribute is discarded &
stream pointer is being adjusted accordingly based on length,
the total attribute length sanity check case has been moved up in the function
to be checked before this case.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
As per RFC7606 section 4,
when the total attribute length value is in conflict with the
enclosed attribute length, treat-as-withdraw approach must be followed.
However, notification is being sent out for this case currently,
that leads to session reset.
Fix:
The handling has been updated to conform to treat-as-withdraw
approach only for EBGP case. For IBGP, since we are not following
treat-as-withdraw approach for any of the error handling cases,
the existing behavior is retained for the IBGP.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
RCA:
On encountering any attribute error for core attributes in update message,
the error handling is set to 'treat as withdraw' and
further parsing of the remaining attributes is skipped.
But the stream pointer is not being correctly adjusted to
point to the next NLRI field skipping the rest of the attributes.
This leads to incorrect parsing of the NLRI field,
which causes BGP session to reset.
Fix:
The stream pointer offset is rightly adjusted to point to the NLRI field correctly
when the malformed attribute is encountered and remaining attribute parsing is skipped.
Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
The total number of octets of the Sub-TLV Value field. The Sub-TLV Length field
contains 1 octet if the Sub-TLV Type field contains a value in the range from
0-127. The Sub-TLV Length field contains two octets if the Sub-TLV Type field
contains a value in the range from 128-255.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When the remote peer is neither EBGP nor confed, aspath is the
shadow copy of attr->aspath in bgp_packet_attribute(). Striping
AS4_PATH should not be done on the aspath directly, since
that would lead to bgpd core dump when unintern the attr.
Signed-off-by: Yuan Yuan <yyuanam@amazon.com>
Adds a generalized martian reimport function used for triggering a
relearn/reimport of EVPN routes that were previously filtered/deleted
as a result of a "self" check (either during import or by a martian
change handler). The MAC-VRF SoO is the first consumer of this function,
but can be expanded for use with Martian Tunnel-IPs, Interface-IPs,
Interface-MACs, and RMACs.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Based on RFC-4760, if NEXT_HOP attribute is not
suppose to be set if MP_REACH_NLRI NLRI is used.
for IPv4 aggregate route only NEXT_HOP attribute
with ipv4 prefixlen needs to be set.
Testing Done:
Before fix:
----------
aggregate route:
*> 184.123.0.0/16 ::(TORC11) 0 32768 i
After fix:
---------
aggregate route:
*> 184.123.0.0/16 0.0.0.0(TORC11) 0 32768 i
* i peerlink-3 0 100 0 i
* uplink1 0 4435 5546 i
184.123.1.0/24 0.0.0.0(TORC11) 0 32768 i
s> 184.123.8.0/22 0.0.0.0(TORC11) 0 32768 i
Signed-off-by: Chirag Shah <chirag@nvidia.com>
BGP_PREFIX_SID_SRV6_L3_SERVICE attributes must not
fully trust the length value specified in the nlri.
Always ensure that the amount of data we need to read
can be fullfilled.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a hash_clean_and_free() function as well as convert
the code to use it. This function also takes a double
pointer to the hash to set it NULL. Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When we specify remote-as as external/internal, we need to set local_as to
bgp->as, instead of bgp->confed_id. Before this patch, (bgp->as != *as) is
always valid for such a case because *as is always 0.
Also, append peer->local_as as CONFED_SEQ to avoid other side withdrawing
the routes due to confederation own AS received and/or malformed as-path.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Because mp_nexthop_len attribute value stands for the length
to encode in the stream, simplify the way the nexthop is
forged.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The peer pointer theorically have a NULL bgp pointer. This triggers
a SA issue. So let us fix it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Each BGP prefix may have an as-path list attached. A forged
string is stored in the BGP attribute and shows the as-path
list output.
Before this commit, the as-path list output was expressed as
a list of AS values in plain format. Now, if a given BGP instance
uses a specific asnotation, then the output is changed:
new output:
router bgp 1.1 asnotation dot
!
address-family ipv4 unicast
network 10.200.0.0/24 route-map rmap
network 10.201.0.0/24 route-map rmap
redistribute connected route-map rmap
exit-address-family
exit
!
route-map rmap permit 1
set as-path prepend 1.1 5433.55 264564564
exit
ubuntu2004# do show bgp ipv4
BGP table version is 2, local router ID is 10.0.2.15, vrf id 0
Default local pref 100, local AS 1.1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
*> 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
The changes include:
- the aspath structure has a new field: asnotation type
The ashash list will differentiate 2 aspaths using a different
asnotation.
- 3 new printf extensions display the as number in the wished
format: pASP, pASD, pASE for plain, dot, or dot+ format (extended).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In ipv6 vpn, when the global and the local ipv6 address are received,
when re-transmitting the bgp ipv6 update, the nexthop attribute
length must still be 48 bytes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The idea is to drop unwanted attributes from the BGP UPDATE messages and
continue by just ignoring them. This improves the security, flexiblity, etc.
This is the command that Cisco has also.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
At bgpd startup, VRF instances are sent from zebra before the
interfaces. When importing a l3vpn prefix from another local VRF
instance, the interfaces are not known yet. The prefix nexthop interface
cannot be set to the loopback or the VRF interface, which causes setting
invalid routes in zebra.
Update route leaking when the loopback or a VRF interface is received
from zebra.
At a VRF interface deletion, zebra voluntarily sends a
ZEBRA_INTERFACE_ADD message to move it to VRF_DEFAULT. Do not update if
such a message is received. VRF destruction will destroy all the related
routes without adding codes.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If 'network import-check' is defined on the source BGP session, prefixes
that are stated in the network command cannot be leaked to the other
VRFs BGP table even if they are present in the origin VRF RIB if the
'rt import' statement is defined after the 'network <prefix>' ones.
When a prefix nexthop is updated, update the prefix route leaking. The
current state of nexthop validation is now stored in the attributes of
the bgp path info. Attributes are compared with the previous ones at
route leaking update so that a nexthop validation change now triggers
the update of destination VRF BGP table.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The function bgp_packet_mpattr_prefix was using an if statement
to encode packets to the peer. Change it to a switch and make
it handle all the cases and fail appropriately when something
has gone wrong. Hopefully in the future when a new afi/safi
is added we can catch it by compilation breaking instead of
weird runtime errors
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This function was just using default: case statements for
the encoding of nlri's to a peer. Lay out all the different
cases and make things fail hard when a dev escape is found.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The function bgp_packet_mpattr_prefix_size had an if/else
body that allowed people to add encoding types to bgpd
such that we could build the wrong size packets. This
was exposed recently in commit:
0a9705a1e0
Where it was discovered flowspec was causing bgp update
messages to exceed the maximum size and the peer to
drop the connection. Let's be proscriptive about this
and hopefully make it so that things don't work when
someone adds a new safi to the system ( and they'll have
to update this function ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, bgp_packet_mpattr_prefix_size (bgpd/bgp_attr.c:3978) always returns zero for Flowspec prefixes.
This is because, for flowspec prefixes, the prefixlen attribute of the prefix struct is always set to 0, and the actual length is bytes is set inside the flowspec_prefix struct instead (see lib/prefix.h:293 and lib/prefix.h:178).
Because of this, with a large number of flowspec NLRIs, bgpd ends up building update messages that exceed the maximum size and cause the peer to drop the connection (bgpd/bgp_updgrp_packet.c:L719).
The proposed change allows the bgp_packet_mpattr_prefix_size to return correct result for flowspec prefixes.
Signed-off-by: Stephane Poignant <stephane.poignant@proton.ch>
It's possible to send less data then the length you say you are.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Use %pI4/%pI6 where possible, otherwise at least atjust stack buffer sizes
for inet_ntop() calls.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Given that two routers are connected each other and they have IPv6
addresses and they establish BGP peer with extended-nexthop capability
and one router tries to advertise locally-generated IPv4-VPN routes to
other router.
In this situation, bgpd on the router that tries to advertise IPv4-VPN
routes will be crashed with "invalid MP nexthop length (AFI IP6)".
This issue is happened because MP_REACH_NLRI path attribute is not
generated correctly when ipv4-vpn routes are advertised to IPv6 peer.
When IPv4 routes are leaked from VRF RIB, the nexthop of these routes
are also IPv4 address (0.0.0.0/0 or specific addresses). However,
bgp_packet_mpattr_start only covers the case of IPv6 nexthop (for IPv6
peer).
ipv4-unicast routes were not affected by this issue because the case of
IPv4 nexthop is covered in `else` block.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Currently bgpd uses the opaque codepoint (0xFFFF) in the BGP
advertisement. In this commit, we update bgpd to use the SRv6 codepoints
defined in the IANA SRv6 Endpoint Behaviors Registry
(https://www.iana.org/assignments/segment-routing/segment-routing.xhtml)
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
For now, only if the knob is enabled. Later this gonna be (most likely) removed
and routes with AS_SET / AS_CONFED_SET will be denied by default.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
According to RFC9234:
An UPDATE message with a malformed OTC Attribute SHALL be handled
using the approach of "treat-as-withdraw".
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The command `debug bgp allow-martian` is not actually
a debug command it's a command that when entered allows
bgp to not reset a peering when a martian nexthop is
passed in the nlri.
Add the `bgp allow-martian-nexthop` command and allow it to be
used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
When implementing the bgp_packet_mpunreach_prefix a uint8_t array
of 3 bytes was created and then assigned to a label type, which
is 4 bytes and then various pointer work is done on it. Eventually
coverity is complaining that the 3 -vs- 4 bytes is not enough
to properly dereference it. Just make the uint8_t 4 bytes
and be done with it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The compiler is, rightly, pointing out that in some cases it is
possible that the pkt_afi and pkt_safi values are not properly
set and could result in a use before initialized. I do not
actually belive that this is possible, but let's make the compiler
happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
https://datatracker.ietf.org/doc/html/rfc7947#section-2.2
Optional recognized and unrecognized BGP attributes,
whether transitive or non-transitive, SHOULD NOT be updated by the
route server (unless enforced by local IXP operator configuration)
and SHOULD be passed on to other route server clients.
By default LB ext-community works with iBGP peers. When we receive a route
from eBGP peer, we can send LB ext-community to iBGP peers.
With this patch, allow sending LB ext-community to iBGP/eBGP peers if they
are set as RS clients.
FRR does not send non-transitive ext-communities to eBGP peers, but for
example GoBGP sends and if it's set as RS client, we should pass those attributes
towards another RS client.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When reading the BGP_PREFIX_SID_SRV6_L3_SERVICE_SID_STRUCTURE
it is possible that the length read in the packet is insufficiently
large enough to read a BGP_PREFIX_SID_SRV6_L3_SERVICE_SID_STRUCTURE.
Let's ensure that it is.
Fixes: #10860
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_attr_undup does the same thing as bgp_attr_flush – frees the
temporary data that might be allocated when applying a route-map. There
is no need to have two separate functions for that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Avoid use-after-free situation. Flush attr_extra structure only when flushing
all attributes, not just for unintern.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>