bgp_attr_intern(attr) takes an attribute, duplicates it, and inserts it
into the attribute hash table, returning the inserted attr. This is done
when processing a bgp update. We store the returned attribute in the
path info struct. However, later on we modify one of the fields of the
attribute. This field is inspected by attrhash_cmp, the function that
allows the hash table to select the correct item from the hash chain for
a given key when doing a lookup on an item. By modifying the field after
it's been inserted, we open the possibility that two items in the same
chain that at insertion time were differential by attrhash_cmp becomes
equal according to that function. When performing subsequent hash
lookups, it is then indeterminate which of the equivalent items the hash
table will select from the chain (in practice it is the first one but
this may not be the one we want). Thus, it is illegal to modify
data used by a hash comparison function after inserting that data into
a hash table.
In fact this is occurring for attributes. We insert two attributes that
hash to the same key and thus end up in the same hash chain. Then we
modify one of them such that the two items now compare equal. Later one
we want to release the second item from the chain before XFREE()'ing it,
but since the two items compare equal we get the first item back, then
free the second one, which constitutes two bugs, the first being the
wrong attribute removed from the hash table and the second being a
dangling pointer stored in the hash table.
To rectify this we need to perform any modifications to an attr before
it is inserted into the table, i.e., before calling bgp_attr_intern().
This patch does that by moving the sole modification to the attr that
occurs after the insert (that I have seen) before that call.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
This will check route-maps as well, not only prefix-lists, access-lists, and
filter-lists.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
until now, the assumption was done in bgp flowspec code that the
information contained was an ipv4 flowspec prefix. now that it is
possible to handle ipv4 or ipv6 flowspec prefixes, that information is
stored in prefix_flowspec attribute. Also, some unlocking is done in
order to process ipv4 and ipv6 flowspec entries.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a SYNC route i.e. a route with a local ES as destination is
rxed on a switch (say L11) from an ES peer (say L12) a local
MAC/neigh entry is created on L11 with the local access port
as dest port.
Creation of the local entry triggers a local path advertisement from
L11. This could be a "locally-active" path or a "locally-inactive"
path. Inactive paths are advertised with the proxy bit.
To ensure that the local entry is not deleted by a SYNC route it is
given absolute precedence over peer-paths.
If there are two non-local paths with the same dest ES and same MM
seq number the non-proxy path is preferred. This is done to ensure
that we don't lose track of the peer-activity.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. Sample ES display
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es
ES Flags: L local, R remote, I inconsistent
VTEP Flags: E ESR/Type-4, A active nexthop
ESI Flags RD #VNIs VTEPs
03:00:00:00:00:01:11:00:00:01 LR 27.0.0.15:15 10 27.0.0.16(EA)
03:00:00:00:00:01:22:00:00:02 LR 27.0.0.15:16 10 27.0.0.16(EA)
03:00:00:00:00:01:22:00:00:03 LR 27.0.0.15:17 10 27.0.0.16(EA)
03:00:00:00:00:02:11:00:00:01 R - 10 27.0.0.17(A),27.0.0.18(A)
03:00:00:00:00:02:22:00:00:02 R - 10 27.0.0.17(A),27.0.0.18(A)
03:00:00:00:00:02:22:00:00:03 R - 10 27.0.0.17(A),27.0.0.18(A)
torm-11#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2. Sample ES-EVI display
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es-evi
Flags: L local, R remote, I inconsistent
VTEP-Flags: E EAD-per-ES, V EAD-per-EVI
VNI ESI Flags VTEPs
1005 03:00:00:00:00:01:11:00:00:01 LR 27.0.0.16(EV)
1005 03:00:00:00:00:01:22:00:00:02 LR 27.0.0.16(EV)
1005 03:00:00:00:00:01:22:00:00:03 LR 27.0.0.16(EV)
1005 03:00:00:00:00:02:11:00:00:01 R 27.0.0.17(EV),27.0.0.18(EV)
1005 03:00:00:00:00:02:22:00:00:02 R 27.0.0.17(EV),27.0.0.18(EV)
1005 03:00:00:00:00:02:22:00:00:03 R 27.0.0.17(EV),27.0.0.18(EV)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
3. Sample EAD route display
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn route type ead
BGP table version is 19, local router ID is 27.0.0.15
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [4]:[ESI]:[EthTag]:[IPlen]:[VTEP-IP]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]
Network Next Hop Metric LocPrf Weight Path
Extended Community
Route Distinguisher: 27.0.0.15:5
*> [1]:[0]:[03:00:00:00:00:01:11:00:00:01]:[128]:[0.0.0.0]
27.0.0.15 32768 i
ET:8 RT:5550:1009
*> [1]:[0]:[03:00:00:00:00:01:22:00:00:02]:[128]:[0.0.0.0]
27.0.0.15 32768 i
ET:8 RT:5550:1009
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Add ESI as an inline attribute field along with the other EVPN
attributes. This may be re-worked when the rest of the EVPN
attributes find a new home.
Some cleanup has been done to get rid of stale/unused references
to ESI. And also to consolidate duplicate definitions of ES ID
types.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When we have a prefix that has been selected, note that that
particular flag has been set and give that information to the
end user.
eva# show bgp ipv4 uni neighbors 192.168.161.131 prefix-counts
Prefix counts for 192.168.161.131, IPv4 Unicast
PfxCt: 814246
Counts from RIB table walk:
Adj-in: 0
Damped: 0
Removed: 0
History: 0
Stale: 0
Valid: 814246
All RIB: 814246
PfxCt counted: 814246
PfxCt Best Selected: 0
Useable: 814246
eva# show bgp ipv4 uni neighbors 192.168.161.2 prefix-counts
Prefix counts for 192.168.161.2, IPv4 Unicast
PfxCt: 814070
Counts from RIB table walk:
Adj-in: 0
Damped: 0
Removed: 0
History: 0
Stale: 0
Valid: 814070
All RIB: 814070
PfxCt counted: 814070
PfxCt Best Selected: 814070
Useable: 814070
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If _force_ is set, then ALL prefixes are counted for maximum instead of
accepted only. This is useful for cases where an inbound filter is applied,
but you want maximum-prefix to act on ALL (including filtered) prefixes.
For instance, we have a configuration like:
neighbor r1 maximum-prefix 10
neighbor r1 prefix-list custom in
!
ip prefix-list custom seq 1 permit 10.0.0.0/24
ip prefix-list custom seq 2 permit 10.0.1.0/24
This will accept only 2 prefixes and discard all others instead of
shutting down the session when 10 is reached.
With this new knob (force), we will count all received prefixes and shutdown
the session when 10 is reached.
The bigger problem is when you have lots of peers with full feed and such a
configuration like in an example.
This is kinda re-ordering of how to treat filter vs. maximum-prefix.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
```
(gdb) bt
0 0x00007f45a6f0a781 in raise () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007f45a6ef455b in abort () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007f45a7781920 in core_handler (signo=11, siginfo=0x7fffac7b84b0, context=<optimized out>) at lib/sigevent.c:228
3 <signal handler called>
4 0x000055a4133c0f32 in bgp_table_stats (vty=vty@entry=0x55a415acb240, bgp=0x0, afi=AFI_IP, safi=SAFI_UNICAST, json_array=json_array@entry=0x0) at bgpd/bgp_route.c:11412
5 0x000055a4133c13fb in show_ip_bgp_afi_safi_statistics (self=<optimized out>, vty=0x55a415acb240, argc=6, argv=<optimized out>) at bgpd/bgp_route.c:10749
6 0x00007f45a773917d in cmd_execute_command_real (vline=vline@entry=0x55a415ab7e10, vty=vty@entry=0x55a415acb240, cmd=cmd@entry=0x0, filter=FILTER_RELAXED)
at lib/command.c:909
7 0x00007f45a773afdf in cmd_execute_command (vline=vline@entry=0x55a415ab7e10, vty=vty@entry=0x55a415acb240, cmd=0x0, vtysh=vtysh@entry=0) at lib/command.c:968
8 0x00007f45a773b135 in cmd_execute (vty=vty@entry=0x55a415acb240, cmd=cmd@entry=0x55a415ace950 "show ip bgp vrf all statistics", matched=matched@entry=0x0,
vtysh=vtysh@entry=0) at lib/command.c:1122
9 0x00007f45a7794d62 in vty_command (vty=vty@entry=0x55a415acb240, buf=0x55a415ace950 "show ip bgp vrf all statistics") at lib/vty.c:526
10 0x00007f45a7794fb6 in vty_execute (vty=vty@entry=0x55a415acb240) at lib/vty.c:1293
11 0x00007f45a7797804 in vtysh_read (thread=<optimized out>) at lib/vty.c:2126
12 0x00007f45a778f641 in thread_call (thread=thread@entry=0x7fffac7bb040) at lib/thread.c:1550
13 0x00007f45a775b6d8 in frr_run (master=0x55a415542820) at lib/libfrr.c:1098
14 0x000055a4133815d6 in main (argc=10, argv=0x7fffac7bb2a8) at bgpd/bgp_main.c:509
```
"show ip bgp vrf all statistics" should show statistics for all VRFs if "all"
is specified.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
It's hard to cope with cases when next-hop is changed/unchanged or
peers are non-direct.
It would be better to show the hostname and nexthop IP address (both)
under `show bgp` to quickly identify the source and the real next-hop
of the route.
If `bgp default show-nexthop-hostname` is toggled the output looks like:
```
spine1-debian-9# show bgp
BGP table version is 1, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 65002
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
* 2a02:4780::/64 fe80::a00:27ff:fe09:f8a3(exit1-debian-9)
0 0 65001 ?
spine1-debian-9# show ip bgp
BGP table version is 5, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 65002
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 10.255.255.0/24 192.168.0.1(exit1-debian-9)
0 0 65001 ?
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
If the RT changes on a L3VPN route then any leak of this route into
a VRF should be withdrawn.
Extend existing EVPN check for RT change to cover L3VPN routes.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
This is the bulk part extracted from "bgpd: Convert from `struct
bgp_node` to `struct bgp_dest`". It should not result in any functional
change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
If we have something like:
```
ip route 1.1.1.0/24 Null0
!
router bgp 100
no bgp ebgp-requires-policy
neighbor 192.168.0.2 remote-as 200
!
address-family ipv4 unicast
network 1.1.1.0/24
redistribute connected
exit-address-family
!
line vty
!
```
1.1.1.0/24 is not advertised due to martian nexthop (0.0.0.0). It starts
working only when we use `redistribute static`.
By checking if it's a BGP static route we able to announce
1.1.1.0/24 with `network 1.1.1.0/24` without redistribute even when
`bgp import-check` is enabled.
Disabling `bgp import-check` works as well, but it's enabled by default
since 7.4.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Non-best paths (path info structures) also need to be freed during
table cleanup not only to release their memory but to also ensure
any linkages are updated correctly. One such example is for EVPN
where there is a link between the imported path info (in a L2 or
L3 vrf instance) and its parent path info.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Without specifying a default afi/safi we get a segfault:
```
(gdb) frame 4
bgp_table_stats (..., afi=32724, safi=SAFI_UNICAST, ...
11349 if (!bgp->rib[afi][safi]) {
(gdb)
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
unicast and labeled-unicast share the same table, but configuration should
be visible for both independently. Without this fix it confuses a bit
because when you enter `network 10.0.0.0/24` under labeled-unicast it's
written in unicast family block.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When we receive an UPDATE with MP_NEXTHOP len as 32 bytes, we shouldn't
check if the global (1st) nexthop is unspecified.
Peering between bird and FRRouting we receive from Bird something like:
```
rcvd UPDATE w/ attr: , origin i, mp_nexthop ::(fe80::a00:27ff:fe09:f8a3)
```
The link-local (2nd) nexthop is valid and validated later in the code.
Before it was marked:
```
IPv6 unicast -- DENIED due to: martian or self next-hop;
```
After it's a valid prefix:
```
spine1-debian-9# show bgp
BGP table version is 0, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 65002
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
2a02:4780::/64 fe80::a00:27ff:fe09:f8a3
0 65001 i
Displayed 1 routes and 1 total paths
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
In real world sometimes happens that bgp_nexthop_cache is NULL. Avoid
segfaulting when using `show [ip] bgp ...` CLI commands.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The problem is when using kinda such topologies:
(192.168.1.1/32) r1 <-- eBGP --> r2 <-- iBGP --> r3
Looking at r3's nexthop for 192.168.1.1/32 we have it as r2, but really
it MUST be r1.
Checking if the nexthop is connected solves the problem even for cases
when route-reflectors are used.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Some competitive vendors like Cisco, Bird, OpenBGPD,
Nokia already have this by default enabled.
The list is here: https://github.com/bgp/RFC8212
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>