Redistributing routes from a specific routing table to a particular routing
protocol necessitates copying route entries to the main routing table using the
"ip import-table" command. Once copied, these routes are assigned a distinct
"table" route type, which the "redistribute table" command of the routing
protocol then picks up.
For illustration, here is a configuration that showcases the use of
"import-table" and "redistribute":
> # show running-config
> [..]
> ip route 172.31.0.10/32 172.31.1.10 table 100
> router bgp 65500
> address-family ipv4 unicast
> redistribute table 100
> exit-address-family
> exit
> ip import-table 100
>
> # show ip route vrf default
> [..]
> T[100]>* 172.31.0.10/32 [15/0] via 172.31.1.10, r2-eth1, weight 1, 00:00:05
However, this method has inherent constraints:
- The 'import-table' parameter only handles route table id up to 252. The
253/254/255 table ids are reserved in the linux system, and adding other table
IDs above 255 leads to a design issue, where the size of some tables is directly
related to the maximum number of table ids to support.
- Duplicated route entries might interfere with original default table routes,
leading to potential conflicts. There is no guarantee that the zebra RIB will
favor these duplicated entries during redistribution.
- There are cases where the table ID can be checked independently of the default
routing table, as seen in Linux where the "ip rule" command is able to divert
traffic to that routing table. In that case, there is no need to duplicate route
entries in the default routing table.
To overcome these issues, a new redistribution type is proposed to redistribute
route entries directly from a specified routing table, eliminating the need for
an initial import into the default table.
Add a 'ZEBRA_ROUTE_TABLE_DIRECT' type to the 'REDISTRIBUTE' ZAPI messages. It
allows sending routes from a given non default table ID from zebra to a routing
daemon. The destination routing protocol table must be the default table.
The redistributed route inherit from the default distance value of 14: this is
the distance value reserved for routes redistributed via ROUTE_TABLE_DIRECT.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The weight scale value might be useful to have it
change it's behavior at a later time or controlled
by something depending on how FRR is compiled/ran.
Let's start that process
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently underlying asics get into a bit of trouble when the
nexthop weight passed down varies wildly between the different
numbers. Let's normalize the weight values between 1 and 255
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently when one interface changes its VRF, zebra will send these messages to
all daemons in *order*:
1) `ZEBRA_INTERFACE_DELETE` ( notify them delete from old VRF )
2) `ZEBRA_INTERFACE_VRF_UPDATE` ( notify them move from old to new VRF )
3) `ZEBRA_INTERFACE_ADD` ( notify them added into new VRF )
When daemons deal with `VRF_UPDATE`, they use
`zebra_interface_vrf_update_read()->if_lookup_by_name()`
to check the interface exist or not in old VRF. This check will always return
*NULL* because `DELETE` ( deleted from old VRF ) is already done, so can't
find this interface in old VRF.
Send `VRF_UPDATE` is redundant and unuseful. `DELETE` and `ADD` are enough,
they will deal with RB tree, so don't send this `VRF_UPDATE` message when
vrf changes.
Since all daemons have good mechanism to deal with changing vrf, and don't
use this `VRF_UPDATE` mechanism. So, it is safe to completely remove
all the code with `VRF_UPDATE`.
Signed-off-by: anlan_cs <anlan_cs@tom.com>
The iprule/pbr rule object has a vrf id, and zebra uses
that internally, but the vrf id isn't returned to clients
who install rules and are waiting for results. Include the
vrf_id sent by the client in the zapi result notification
message; update the existing clients so they decode the id.
Signed-off-by: Mark Stapp <mjs@labn.net>
Append zebra and lib to use muliple SRv6 segs SIDs, and keep one
seg SID for bgpd and sharpd.
Note: bgpd and sharpd compilation relies on the lib and zebra files,
i.e if we separate this: lib or zebra or bgpd or sharpd in different
commits - this will not compile.
Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
Add the txqlen attribute to the common interface struct. Capture
the value in zebra, and distribute it through the interface lib
module's zapi messaging.
Signed-off-by: Mark Stapp <mjs@labn.net>
Before now, PBRD used non-zero values to imply that a rule's
match or action field was active. This approach was getting
cumbersome for fields where 0 is a valid active value and
various field-specific magic values had to be used.
This commit changes PBRD to use a flag bit per field to
indicate that the field is active.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
After Zebra knows it's capability surrounding v6 with v4 nexthops
have it send this ability up to interested parties.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgpd, pbrd: use common pbr encoder
zebra: use common pbr decoder
tests: pbr_topo1: check more filter fields
Purpose:
1. Reduce likelihood of zapi format mismatches when adding
PBR fields due to multiple parallel encoder implementations
2. Encourage common PBR structure usage among various daemons
3. Reduce coding errors via explicit per-field enable flags
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Subset: ZAPI changes to send the new data
Also adds filter_bm field; currently for PBR_FILTER_PCP, but in the
future to be used for all of the filter fields.
Changes by:
Josh Werner <joshuawerner@mitre.org>
Eli Baum <ebaum@mitre.org>
G. Paul Ziemba <paulz@labn.net>
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
The default vrf is generally non-NULL, except when shutdown. So, most
of the time it is not necessary to check if it is NULL, we should
remove the useless checks for it.
Searched them with exact match:
```
grep -rI "zebra_vrf_lookup_by_id(VRF_DEFAULT)" | wc -l
31
```
Signed-off-by: anlan_cs <vic.lan@pica8.com>
When running all daemons with config for most of them, FRR has
sharpd@janelle:~/frr$ vtysh -c "show debug hashtable" | grep "VRF BIT HASH" | wc -l
3570
3570 hashes for bitmaps associated with the vrf. This is a very
large number of hashes. Let's do two things:
a) Reduce the created size of the actually created hashes to 2
instead of 32.
b) Delay generation of the hash *until* a set operation happens.
As that no hash directly implies a unset value if/when checked.
This reduces the number of hashes to 61 in my setup for normal
operation.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Allow zapi clients to register to be notified when a server
for an opaque message type is present. Zebra maintains these
notification registrations in the same data structures that it
uses for opaque message handling.
Signed-off-by: Mark Stapp <mjs@labn.net>
When using a context to send route notifications to upper
level protocols, the code was using a locking function to
get the route node. There is no need for this to be locked
as such FRR should free it up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Once RP/BSR address is learned in PIMD, PIMD does nexthop tracking
in Zebra.
For IPV6 address, the nexthop type is either NEXTHOP_TYPE_IPV6
or NEXTHOP_TYPE_IPV6_IFINDEX.
Zebra should send nexthop ifindex information along with nexthop address
to the client (PIMD).
Issue: #11526
Issue: #11957
Signed-off-by: Sarita Patra <saritap@vmware.com>
a) Consolidate v4 and v6 versions of rib_match_multicast
b) Improve debug to show what we matched against as well.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add the ability to specify the label type along with the labels
you are passing to zebra in zapi_nexthop. This is needed as we
abstract the label code to be re-used by evpn as well as mpls.
Protocols need to be able to set the type of label they have attached.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Use the already existing mpls label code to store VNI
info for vxlan. VNI's are defined as labels just like mpls,
we should be using the same code for both.
This patch is the first part of that. Next we will need to
abstract the label code to not be so mpls specific. Currently
in this, we are just treating VXLAN as a label type and storing
it that way.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
This commit adds ZAPI encoders & decoders for traffic control operations, which
include tc_qdisc, tc_class and tc_filter.
Signed-off-by: Siger Yang <siger.yang@outlook.com>
At this point add abilty for the encode/decode of the
resilience down ZAPI to zebra. Just hookup sharpd
at this point in time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In this commit, we extend the ZAPI to support encoding and decoding the
locator flags contained in the messages exchanged between zebra and the
routing daemons.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Daemons like isisd continue to use the previous link-params after they
are removed from zebra.
For example,
>r0# sh run zebra
> (...)
> interface eth-rt1
> link-params
> enable
> metric 100
> exit-link-params
> r0# conf
> r0(config)# interface eth-rt1
> r0(config-if)# link-params
> r0(config-link-params)# no enable
After "no enable", "sh run zebra" displays no more link-params context.
The "no enable" causes the release of the "link_params" pointer within
the "interface" structure. The zebra function to update daemons with
a ZEBRA_INTERFACE_LINK_PARAMS zapi message is called but the function
returns without doing anything because the "link_params" pointer is
NULL. Therefore, the "link_params" pointers are kept in daemons.
When the zebra "link_params" pointer is NULL:
- Send a zapi link param message that contains no link parameters
instead of sending no message.
- At reception in daemons, the absence of link parameters causes the
release of the "link_params" pointer.
Fixes: 16f1b9e ("Update Traffic Engineering Support for OSPFD")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Currently if an operator does this operation:
sharpd@eva ~/frr8> sudo ip nexthop add id 5000 via 192.168.119.44 dev enp39s0 ; sudo ip route add 10.0.0.1 nhid 5000
2022/06/30 08:52:40 ZEBRA: [ZHQK5-J9M1R] proto2zebra: Please add this protocol(0) to proper rt_netlink.c handling
2022/06/30 08:52:40 ZEBRA: [PS16P-365FK][EC 4043309076] Zebra failed to find the nexthop hash entry for id=5000 in a route entry
sharpd@eva ~/frr8> vtysh -c "show ip route 10.0.0.1"
Routing entry for 0.0.0.0/0
Known via "kernel", distance 0, metric 100, best
Last update 00:01:58 ago
* 192.168.119.1, via enp39s0
The route is dropped by zebra with no warnings. This is not good,
but unlikely to happen at this point in time. In order to fix
this issue route processing from inputs needs to happen after nexthop
group processing from inputs. This was not possible because
nexthop groups are placed on the metaQ. As such the above
nexthop group creation is placed on the metaQ for processing
in META_QUEUE_NHG. Then the route is read in and processed
immediately. The nexthop group is not found ( not processed yet!)
and the route is dropped in zebra.
Modify the code to have early route processing of validity
on the MetaQ. This preserves the order of operations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Multipath route may have mixed nexthops of EVPN and IP unicast. Move
EVPN flag to nexthop to support such cases.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
The usage of zebra dplane makes the job asyncronous which implies
that a given job will try to add an iptable, while the second job
will not know that its iptable is the same as the former one.
The below exabgp rules stand for two bgp flowspec rules sent to
the bgp device:
flow {
route {match {
source 185.228.172.73/32;
destination 0.0.0.0/0;
source-port >=49156&<=49159;
}then {redirect 213.242.114.113;}}
route {match {
source 185.228.172.73/32;
destination 0.0.0.0/0;
source-port >=49160&<=49163;
}then {redirect 213.242.114.113;}}
}
This rule creates a single iptable, but in fact, the same iptable
name is appended twice. This results in duplicated entries in the
iptables context. This also results in contexts not flushed, when
BGP session or 'flush' operation is performed.
iptables-save:
[..]
-A PREROUTING -m set --match-set match0x55baf4c25cb0 src,src -g match0x55baf4c25cb0
-A PREROUTING -m set --match-set match0x55baf4c25cb0 src,src -g match0x55baf4c25cb0
-A match0x55baf4c25cb0 -j MARK --set-xmark 0x100/0xffffffff
-A match0x55baf4c25cb0 -j ACCEPT
-A match0x55baf4c25cb0 -j MARK --set-xmark 0x100/0xffffffff
-A match0x55baf4c25cb0 -j ACCEPT
[..]
This commit addresses this issue, by checking that an iptable
context is not already being processed. A flag is added in the
original iptable context, and a check is done if the iptable
context is not already being processed for install or uinstall.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Don't rely on the OS interface name length definition and use the FRR
definition instead.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
By changing this API call to use a `struct ipaddr`, which encodes the
type of IP address with it. (And rename/remove the `IPV4` from the
command name.)
Also add a comment explaining that this function call is going to be
obsolete in the long run since pimd needs to move to proper MRIB NHT.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Add support for setting the protodown reason code.
829eb208e8
These patches handle all our netlink code for setting the reason.
For protodown reason we only set `frr` as the reason externally
but internally we have more descriptive reasoning available via
`show interface IFNAME`. The kernel only provides a bitwidth of 32
that all userspace programs have to share so this makes the most sense.
Since this is new functionality, it needs to be added to the dplane
pthread instead. So these patches, also move the protodown setting we
were doing before into the dplane pthread. For this, we abstract it a
bit more to make it a general interface LINK update dplane API. This
API can be expanded to support gernal link creation/updating when/if
someone ever adds that code.
We also move a more common entrypoint for evpn-mh and from zapi clients
like vrrpd. They both call common code now to set our internal flags
for protodown and protodown reason.
Also add debugging code for dumping netlink packets with
protodown/protodown_reason.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
When using wait for install there exists situations where
zebra will issue several route change operations to the kernel
but end up in a state where we shouldn't be at the end
due to extra data being received. Example:
a) zebra receives from bgp a route change, installs sends the
route to the kernel.
b) zebra receives a route deletion from bgp, removes the
struct route entry and then sends to the kernel a deletion.
c) zebra receives an asynchronous notification that (a) succeeded
but we treat this as a new route.
This is the ships in the night problem. In this case if we receive
notification from the kernel about a route that we know nothing
about and we are not in startup and we are doing asic offload
then we can ignore this update.
Ticket: #2563300
Signed-off-by: Donald Sharp <sharpd@nvidia.com>