The imported BFD code had some logic to ignore the source address when
using single hop IPv4. The BFD peer socket function should allow the
source to be selected so we can:
1. Select the source address in the outgoing packets
2. Only receive packets from that specific source
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When an interface does not have a mac address, don't
try to retrieve the mac address ( for it to just fail ).
Example interface:
sharpd@eva [2]> ip link show tun100
21: tun100@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ipip 192.168.119.224 peer 192.168.119.120
Let's just notice that there is a NOARP flag and abort the call.
Fixes: #11733
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a send time into the BFD Echo packet. When the BFD Echo
packet is received back store time it took in usec. When
user issues a show bfd peer(s) command calculate and display
minimum, average, and max time it took for the BFD Echo packet
to be looped back.
Signed-off-by: Lynne Morrison <lynne.morrison@ibm.com>
Until now, when in vrf-lite mode, the BFD implementation
creates a single UDP socket and relies on the following
sysctl value to 1:
echo 1 > /proc/sys/net/ipv4/udp_l3mdev_accept
With this setting, the incoming BFD packets from a given
vrf, would leak to the default vrf, and would match the
UDP socket.
The drawback of this solution is that udp packets received
on a given vrf may leak to an other vrf. This may be a
security concern.
The commit addresses this issue by avoiding this leak
mechanism. An UDP socket is created for each vrf, and each
socket uses new setsockopt option: SO_REUSEADDR + SO_REUSEPORT.
With this option, the incoming UDP packets are distributed on
the available sockets. The impact of those options with l3mdev
devices is unknown. It has been observed that this option is not
needed, until the default vrf sockets are created.
To ensure the BFD packets are correctly routed to the appropriate
socket, a BPF filter has been put in place and attached to the
sockets : SO_ATTACH_REUSEPORT_CBPF. This option adds a criterium
to force the packet to choose a given socket. If initial criteria
from the default distribution algorithm were not good, at least
two sockets would be available, and the CBPF would force the
selection to the same socket. This would come to the situation
where an incoming packet would be processed on a different vrf.
The bpf code is the following one:
struct sock_filter code[] = {
{ BPF_RET | BPF_K, 0, 0, 0 },
};
struct sock_fprog p = {
.len = sizeof(code)/sizeof(struct sock_filter),
.filter = code,
};
if (setsockopt(sd, SOL_SOCKET, SO_ATTACH_REUSEPORT_CBPF, &p, sizeof(p))) {
zlog_warn("unable to set SO_ATTACH_REUSEPORT_CBPF on socket: %s",
strerror(errno));
return -1;
}
Some tests have been done with by creating vrf contexts, and by using
the below vtysh configuration:
ip route 2.2.2.2/32 10.126.0.2
vrf vrf2
ip route 2.2.2.2/32 10.126.0.2
!
interface ntfp2
ip address 10.126.0.1/24
!
interface ntfp3 vrf vrf4
ip address 10.126.0.1/24
!
interface ntfp2 vrf vrf1
ip address 10.126.0.1/24
!
interface ntfp2.100 vrf vrf2
ip address 10.126.0.1/24
!
interface ntfp2.200 vrf vrf3
ip address 10.126.0.1/24
!
line vty
!
bfd
peer 10.126.0.2 vrf vrf2
!
peer 10.126.0.2 vrf vrf3
!
peer 10.126.0.2
!
peer 10.126.0.2 vrf vrf4
!
peer 2.2.2.2 multihop local-address 1.1.1.1
!
peer 2.2.2.2 multihop local-address 1.1.1.1 vrf vrf2
transmit-interval 1500
receive-interval 1500
!
The results showed no issue related to packets received by
the wrong vrf. Even changing the udp_l3mdev_accept flag to
1 did not change the test results.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Use the destination for the operator `sizeof()` instead of the source
which could (and is) be bigger than destination.
We are not truncating any data here it just happens that the zebra
interface data structure hardware address can be bigger due to different
types of interface.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Modify the existing BFD Echo code to send an Echo message that will
be looped in the peers forwarding plane. The existing Echo code
only works with other FRR implementations because the Echo packet
must go up to BFD to be turned around and forwarded back to the
local router. The new BFD Echo code sets the src/dst IP of the
packet to be the local router's IP and sets the dest MAC to be the
peers MAC address. The peer receives the packet and because it
is not it's IP address it forwards it back to the local router.
Signed-off-by: Lynne Morrison <lynne.morrison@ibm.com>
After two single-hop sessions (*no local address are configured*) on two
interfaces are UP, remove one address of one interface, both of them
(actually, quite independent sessions) come to be DOWN, not just one.
Consider two boxes: A with `a1` and `a2` adddress on two interfaces,
and B with `b1` and `b2`.
Two sessions are set up and ok: `s1` with <a1,b1> and `s2` with <a2,b2>.
After `a1` of A is removed, there is an unhappy coincidence:
1) On A: `s1` changes local address, and sends <a2,b1> packets with help
of route.
2) On B: wrongly regarded <a2,b1> packets with non-zero remote descriminator
as part of `s2`, and are dropped for mismatched remote remote descriminator.
3) On A: `s1` sends <a2,b1> packets with zero remote descriminator to
initialize this session.
4) On B: wrongly regarded <a2,b1> packets with zero remote descriminator as
part of `s2`. Then `s2` will vibrate.
So the good sessions are overridden.
In this case, the <a2,b1> packets with zero remote descriminator won't take
effect until the current good sessions become bad.
Since single-hop sessions are allowed to be set without bound inteface in
current code, this commit adds one check in `bfd_recv_cb()` to avoid wrong
override.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The "local_address" of bfd is only used in `show bfd peers brief`
for single hop sessions which are configured without "local address".
Since it is set by destination address of received packet, not
completely correct, so remove it.
Signed-off-by: ewlumpkin <ewlumpkin@gmail.com>
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Since control packets may be dropped by ttl check, the counter
operation should be put after all check including ttl check.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Per RFC 5880 section 6.8.12, the use of a Poll Sequence is not necessary
when the Detect Multiplier is changed. Currently, we update the Detection
Timeout only when a Poll Sequence is terminated, therefore we ignore the
Detect Multiplier change if it's not accompanied with RX/TX timer change.
To fix the problem, we should update the Detection Timeout on every
received packet.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We get the pointer to the interface on which the packet was received
right at the beginning of bfd_recv_cb. So let's use this pointer and
don't perform additional interface lookups.
Also explain in more detail how we process VRF id with different
backends.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently there is a single interval for both RX and TX echo functions.
This commit introduces separate RX and TX timers for echo packets.
The main advantage is to be able to set the receive interval to zero
when we don't want to receive echo packets from the remote system.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Valgrind reports:
2052866-==2052866==
2052866-==2052866== Syscall param sendmsg(msg.msg_name) points to uninitialised byte(s)
2052866:==2052866== at 0x49C8E13: sendmsg (sendmsg.c:28)
2052866-==2052866== by 0x11DC08: bp_udp_send (bfd_packet.c:823)
2052866-==2052866== by 0x11DD76: ptm_bfd_echo_snd (bfd_packet.c:179)
2052866-==2052866== by 0x114C2D: ptm_bfd_echo_xmt_TO (bfd.c:469)
2052866-==2052866== by 0x114C2D: ptm_bfd_echo_start (bfd.c:498)
2052866-==2052866== by 0x114C2D: bs_echo_timer_handler (bfd.c:1199)
2052866-==2052866== by 0x11E478: bfd_recv_cb (bfd_packet.c:702)
2052866-==2052866== by 0x4904846: thread_call (thread.c:1681)
2052866-==2052866== by 0x48CB4DF: frr_run (libfrr.c:1126)
2052866-==2052866== by 0x113044: main (bfdd.c:403)
2052866-==2052866== Address 0x1ffefff3e8 is on thread 1's stack
In ptm_bfd_echo_snd, for the v4 case we were memsetting the v6 memory
then setting the v4 memory. Just fix it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
on vrf-lite environment, all incoming bfd packets are received by the
same socket on the default namespace. the vrfid is not relevant and
needs to be updated based on the incoming interface where traffic has
been received. If the traffic is received from an interface belonging to
a separate vrf, update the vrfid value accordingly.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When running in vrf-lite mode, the socket used in a vrf environment
should be bound to an interface belonging to the vrf. If no one is
selected, then the vrf interface itself should be bound to that socket,
so that outgoing packets are being applied routing rules for that vrf.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Experimental patch to allow us to discuss if we should
allow bfdd to work when v6 is turned off in the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Initial BFD protocol implementation had a hard coded value of maximum 5
hops, now we have a configurable hop amount with a safe default of 1
hop.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Move most of the log messages to debug guards so they only get activated
if the user configured the proper debug level.
Current debug levels:
- Peer events.
- Zebra events.
- Network layer debugs.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Lets avoid garbage data on packets by zeroing the packet before setting
the fields/flags.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Always bind the created sockets to their respective VRF devices. With
this it should be possible to run BFD on VRFs without needing to weaken
the security setting `net.ipv4.udp_l3mdev_accept=1`.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
bfd cbit is a value carried out in bfd messages, that permit to keep or
not, the independence between control plane and dataplane. In other
words, while most of the cases plan to flush entries, when bfd goes
down, there are some cases where that bfd event should be ignored. this
is the case with non stop forwarding mechanisms where entries may be
kept. this is the case for BGP, when graceful restart capability is
used. If BFD event down happens, and bgp is in graceful restart mode, it
is wished to ignore the BFD event while waiting for the remote router to
restart.
The changes take into account the following:
- add a config flag across zebra layer so that daemon can set or not the
cbit capability.
- ability for daemons to read the remote bfd capability associated to a bfd
notification.
- in bfdd, according to the value, the cbit value is set
- in bfdd, the received value is retrived and stored in the bfd session
context.
- by default, the local cbit announced to remote is set to 1 while
preservation of the local path is not set.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this structure contains the bfdd_privs structure in charge of the
privilege settings. The initialisation has moved a bit, in order that
the preinit settings are done.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
in the case vrf-lite is used, it is possible to call SO_BINDTODVICE, by
using vrf_socket() call.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add the address family to the sockaddr structure otherwise `sendmsg`
will fail with `EAFNOSUPPORT`.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Use simplier data structure key to avoid having to do complex and
error-prone key building (e.g. avoid expecting caller to know IPv6
scope id, interface index, vrf index etc...).
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>