The client socket value can only be modified by the main thread.
Modifying the client socket from within the client I/O pthread
introduces race conditions.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Socket should be closed in zserv_client_free() and nowhere else.
Credit to Mark Stapp for catching this one.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Rename some things to be less confusing
* Convert client close function to take a client struct rather than a
task
* Extern client close function and use it when handling SIGTERM
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The fuzzing code was calling zebra_client_create which was refactored to zserv_client_create.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
While ZAPI I/O threads make a best effort to kill any scheduled tasks on
their threadmasters, after death another pthread can continue to
schedule onto the threadmaster. This isn't a problem per se since the
tasks will never run, but it also means that asserting that it hasn't
happened is pointless.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When in a dev build add a bit of code to track max
depth of a fifo and to allow zebra to report on it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Cleanup the zebra code to test for failure for reading
from stream once instead of once to see if we should
debug and once for the actual failure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
I mistakenly used an external mechanism to cause a pthread to shut
itself down instead of using the one built into frr_pthread.[ch]. This
created a race condition whereby a pthread could schedule work onto a
dead pthread and cause it to reanimate.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Coalesce multiple write() syscalls into one
* Write larger chunks
* Decrease default read limit to 1000
* Remove unnecessary operations from hot loop (zserv_write)
* Move cross-schedule out of obuf lock
* Use atomic ops to update atomic variable
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Cancelling threads is nice but they can potentially be scheduled again
after cancellation without an explicit check.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Only one I/O task can be scheduled per file descriptor. Having two
separate tasks for buffer filling and buffer flushing was breaking that
invariant and causing messages to never be written.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Separate flush task from write task, so we can continue adding to the
write buffer while it's waiting to flush
* Handle write errors sooner rather than later
* Only schedule a process job if we have packets to process
* Tweak zserv_process_messages to not reschedule itself and rely on
zserv_read() to do so in all proper cases
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Label manager reaches its hands into session / IO code for zserv for
whatever reason, gotta handle that.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Simplify zapi_msg <-> zserv interaction
* Remove header validity checks, as they're already performed before the
packet ever makes it here
* Perform the same kind of batch processing done in zserv_write by
copying multiple inbound packets under lock instead of doing serial
locking
* Perform self-scheduling under the same lock
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Dequeue all pending messages when writing and push them all into the
write buffer. This removes the necessity to self-schedule, avoiding a
mutex lock, and should also maximize throughput by not writing 1 packet
per job.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Increase the maximum number of packets to read per read job
* Store read packets in a local cached buffer to avoid mutex overhead
* Only update last-read time / last-command if we actually read a packet
* Add missing log line for corrupt header case
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add centralized thread scheduling dispatchers for client threads and
the main thread
* Rename everything in zserv.c to stop using a combination of:
- zebra_server_*
- zebra_*
- zserv_*
Everything in zserv.c now begins with zserv_*.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Since it is already quite difficult to understand the various pieces
going on here, I reorganized the file to make it much cleaner and easier
to understand. The organization is now:
zserv.c:
,---------------------------------.
/ include statements |
| ... |
| ... |
| -------------------------------- |
| Client pthread server functions |
| ... |
| ... |
| -------------------------------- |
| Main pthread server functions |
| ... |
| ... |
| -------------------------------- |
| CLI commands, other |
| ... |
| ... |
\_________________________________/
No code has been changed; the functions have merely been moved around.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Time counters need to use atomic access between threads
* After a client disconnects, we properly kill the thread but need to
free its frr_pthread as well
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add doc comments explaining hairy bits of thread lifecycle
* Remove t_suicide as it no longer makes sense
* Remove client double-free
* Remove unnecessary THREAD_OFF being used in incorrect pthread context
* Eliminate unnecessary racey access to client's obuf_fifo
* Ensure zserv_process_messages() reschedules itself if it has not
finished its work
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Rename client_connect and client_close hooks to zapi_client_connect
and zapi_client_close
* Remove some more unnecessary headers
* Fix a copy-paste error in zapi_msg.[ch] header comments
* Fix an inclusion comment in zserv.c
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
zserv.c was using hardcoded callbacks to clean up various components
when a client disconnected. Ergo zserv.c had to know about all these
unrelated components that it should not care about. We have hooks now,
let's use the proper thing instead.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
zserv.c has become something of a dumping ground for everything vaguely
related to ZAPI and really needs some love. This change splits out the
code fo building and consuming ZAPI messages into a separate source
file, leaving the actual session and client lifecycle code in zserv.c.
Unfortunately since the #include situation in Zebra has not been paid
much attention I was forced to fix the headers in a lot of other source
files. This is a net improvement overall though.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Zebra is starting to have some run-time capabilites that would be
useful to pass up to the higher level protocols so that they
can act in an appropriate manner when needed.
Send the ecmp value zebra is being run with and whether or not
we believe mpls is enabled in the kernel or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In order to avoid duplicates functions, the zebra_pbr_rule structure
used by zebra to decode the zapi message, and send netlink messages, is
slightly modified. the structure is derived from pbr_rule, but it also
includes sock identifier that is used to send back information to the
daemon that did the request. Also, the ifp pointer is stored in that
structure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Those messages permit a remote daemon to configure an iptable entry. A
structure is defined that maps to an iptable entry. More specifically,
this structure proposes to associate fwmark, and a table ID.
Adding to the configuration, the initialisation of iptables hash list is
done into zebra netnamespace. Also a hook for notifying the sender that
the iptables has been correctly set is done.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
PBR rule is being added a 32 bit value that can be used to record a rule
in the kernel, by using a fwmark information.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Once ipset entries are injected in the kernel, the relevant daemon is
informed with a zebra message sent back.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
ZEBRA IPSET defines are added for creating/deleting ipset contexts.
Ans also create ipset hash sets.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The header length needs to be subtracted from the handling
side of the zapi in zebra. This is because we refigure the
header data structure. The receive side doesn't care
about the total header length so no need to subtract there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In BGP, doing policy-routing requires to use table identifiers.
Flowspec protocol will need to have that. 1 API from bgp zebra has been
done to get the table chunk.
Internally, onec flowspec is enabled, the BGP engine will try to
connect smoothly to the table manager. If zebra is not connected, it
will try to connect 10 seconds later. If zebra is connected, and it is
success, then a polling mechanism each 60 seconds is put in place. All
the internal mechanism has no impact on the BGP process.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit is connecting the table manager with remote daemons by
handling the queries.
As the function is similar in many points with label allocator, a
function has been renamed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When a route_delete is received allow the deletion
to occur in the passed in tableid if the vrf is VRF_DEFAULT.
This now matches route_add behavior in rib_add_multipath
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that we have properly decoded the zapi_route sent to us
and if we cannot decode, log and move on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When figuring out whom to call and if we actually can legally
call into the handler array actually use the number of elements
in the array instead of the size of the array.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When zebra detects that the originator has dissapeared
delete all rules associated with that client.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There were several places where when I am attempting
to debug zebra functionality that I would really
like to have the ability to know what vrf I think
I am operating on.
Add the vrf_id to a bunch of zlog_debug messages
to help figure out issues when they happen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Group send and receive functions together, change handlers to take a
message instead of looking at ->ibuf and ->obuf, allow zebra to read
multiple packets off the wire at a time.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
A lot of the handler functions that are called directly from the ZAPI
input processing code take different argument sets where they don't need
to. These functions are called from only one place and all have the same
fundamental information available to them to do their work. There is no
need to specialize what information is passed to them; it is cleaner and
easier to understand when they all accept the same base set of
information and extract what they need inline.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Formalize the ZAPI header by documenting it in code and providing it to
message handlers free of charge to reduce complexity.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
All of the ZAPI message handlers return an integer that means different
things to each of them, but nobody ever reads these integers, so this is
technical debt that we can just eliminate outright.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Every place we need to pass around the rule structure
we need to pass around the ifp as well. Move it into
the structure. This will also allow us to notify up
to higher level protocols that this worked properly
or not better too.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Keep track of rules written into the kernel. This will
allow us to delete them on shutdown if we are not cleaned
up properly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the add/delete to go through a intermediary function in
zebra_pbr.c instead of directly to the underlying os call. This
will allow future refinements to track the data a bit better
so that on shutdown we can delete the rules.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) use uint32_t instead of u_int32_t as we are supposed to
2) Consolidate priority into the rule.
3) Cleanup the api from this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Also modify `struct route_entry` to use nexthop_groups.
Move ALL_NEXTHOPS loop to nexthop_group.h
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the calling daemon to pass down what table-id we
want to use to install the route. Useful for PBR.
The vrf id passed must be the VRF_DEFAULT else this
value is ignored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When decoding and creating the appropriate data structures
for a nexthop, use the passed in vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement support for EVPN symmetric routing for IPv6 routes. The next hop
for EVPN routes is the IP address of the remote VTEP which is only an IPv4
address. This means that for IPv6 symmetric routing, there will be IPv6
destinations with IPv4 next hops. To make this work, the IPv4 next hops are
converted into IPv4-mapped IPv6 addresses.
As part of support, ensure that "L3" route-targets are not announced with
IPv6 link-local addresses so that they won't be installed in the routing
table.
Signed-off-by: Vivek Venkatraman vivek@cumulusnetworks.com
Reviewed-by: Mitesh Kanjariya mitesh@cumulusnetworks.com
Reviewed-by: Donald Sharp sharpd@cumulusnetworks.com
The addition of the name of the netns in the vrf message introduces also
a limitation when the size of the netns is bigger than 15 bytes. Then
the netns are ignored by the library.
In addition to this, some sanity checks have been introduced. some
functions to create the netns from a call not coming from the vty is
being added with traces.
Also, the ns vty function is reentrant, if the context is already
created.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add a bit more detail to tell us what we are sending
up to a protocol so we can debug it better in the
future.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If you were to configure a v4 and v6 vrf pop and forward label
that both happened to be the same, unconfiguring one would
remove them both.
This fixes that issue by noticing if we should remove it or
not based upon v4 or v6 having the same label or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to pass in an afi to zebra. zebra_vrf keeps
track of the afi/label tuple and then does the right thing
before we call down. AF_MPLS does not care about v4 or v6
it just knows label and what device to use for lookup.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify mpls.h to rename MPLS_LABEL_ILLEGAL to be MPLS_LABEL_NONE.
Fix all pre-existing code that used MPLS_LABEL_ILLEGAL.
Modify the zapi vrf label message to use MPLS_LABEL_NONE as the
signal to remove label associated with a vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to pass the lsp owner type through the zapi
and in addition add a new label type for the sharp protocol
for testing.
Finally modify zebra_mpls.h to not have defaults specified
for the enum. That way when we add a new LSP type the
compile fails and the person doing the addition knows
where he has to touch shit.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Turns out we had 3 different ways to define labels
all of them overlapping with the same meanings.
Consolidate to 1. This one choosen is consistent
naming wise with what the *bsd and linux kernels
use.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For L3VPN's we need to create a label associated with the specified
vrf to be installed into the kernel to allow a pop and lookup
operation.
The new api is:
zclient_send_vrf_label(struct zclient *zclient, vrf_id_t vrf_id,
mpls_label_t label);
For the specified vrf_id associate the specified label for
a pop and lookup operation for forwarding.
To setup a POP and Forward use MPLS_LABEL_IMPLICIT_NULL
If the same label is passed in we ignore the call.
If the label is different we update entry.
If the label is MPLS_LABEL_NONE we remove
the entry.
This sets up the api. Future commits will have the functionality
to actually install into the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In EVPN symmetric routing, not all subnets are presents everywhere.
We have multiple scenarios where a host might not get learned locally.
1. GARP miss
2. SVI down/up
3. Silent host
We need a mechanism to resolve such hosts. In order to achieve this,
we will be advertising a subnet route from a box and that box will help
in resolving the ARP to such hosts.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
The function zserv_create_header was exactly the same
as zclient_create_header. Let's just have one in the
system.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is a preparatory work for configuring vrf/frr over netns
vrf structure is being changed to 32 bit, and the VRF will have the
possibility to have a backend made up of NETNS.
Let's put some history.
Initially the 32 bit was because one wanted to map on vrf_id both the
VRFLITE and the NSID.
Initially, one would have liked to make zebra configure at the same time
both vrf lite and vrf from netns in a flat way. From the show
running perspective, one would have had both kind of vrfs, thatone
would configure on the same way.
however, it leads to inconsistencies in concepts, because it mixes vrf
vrf with vrf, and vrf is not always mapped with netns.
For instance, logical-router could also be used with netns. In that
case, it would not be possible to map vrf with netns.
There was an other reason why 32 bit is proposed. this is because
some systems handle NSID to 32 bits. As vrf lite exists only on
Linux, there are other systems that would like to use an other vrf
backend than vrf lite. The netns backend for vrf will be used for that
too. for instance, for windows or freebsd, some similar
netns concept exists; so it will be easier to reuse netns
backend for vrf, than reusing vrflite backend for vrf.
This commit is here to extend vrf_id to 32 bits. Following commits in a
second step will help in enable a VRF backend.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
With VRF route-leaking we need to know what vrf
the nexthops are in compared to this vrf. This
code adds the nh_vrf_id to the route entry and
sets it up correctly for the non-route-leaking
case.
The assumption here is that future commits
will make the nh_vrf_id *different* than
the vrf_id.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>