The way thread.c is written, a caller who wishes to be able to cancel a
thread or avoid scheduling it twice must keep a reference to the thread.
Typically this is done with a long lived pointer whose value is checked
for null in order to know if the thread is currently scheduled. The
check-and-schedule idiom is so common that several wrapper macros in
thread.h existed solely to provide it.
This patch removes those macros and adds a new parameter to all
thread_add_* functions which is a pointer to the struct thread * to
store the result of a scheduling call. If the value passed is non-null,
the thread will only be scheduled if the value is null. This helps with
consistency.
A Coccinelle spatch has been used to transform code of the form:
if (t == NULL)
t = thread_add_* (...)
to the form
thread_add_* (..., &t)
The THREAD_ON macros have also been transformed to the underlying
thread.c calls.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Rename HAVE_POLL to HAVE_POLL_CALL, when compiling with
snmp and poll enabled this was causing issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change adds three fields to thread_master and associated code to
use them. The fields are:
* long selectpoll_timeout
This is a millisecond value that, if nonzero, will override the
internally calculated timeout for select()/poll(). -1 indicates
nonblocking while a positive value indicates the desired timeout in
milliseconds.
* bool spin
This indicates whether a call to thread_fetch() should result in a loop
until work is available. By default this is set to true, in order to
keep the default behavior. In this case a return value of NULL indicates
that a fatal signal was received in select() or poll(). If it is set to
false, thread_fetch() will return immediately. NULL is then an
acceptable return value if there is no work to be done.
* bool handle_signals
This indicates whether or not the pthread that owns the thread master
is responsible for handling signals (since this is an MT-unsafe
operation, it is best to have just the root thread do it). It is set to
true by default. Non-root pthreads should set this to false.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Fixes a few insufficient critical sections. Adds back locking for
thread_cancel(), since while thread_cancel() is only safe to call from
the pthread which owns the thread master due to races involving
thread_fetch() modifying thread master's ready queue, we still need
mutual exclusion here for all of the other public thread.c functions to
maintain their MT-safety.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This change introduces synchronization mechanisms to thread.c in order
to allow safe concurrent use.
Thread.c should now be threadstafe with respect to:
* struct thread
* struct thread_master
Calls into thread.c for operations upon data of this type should not
require external synchronization.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
In the case where we are using select as
the operator *and* we call
funcname_thread_add_read_write *and* the
fd is already set, we would overwrite
the read/write direction to always be READ.
Clearly this was a bad idea.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since time is no longer cached, if we schedule something with zero
timeout, it will automatically be negative by the time we reach the
event loop.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
monotime_since() does exactly the same thing.
... and timeval_elapsed is now private to lib/thread.c
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Fix the display of 'show thread cpu' to keep track
of the number of active threads and to display that
information.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
lib: Fix thread_execute_crash
This commit does these things:
1) Make thread_add_unuse own the setting of THREAD_UNUSED.
2) Move thread->hist finding to to thread_get.
We are storing the thread->hist even when the thread
is on the unused. This means that we check to see
if the funcname or func have changed and we get new
history. Else we've probably just retrieved the last
unused which has the same func/funcanme. This is
a common practice to do THREAD_OFF/THREAD_ON in
quick succession.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
This moves all install_element calls into the file where the DEFUNs are
located. This fixes several small related bugs:
- ospf6d wasn't installing a "no interface FOO" command
- zebra had a useless copy of "interface FOO"
- pimd's copy of "interface FOO" was not setting qobj_index, which means
"description LINE" commands would fail with an error
The next commit will do the actual act of making "foo_cmd" static.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Since we have autoconf results from a wide swath of target platforms, we
can go remove checks that have the same result on all systems.
This also removes several "fallback" implementations of functions that,
at some point in the history, weren't available on all target platforms.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When displaying thread time for long running/busy
protocols, the space allocated may not be sufficient.
Allow the runtime to take a bit more space.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is a rather large mechanical commit that splits up the memory types
defined in lib/memtypes.c and distributes them into *_memory.[ch] files
in the individual daemons.
The zebra change is slightly annoying because there is no nice place to
put the #include "zebra_memory.h" statement.
bgpd, ospf6d, isisd and some tests were reusing MTYPEs defined in the
library for its own use. This is bad practice and would break when the
memtype are made static.
Acked-by: Vincent JARDIN <vincent.jardin@6wind.com>
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
[CF: rebased for cmaster-next]
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
The regular expression for finding DEFUN/ALIAS in
extract.pl looks for "DEFUN (" or "ALIAS (" if
the *.c file does not have this then it will just
silently ignore the cli.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
AgentX fd/timeout handling is rather hackishly monkeyed into thread.c.
Replace with code that uses plain thread_* functions.
NB: Net-SNMP's API rivals Quagga's in terms of age and absence of
documentation. netsnmp_check_outstanding_agent_requests() in particular
seems to be unused and is therefore untested.
The most useful documentation on this is actually the blog post Vincent
Bernat wrote when he originally integrated this into lldpd and Quagga:
https://vincent.bernat.im/en/blog/2012-snmp-event-loop.html
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Another zoo extension, this adds a timer scheduling function that takes
a struct timeval argument (which is actually what the wrappers boil down
to, yet it's not exposed...)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
QUAGGA_CLK_REALTIME and QUAGGA_CLK_REALTIME_STABILISED aren't used
anywhere in the code. Remove. The enum is kept to avoid having to
change the calls everywhere.
Same applies to the workaround code for systems that don't have a
monotonic clock. None of the systems Quagga works on fall into that
category; Linux, BSD and Solaris all do clock_gettime, for OSX we have
mach_absolute_time() - that covers everything.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
ospf->lsa_refresher_started is only used in relative timing to itself;
replace with monotonic clock which is appropriate for this.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
- HAVE_POLL is overloaded by net-snmp
- missing includes
- ospf6_snmp converted to vrf_iflist()
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Two Fixes:
1) When a fd has both read and write as a .events.
(POLLHUP | POLLIN | POLLOUT) and a
thread_cancel_read_write call is executed
from a protocol, the code was blindly removing
the fd from consideration at all.
2) POLLNVAL was being evaluated before POLLIN|POLLOUT
were being evaluated. While I didn't see a case
of POLLNVAL being included with other .revent flags
I decided to move the POLLNVAL and POLLHUP handling
to the same section of code.
Additionally the function thread_cancel_read_write
was poorly named and let me to poorly implement
the poll version of it. I've renamed the function
thread_cancel_read_or_write in an attempt to
make this problem moot in the future.
Ticket: CM-11027
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
(cherry picked from commit f6da66a913)
now that we know what thread we're currently executing, let's add that
information to SEGV / assert backtraces.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 615f9f18fc025757a255f936748fc1e86e922783)
the library's thread scheduling functions keep track of the thread
function's name, so far so good. However, copying the compiler-provided
constant into a buffer inside the thread structure is plain useless.
Also, strip_funcname() was trying to support something that never
happens.
Instead, let's use some bytes here to track where threads are scheduled
from. Another commit will print that information on crashes.
Ripping out useless stuff: -64 bytes in the thread structure
Re-add as const ptr: +8 bytes
Extra debug info: +12 bytes
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
(cherry picked from commit 3493b7731b750cbc62f00be94b624a08ccccf0b2)
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
poll returns the number of revents that we need to handle
in the array. revent is a bit field of events that need
to be handled. thread.c was treating each sub item in the
bitfield as a separate item to handle.
As such the loop over the pollfds would quit early
sometimes.
Ticket: CM-10077
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
The pollfds was being resized if the # of fds grew to
be more than the original array size. Just size it
once.
Ticket: CM-10077
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>