build: fix several issues with building tests
- MAINTAINERCLEANFILES should not rely on conditionals
that could or could not clean files.
- EXTRA_DIST should not rely on conditonals that could
or could not add files to the final tarball.
sources should always ship.
- CLEANFILES should not rely on conditionals as
./configure can be done in between builds leaving
stray files around.
- (cosmetic) move distclean-local: target with clean-local.
- drop old ipc_sock.test, start.test and resources.test
shell files.
- fix make distcheck -j:
- stop shipping or not shipping libstat_wrapper.so.
libtool will only generate the .so when installing
a shared library (--enable-install-tests).
- make libstat_wrapper a module in a similar fashion
of _failure_injection.
- build ipc_sock.test in a similar fashion as ipc.test
and link as module _libstat_wrapper.la.
this solves multiple issues of having the binary
in the final test builddir, no need to detect if
libstat_wrapper.so is installed or not and workaround
libtool different linking methods for inst vs noinst
libraries.
- fix ipc.test linking with GLIB that should not be
dependent on HAVE_FAILURE_INJECTION.
Run tests in parallel with dependancies
Make sure the two IPC tests use different socket names
Shortedn some names so they fit with the new ipc-names
remove ipc-test-name-sock
Fix resources.test now that ipc_sock is being run properly
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
* test: Clean /dev/shm a bit better
This isn't perfect, but it does tidy more of /dev/shm than
previously. Because some of the tests leave empty directories
we have no way of telling (in resources.test) whether they
belong to this test run, another test run, or a running
application.
* unix: Don't fail on FreeBSD running ZFS
ZFS doesn't support posix_fallocate() so libqb IPC or RB would
always fail with EINVAL.
As there seems to be no prospect of a more useful return code,
trap it in a QB_BSD #ifdef. That way if we do have actual errors
in the posix_fallocate() call the Linux tests should still find them.
Also, stick a small sleep in the test_ipc_disconnect_after_created
test to allow the server to shutdown before killing it with SIGTERM
and causing a test failure. all the other uses of it seem to have this
sleep!
* Tidy some scripts
Errors reported by Centos covscan
I changed %N to %s as BSD's date command doesn't support %N.
Seconds + PID should be enough ....
* Shrink the name of the dlock tests as they cause random failures
When the PID numbers get big, the socket name overflows the allowed
limit
* Increase timeout of thread check.
It's been seen to time out too early and fail the tests
* ipc: addd qb_ipcc_auth_get() API call
We can't use SO_PEERCRED on the client fd when using socket IPC
becayse it's a DGRAM socket (pacemaker tries this). So provide
an API to get the server credentials that libqb has already
squirreled away for its own purposes.
Also, fix some unused-variable compiler warnings in unix.c
when building on systems without posix_fallocate().
On newer Fedora systems that can have 32 bit PIDs, these long test
names can get truncated in the libqb internal buffers and thus break the
tests, so I've shortened the names.
After much discussion on IRC it was decided that 70000 iterations
of the stress patch didn't achieve anything significant over a
reasonable but smaller number. So it has been reduced to 5000 on
all platforms.
This patch also fixes a bug where test_ipc_disconnect_after_created
committed a use-after-free which could cause a crash on FreeBSD-devel.
In particular, qb_ipcs_rate_limit() needs to be outside the
"#ifdef HAVE_GLIB" conditional, since it gets used regardless.
This should have been like this as of 28e7259.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Compared to the outer world, libqb brings rather unintuitive approach
to priorities within a native event loop (qbloop.h) -- it doesn't do
an exhaustive high-to-low priorities in a batched (clean-the-level)
manner, but rather linearly adds a possibility to pick the handling
task from the higher priority level as opposed to lower priority ones.
This has the advantage of limiting the chances of starvation and
deadlock opportunities in the incorrectly constructed SW, on the other
hand, it means that libqb is not fulfilling the architected intentions
regarding what deserves a priority truthfully, so these priorities are
worth just a hint rather than urgency-based separation.
And consequently, a discovery of these deadlocks etc. is deferred to
the (as Murphy's laws have it) least convenient moment, e.g., when
said native event loop is exchanged for other (this time priority
trully abiding, like GLib) implementation, while retaining the same
basic notion and high-level handling of priorities on libqb
side, in IPC server (service handling) context.
Hence, demonstration of such a degenerate blocking is not trivial,
and we must defer such other event loop implementation. After this
hassle, we are rewarded with a practical proof said "high-level
handling [...] in IPC server (service handling) context" contains
a bug (which we are going to subsequently fix) -- this is contrasted
with libqb's native loop implementation that works just fine even
prior that fix.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
This way, this core part can be easily reused where needed.
Note that "ready_signaller" similarity with run_ipc_server is not
accidental, following commit will justify it.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Roles specifications are currently not applied and are rather
a preparation for the actual meaningful use to come.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Using i7-6820HQ CPU yields these results:
Before: ~2:54
After: ~2:26
Speedup: ~16%
The main optimization lies in how run_function_in_new_process helper is
constructed, since now, there's an actual synchronization between the
parent and its child (that needs to be prioritized here, which is
furthermore help with making the parent immediately give up it's
processor possession) after the fork, so that a subsequent sleep is
completely omitted -- at worst (unlikely), additional sleep round(s)
will need to be undertaken as already arranged for (and now, just
400 ms is waited rather than excessive 1 second).
Another slight optimization is likewise in omission of sleep where
the control gets returned to once the waited for process has been
suceesfully examined post-mortem, without worries it's previous
life is still resounding.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
There's some slight reserve for when bigger PID ranges are in use.
The method to yield the limit on prefix string was derived from
practical experience (rather than based on exact calculations).
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
* ipc_shm: Don't truncate SHM files of an active server
I've put in an extra check so that clients don't truncate the
SHM file if the server still exists. Sadly on FreeBSD we can't
get the server PID for the client (unless someone has a patch handy!)
so we still do the truncate when disconnected. As a backstop (and also
to cover the BSD issue) I've added a SIGBUS trap to the server shutdown
so that it doesn't cause a server crash.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed by: Jan Friesse <jfriesse@redhat.com>
* tests: Improve test isolation
Make all the IPC tests run with a common date/pid stamp name, so that
the final resource.test only fails if it finds one of OUR files left
lying around and not those from another test.
Falls back to old IPC naming style if we can't create the file.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Looks like these are not accepted with splint checker. Also fix some
other minor type -- print format specifier discrepancies.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
examples/tests: make qb logging dispose the memory
A.k.a. "be a good example of using this very library".
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
* Low: log: prevent static vs. implicit non-static declaration clash
...of qb_log_callsites_dump_sect, that could happen when its usage
in qb_log_callsites_register was uncommented.
* Low: tests: fix duplicate "const" declaration specifier
This is a follow-up for d69cc7b (making the pointer itself constant was
meant as a self-defense, no-overwrite measure).
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
* IPC: Allow filesystem sockets to be chosen at run-time on Linux
Most of this patch came from Andrew Beekhof.
Keep a global variable that decides whether or not to use filesystem sockets
or abstract sockets for IPC connections. This variable is set by the presence of a file (default /etc/libqb/force-filesystem-sockets).
* tests: Fix test_ipcc_truncate_when_unlink_fails_shm test using FS sockets
When using filesystem sockets, the
test_ipcc_truncate_when_unlink_fails_shm test always fails, this was
because the unlink() call is wrapped to fail and so it never cleans up
the old version of the socket.
The fix is to preemptively remove the file before unlink gets wrapped.
* doc: Explain the force-filesystem-sockets option
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
The sockets are named using a random() suffix in at attempt to isolate
concurrent test. However random() always returns the same random number
by design ... unless pre-seeded with some value being unique enough for
the particular execution.
Borrowing the most of the above message from original "srandom" fix by
Chrissie who also discovered this issue (nice!), I thought it would be
more viable if we encoded such "unique enough" variables directly to
IPC name being generated, not relying on pseudorandom generators in any
way. Hence this other fix.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
There are not many ways to test alternate code paths having failure of
some function from standard library as a precondition.
For a starter, we need to test failing unlink{,at} functions in a
controlled manner to mimic client and server path of the IPC connection
having different privileges to validate the previous commit. But the
test suite cannot assume it has root privileges (so as to add artificial
user system-wide, which is a pretty stupid idea on its own), cannot
generally use stuff like chroot/namespacing (not to speak about
synergies of the former like docker). So what's left is to make our
own playground, or better yet, use existing playground but just to
modify the rules of the game a bit when it's desired -- a variation
of old good LD_PRELOAD trick.
Note that this concept was already used in syslog tests (see commit
642f74d) and is now further extended using dlsym(RTLD_NEXT, "symbol")
to resolve the standard library symbol being shadowed by our little
"module". This hence yields a customized wrapping we use to either
inject a call failure or to increase an invocation counter so as to
assure something has indeed been called. As the mechanisms used are
not supposed to be available everywhere, the build system is
conditionalized respectively.
Back to our test when unlink{,at} fails, with the help of the described
mechanism, it was actually easy to massage test_ipc_server_fail_shm
into test_ipcc_truncate_when_unlink_fails_shm desired addition, which
is also featured in this commit, together with a modification to
resources.test script so that it expects particular number of empty
file leftovers (see previous commit).
It's expected that the module for failure injections will keep growing
so as to enable better overall coverage of the code (on the platforms
where this provision is available).
This reduces repeated code significantly, and allows for easier
supervision of what's being grouped to the suites + possibly what
timeouts apply.
Note that some artificial test case identifiers (in check_array.c,
check_log.c, check_loop.c, check_rb.c, check_utils.c) got changed
so they now follow 1:1 the test (function) name that is being run
for the case at hand without the "test_" prefix (strict convention).
Exception to this are test_ipc_disp_* tests in check_ipc.c that got,
conversely, changed to test_ipc_dispatch_* to follow the test case
identifiers.
On *BSD and other platforms the stress_connections can timeout and
fail the tests. I've increased the timeout here to an hour as it
takes nearly that long on my VM environment but it seems that's not
common, luckily.
This was also seen on mips/mipsel.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
On some platforms the check_ipc test can fail due to SIGTERM
being delivered to to exiting server process. There is a race
condition between the server main loop quitting and the
signal being delivered.
This patch closes that race loophole in two places, firstly
it makes SIGTERM/SIGSTOP exit immediately rather than just signalling
the mainloop, secondly it calls exit() rather than return when the server
mainloop completes to that the client code does not start executing!
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Package builders that run multiple builds of libqb in parallel
will fail because the IPC unit tests stomp on each other's namespace.
We have to give each IPC server a randomized unique name during
'make check' to avoid this.