It's possible that cs->filename or cs->format could be read
in the 'fast' path while the 'slow' path is still constructing
the object. So we need to lock arr_next_lock before copying them
out for the caller.
Also wthread_should_exit was unprotected.
It's hard to predict the length of formatted output, so we'd better
notice (and abort) if the description is truncated. Incidentally,
mkdtemp() does this for us in the shared memory branch, but do an
explicit check there as well for consistency, and get rid of the wrongly
parametrized strncat() risking a buffer overflow (CONNECTION_DESCRIPTION
is not the length of the source "/qb").
Similar truncation checks should be added to qb_ipcs_{shm,us}_connect()
where they build the request/response names, and possibly to other
places using snprintf().
When qb_ipcs_connection_auth_set() has been used, the ownership of the
temp directory initially set by handle_new_connection() must be updated
as well.
The main and the most ABI-touching thing for the envisioned 2.0 branch
is the usage of the linker-build-time allocated callsite info, avoiding
the non-economic evaluations and, under some circumstances dangerous,
heap allocations in the run-time.
Considering that v1.9.0 release (libqb.so.20) was expressly marked as
tech-preview[1,2] (hence something that shall not make it to production
use), there should be no harm for master branch (that is headed towards
2.0 and beyond) to receive noticable SONAME bump (libqb.so.100) so as to
- leave enough of space for a possible v1-compatible branch evolution
(for use cases where recompile-everything is a no-go).
in particular, with resuming with libqb.so.30, there would
be a room for 99-33 = 63 add-new-drop-nothing compatible
changes for that branch (which is more than plentiful)
- indicate some big change is going on more clearly towards client space
This is supposed to be a reasonable trade-off solution that would still
leave enough wiggle space, and would represent responsible approach to the
development (like the original attempt to prevent ABI break in the first
place was), allowing for more than an enforced unanimity (rather
antagonistic in the free software realms).
[1] https://lists.clusterlabs.org/pipermail/users/2019-December/026690.html
[2] https://github.com/ClusterLabs/libqb/releases/tag/1.9.0
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Now that splint is actually contradicting errors that come from
the compilers I think it's time to retire it. I could cope with it
being a minor nuisance on the argument that "another check can't
hurt", but contradicting the actual compilers is too much.
The CI has Coverity installed which is much more up-to-date anyway.
Splint hasn't been updated since 2010
Response structure was not initialized completely,
when mkdtemp/chown failed, server was not accepting connection yet or
connect failed for some reason.
This is not an issue, but valgrind reports this
as a problem so it is easy to miss real problem then.
Solution is to initialize response before it is used.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Previously, there were two separate logical issues:
- errno could be set negative in qb_rb_chunk_alloc when
when "reclaim" notifier failed
- _rb_chunk_reclaim (note: local scoped, hence comfortable for changes)
was already setting errno at a single (coincidentally, in a correct
way, but that'd be overwritten with the inverse because of the
previous logical issue in qb_rb_chunk_alloc), so make it set errno
at each failure path (now also when internal integrity in
_rb_chunk_reclaim failed(), sparing the callers to double on that task
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
The callers of qb_log_target_alloc() return -errno when it fails.
However, qb_log_target_alloc() wasn't setting errno.
The only failure case is when QB_TARGET_LOG_MAX (32) logs have been opened, so
it's unlikely to ever be a real-world problem. But in that case, now set errno
to EMFILE ("Too many open files").
Terminating NUL on FreeBSD is not part of the sun_path.
Add it to use sun_path as a parameter of unlink.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
It's misleading towards a random code observer, at least,
hiding the fact that what failed is actually the queing up
of some handling to perform asynchronously in the future,
rather than invoking it synchronously right away.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
It turns out that while 7f56f58 allowed for less blocking (thus
throughput increasing) initial handling of connections from clients
within the abstract (out-of-libqb managed) event loop, it unfortunately
subscribes itself back to such polling mechanism for UNIX-socket-check
with a default priority, which can be lower than desired (via explicit
qb_ipcs_request_rate_limit() configuration) for particular channel
(amongst attention-competing siblings in the pool, the term here
refers to associated communication, that is, both server and
on-server abstraction for particular clients). And priority-based
discrepancies are not forgiven in true priority abiding systems
(that is, unlikele with libqb's native event loop harness as detailed
in the previous commit, for which this would be soft-torelated hence
the problem would not be spotted in the first place -- but that's
expliicitly excluded from further discussion).
On top of that, it violates the natural assumption that once (single
threaded, which is imposed by libqb, at least between initial accept()
and after-said-UNIX-socket-check) server accepts the connection, it
shall rather take care of serving it (at least within stated initial
scope of client connection life cycle) rather than be rushing to accept
new ones -- which is exactly what used to happen previously once the
library user set the effectively priority in the abstract poll
above the default one.
It's conceivable, just as with the former case of attention-competing
siblings with higher priority whereby they could _infinitely_ live on
at the expense of starving the client in the initial handling phase
(authentication) despite the library user's as-high-as-siblings
intention (for using the default priority for that unconditionally
instead, which we address here), the dead lock is imminent also in
this latter accept-to-client-authentication-handling case as well
if there's an _unlimited_ fast-paced arrival queue (well, limited
by with number of allowable open descriptors within the system,
but for the Linux built-in maximum of 1M, there may be no practical
difference, at least for time-sensitive applications).
The only hope then is that such dead-locks are rather theoretical,
since a "spontaneous" constant stream of either communication on
unrelated, higher-prio sibling channels, or of new connection arrivals
can as well testify the poor design of the libqb's IPC application.
That being said, unconditional default priority in the isolated
context of initial server-side client authentication is clearly
a bug, but such application shall apply appropriate rate-limiting
measures (exactly on priority basis) to handle unexpected flux
nonetheless.
The fix makes test_ipc_dispatch_*_glib_prio_deadlock_provoke tests pass.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Use mkdtemp makes sure that IPC files are only visible to the
owning (client) process and do not use predictable names outside
of that.
This is not meant to be the last word on the subject, it's mainly a
simple way of making the current libqb more secure. Importantly, it's
backwards compatible with an old server.
It calls rmdir on the directory created by mkdtemp way too often, but
it seems to be the only way to be sure that things get cleaned up on
the various types of server/client exit. I'm sure we can come up with
something tidier for master but I hope this, or something similar, will
be OK for 1.0.x.
Proper Libs.private enables linking applications statically against
libqb: static archives (.a) don't carry their own dependency
information, unlike shared libraries (.so). Modern libc versions
include socket and RT functions, so socket_LIBS and rt_LIBS will be
empty there, but we include them for strict correctness on older
platforms; basically, we're matching libqb_la_LIBADD here.
Consequently, nsl_LIBS and GLIB_LIBS don't enter this field, since they
are only used in the examples and tests, not in the library proper.
Cflags, on the other hand, is emitted all the time and (under GCC)
propagates the -pthread option (which also affects the preprocessing
stage) to all users of libqb even when compiling modules or linking
everything dynamically.
Signed-off-by: Ferenc Wágner <wferi@debian.org>
The last fix to skiplist never ran the code that patched up the level
list as it updated the current level before runnign the loop.
This now works.
Merges: https://github.com/ClusterLabs/libqb/pull/333
Reviewed-by: Jan Pokorný <jpokorny@redhat.com>
* log: Add high-resolution timestamp option for log files
This adds the %T option to the log format for millisecond timestamps. There's a feature test macro QB_FEATURE_LOG_HIRES_TIMESTAMPS so that applications know that they are available.
Because this changes the internal logging API, applications that use custom loggers will also need to change their custom logging destinations to take a struct timespec instead of a time_t. The above feature test macro will help in deciding which is appropriate.
* log: add systemd journal as a logging option
systemd journal can be configured as a logging option
at ./configure time (--enable-systemd-journal).
If libqb is buit with this then the syslog target can be switched
to sending to the journal using
qb_log_ctl(QB_LOG_SYSLOG, QB_LOG_CONF_USE_JOURNAL, 1);
As this patch also brought up some locking issues with re-configuring the logging while threaded logging was enabled, it also includes locking around qb_log_ctl2() and conversion of in_logger to an atomic.
(Patch from poki, only committed under my name because github is being
weird)
This used to happen when an iterator contained a reference on the item
to continue with, which got outdated when such item had been removed in
the interim, though it's original memory would still be -- mistakenly --
accessed. Actually such a condition is exercised with an existing
"test_map_iter_safety(ordered=true)" test, though it likely never run
under valgrind's supervision and standard memory checking harness was
too coarse (perhaps because of low memory pressure or other "lucky"
coincidence). Thankfully, the default, paranoid approach towards dynamic
memory handling in OpenBSD (free(3) call makes small chunks "junked",
i.e., filled with 0xdf bytes, see malloc.conf(5)) resulted in the
explicit segmentation fault when tripping over the happens-to-be-freed
pointer in the assumed iteration chain.
We solve the "out-of-sync iterator" issue with a twist, inverting
the responsibility to carry (and more widely, to contribute in the
propagation of) the up-to-date "forward" pointers, as clearly,
iterating over and over through the items would not be very scalable
(and it was not done, which had resulted in the first place).
So now, when any skiplist item is to be removed, its preceding item
gets the "forward" pointers recomputed as before, but then, they are
copied into "forward" pointers for the item to be removed, original
area containing them is disposed, and this preceding item just points
to the area primarily managed by the to-be-removed item (procedure
dubbed "takeover-and-repoint" in the comment). This itself gets
a special mark so that this area won't be dropped when that item gets
disposed, which rather happens with the disposal of the preceding item
that points to the "forward" memory area at hand and is not marked so.
This is believed to be sufficient to address out-of-band (iterator
based) access versus interim future iteration chain mangling, as these
operate de facto on the non-sparse, linear level of the skiplist.
Alternative approaches include:
turning pointers-to-arrays into pointers-to-pointers-to-arrays to
allow for explicit setting to NULL after free, and sharing this
additional indirection -- this straightforward extension was
attempted first, but shortly after, it became apparent it would
be a nightmare with the current interprocedural dependencies
extra tagging of the structures and adding complexities around
checking the eligibility, like every other manipulation with the
skiplist
completely split life-cycle of "node" and "node->forward", i.e.,
separate reference-counting etc.
Also said test was extended to push the corner case to the limit:
when to-resume-with item in the chain is being figured out, the
predecessors may be consulted (it is in that test), but the very
first predecessor is now removed as well, for good measure, as
it makes for boundary condition ^ 2.
Signed-off-by: Jan Pokorný jpokorny@redhat.com
It is my (and several others') opinion that the linker 'magic' used to maintain callsite data in libqb is hugely over complicated and unnecessarily fragile. It's main purpose seems to have been to improve performance but empirical testing shows this to be tiny at best. The overhead of sprintf makes minor optimisations in this code pointless.
With this code removed, libqb allocates callsites using a C static variable at run-time. This sounds bad but in actuality it merely moves the allocation from program load time to the first few milliseconds of program run-time. Applications like corosync and pacemaker spend most of their time in small loops doing the same work over and over again so the overhead doesn't apply and jitter does not occur.
We've tested this with corosync and pacemaker under valgrind and massif and the differences are minimal and even then only show up under artificial stress testing.
For this change I've bumped the soname up to 20 to indicate this is an incompatible change. I'm open to suggestions as to a release number but am currently thinking of 2.0.0
* doc: qbarray.h: fix garbled Doxygen markup
* build: follow-up for and fine-tuning of a rushed 6d62b64 commit
(It made a service as-was, but being afforded more time, this would
have accompanied that commit right away, for better understanding,
brevity and uniformity.)
* build: prune superfluous Makefile declarations within tests directory
There was a significant redundancy wrt. build flags and EXTRA_DIST
assignment (the latter become redundant as of f6e4042 at latest)
spread all over the place (vivat copy&paste). Also, in one instance,
CPPFLAGS (used) was confused with CFLAGS (meant).
* maint: check abi: fix two issues with abi-compliance-checker/libstdc++
1. ABICC >= 2 needs to be passed -cxx-incompatible switch because C is
no longer a default for this tool (used to be vice versa),
plus current version will stop choking on C vs. C++ (our C code with
C++ compatibility wrapping being viewed from C++ perspective for the
purpose of dumping the declared symbols, which somewhat conflicts
with internal masking of the C++ keywords being used as valid C
identifiers [yet some instances must not be masked here, see
https://github.com/lvc/abi-compliance-checker/issues/64) only
if _also_ something like this is applied:
https://github.com/lvc/abi-compliance-checker/pull/70
2. since 20246f5, libqb.so no longer poses a symlink to the actual
version-qualified shared library, but rather a standalone linker
script, which confuses ABICC, so blacklist that file for the scanning
purposes explicitly, together with referring to the library through
it's basic version qualification (which alone, sadly, is not
sufficient as ABICC proceeds to scan whole containing directory
despite particular file is specified)
* maint: check abi: switch to abi-dumper for creating "ABI dumps"
Beside avoiding issues with abi-compliance-checker in the role of ABI
dumps producer (see the preceding commit), it also seems to generate
more accurate picture (maybe because it expressly requires compiling
with debugging information requested).
* Low: qblist.h: fix incompatibility with C++ & check it regularly
* tests: check_list.c: start zeroing in on the gaps in tests' coverage
* tests: print_ver: make preprocessor emit "note" rather than warning
IIRC, Chrissie asked about this around inclusion of the test at
hand, and it seemed there was no way but to emit a warning to get
something output at all. Now it turns wrong, and moreover, we
make the code not fixed on GCC specific pragmas, with a bit of
luck, "#pragma message" approach is adopted more widely by compilers.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
* Replace ck_assert_uint_eq() with ck_assert_int_eq()
it's not available in check 0.9
* Proper check for C++ compiler (from Fabio)
* add (c) to copyright dates
Uses dpkg-architecture, if present, to return
DEB_HOST_GNU_TYPE, and use this appended to /usr/include
for form the path.
Signed-off-by: Daniel Black <daniel@linux.ibm.com>
Reviewed-by: Jan Pokorný <jpokorny@redhat.com>
on FreeBSD 11 call dlopen on a shared library causes the constructors
to run again. As we're just getting symbols we don't need this to
happen.
Actually we don't WANT it to happen because it can cause qb_log_init to
be called twice (recursively) and the dlnames list gets corrupted. This
causess corosync (at leasT0 to crash at startup.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
* log: Allow flexible size of logging buffer & ellipsis if it overflows.
Allow the logging line length to be changed. Any reasonable length is allowable, the default is 512 as before. Anything more than 512 incurs several mallocs.
Also add an option to set the last 3 characters as '...' if the line length overflows.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
* ipc_shm: Don't truncate SHM files of an active server
I've put in an extra check so that clients don't truncate the
SHM file if the server still exists. Sadly on FreeBSD we can't
get the server PID for the client (unless someone has a patch handy!)
so we still do the truncate when disconnected. As a backstop (and also
to cover the BSD issue) I've added a SIGBUS trap to the server shutdown
so that it doesn't cause a server crash.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed by: Jan Friesse <jfriesse@redhat.com>
For the former, a prototype and the final code got (hm, mysteriously)
intertwisted. For the latter, I am clearly guilty of (rare, anyway)
testing of the out-of-tree builds only with libqb-already-system-wide
scenario, which is rather shortsighted.
Thanks Fabio and his ci.kronosnet.org project for spotting that.
X-mas-present-for: Fabio M. Di Nitto <fdinitto@redhat.com>
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Looks like these are not accepted with splint checker. Also fix some
other minor type -- print format specifier discrepancies.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
... also to make it the documentation refer to the implementation limits
properly.
When at it, also document some nits on the implementation side, unify
qbarray.h with project's doxygen conventions a bit + fix a typo there.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
In case of hdb.c, the problem is that `check` public API (qb_hdb_handle
member struct) item (which should not be exposed publicly like this
in the first place!) is typed as `int32_t`, whereas it was to be
compared to `uint32_t` implementation-possessed local variables
(presumably derived from the same source), which made the compiler
upset (even though there was no real reason, integer promotion to
unsigned type would happily occur, which is furthermore expected to
be fully defined as these values come from `random` that shall
return non-negative integers below `INT32_MAX`).
Hence:
- type these local variables to `int32_t` just as well, which allows to
- simplify `random` return value handling, since they are expected to be
zero-or-greater and the previously extra tested all-bits-on pattern
makes undoubtfully for a negative numeric value in case of a signed
integer with specified width (c.f. 7.18.1.1/C99), hence falling into
complement of zero-or-greater; zero itself is also excluded for the
reasons stated in the comment (which was pretty hazy and incorrect,
so it gets overhaul as well)
- also superfluous typecasts are removed
Similar situation is with loop_timerlist.c, where we are actually fully
in charge of the struct member (private API), but there are good reasons
to stay consistent with the former file as the same applies to the
source of that value -- it comes from `random` (equivalent comment
is added here for greater symmetry).
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>