* log: lower IPC connection issues to info level
... in handle_new_connection(). The caller has better context for whether a
problem merits a warning or error, and the function's return code is
sufficiently descriptive to do so. Some problems may be expected or able to be
worked around.
For example, Pacemaker's crm_mon attempts to contact pacemakerd IPC. On a
Pacemaker Remote node, that IPC will be unavailable, and crm_mon can check the
libqb return code to detect and handle that situation gracefully.
* log: lower some ringbuffer debug messages to trace level
They're rather noisy, with every shm-based IPC connection generating multiple
obscure messages like:
debug: shm size:1048589; real_size:1052672; rb->word_size:263168
and every disconnect generating the rather unhelpful:
debug: qb_ipcc_disconnect()
along with multiple messages like:
debug: Closing ringbuffer: /dev/shm/qb-10986-11014-34-26VRvs/qb-request-cmap-header
All of these seem appropriate to trace level.
covscan complained we don't check the blackbox header when
reading it in. (quite reasonably)
Note that we still get a covscan error for ->shared_data, but that's
really impossible to verify in the read routine, so I'll leave the
covscan waiver to handle that.
Previously, there were two separate logical issues:
- errno could be set negative in qb_rb_chunk_alloc when
when "reclaim" notifier failed
- _rb_chunk_reclaim (note: local scoped, hence comfortable for changes)
was already setting errno at a single (coincidentally, in a correct
way, but that'd be overwritten with the inverse because of the
previous logical issue in qb_rb_chunk_alloc), so make it set errno
at each failure path (now also when internal integrity in
_rb_chunk_reclaim failed(), sparing the callers to double on that task
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
Use mkdtemp makes sure that IPC files are only visible to the
owning (client) process and do not use predictable names outside
of that.
This is not meant to be the last word on the subject, it's mainly a
simple way of making the current libqb more secure. Importantly, it's
backwards compatible with an old server.
It calls rmdir on the directory created by mkdtemp way too often, but
it seems to be the only way to be sure that things get cleaned up on
the various types of server/client exit. I'm sure we can come up with
something tidier for master but I hope this, or something similar, will
be OK for 1.0.x.
Looks like these are not accepted with splint checker. Also fix some
other minor type -- print format specifier discrepancies.
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
This should prevent libqb from looping in the server if the
ringbuffer gets corrupted. Instead the client will be disconnected.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
This changeset builds on previous 2-3 commits and represents the main
libqb's answer to the original question behind pacemaker's security
defect known as CVE-2016-7035.
Beside the helper partly unifying handling of qb_rb_force_close and
qb_rb_close, it provides the former with ability to use file truncating
as a fallback for when unlinking fails, e.g., because client (note that
mentioned is currently only relevant for the client side as normally
server is responsible for the lifecycle of the materialized files,
unless it crashes and only client is left to do its best) is not the
owner while they are placed at a directory with restricted deletion,
which enforces this very ownership condition.
In practice, this means that, at worst, just the zero-size files are
left behind, so not that much space exhaustion (usually "ramdisk"
like tmpfs is what backs default storage directory /dev/shm, so it
boils down to physical memory exhaustion, even if it can be just
for page cache and related overhead) can happen even on repeated
crashes as the memory mappings are cleared as much as possible.
Also openat/unlinkat functions (sported in qb_sys_unlink_or_truncate_at
as of the previous commit) are, when applicable, used so as to limit
possible race conditions between/during individual path traversals
(both files being got rid of presumably share the same directory).
Few words on which actions are attempted in which order for the
equivalent of qb_rb_force_close now:
There are subtle interactions between what's externally visible
(files) and what's not (memory mappings associated with such files),
and perhaps between memory pages management from the perspective of
the former (usually "ramdisk"/tmpfs) and the latter (mmap + munmap).
If the associated file is no longer publicly exposed by the means of
unlink (even if the object survives internally as refcounting is in
the game, with mmap holding a reference), memory mapping is not
affected. On the other hand, if it's just limited by truncation
to zero size, memory mapping is aware and generates SIGBUS in response
to accessing respective addresses. Similarly, accessing munmap'd
(no refcounting here) memory generates SIGSEGV. For delicacy,
the inputs for all of unlink, truncate, and munmap are stored
at the mmap'd location we are about to drop, but that's just a matter
of making copies ahead of time.
At Ken's suggestion, the scheme is: (unlink or truncate) then munmap,
which has a benefit that externally visible (and program's life span
otherwise surviving!) part is eliminated first, with memory mappings
(disposed at program termination automatically at latest) to follow.
(There was originally a paranoid expectation on my side that truncate
on tmpfs actually does silent munmap, so that our munmap could in fact
tear down the mapping added in the interim by the libraries, signal
handler or due to requirements of another thread, also because of
munmap on the range without any current mappings will not fail, and
thus there's likely no portable way to non-intrusively check the
status, but also due to documented SIGBUS vs. SIGSEGV differences
the whole assumption appears bogus on the second thought.)
Relevant unit tests that exercise client-side unlinking:
- check_ipc: test_ipc_server_fail_shm, test_ipc_exit_shm
- new test in a subsequent commit
libqb fails to build on the hppa architecture, because the built-in
testcases fail as can be seen here:
http://buildd.debian-ports.org/status/fetch.php?pkg=libqb&arch=hppa&ver=0.17.0-2&stamp=1409458262
I did analyzed why they fail, and the reason is that on hppa we have
somewhat more complicated requirements (e.g. alignments) which needs to
be followed in order to mmap shared pages between processes. It's
different than what can be done compared to ia64 and sparc.
The attached patch fixes libqb on the hppa architecture and with it all
testcases finish successful.
By the way, I fixed a small typo in configure.ac too where arm platforms
prints "ia64"...
Forwarded-From: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=760199
Signed-off-by: Christoph Berg <myon@debian.org>
qb_rb_open() expects the size field to represent the maximum chunk size.
It adds QB_RB_CHUNK_MARGIN + 1 and rounds up to the page size to
determine the ringbuffer's total size. When creating a ringbuffer from a
file we must compensate by subtracting this amount from the file's size.
In the future, if something changes in the file header we can
increment the version and split up the parsing into separate functions
for backwards compatibility.
The ringbuffer protocol uses the chunk magic number to indicate to the
other side what state a chunk is in. It's therefore important to use
strongly ordered memory writes to make sure that neither the compiler
nor the CPU change the apparent order of the writes, since that would
result in corrupted messages.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
This is to prevent a situation where a fast writer will
write their new chunk between setting the new read pointer
and clearing the header.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>