Commit Graph

368 Commits

Author SHA1 Message Date
Jan Friesse
9a50628fd1 main: Add support for libcgroup
When corosync is started in environment where it ends in cgroup without
properly set rt_runtime_us it's impossible to get RT priority.

Already implemented workaround is to use higher non-RT priority.

This patch implements another solution. It moves corosync into root cpu
cgroup. Root cpu cgroup hopefully has enough RT budget.

Another solution was mentioned on ML
https://lists.freedesktop.org/archives/systemd-devel/2017-July/039353.html
but this means to generate some "random" values.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
(cherry picked from commit c56086c701)
2017-08-01 14:32:53 +02:00
Christine Caulfield
55c3dcb76d stats: Add map with on-demand statistics
Icmap is factored out so it's possible to add other
maps for cmap. API call to switch maps from application
end is added.

Corosync-cmapctl is enhanced with -m option.

Stats contains all statistics previously found in runtime.connections,
runtime.services and runtime.totem prefixes together with new knet
related. All stats are read only.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-07-27 15:53:04 +02:00
Christine Caulfield
876910d8ff ipc: Check for the libraries sending invalid message IDs
If the library sent an invalid (ie too high) message ID to
corosync, then it could cause the daemon to crash.

Now we check the message ID before indexing the function array

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-07-14 14:06:49 +01:00
Jan Friesse
9627d7350b main: Add option to set priority
Option -P takes numeric value with same meaning
as nice or values min / max, meaning maximal / minimal priority (so
minimal / maximal nice value).

Scheduler / priority setting is moved in code so it is now executed
after logsys is configured so errors are logged.

Setting maximal priority is also used as fallback when realtime
scheduling is requested and sched_setscheduler fails.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
(cherry picked from commit a008448efb)
2017-07-10 16:40:39 +02:00
Jan Friesse
564b4bf7d4 totem: Propagate totem initialization failure
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2017-06-15 11:07:33 +02:00
Jan Friesse
95b91e4ae7 main: Display reason why cluster cannot be formed
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2017-05-18 17:15:55 +02:00
Andrew Price
86012ebb45 Main: Call mlockall after fork
Man page of mlockall is clear:
Memory locks are not inherited by a child created via fork(2) and are
automatically removed (unlocked) during an execve(2) or when the
process terminates.

So calling mlockall before corosync_tty_detach is noop when corosync is
executed as a daemon (corosync -f was not affected).

This regression is caused by ed7d054e55
(setprio for logsys/qblog was correct, mlockall was not).

Solution is to move corosync_mlockall call on correct place.

Signed-off-by: Andrew Price <anprice@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-04-25 14:50:04 +02:00
Bin Liu
0462b5e609 totemconfig: Prefer nodelist over bindnetaddr
In a two-node cluster, I 've one node configured with open-vswtich:
5: br-fixed: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UNKNOWN group default
inet 192.168.124.88/24 scope global br-fixed
inet 192.168.124.87/24 scope global secondary br-fixed
inet 192.168.124.83/24 brd 192.168.124.255 scope global secondary
tentative br-fixed
inet 192.168.124.89/24 scope global secondary br-fixed

while I use 192.168.124.83 in node list of corosync.conf with udpu, and
the bind_addr is 192.168.124.0. After upgrading corosync on this node,
the it uses 192.168.124.88 instead of 192.168.124.83. As we can see:

corosync-cfgtool -s
Printing ring status.
Local node ID 1084783704

corosync-quorumtool -s
Membership information:
Nodeid Votes Name
1084783697 1 d52-54-77-77-01-02
1084783699 1 d52-54-77-77-01-01 (local)

while the other node can only see itself:
corosync-cfgtool -s
Printing ring status.
Local node ID 1084783697
RING ID 0
id = 192.168.124.81
status = ring 0 active with no faults

corosync-quorumtool -s
Membership information:
Nodeid Votes Name
1084783697 1 d52-54-77-77-01-02.virtual.cloud.suse.de (local)

this patch will check if there are both nodelist and bindnetaddr and if
so, display warning and use nodelist information.

Signed-off-by: Bin Liu <bliu@suse.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-04-11 11:19:31 +02:00
Christine Caulfield
30771a39a8 main: Don't ask libqb to handle segv, it doesn't work
segv should be handled by corosync, libqb is not the
place to be handling emergency signals.

This currently requires the head of libqb git tree to
generate a blackbox & coredump in the event of a segfault,
but it's better than the write() spin that currently happens.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-02-27 15:14:41 +00:00
Jan Friesse
8b6bd86a55 Logsys: Change logsys syslog_priority priority
LibQB adds default "*" syslog filter so we have to set syslog_priority
as low as possible so filters applied later in
_logsys_config_apply_per_file takes effect.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2017-02-24 16:23:50 +01:00
Takeshi MIZUTA
034553c080 man: Modify man-page according to command usage
Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-12-01 16:32:42 +01:00
Michael Jones
b4c06e52f3 list: Replace uses of list.h with qblist.h
Signed-off-by: Michael Jones <jonesmz@jonesmz.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-10-27 14:56:52 +02:00
Christine Caulfield
268cde6ee4 totem: Add Kronosnet transport.
This is a big update that removes RRP & MRP from the codebase
and makes knet the default transport for corosync. UDP & UDPU
are still (currently) supported but are deprecated. Also crypto
and mutiple interfaces are only supported over knet.

To compile this codebase you will need to install libknet from
https://github.com/fabbione/kronosnet

The corosync.conf(5) man page has been updated with info on the new
options. Older config files should still work but many options
have changed because of the knet implementation so configs should
be checked carefully. In particular any cluster using using RRP
over UDP or UDPU will not start as RRP is no longer present. If you
need multiple interface support then you should be using the knet transport.

Knet brings many benefits to the corosync codebase, it provides support
for more interfaces than RRP (up to 8), will be more reliable in the event
of network outages and allows dynamic reconfiguration of interfaces.
It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have
plagued corosync/openais from day 1

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2016-10-11 10:09:42 +01:00
Ferenc Wágner
cf10a754e9 Fix various typos
occured -> occurred
parantheses -> parentheses
configuraton -> configuration
aquire -> acquire
retrive -> retrieve
prefered -> preferred

Signed-off-by: Ferenc Wágner <wferi@niif.hu>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-09-12 09:50:11 +02:00
Jan Friesse
f837f95dfe Config: Flag config uidgid entries
Uidgid entries parsed from configuration files now has prefix
(uidgid.config.) so they are distinguishable from dynamically added
entries. Entries added from config file are pruned on reload if no
longer exists in config file (dynamic one stays unaffected). Also whole
uidgid.config. prefix is made read only.

This make PCMK work again after configuration reload is called.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2016-08-04 16:13:48 +02:00
Ferenc Wágner
b1de8efd15 Fix typo: aquire -> acquire
Signed-off-by: Ferenc Wágner <wferi@niif.hu>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-06-22 14:26:28 +02:00
Christine Caulfield
571b1621e9 Add some more RO keys
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-05-24 12:33:55 +02:00
Christine Caulfield
1e2de52ef1 logging: Use our own version of basename
basename() function has some potentially odd issues on
other platforms.

So, to be safe, here's an internal version.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-05-03 15:31:29 +02:00
Christine Caulfield
d245831d65 logsys: fix TOTEM logging when corosync built out of tree
If corosync is built out-of-tree (passing --srcdir to configure) then
TOTEM logging doesn't print anything.

This is caused by the source filenames (from __FILE__ at compilation
time) having the configured path in them - in this example
../corosync/exec/totemudp.c etc. The list of totem source filenames
passed to libqb logging facility only has the basenames so the filenames
never match up as libqb does an exact string match.

I looked into fixing this in libqb but it causes a regression. We can't
simply basename() __FILE__ at the point of calling log_printf as it's i
common also to use __FILE__ to generate the logging source, and
using basename() on both removes the distinction between similarly named
files from different directories which could be a requirement.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2016-04-26 09:49:53 +01:00
Jan Friesse
d77cec24d0 Handle adding and removing UDPU members atomically
When config file is reloaded with removed UDPU member, internal icmap
index of nodelist.node can change. This can result in removal and then
adding back node. This, with UDPU alive filtering (where member is by
default considered as not a member) makes corosync not sending messages
to such members resulting in new membership creation.

Solution is to properly test which members were really deleted and added
(instead of relying on internal and dynamic naming of icmap hash table
key name).

Also trully dynamic add and remove node (via cmap) is now handled by
same function so totem_config->interfaces is now updated properly.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2015-01-21 16:37:26 +01:00
Jan Friesse
252b38ab8a corosync_ring_id_store: Use safer permissions
corosync_ring_id_store should use same (safer) permissions as
corosync_ring_id_create_or_load for (eventually) newly created ringid
file.

Credit to Sjerek for finding this problem.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2015-01-20 11:21:05 +01:00
Jan Friesse
177ef0e524 Set RR priority by default
Experience with larger production clusters showed that setting RR
priority for corosync is viable for prevent random fencing, ...

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2015-01-05 15:01:49 +01:00
Jan Friesse
bb52fc2774 Store configuration values used by totem to cmap
Some totem configuration values (like token, consensus, ...) are ether
computed or default value is used. It's hard to find out, what
value is really used.

Solution is to store values in cmap.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-10-13 11:59:06 +02:00
Vladislav Bogdanov
e3ffd4fedc Implement config file testing mode
Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-07-16 16:10:32 +02:00
Jan Friesse
dfaca4b10a Fix compiler warning introduced by previous patch
QB loop signal handler prototype differs from signal(2) prototype.
Solution is to create wrapper functions.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2014-07-09 15:57:35 +02:00
zouyu
384760cb67 Handle SIGSEGV and SIGABRT signals
SIGSEGV and SIGABRT signals are now correctly handled (blackbox is
dumped and logsys is finalized).

Signed-off-by: zouyu <hopkings2005@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-07-03 15:13:48 +02:00
zouyu
cc80c8567d fix memory leak produced by 'corosync -v'
Signed-off-by: zouyu <hopkings2005@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-07-03 14:54:05 +02:00
Jan Friesse
72cf15af27 votequorum: Do not process events during reload
During reload, local_node_pos is deleted and reinstation is handled in
totemconfig after reload is finished. votequorum handles this events and
tries to reload it's configuration. This led to logging a little scary
messages (even nothing bad is happening, because after local_node_pos
reinstation everything back to normal).

Solution is to stop processing events during reload. Sadly, simple
tracking of config.reload_in_progress doesn't work because LibQB events
triggering order is undefined so votequorum reload handler can be called
before totemconfig (and before local_node_pos is reinstatied).

So new config.totemconfig_reload_in_progress key is defined with very
similar semanthic as config.reload_in_progress but set inside
totem_reload_notify function. Votequorum then use this new key.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-06-27 11:40:21 +02:00
Jan Friesse
c8e3f14fdb Make config.reload_in_progress key read only
It's not very good idea to allow user apps changing internal key
reload_in_progress.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-06-27 11:40:18 +02:00
Jan Friesse
da46ecfc30 Move ringid store and load from totem library
Functions for storing and loading ring id was in the totem library. This
causes problem, what to do when it's impossible to load or store ring
id. Easy solution seemed to be assert, but sadly this makes hard for
user to find out what happened (because corosync was just aborted and
logsys didn't flush)

Solution is to move these functions to main.c, where is much easier to
handle error. This also makes libtotem free of any file system
operations.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-06-02 14:54:57 +02:00
Jan Friesse
d310b251c3 Introduce get_run_dir function
Run dir (LOCALSTATEDIR/lib/corosync) was hardcoded thru whole codebase.
Totemsrp was trying to create and chdir into it, but also
takes into account environment variable COROSYNC_RUN_DIR creating
inconsistency.

get_run_dir correctly returns COROSYNC_RUN_DIR (when set) or
LOCALSTATEDIR/lib/corosync. This is now used by all functions instead of
hardcoded string.

All occurrences of mkdir/chdir are removed from totemsrp and chdir is
now called in main function. Mkdir call is completely removed, because
it was not used anyway (check in main.c was called before totemsrp init,
so mkdir was never called) and also make install and/or package system
should take care of creating this directory with correct
permissions/context.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-06-02 14:53:18 +02:00
Jan Friesse
19c5b63ff5 logsys: Log error if blackbox cannot be created
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-06-02 14:36:08 +02:00
Jan Friesse
45dd9861ff Properly check result of symlink
Error message is displayed when it's impossible to create symlink to
fdata file.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 11:24:31 +01:00
Jan Friesse
5c54f941ac Fix cppchecks warning
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 11:24:29 +01:00
Jan Friesse
178c0d82d9 Close devnull file handler
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 11:24:26 +01:00
Jan Friesse
b88c0766fe logsys: Make logging of totem work again
Because of change in libqb (9abb686) logging of TOTEM subsystem stopped
working.

Instead of rely on previous behavior (implicit substring match), all
totem files are now explicitly given.

Also QB subsystem now uses comma separated filelist instead of previous
function calling.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-11-04 12:32:35 +01:00
Christine Caulfield
bc47c583bd Reload: Make coroparse use a designated icmap hash table
Pass an icmap hashtable into coroparse so we can load it into
a temporary one during reload

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-12 16:09:06 +01:00
Christine Caulfield
8567887abb [PATCH] Replace freopen with open/dup2 when daemonizing
This patch replaces the existing freopen method of
forcing stdin/out/err to /dev/null with the more
usual system of open/dup2.

While I don't like posting patches I don't fully understand,
this patch seems to fix a problem where stdout/err get
assigned to a socket causing double logging output
on systemd.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-10 15:33:31 +01:00
Christine Caulfield
3663622576 Add log message to exit signal handler
I've seen a few instances where corosync has shut down for
apparently 'no reason'. In fact most of the time the shutdown
has been caused by an external source (often an init script)
but it's not been obvious what has happened and people
implicate the deamon

This patch simply adds a log message to the signal handler
when it is called so that the cause of the shutdown is obvious.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-03 14:04:50 +01:00
Michael Chapman
2740cfd1ea Fix scheduler pause-detection timeout
qb_loop_timer_add expects the timeout to be in nanoseconds, but we were
passing the value in milliseconds. Scale the timeout appropriately.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-08-19 09:03:24 +02:00
Jan Friesse
615d7592fb Log: Output parse errors to syslog
When corosync was started in daemon mode and there was parse error, no
way existed how to find out what happened (this is usual situation with
systemd enabled systems). Solution seems to be output to syslog by
default.

Also redundant line with setting logsys is removed because it's no
longer needed, because FORK and THREADED mode options has no longer
effect. FORK is handled by libqb by default and THREADED mode is forced
by calling logsys_thread_start.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-21 11:21:42 +02:00
Jan Friesse
8429d01389 Detect big scheduling pauses
Add poll timer scheduler to be called 3 times per token timeout.
If poll timer was not called for more then 0.8 * token timeout, it means
corosync process was not scheduled and ether token_timeout should be
increased or load should be reduced (useful for VM, where host is
overcommitted so VM is not scheduled as expected).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-04-08 09:58:42 +02:00
Kazunori INOUE
1ad21e384e log: move Corosync started log messages
"Corosync Cluster Engine ... started" message is shown after
logsys is full configured.

Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 11:52:26 +01:00
Fabio M. Di Nitto
ed6bca3293 crypto: drop < 2.3 protocols and onwire compat
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-01-14 11:49:32 +01:00
Jan Friesse
6127be1806 Move qb_loop creation after daemonization
Creating qb_loop before daemonization is not problem for poll or epoll
type loops, but it's problem for kqueue, because kqueue is not shared
in child with parent after fork.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-12-12 11:47:42 +01:00
Jan Friesse
dd588d004e Add option to specify ip version
Default is ipv4.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-12-03 14:02:32 +01:00
Steven Dake
402638929e Fix problem with sync operations under very rare circumstances
This patch creates a special message queue for synchronization messages.
This prevents a situation in which messages are queued in the
new_message_queue but have not yet been originated from corrupting the
synchronization process.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-22 11:47:57 +01:00
Fabio M. Di Nitto
220d659b38 totemcrypto: implement crypto packet format 2.2 and crypto_compat: config opt
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-11-22 11:13:30 +01:00
Jan Friesse
3cd4f9a1f5 Add support for selecting IPC type
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-11-08 12:16:11 +01:00
Jan Friesse
b7635ab9f7 Return back "Totem is unable to form..." message
This patch returns back SUBJ functionality. It rely on fact, that
sendmsg will return error, and if such error is returned for long time,
it's probably because of firewall.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-10-08 16:53:35 +02:00