Commit Graph

3416 Commits

Author SHA1 Message Date
Jan Friesse
57ff693b70 mon: Fix comparsion typo
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-25 14:57:13 +01:00
Jan Friesse
e1e2390b61 mon: Make mon compilable with libstatgrab ver 0.9
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-25 14:57:10 +01:00
Jan Friesse
fbe8768f1b cpg: Make sure left nodes are really removed
When node is paused and other nodes has in meantime exited cpg process,
paused node after resume doesn't update it's membership correctly so on
previously paused node exited cpg process is still visible.

Solution is to compare join list with cpd and remove all pids which are
not included in join list.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-19 10:59:14 +01:00
Jan Friesse
83c63b247f cpg: Make sure nodid is always logged as hex num
Also number is prefixed by 0x so it's easier to spot that number is
hexadecimal.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-19 10:59:10 +01:00
Jan Friesse
fcf26e0303 cpg: Refactor mh_req_exec_cpg_procleave
Most of functionality is moved to do_proc_leave function to make it
reusable.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-19 10:59:05 +01:00
Jan Friesse
38c04d9a66 totemsrp: Fix typo with cont gather
Patch f3ffd3da5c introduced named states
of state-machine, but sadly contains logical problem causing
stats.continuous_gather increasing even when it shouldn't. Problem is
not critical, because continuous_gather is set to 0 on successful
membership creation.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-18 16:12:57 +01:00
Christine Caulfield
90d448af3b votequorum: Add extended options to auto_tie_breaker
This patch adds more flexibility to the auto_tie_breaker feature of
votequorum. With this, not only can the lowest nodeid be used as
a tie breaker, but also the highest, or a node from a nominated list.

If there is a list of nodes, the first node in the list that was not
part of the previous partition is used. This allows the user to
specify a preferred set of nodes but prevents a split-brain if the
cluster divides evenly with a node in each half.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-02-17 16:29:45 +00:00
Masatake YAMATO
fa71067a93 Free object allocated at quorum_register_callback
Memory object allocated with malloc at quorum_register_callback
is not freed. The object is linked to internal_trackers_list.

The object is unlinked at quorum_unregister_callback. However,
it is not freed at the function.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-01-23 17:18:44 +01:00
Jan Friesse
45dd9861ff Properly check result of symlink
Error message is displayed when it's impossible to create symlink to
fdata file.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 11:24:31 +01:00
Jan Friesse
5c54f941ac Fix cppchecks warning
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 11:24:29 +01:00
Jan Friesse
178c0d82d9 Close devnull file handler
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 11:24:26 +01:00
Christine Caulfield
8020f1d8e8 votequorum: Add missing man pages
Man pages for votequorum_qdevice_update and votquorum_qdevice_master_wins
were missing from the last commit.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-14 10:07:46 +00:00
Jason
cfbb021e13 totem: Drop invalid join msg in operational state
According to the totem paper, if a processor
receives a join message in the operational state and if the
receivers identifier is in the join messages fail list,
then join message should be ignored.

By applying this validation of join messages, we can avoid unnecessary
switching from operational state to gather state(or even lead to rings
can not be merged) like the following to happen.

1. Initially, there is only one ring contains three nodes, say
   ring(A,B,C).
2. A and B network partition, "in the same time", C is down.
3. Node A sends join message with proclist:A,B,C. faillist:NULL.
   Node B sends join message with proclist:A,B,C. faillist:NULL.
4. Both A and B consensus timeout due to network partition.
5. A and B network remerged.
6. Node A sends join message with proclist:A,B,C. faillist:B,C. and
   create ring(A).
   Node B sends join message with proclist:A,B,C. faillist:A,C. and
   create ring(B).
7. Say join message with proclist:A,B,C. faillist:A,C which sent
   by node B is received by node A because network remerged.
8. Node A shifts to gather state and send out a modified join message
   with proclist:A,B,C. faillist:B. Such join message will prevent
   both A and B from merging.
9. Node A consensus timeout (caused by waiting node C) and sends join
   message with proclist:A,B,C. faillist:B,C again.

Same thing happens on node B, so A and B will dead loop forever
in step 7, 8 and 9.

As the paper also said: "If a processor receives a join message in the
operational state and if the sender's identifier is in the receiver's
my_proclist and the join message's ring_seq is less than the receiver's
ring sequence number, then it ignores the join message too." So these
patch applying these validations of join messages altogether.

Signed-off-by: Jason <huzhijiang@gmail.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-01-13 14:46:13 +01:00
Jan Friesse
6cd1f67f1f systemd unit: Make sure network is really up
Change network.target to network-online.target to make sure network is
really up and running when starting corosync.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-13 14:40:28 +01:00
Christine Caulfield
fcba1652ba votequorum: Improve/add documentation for quorum device API
Improve the man pages for the votequorum qdevice API and include
them in the build. Also improve the testvotequorum2 test program.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-01-13 09:57:32 +00:00
Christine Caulfield
ff6a43edb3 votequorum: Add persistent expected_votes tracking.
This patch adds the option to store expected_votes to
persistent storage. This is needed to allow_downscale
to operate properly.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2014-01-07 15:30:11 +00:00
Jan Friesse
7014f10123 cfgtool: return error on reload failure
If reload fails, return code is set to value >0 to indicate error.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-11-26 15:39:34 +01:00
Christine Caulfield
a91192cfe9 man pages: Note that votequorum's allow_downscale is unsupported
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-11-08 09:37:38 +00:00
Jan Friesse
b88c0766fe logsys: Make logging of totem work again
Because of change in libqb (9abb686) logging of TOTEM subsystem stopped
working.

Instead of rely on previous behavior (implicit substring match), all
totem files are now explicitly given.

Also QB subsystem now uses comma separated filelist instead of previous
function calling.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-11-04 12:32:35 +01:00
Masatake YAMATO
f3ffd3da5c totemsrp: Show English message when memb_state_gather_enter is called
The reason why memb_state_gather_enter is invoked was printed
in integer code. This patch introduces human readable English
messages for the code.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-10-24 16:46:17 +02:00
Yevheniy Demchenko
805b3423ee totemiba: Check if configured MTU is allowed by HW
Solution use aproximation of totem structures. This needs to be
rewritten in proper way. Also MTU checking should be implemented for IP
transports.

Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-20 11:27:08 +02:00
Yevheniy Demchenko
8f14a5788f totemiba: Fix parameters position for poll_add
Parameters in functions like mcast_cq_send_event_fn, ... were defined in
incorrect order. Also their names were weird.

Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-20 11:26:50 +02:00
Yevheniy Demchenko
c5d4a0762f totemiba: Del channel fd from poll before destroy
Corosync freezes after several peer node connects/disconnects. The
freeze happens in recv_token_cq_recv_event_fn in ibv_get_cq_event call.
The problems is in fact, that after each peer node connect,
recv_token_accept_destroy is called, which tries to call
poll_dispatch_delete _after_ freeing of completion_channel. As
completion_channel contains fd, handlers are not disconnected from
poller properly. This leads to complete inconsistency in subsequent
calls to handlers.

Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-20 11:26:04 +02:00
Yevheniy Demchenko
5046de387b totemiba: Properly allocate RDMA buffers
1. In UD mode receivnig side of RDMA application should have enough
space in buffer to hold data and GRH. Also, sge.length on the receiving
size should be set to max_msg_size + sizeof (struct ibv_grh). Current
corosync doesn't take grh in the account and does not work if mtu is set
to the real mtu of IB port (it works if netmtu is set to < 2048-40).
2. ibv_wc.byte_len is the actual lentgh of the received packet, i.e.
msg_len + GRH. GRH length should be substracted in further proceeding.
If not, it might cause problems when messages get retransmitted, as
their apparent size will constantly grow.
3. Current corosync will not work with rdma and mtus > 2048. Most modern
IB HW supports 4096 mtu.

Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-20 11:26:00 +02:00
Christine Caulfield
c6688c6e11 Reload: document config.reload_in_progress in man page
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2013-09-12 16:10:35 +01:00
Christine Caulfield
1a046793cb Reload: Add atomic reload to log config
When a reload is in progress, wait until it has all finished
before re-reading all of the logging parameters

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-12 16:10:07 +01:00
Christine Caulfield
c0bfd48928 Reload: Add atomic reload to totemconfig
When a reload is in progress, wait until the whole thing has
finished before setting parameters

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-12 16:09:55 +01:00
Christine Caulfield
82fbffc34b Reload: Add reload code to cfg
Add the code to do the actual corosync.conf reload to cfg, along with
a corosync-cfgtool -R command to trigger it

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-12 16:09:41 +01:00
Christine Caulfield
bc47c583bd Reload: Make coroparse use a designated icmap hash table
Pass an icmap hashtable into coroparse so we can load it into
a temporary one during reload

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-12 16:09:06 +01:00
Jan Friesse
95133a5d77 icmap: Add func to test equality of two key values
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-09-10 17:02:12 +02:00
Christine Caulfield
8567887abb [PATCH] Replace freopen with open/dup2 when daemonizing
This patch replaces the existing freopen method of
forcing stdin/out/err to /dev/null with the more
usual system of open/dup2.

While I don't like posting patches I don't fully understand,
this patch seems to fix a problem where stdout/err get
assigned to a socket causing double logging output
on systemd.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-10 15:33:31 +01:00
Christine Caulfield
3663622576 Add log message to exit signal handler
I've seen a few instances where corosync has shut down for
apparently 'no reason'. In fact most of the time the shutdown
has been caused by an external source (often an init script)
but it's not been obvious what has happened and people
implicate the deamon

This patch simply adds a log message to the signal handler
when it is called so that the cause of the shutdown is obvious.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-09-03 14:04:50 +01:00
Jan Friesse
26ef8e15db icmap: Add map copy function
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-08-29 17:08:46 +02:00
Jan Friesse
e363f8b06d icmap: Add function to return item data pointer
icmap_get_r is now implemented using this function. Function is not very
safe tho defined as static.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-08-29 17:08:41 +02:00
Jan Friesse
624cd439aa icmap: Fix value len checking for strings
Implementation should allow pass only parts of string (shorten string)
and must prohibit reading of uninitialized memory.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-08-29 17:08:37 +02:00
Jan Friesse
04ddddd6d2 icmap: Add function to return global icmap
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-08-29 17:08:32 +02:00
Jan Friesse
e5a528c5cb icmap: Allow multiple icmap instances
Patch adds reentrant version of most of functions (with exception of RO
flags support and tracking) to allow multiple icmap instances existence
inside corosync.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-08-27 15:23:52 +02:00
Michael Chapman
2740cfd1ea Fix scheduler pause-detection timeout
qb_loop_timer_add expects the timeout to be in nanoseconds, but we were
passing the value in milliseconds. Scale the timeout appropriately.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-08-19 09:03:24 +02:00
Jan Friesse
e5f2ecab0e Remove dir pragma for xml2conf.xsl in specfile
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-07-10 12:17:17 +02:00
Jan Friesse
ab16f10ff7 cts: Update DC_IDLE pattern
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2013-07-09 11:00:40 +02:00
Kazunori INOUE
3551dc920b Use restart policy in the corosync-notifyd unit
Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-07-08 11:01:41 +02:00
David Vossel
b424acc3a0 ipc_glue: proper ref counting during service connection iteration
Signed-off-by: David Vossel <dvossel@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-07-04 13:05:52 +02:00
David Vossel
aa8e56a0fe ipc_glue: Remove connection unref with no matching reference.
We don't reference the connection object on creation, so there
is on reason to dereference it on disconnect.

Signed-off-by: David Vossel <dvossel@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-07-04 13:05:36 +02:00
David Vossel
771b239603 ipc_glue: Fixes connection ref count leak
Signed-off-by: David Vossel <dvossel@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-07-04 13:05:02 +02:00
Kazunori INOUE
e425cd5805 systemd: fix typo in unit file
Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-07-01 10:34:22 +02:00
Kazunori INOUE
12d0d76e0c notifyd: fix handle dispatch functions results
Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2013-07-01 10:26:17 +02:00
Christine Caulfield
074e57910e The corosync message "A processor joined or left the membership" is
vague and unhelpful. People have to look for the following quorum
message and try to deduce which nodes have joined or left from that
and past membership messages, even though the routine printing the
message already has this information to hand.

This patch fixes that message so that it prints the nodeids of the nodes
that have joined/left the cluster.

Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-By: Jan Friesse <jfriesse@redhat.com>
2013-06-27 14:44:46 +01:00
Jan Friesse
615d7592fb Log: Output parse errors to syslog
When corosync was started in daemon mode and there was parse error, no
way existed how to find out what happened (this is usual situation with
systemd enabled systems). Solution seems to be output to syslog by
default.

Also redundant line with setting logsys is removed because it's no
longer needed, because FORK and THREADED mode options has no longer
effect. FORK is handled by libqb by default and THREADED mode is forced
by calling logsys_thread_start.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-21 11:21:42 +02:00
Jan Friesse
d6dd2e455d totemconfig: Prevent leak of cluster_name str
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-21 11:21:33 +02:00
Jan Friesse
7cba14fb61 service: Fix memleak in service_unlink_and_exit
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2013-06-21 11:21:29 +02:00