Commit Graph

64 Commits

Author SHA1 Message Date
Jan Friesse
23e17953fe cpg: Inform clients about left nodes during pause
Patch tries to fix incorrect behaviour during following test-case:
- 3 nodes
- Node 1 is paused
- Node 2 and 3 detects node 1 as failed and informs CPG clients
- Node 1 is unpaused
- Node 1 clients are informed about new membership, but not about Node 1
  being paused, so from Node 1 point-of-view, Node 2 and 3 failure

Solution is to:
- Remove downlist master choose and always choose local node downlist.
  For Node 1 in example above, downlist contains Node 2 and 3.
- Keep code which informs clients about left nodes
- Use joinlist as a authoritative source of nodes/clients which exists
  in membership

This patch doesn't break backwards compatibility.

I've walked thru all the patches which changed behavior of cpg to ensure
patch does not break CPG behavior. Most important were:
- 058f50314c - Base. Code was significantly
  changed to handle double free by split group_info into two structures
  cpg_pd (local node clients) and process_info (all clients). Joinlist
  was
- 97c28ea756 - This patch removed
  confchg_fn and made CPG sync correct
- feff0e8542 - I've tested described
  behavior without any issues
- 6bbbfcb6b4 - Added idea of using
  heuristics to choose same downlist on all nodes. Sadly this idea
  was beginning of the problems described in
  040fda8872,
  ac1d79ea7c,
  559d4083ed,
  02c5dffa5b,
  64d0e5ace0 and
  b55f32fe2e
- 02c5dffa5b - Made joinlist as
  authoritative source of nodes/clients but left downlist_master_choose
  as a source of information about left nodes

Long story made short. This patch basically reverts
idea of using heuristics to choose same downlist on all nodes.

(ported from needle 9c2a97f4f9)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-30 14:37:20 +02:00
Christine Caulfield
fc8580bdbf totem: Use nodeid ONLY in srp_addr
This shrinks the srp_addr (and consequently every packet sent by
corosync) so that instead of containing loads of IP addresses to
identify a node, it just sends the nodeid.

This then allows us to make ring0 optional and replaceable when running
knet.

It also means that we need some other way of identifying the local
node in corosync.conf, so the nodelist.node.name entry is now mandatory
and is mapped to the local host using the same algorithm as used in
cman.

This code needs LOTS of testing as it touches a huge amount of totemsrp
and totemconfig.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:18:51 +01:00
Takeshi MIZUTA
4939c75629 Remove redundant header file inclusion
Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com>
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-12-05 09:59:08 +01:00
Jan Friesse
1f90c31ba7 list: Replace for_each by safe version where need
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2016-10-27 14:56:52 +02:00
Michael Jones
b4c06e52f3 list: Replace uses of list.h with qblist.h
Signed-off-by: Michael Jones <jonesmz@jonesmz.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2016-10-27 14:56:52 +02:00
Jan Friesse
1925074909 Fix few bugs found by coverity
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
2016-06-28 13:58:43 +02:00
Christine Caulfield
8cc8e51363 cpg: Add support for messages larger than 1Mb
If a cpg client sends a message larger than 1Mb (actually slightly
less to allow for internal buffers) cpg will now fragment that into
several corosync messages before sending it around the ring.

cpg_mcast_joined() can now return CS_ERR_INTERRUPT which means that the
cpg membership was disrupted during the send operation and the message
needs to be resent.

The new API call cpg_max_atomic_msgsize_get() returns the maximum size
of a message that will not be fragmented internally.

New test program cpghum was written to stress test this functionality,
it checks message integrity and order of receipt.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2015-03-05 16:45:15 +00:00
Jan Friesse
fbe8768f1b cpg: Make sure left nodes are really removed
When node is paused and other nodes has in meantime exited cpg process,
paused node after resume doesn't update it's membership correctly so on
previously paused node exited cpg process is still visible.

Solution is to compare join list with cpd and remove all pids which are
not included in join list.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-19 10:59:14 +01:00
Jan Friesse
83c63b247f cpg: Make sure nodid is always logged as hex num
Also number is prefixed by 0x so it's easier to spot that number is
hexadecimal.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-19 10:59:10 +01:00
Jan Friesse
fcf26e0303 cpg: Refactor mh_req_exec_cpg_procleave
Most of functionality is moved to do_proc_leave function to make it
reusable.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2014-02-19 10:59:05 +01:00
Jan Friesse
e684e4ca6f Remove unnecessary mmap in cpg
Code for zero-copy in cpg does following mmaps:
- Mmap anonymous, private memory to some address (-> malloc)
- Mmap shared memory of fd to address returned by first mmap
  (effectively shadows first mapping)

This is not necessary and only one mapping is needed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2013-05-21 14:46:15 +02:00
Jan Friesse
5ce59f49ba Move some totem and cpg messages to trace level
Messages which are flow messages, rather then lifecycle are now logged
in trace level.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-09-19 11:03:16 +02:00
Angus Salkeld
0e86aa4ac6 Fix cpg_membership_get()
The wrong size was getting set in exec/cpg.c

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-31 14:48:35 +10:00
Fabio M. Di Nitto
6d28d51284 build: bring SOLARIS up to the same standard as other OSes
drop all SOLARIS specific ifdefs and replace them with feature checks

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto
18929089d1 build: drop MAP_ANONYMOUS check from configure
define it only in case it's not there

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto
a1c154e6fa build: use MADV_NOSYNC only when it's defined
so far only FreeBSD defines it.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-08-30 15:00:27 +02:00
Jan Friesse
2d10e2bbea cpg: Check input param name_t length
IPC is using buffer of CS_MAX_NAME_LENGTH for name. If user calls
function with longer string, such string can be passed to service
incomplete.

Solution is to not allow string larger then CS_MAX_NAME_LENGTH
and return error.

Same applies to cpg service.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-08-09 15:10:35 +02:00
Jan Friesse
537bf56fcc cpg: Be more verbose for procjoin message
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-07-30 10:22:16 +02:00
Jan Friesse
a966506c1e cpg: Enhance downlist selection algorithm
Let's say we have 2 nodes:
- node 2 is paused
- node 1 create membership (one node)
- node 2 is unpaused

Result is that node 1 downlist is selected, so it means that from node 2
point of view, node 1 was never down.

Patch solves situation by adding additional check for largest previous
membership.

So current tests are:
1) largest (previous #nodes - #nodes know to have left)
2) (then) largest previous membership
3) (and last as a tie-breaker) node with smallest nodeid

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:42 +02:00
Jan Friesse
f3457c5d49 cpg: Print cpg name to debug informations
In downlist and joinlist debug output group was printed in nonsense
format of integer to pointer to array.

Now it's printed by full name.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:39 +02:00
Jan Friesse
35446d6bcc cpg: Process join list after downlists
let's say following situation will happen:
- we have 3 nodes
- on wire messages looks like D1,J1,D2,J2,D3,J3 (D is downlist, J is
  joinlist)
- let's say, D1 and D3 contains node 2
- it means that J2 is applied, but right after that, D1 (or D3) is
  applied what means, node 2 is again considered down

It's solved by collecting joinlists and apply them after downlist, so
order is:
- apply best matching downlist
- apply all joinlists

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:35 +02:00
Jan Friesse
816d7687b0 cpg: Never choose downlist with localnode
Test scenario is follows:
- node 1, node 2
- node 1 is paused
- node 2 sees node 1 dead
- node 1 unpaused
- node 1 and 2 both choose same dowlist message which includes node 2 ->
node 2 is efectivelly disconnected

Patch includes additional test if left_node is localnode. If so, such
downlist is ignored.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-06-14 15:15:32 +02:00
Fabio M. Di Nitto
8f6e5ff530 sync: kill evil and syncv1 in one shot
this change breaks onwire compatibility.

cpg is the only user of sync_* interface and it's the only
service that will require extra testing.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-09 11:15:08 +01:00
Steven Dake
2ad0cdc832 Update copyright header dates in exec directory
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-02-13 17:05:04 -07:00
Steven Dake
4ee9550f80 Remove jhash.h since it is not used
We would use libqb for hashing now if we needed hashing.
cpg no longer uses jhash.h.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-02-13 17:05:04 -07:00
Jiaju Zhang
dd9e177af7 CPG: Send CPG_REASON_PROCDOWN when really needed
This patch fixes the issue that in some cases where cpg_finalize()
was called just after cpg_leave() was called, CPG_REASON_PROCDOWN
might also be sent while CPG_REASON_LEAVE had already been sent.
This behavior is not aligned with what the man page has described:
"CPG_REASON_PROCDOWN - the process left a group without calling
cpg_leave()."
And it will confuse CPG's clients in that one process left results
in two different reasons being sent.

The root cause of this issue is cpg_leave() will return after
adding the LEAVE message to the sending queue, but the cpg's group
name has not been cleared yet. Just at that time, cpg_finalize()
is being called, then it determines if there is the calling of
cpg_leave() happened only by the checking of cpg's group name, so
this method is not sufficient.

Signed-off-by: Jiaju Zhang <jjzhang@suse.de>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-02-06 08:07:54 -07:00
Steven Dake
007e5c9458 Honor exec_init_fn call
exec_init_fn now either returns NULL (success) or a string which indicates
the error that occured during service engine initialization.  If an error
occurs, corosync will exit.  This patch adds ykd and makes other suggestions
from Fabio Di Nitto.

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-01-30 14:05:09 -07:00
Angus Salkeld
3131601ce2 Remove all unneccessary "\n" from log messages
These look ugly, are inconsistently done and just have
to be removed later in libqb before calling syslog.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-01-23 13:08:23 +11:00
Steven Dake
f763d3ba4a Initial removal of plugins
Quorum is broken in this patch.
service.h needs to be cleaned up significantly

Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Fabio Di Nitto <fdinitto@redhat.com>
2012-01-16 09:30:26 -07:00
Steven Dake
46babc95ad Initial move of corosync and openais trees into seperate directories.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1582 fd59a12c-fef9-0310-b244-a6a79926bd2f
2008-07-21 07:59:08 +00:00
Patrick Caulfield
a53b222341 Add cpg_groups_get call to libcpg.
This call causes a complete list of active groups and their
membership lists to be sent to a callback function.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1571 fd59a12c-fef9-0310-b244-a6a79926bd2f
2008-07-02 07:19:50 +00:00
Steven Dake
9e2376fcc0 Remove totemip.h reference from file.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1561 fd59a12c-fef9-0310-b244-a6a79926bd2f
2008-06-24 04:44:45 +00:00
Steven Dake
f40e9a1283 Endian convert downlist messages from cpg.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1470 fd59a12c-fef9-0310-b244-a6a79926bd2f
2007-10-22 15:42:36 +00:00
Patrick Caulfield
94561626e6 Remove some includes from .h files so they can be installed.
Also install flow.h & ipc.h for external services.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1467 fd59a12c-fef9-0310-b244-a6a79926bd2f
2007-10-10 10:33:55 +00:00
Steven Dake
113a3c4f88 The logsys logging system. Read logsys_overview.8.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1428 fd59a12c-fef9-0310-b244-a6a79926bd2f
2007-09-09 06:38:10 +00:00
Patrick Caulfield
a245f25ac8 Clear pid when we leave a process group
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1399 fd59a12c-fef9-0310-b244-a6a79926bd2f
2007-06-25 12:34:44 +00:00
Steven Dake
39b3f0d5a6 Add cpg_local_get api to cpg service
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1391 fd59a12c-fef9-0310-b244-a6a79926bd2f
2007-06-25 03:04:35 +00:00
Steven Dake
0a19a21f1b Remove this_ip from the source tree and replace with accessor functions.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1379 fd59a12c-fef9-0310-b244-a6a79926bd2f
2007-06-05 08:55:44 +00:00
Steven Dake
cb154572a2 Patch from Renaud to report some broken Solaris porting from past.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1353 fd59a12c-fef9-0310-b244-a6a79926bd2f
2007-03-06 16:18:44 +00:00
Patrick Caulfield
2a12de36f2 Fix ordering of join messages
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1324 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-12-12 17:47:33 +00:00
Hans Feldt
97919b8d16 Cleaning up and preparing for later patch.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1310 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-11-17 06:57:00 +00:00
Steven Dake
336dc17daa Forward port of flow control work from whitetank branch.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1289 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-11-04 22:29:14 +00:00
Fabien Thomas
d4e4f10df1 do not include alloca.h under BSD; alloca is in stdlib.h
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1221 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-08-18 07:38:21 +00:00
Patrick Caulfield
1141aab7b3 fixe a bug in cpg where get_group() will return the wrong group
info structure if there is a hash collision.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1199 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-08-11 07:37:15 +00:00
Steven Dake
90ccff6bbc Solaris port for openais
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1175 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-08-05 02:22:12 +00:00
Steven Dake
6a00f63ff9 Patch so realloc reverts to old buffer if reallocation fails.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1170 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-07-28 23:34:28 +00:00
Steven Dake
8cb23508df Clean up endian swabbing for cpg service.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1162 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-07-26 07:07:22 +00:00
Patrick Caulfield
b776747e9a Send the new joinlists using the sync service, so it happens atomically.
This should fix some odd sequencing bugs.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1136 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-07-19 15:32:25 +00:00
Steven Dake
1f60232e88 Make cpg 32/64 userland safe and endian safe.
git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1085 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-06-23 18:38:25 +00:00
Patrick Caulfield
d84e890831 Fix message alignment in CPG.
we now unpack the message in the same way as we pack it.


git-svn-id: http://svn.fedorahosted.org/svn/corosync/trunk@1084 fd59a12c-fef9-0310-b244-a6a79926bd2f
2006-06-22 16:07:30 +00:00