mirror_corosync/exec
Jan Friesse 23e17953fe cpg: Inform clients about left nodes during pause
Patch tries to fix incorrect behaviour during following test-case:
- 3 nodes
- Node 1 is paused
- Node 2 and 3 detects node 1 as failed and informs CPG clients
- Node 1 is unpaused
- Node 1 clients are informed about new membership, but not about Node 1
  being paused, so from Node 1 point-of-view, Node 2 and 3 failure

Solution is to:
- Remove downlist master choose and always choose local node downlist.
  For Node 1 in example above, downlist contains Node 2 and 3.
- Keep code which informs clients about left nodes
- Use joinlist as a authoritative source of nodes/clients which exists
  in membership

This patch doesn't break backwards compatibility.

I've walked thru all the patches which changed behavior of cpg to ensure
patch does not break CPG behavior. Most important were:
- 058f50314c - Base. Code was significantly
  changed to handle double free by split group_info into two structures
  cpg_pd (local node clients) and process_info (all clients). Joinlist
  was
- 97c28ea756 - This patch removed
  confchg_fn and made CPG sync correct
- feff0e8542 - I've tested described
  behavior without any issues
- 6bbbfcb6b4 - Added idea of using
  heuristics to choose same downlist on all nodes. Sadly this idea
  was beginning of the problems described in
  040fda8872,
  ac1d79ea7c,
  559d4083ed,
  02c5dffa5b,
  64d0e5ace0 and
  b55f32fe2e
- 02c5dffa5b - Made joinlist as
  authoritative source of nodes/clients but left downlist_master_choose
  as a source of information about left nodes

Long story made short. This patch basically reverts
idea of using heuristics to choose same downlist on all nodes.

(ported from needle 9c2a97f4f9)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-30 14:37:20 +02:00
..
.gitignore Add .gitignore files. 2010-10-21 07:43:46 -07:00
apidef.c CFG: Remove ring-reenable code 2017-08-03 14:32:02 +02:00
apidef.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
cfg.c knet: Always use link0 for loopback 2018-03-01 14:23:20 +01:00
cmap.c cmap: Remove noop highest config version check 2017-10-11 17:11:33 +02:00
coroparse.c totem: Use nodeid ONLY in srp_addr 2018-03-01 14:18:51 +01:00
cpg.c cpg: Inform clients about left nodes during pause 2018-04-30 14:37:20 +02:00
cs_queue.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
fsm.h Make logging of WD and MON service correct 2012-08-16 14:45:15 +02:00
icmap.c stats: Add map with on-demand statistics 2017-07-27 15:53:04 +02:00
ipc_glue.c stats: Add cmap key to clear the various stats. 2017-10-31 17:39:14 +01:00
ipcs_stats.h stats: Add cmap key to clear the various stats. 2017-10-31 17:39:14 +01:00
logconfig.c logging: Make blackbox configurable 2018-01-30 13:21:48 +01:00
logconfig.h list: Replace uses of list.h with qblist.h 2016-10-27 14:56:52 +02:00
logsys.c logging: Close before and open blackbox after fork 2018-01-30 13:21:52 +01:00
main.c Fix typo: sucesfully -> successfully 2018-04-20 12:04:49 +02:00
main.h Reload: Make coroparse use a designated icmap hash table 2013-09-12 16:09:06 +01:00
Makefile.am [build] fix build with non-standard knet location 2018-02-05 15:57:12 +01:00
mon.c list: Replace uses of list.h with qblist.h 2016-10-27 14:56:52 +02:00
pload.c build: bring SOLARIS up to the same standard as other OSes 2012-08-30 15:00:27 +02:00
quorum.c Remove redundant header file inclusion 2016-12-05 09:59:08 +01:00
quorum.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
schedwrk.c schedwrk: Cleanup and make it work on PPC BE 2016-05-17 16:29:25 +02:00
schedwrk.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
service.c service: Fix memleak in service_unlink_and_exit 2013-06-21 11:21:29 +02:00
service.h service: remove leftovers from mt corosync 2012-08-09 15:10:16 +02:00
stats.c stats: Add some missing knet stats 2017-11-16 08:35:50 +01:00
stats.h stats: Add map with on-demand statistics 2017-07-27 15:53:04 +02:00
sync.c sync: Call sync_init of all services at once 2017-11-16 15:22:19 +01:00
sync.h sync: kill evil and syncv1 in one shot 2012-03-09 11:15:08 +01:00
timer.c Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
timer.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
totemconfig.c config: Allow use of ring0_addr 2018-03-01 14:21:37 +01:00
totemconfig.h config: Allow links to have different ip_versions 2017-12-22 17:15:19 +01:00
totemip.c totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totemknet.c totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totemknet.h totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totemnet.c totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totemnet.h totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totempg.c knet: Always use link0 for loopback 2018-03-01 14:23:20 +01:00
totemsrp.c totemsrp: Fix leave message regression 2018-04-23 17:46:05 +02:00
totemsrp.h knet: Always use link0 for loopback 2018-03-01 14:23:20 +01:00
totemudp.c totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totemudp.h totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totemudpu.c totem: Display IP of sender 2018-03-16 13:58:15 +01:00
totemudpu.h totem: Display IP of sender 2018-03-16 13:58:15 +01:00
util.c list: Replace uses of list.h with qblist.h 2016-10-27 14:56:52 +02:00
util.h stats: Add map with on-demand statistics 2017-07-27 15:53:04 +02:00
votequorum.c totem: Use nodeid ONLY in srp_addr 2018-03-01 14:18:51 +01:00
votequorum.h list: Replace uses of list.h with qblist.h 2016-10-27 14:56:52 +02:00
vsf_quorum.c Remove redundant header file inclusion 2016-12-05 09:59:08 +01:00
vsf_ykd.c YKD: Fix loading of YKD quorum module 2014-08-18 09:33:59 +01:00
vsf_ykd.h list: Replace uses of list.h with qblist.h 2016-10-27 14:56:52 +02:00
vsf.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
wd.c wd: fix snprintf warnings 2017-12-01 17:23:54 +01:00