mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2026-01-16 15:58:44 +00:00

Author	SHA1	Message	Date
Fabio M. Di Nitto	a0a14c68e3	totemip: clean up headers a lot more getifaddrs is always available if there is freeifaddr. all BSD and openindiana have it defined in ifaddr.h. drop a bunch of obsoleted headers. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	18929089d1	build: drop MAP_ANONYMOUS check from configure define it only in case it's not there Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	5c5db34e56	build: make libstatgrab the facto default for monitoring service drop duplicate code and remove the last COROSYNC_LINUX ifdefs around Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	a1c154e6fa	build: use MADV_NOSYNC only when it's defined so far only FreeBSD defines it. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Fabio M. Di Nitto	6098ef2c14	build: make exec/totemip os detection free Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-30 15:00:27 +02:00
Jan Friesse	dbe0e9e382	Log: Use threaded mode for syslog and file log Syslog and file log can block, so it's good idea to use libqb threaded mode to prevent it. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-30 09:46:48 +02:00
Jan Friesse	9f6e6a990b	Use native IPC mechanism Instead of hardcoded SHM, we should use NATIVE, so libqb is able to find out what is best/availiable mechanism. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-30 09:45:46 +02:00
Fabio M. Di Nitto	427fdd4558	build: fix build on openindiana 151a openindiana toolchain is rather messy. This is the first cut only Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	9f7181b533	build: drop more dlopen leftovers from dinosaur era... Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	dd4d7f86e6	build: make monitoring optional in corosync exec Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	8f96347100	build: respect watchdog conditional when building corosync exec Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:49 +02:00
Fabio M. Di Nitto	76d18f964d	build: use libtool for linking Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-28 15:14:48 +02:00
Tim Beale	6129ce5b59	Remove redundant default-config code We were checking 'hold_timeout == 0' in 3 different places when setting up the default totem config. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:26:50 +02:00
Tim Beale	77ea036c72	Remove unused structure Nowhere in the corosync codebase references this structure. Signed-off-by: Tim Beale <tlbeale@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-21 14:11:48 +02:00
Jan Friesse	397cc89f01	Make logging of WD and MON service correct MON and WD services are using fsm.h, which calls log function. Such messages were incorrectly logged as SERV (or random service) which made debugging hard. Solution is to add callback parameter to fsm functions and do actual logging there. Handling of failure states is also done in calback now. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-16 14:45:15 +02:00
Jan Friesse	e3cef955bf	IPC: Call lib function only when it's possible send_ok was incorrectly tested as boolean, even it's errno type variable. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:52 +02:00
Jan Friesse	8014b2facf	Close sockets after deleting from poll This will remove (non critical) debug message from QB about polling on closed FD. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:44 +02:00
Jan Friesse	2d10e2bbea	cpg: Check input param name_t length IPC is using buffer of CS_MAX_NAME_LENGTH for name. If user calls function with longer string, such string can be passed to service incomplete. Solution is to not allow string larger then CS_MAX_NAME_LENGTH and return error. Same applies to cpg service. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:35 +02:00
Jan Friesse	6f6988afff	Handle sync and service unload correctly When sync started and service is unloaded in meantime, it can happen that sync will call sync_* functions on unloaded service. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:26 +02:00
Jan Friesse	dfe34d330c	service: remove leftovers from mt corosync Multithreaded corosync used to use many ugly workarounds. One of them is shutdown process, where we had to solve problem with two locks. This was solved by scheduling jobs between service exit_fn call and actual service unload. Sadly this can cause to receive message from other node in that meantime causing corosync to segfault on exit. Because corosync is now single threaded, we don't need such hacks any longer. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-08-09 15:10:16 +02:00
Fabio M. Di Nitto	423e37b4ca	votequorum: change init/clean up to deal with exit races Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-08 09:03:57 +02:00
Fabio M. Di Nitto	50308cb08d	quorumtool: make output more meaningful there is really no point to have a per node view of (vote)quorum since all the info are always there. drop the -n option for status/display nodes and improve the output to provide a full cluster view at any given time. Old format: [root@fedora-master-node2 ~]# corosync-quorumtool -s Quorum information ------------------ Date: Mon Aug 6 10:22:27 2012 Quorum provider: corosync_votequorum Nodes: 2 Ring ID: 8 Quorate: Yes Votequorum information ---------------------- Node ID: 3254954176 Node state: Member Node votes: 1 Qdevice votes: 1 Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Name 3238176960 1 fedora-master-node1.int.fabbione.net 3254954176 1 fedora-master-node2.int.fabbione.net 0 1 QDEVICE (Alive/Voting/NoMasterWins) New format: [root@fedora-master-node1 tools]# ./corosync-quorumtool -s Quorum information ------------------ Date: Mon Aug 6 15:50:03 2012 Quorum provider: corosync_votequorum Nodes: 2 Ring ID: 48 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Qdevice Name 3238176960 1 A,V,MW fedora-master-node1.int.fabbione.net 3254954176 1 NR fedora-master-node2.int.fabbione.net 0 1 QDEVICE Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	6b270c6cd1	votequorum: make the last QDEVICE define name consistent with everything else Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	302545e112	votequorum: add missing return call Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	379b203677	votequorum: make master_wins check stricter Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	9c50f33509	votequorum: add ENTER/LEAVE for consistency Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2f369e7039	votequorum: delegate qdevice_master_wins setting to qdevice votequorum has no business to device if master_wins setting is correct or not. only the qdevice can decide and should set the value for votequorum. Logic is: - user requests master_wins from config - corosync starts - qdevice starts - qdevice reads cmap values / register with votequorum - qdevice decides if the node can support master_wins or not and tells votequorum - at this point votequorum can check if an unquorate node is part of the master_wins partition it is the qdevice responsibility to keep that value up to date in votequorum and the value can be changed at runtime. this commit also exchange per node master_wins information to lay down the infrastructure to verify discrepancies in node config for master_wins (coming next on this channel). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	cc7bfeb462	votequorum: drop votequorum_qdevice_getinfo and collapse data into getinfo it's really pointless to have basically a duplicated API call to transfer one value and one name. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	65a6c29a31	votequorum: external defines should all be prefixed with VOTEQUORUM_ Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2a37b56c49	votequorum: drop _FLAG_ from defines those are all info flags.. it's redudant and inconsistent Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	3416eacbec	votequorum: fix define name to match reality Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	86dd11b28e	qdevice: implement master_wins partition in previous incarnation of qdisk + cman, master_wins was restricted to 2 node only. In this new version it is possible to use master_wins for any cluster size. Let's assume a 4 node cluster. Each node votes 1, qdevice votes 3. node 1 becomes qdevice master node 2/3/4 no In case of a split (let's assume 2/2): partition 1: {4, 1} partition 2: {1, 1} node 2 in partition 1 would normally be unquorate, leaving effectively only node 1 active. master_wins allows node 2 to recognize to be part of a quorate partition (since node1 is broadcasting that qdevice is voting) and retain quorum. node1 has never lost quorate status since qdevice is voting there. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	aa295be834	votequorum: fix flag check for qdevice votes propagation and cleanup similar code to make it more readable Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	2dae49e54a	votequorum: remove last instance of state and rename it to cast_vote also align naming of vote to cast_vote for info calls Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	3fed1af077	votequorum: several major bug fixes and code cleanup - add a protection check to avoid spurious messages on membership change - greately simplify processing of nodeinfo, since the only data that we send for qdevice over nodeinfo is the number of votes - fix a flag check to trigger quorum calculation that would leave a cluster unquorate under certain conditions Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	62659dbb21	votequorum: move to the new flag structure simplify different code path as checks are simpler, separate ALIVE and CAST_VOTE Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	c9e207ec92	votequorum: simplify getinfo data and protect against call against quorum node Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	f2b25936e5	votequorum: use REGISTERED flag consistently Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	0bcb4cddcc	votequorum: simply internal qdevice_getinfo function as data are moving around we can drop lots of special cases Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	43d1439600	votequorum: add qdevice CAST_VOTE status/flag this is a preparation commit for the next changes. right now it is no more than an alias to ALIVE. CAST_VOTE is required to support master/slave feature from qdevice. Effectively a quorum device can be: Not registered / registered (connected to API but nothing else is happening) if registered: Not alive / alive (quorum device is petting the API via poll and timer is running) if alive: Not voting (slave) / voting (master) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	987e26f8d1	votequorum: rename NODE_FLAGS_QDEVICE_STATE to NODE_FLAGS_QDEVICE_ALIVE STATE is confusing and overloaded term in votequorum as it's used for nodes and other bits. make the name unique and ALIVE means that the qdevice is heartbeating to votequorum. improve display of the status in tools and tests. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	4621a6cd02	votequorum: rename NODE_FLAGS_QDEVICE to NODE_FLAGS_QDEVICE_REGISTERED make the flag name explicit Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Jan Friesse	fed7fc23e1	Don't call sync_* funcs for unloaded services When service is unloaded, sync shouldn't call sync_init\|process\|activate and abort functions. It happens very rare, but in process of unloading all services, totem can recreate membership and bad things can happen (service is unloaded, so there may be access to already freed memory, ...) Solution is to fetch services sync handlers in every time when we are building service list instead of using precreated one. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:34:58 +02:00
Jan Friesse	9fb7979370	Introduce SERVICES_COUNT_MAX macro Sync/service was using maximal number of services in ehter numberic form (magic constant) or inconsistently, this means using SERVICE_HANDLER_MAXIMUM_COUNT which means maximal number of handlers. New macro solves this. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:32:05 +02:00
Jan Friesse	537bf56fcc	cpg: Be more verbose for procjoin message Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-07-30 10:22:16 +02:00
Jan Friesse	04dac3ff5d	Correctly free state string in wd Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-07-12 15:53:04 +02:00
Jan Friesse	e4d75d1ab3	Revert "Free state variable allocated in wd_resource_state_is_ok" This reverts commit `01c63ca17c`.	2012-07-11 17:04:41 +02:00
Jan Friesse	a966506c1e	cpg: Enhance downlist selection algorithm Let's say we have 2 nodes: - node 2 is paused - node 1 create membership (one node) - node 2 is unpaused Result is that node 1 downlist is selected, so it means that from node 2 point of view, node 1 was never down. Patch solves situation by adding additional check for largest previous membership. So current tests are: 1) largest (previous #nodes - #nodes know to have left) 2) (then) largest previous membership 3) (and last as a tie-breaker) node with smallest nodeid Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:42 +02:00
Jan Friesse	f3457c5d49	cpg: Print cpg name to debug informations In downlist and joinlist debug output group was printed in nonsense format of integer to pointer to array. Now it's printed by full name. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:39 +02:00
Jan Friesse	35446d6bcc	cpg: Process join list after downlists let's say following situation will happen: - we have 3 nodes - on wire messages looks like D1,J1,D2,J2,D3,J3 (D is downlist, J is joinlist) - let's say, D1 and D3 contains node 2 - it means that J2 is applied, but right after that, D1 (or D3) is applied what means, node 2 is again considered down It's solved by collecting joinlists and apply them after downlist, so order is: - apply best matching downlist - apply all joinlists Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:35 +02:00

1 2 3 4 5 ...

1640 Commits