mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2026-01-16 12:53:11 +00:00

Author	SHA1	Message	Date
Fabio M. Di Nitto	2dae49e54a	votequorum: remove last instance of state and rename it to cast_vote also align naming of vote to cast_vote for info calls Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	3fed1af077	votequorum: several major bug fixes and code cleanup - add a protection check to avoid spurious messages on membership change - greately simplify processing of nodeinfo, since the only data that we send for qdevice over nodeinfo is the number of votes - fix a flag check to trigger quorum calculation that would leave a cluster unquorate under certain conditions Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	62659dbb21	votequorum: move to the new flag structure simplify different code path as checks are simpler, separate ALIVE and CAST_VOTE Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	c9e207ec92	votequorum: simplify getinfo data and protect against call against quorum node Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	f2b25936e5	votequorum: use REGISTERED flag consistently Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	0bcb4cddcc	votequorum: simply internal qdevice_getinfo function as data are moving around we can drop lots of special cases Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:17 +02:00
Fabio M. Di Nitto	43d1439600	votequorum: add qdevice CAST_VOTE status/flag this is a preparation commit for the next changes. right now it is no more than an alias to ALIVE. CAST_VOTE is required to support master/slave feature from qdevice. Effectively a quorum device can be: Not registered / registered (connected to API but nothing else is happening) if registered: Not alive / alive (quorum device is petting the API via poll and timer is running) if alive: Not voting (slave) / voting (master) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	987e26f8d1	votequorum: rename NODE_FLAGS_QDEVICE_STATE to NODE_FLAGS_QDEVICE_ALIVE STATE is confusing and overloaded term in votequorum as it's used for nodes and other bits. make the name unique and ALIVE means that the qdevice is heartbeating to votequorum. improve display of the status in tools and tests. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Fabio M. Di Nitto	4621a6cd02	votequorum: rename NODE_FLAGS_QDEVICE to NODE_FLAGS_QDEVICE_REGISTERED make the flag name explicit Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-08-07 11:07:16 +02:00
Jan Friesse	fed7fc23e1	Don't call sync_* funcs for unloaded services When service is unloaded, sync shouldn't call sync_init\|process\|activate and abort functions. It happens very rare, but in process of unloading all services, totem can recreate membership and bad things can happen (service is unloaded, so there may be access to already freed memory, ...) Solution is to fetch services sync handlers in every time when we are building service list instead of using precreated one. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:34:58 +02:00
Jan Friesse	9fb7979370	Introduce SERVICES_COUNT_MAX macro Sync/service was using maximal number of services in ehter numberic form (magic constant) or inconsistently, this means using SERVICE_HANDLER_MAXIMUM_COUNT which means maximal number of handlers. New macro solves this. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-08-02 09:32:05 +02:00
Jan Friesse	537bf56fcc	cpg: Be more verbose for procjoin message Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-07-30 10:22:16 +02:00
Jan Friesse	04dac3ff5d	Correctly free state string in wd Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-07-12 15:53:04 +02:00
Jan Friesse	e4d75d1ab3	Revert "Free state variable allocated in wd_resource_state_is_ok" This reverts commit `01c63ca17c`.	2012-07-11 17:04:41 +02:00
Jan Friesse	a966506c1e	cpg: Enhance downlist selection algorithm Let's say we have 2 nodes: - node 2 is paused - node 1 create membership (one node) - node 2 is unpaused Result is that node 1 downlist is selected, so it means that from node 2 point of view, node 1 was never down. Patch solves situation by adding additional check for largest previous membership. So current tests are: 1) largest (previous #nodes - #nodes know to have left) 2) (then) largest previous membership 3) (and last as a tie-breaker) node with smallest nodeid Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:42 +02:00
Jan Friesse	f3457c5d49	cpg: Print cpg name to debug informations In downlist and joinlist debug output group was printed in nonsense format of integer to pointer to array. Now it's printed by full name. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:39 +02:00
Jan Friesse	35446d6bcc	cpg: Process join list after downlists let's say following situation will happen: - we have 3 nodes - on wire messages looks like D1,J1,D2,J2,D3,J3 (D is downlist, J is joinlist) - let's say, D1 and D3 contains node 2 - it means that J2 is applied, but right after that, D1 (or D3) is applied what means, node 2 is again considered down It's solved by collecting joinlists and apply them after downlist, so order is: - apply best matching downlist - apply all joinlists Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:35 +02:00
Jan Friesse	816d7687b0	cpg: Never choose downlist with localnode Test scenario is follows: - node 1, node 2 - node 1 is paused - node 2 sees node 1 dead - node 1 unpaused - node 1 and 2 both choose same dowlist message which includes node 2 -> node 2 is efectivelly disconnected Patch includes additional test if left_node is localnode. If so, such downlist is ignored. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-06-14 15:15:32 +02:00
Jerome FLESCH	99faa3b864	When flushing, discard only memb_join messages Patch solves problem when 1 ring out of 2 went up/down quite often. The simplest setup to reproduce bug is following: - 2 VMs, connected by 2 network interfaces - OS: Linux - On one of the VMs, a test program sending some CPG messages (see the script "test_corosync.sh" joined to this mail for example) Here are the Corosync logs we get when we do this setup: Jun 06 16:23:40 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Jun 06 16:23:40 corosync [CPG ] chosen downlist: sender r(0) ip(192.168.56.104) r(1) ip(192.168.57.104) ; members(old:1 left:0) Jun 06 16:23:40 corosync [MAIN ] Completed service synchronization, ready to provide service. Jun 06 16:24:37 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:24:38 corosync [TOTEM ] Automatically recovered ring 1 Jun 06 16:25:33 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:25:34 corosync [TOTEM ] Automatically recovered ring 1 Jun 06 16:26:35 corosync [TOTEM ] Marking ringid 1 interface 192.168.57.105 FAULTY Jun 06 16:26:36 corosync [TOTEM ] Automatically recovered ring 1 (...) The second ring goes down about every 2 minutes and automatically back up right after. We spent some times looking for the commit that introduced this bug, and it appears it's due the following one: Corosync 1.3.3 -> 1.3.4: `e27a58d93d` Corosync 1.4.1 -> 1.4.2: `be608c0502` Commit message: Ignore memb_join messages during flush operations I had a look at this commit, and it seems to me it's dropping too many packets: Because of this commit, while totemrrp_recv_flush() is called, Corosync drops memb_join packets, but also ORF tokens. In the end, it seems that sometimes, we drop so many of them that Corosync marks the ring as faulty. To fix that, only memb_join messages are dropped now. Signed-off-by: Jerome FLESCH <jerome.flesch@netasq.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-06-11 10:59:30 +02:00
Jan Friesse	2766e57ce5	Store fdata with timestamp and pid in name This should allow easier handling of various blackbox dumps. Original fdata name is now symlink to latest created dump. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-06-05 12:19:42 +02:00
Jan Friesse	7ce332a713	totemudpu: Bind sending sockets to bindto address Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-31 09:28:52 +02:00
Fabio M. Di Nitto	f008cf442c	rename mainconfig to logconfig Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-05-29 09:36:00 +02:00
Fabio M. Di Nitto	b283ef8f12	mainconfig: allow mainconfig logic to be used both internally and externally corosync logging configuration logic is rather complex and in order to make it simpler to reuse (at least within corosync/ tree) we need to be able to use both icmap and cmap. the patch might seem controversial, but it reduces heaps of code around from qdevices (coming next). It might be useful to consider moving this to a common shared library but there aren't enough users yet and a shared lib would force corosync to link with cmap (that we do not want at all costs) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-29 09:04:03 +02:00
Angus Salkeld	5831136c87	LOG: make sure the log target is enabled. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Angus Salkeld	e6b35bdb7a	LOG: handle closing unused logfiles better This fixes a bug where having a second log file will close the previous one. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Angus Salkeld	e6afc761fe	LOG: be more explict about the qb file names else we can get messages been put in the wrong subsys. Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-05-29 14:02:42 +10:00
Jan Friesse	2894f33c4f	totemip: Support bind to exact address Logic for binding now works in following way: - Try to find exact match - If not exact match is found, use first found network address This allows set concrete IP even if network settings contains two IPs on same network. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-24 14:01:12 +02:00
Jan Friesse	aaa575e091	totemip: insert items in correct order list_add_tail is used instead of list_add so ip addresses are inserted in same order as returned by getifaddrs. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-05-24 14:01:08 +02:00
Jan Friesse	0791f44c41	Include ringid in processor joined log message This should help correlate syslog entires with their blackbox counterparts. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Andrew Beekhof <andrew@beekhof.net>	2012-05-17 14:58:04 +02:00
Fabio M. Di Nitto	f2444effd0	icmap: don't leak memory when changing ro/rw status on a key Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto	1dcb2d43d9	icmap: fix a valgrind errors (pass 1) clean up a lot of allocated blocks at exit. those changes has no runtime effects, but it makes valgrind output a bit more useful by dropping over 700 errors/warnings to skip over every single run. there are still a few icmap related valgrind errors but those need some more complex and timeconsuming investigation. pre patch: ==21844== HEAP SUMMARY: ==21844== in use at exit: 1,229,321 bytes in 1,516 blocks ==21844== total heap usage: 7,191 allocs, 5,675 frees, 3,819,853 bytes allocated ==21844== LEAK SUMMARY: ==21844== definitely lost: 3,617 bytes in 11 blocks ==21844== indirectly lost: 21,960 bytes in 11 blocks ==21844== possibly lost: 1,080,101 bytes in 131 blocks ==21844== still reachable: 123,643 bytes in 1,363 blocks ==21844== suppressed: 0 bytes in 0 blocks ==21844== ERROR SUMMARY: 136 errors from 136 contexts (suppressed: 0 from 0) post patch: ==25793== HEAP SUMMARY: ==25793== in use at exit: 1,185,870 bytes in 808 blocks ==25793== total heap usage: 9,427 allocs, 8,619 frees, 4,156,841 bytes allocated ==25793== LEAK SUMMARY: ==25793== definitely lost: 3,697 bytes in 12 blocks ==25793== indirectly lost: 22,248 bytes in 13 blocks ==25793== possibly lost: 1,079,655 bytes in 113 blocks ==25793== still reachable: 80,270 bytes in 670 blocks ==25793== suppressed: 0 bytes in 0 blocks ==25793== ERROR SUMMARY: 119 errors from 119 contexts (suppressed: 0 from 0) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto	d2872aec70	crypto init: release *_slot resource after init Those are only used at init phase and we can free some memory for the system. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-04-20 10:57:16 +02:00
Fabio M. Di Nitto	b34c1e2870	ipcs: allow connections only after all services are ready this fixes a rather annoying race condition at startup where a client connects to corosync "too fast" before the service is ready to operate and client gets some random data during initialization phase. With this fix, we allow connections to ipc only after the main engine is operational and configured (and after the first totem transition). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-04-16 13:39:03 +02:00
Jan Friesse	f89d7b715f	Always allocate totemrrp stats array This prevents segfault when rrp mode is set with only one ring. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-04-10 09:08:42 +02:00
Jan Friesse	92ead6106f	Properly parse uidgid files Full path to key is now tested rather then key name only. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-04-10 09:08:36 +02:00
Fabio M. Di Nitto	cde4468581	totemcrypt: fix build warning (unused variable) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-27 12:06:46 +02:00
Fabio M. Di Nitto	4378915a33	totemcrypto: major code cleanup (no functional or onwire changes) - cleanup include list - reorder code and functions (crypto then hash) - split crypt/decrypt/hash functions - some micro optimizations by dropping a few memcpy - make the code more readable (better var names and buffers mapping) - improve exit paths on error (return codes and free) - store crypto header size instead of recalculating it per packet Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-27 11:43:07 +02:00
Jan Friesse	e925f42165	Make ifaces_get work with dynamic no_rings Commit which added number of addresses to srp_address structure didn't count with totemsrp_ifaces_get where whole structure was copied instead of addresses only. This is now fixed. Also to make API totempg forward compatible, size of interfaces array must be passed to ifaces_get like functions to prevent memory overwrite. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-26 11:54:26 +02:00
Jan Friesse	124ff4339c	Add no_addrs field in srp_addr structure This should allow us future change to dynamic number of rings without breaking wire compatibility. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com>	2012-03-22 14:03:38 +01:00
Jan Friesse	7a0a39b949	Mark few more icmap keys as read only Also most of the key settings are now centralized in one function, so it's easier to audit. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-16 09:37:25 +01:00
Jan Friesse	e57b5b9e6d	crypto: Remove sha224 and add md5 hash SHA224 is not supported on RHEL6 and also it's kind of weird. Instead of that, md5 can now be configured. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-15 17:36:56 +01:00
Jan Friesse	3b7c2f0588	Update crypto_set API Also few leftovers from cfg is removed and version of totempg is increased to 5 to reflect all changes we made Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-15 17:33:53 +01:00
Fabio M. Di Nitto	c75153feb4	crypto: allocate padding in crypto_header while it might seem a waste of space by using 2 extra bytes in the crypto_config_header, it actually gives us the option to grow "unknown at this time" features without hopefully breaking onwire compat Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-15 12:55:11 +01:00
Fabio M. Di Nitto	4a2d503643	crypto: add new hashing methods and fix config defaults add support for sha224/256/384/512 change config defaults to match coroparse and totemconfig Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-15 10:55:32 +01:00
Fabio M. Di Nitto	737de4dbd4	crypto: change network packets and add dynamic crypto header/data The new network packet will look: struct crypto_config_header * that provides info on crypto/hashing hash_block[size based on hashing function] (if hash is selected) salt[SALT_SIZE] (if crypto is selected) ...data... and we kill the concept of crypto_security_header completely since values are now dynamic for hash_block_size. the reason why hash_block needs to be there, is because we do hash salt in case both hashing and crypto are selected. the crypto_config_header is totally transparent to totem and to any underlaying crypto functions. as we go cleaning, also use HASH_BLOCK_SIZE to generate hash_block. the input buffer and output buffer size are dependent on the algo used to hash. we can now determine the real header size and adjust net_mtu properly at startup. This will allow in future to use any algorithm since size is dynamic. some part of the code still needs some polishing to make it more readable (specially the mapping of pointers into the packet is still a bit obscure). Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 15:57:01 +01:00
Fabio M. Di Nitto	c3f7d0ef3e	totem: don't send garbage onwire if we fail to crypt Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 15:30:40 +01:00
Fabio M. Di Nitto	452800c958	crypto: add crypto config to network data this add 2 bytes at the end of the each packet to propagate config info. in case there is a config mismatch packet must be rejected. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 12:32:10 +01:00
Fabio M. Di Nitto	0a6a6bbcfa	crypto: drop secauth and make crypto none work again keep totem.secauth config key for compatibility if the key is NOT set, crypto will default to aes256/sha1 if the key is set to "off", crypto is disabled. this reflects pretty much old behavior keywords totem.crypto_cipher and totem.crypto_hash can override secauth individually. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-03-14 11:28:36 +01:00
Jan Friesse	ab1675f0fe	Parse and use hash and crypto from config file Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-13 17:38:59 +01:00
Jan Friesse	cb97ed186a	Rename totemcrypto Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-03-13 17:38:46 +01:00

1 2 3 4 5 ...

1607 Commits