mirror_corosync

mirror of https://git.proxmox.com/git/mirror_corosync synced 2025-10-23 19:33:39 +00:00

Author	SHA1	Message	Date
Christine Caulfield	fa37b6073c	totemknet: Use new knet_link_set_config() API TC_PRIO_INTERACTIVE is now a link option in knet, so we have to provide it at link config time. This needs the latest knet git to compile as this is an updated API. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-06-09 13:28:46 +01:00
Michael Jones	afd97d7884	coroapi: Use size_t for private_data_size Unsigned int and size_t represent two different concepts. Same problem was present in ipc_glue. Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-05-29 17:23:37 +02:00
Christine Caulfield	5b1df51aa6	votequorum: Report errors from votequorum_exec_send_reconfigure If votequorum_exec_send_reconfigure() returns an error (ie the packet could not be sent) then we should either return it to the sender (for a library call) or, for an internal call, log it. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-05-26 16:18:33 +02:00
Jan Friesse	95b91e4ae7	main: Display reason why cluster cannot be formed Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2017-05-18 17:15:55 +02:00
Christine Caulfield	571f499e0a	knet: Allow space for encapsulated messages Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-05-09 09:05:12 +01:00
Andrew Price	86012ebb45	Main: Call mlockall after fork Man page of mlockall is clear: Memory locks are not inherited by a child created via fork(2) and are automatically removed (unlocked) during an execve(2) or when the process terminates. So calling mlockall before corosync_tty_detach is noop when corosync is executed as a daemon (corosync -f was not affected). This regression is caused by `ed7d054e55` (setprio for logsys/qblog was correct, mlockall was not). Solution is to move corosync_mlockall call on correct place. Signed-off-by: Andrew Price <anprice@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-25 14:50:04 +02:00
Bin Liu	c83e6c7ed9	coroparse: Use readdir instead of readdir_r readdir_r is deprecated in glibc 2.24 in favor of readdir (which became thread safe). Also because corosync never calls read_uidgid_files_into_icmap in muliple threads, no problem should appears even with libc where readdir is thread-safe. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-20 08:53:54 +02:00
Bin Liu	725f9039e9	totemknet: Handle logpipe creation failure Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-20 08:53:49 +02:00
Bin Liu	be3e166249	wd: Report error when close of wd fails Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-20 08:53:45 +02:00
Christine Caulfield	44afff227d	totemknet: Got back to recvmsg() from recvmmsg() The kernel team have recommended us not to use recvmmsg and as it confers no particular speed advantage (especially given the extra memory consumption) I'm going back to single message recvmsg() again. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-04-11 13:44:08 +01:00
Bin Liu	0462b5e609	totemconfig: Prefer nodelist over bindnetaddr In a two-node cluster, I 've one node configured with open-vswtich: 5: br-fixed: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default inet 192.168.124.88/24 scope global br-fixed inet 192.168.124.87/24 scope global secondary br-fixed inet 192.168.124.83/24 brd 192.168.124.255 scope global secondary tentative br-fixed inet 192.168.124.89/24 scope global secondary br-fixed while I use 192.168.124.83 in node list of corosync.conf with udpu, and the bind_addr is 192.168.124.0. After upgrading corosync on this node, the it uses 192.168.124.88 instead of 192.168.124.83. As we can see: corosync-cfgtool -s Printing ring status. Local node ID 1084783704 corosync-quorumtool -s Membership information: Nodeid Votes Name 1084783697 1 d52-54-77-77-01-02 1084783699 1 d52-54-77-77-01-01 (local) while the other node can only see itself: corosync-cfgtool -s Printing ring status. Local node ID 1084783697 RING ID 0 id = 192.168.124.81 status = ring 0 active with no faults corosync-quorumtool -s Membership information: Nodeid Votes Name 1084783697 1 d52-54-77-77-01-02.virtual.cloud.suse.de (local) this patch will check if there are both nodelist and bindnetaddr and if so, display warning and use nodelist information. Signed-off-by: Bin Liu <bliu@suse.com> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-04-11 11:19:31 +02:00
Christine Caulfield	6076e840f5	knet: Close libknet down cleanly at shutdown By tidily shutting down knet in totekmknet_finalize we make sure all the links are cleanly taken down and, more importantly for us, the corosync LEAVE message gets sent so we don't get fenced on a clean exit. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-04-11 09:03:26 +01:00
Bin Liu	d2a5e1442e	logconfig: Do not overwrite logger_subsys priority logfile_priority and syslog_priority could be modified by logging.logger_subsys.{logfile_priority\|syslog_priority}. which could lead to the following output(which are at notice level): corosync[21419]: [QUORUM] Using quorum provider corosync_votequorum corosync[21419]: [QUORUM] Members[1]: 1084777643 corosync[21419]: [QUORUM] This node is within the primary component and will provide service. corosync[21419]: [QUORUM] Members[3]: 1084777563 1084777584 1084777643 even the syslog_priority is warning. This patch could avoid the overwrite. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-03-10 09:09:42 +01:00
Christine Caulfield	16770a4153	totem: Fix buffer sizes knet needs buffers to be KNET_MAX_PACKET_SIZE or messages will get lost or corrupted. UDPU packets shouldn't be that big so I introduced UDP_FRAME_SIZE_MAX for that transport. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-03-02 14:57:39 +00:00
Christine Caulfield	30771a39a8	main: Don't ask libqb to handle segv, it doesn't work segv should be handled by corosync, libqb is not the place to be handling emergency signals. This currently requires the head of libqb git tree to generate a blackbox & coredump in the event of a segfault, but it's better than the write() spin that currently happens. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2017-02-27 15:14:41 +00:00
Jan Friesse	8b6bd86a55	Logsys: Change logsys syslog_priority priority LibQB adds default "*" syslog filter so we have to set syslog_priority as low as possible so filters applied later in _logsys_config_apply_per_file takes effect. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2017-02-24 16:23:50 +01:00
Fabio M. Di Nitto	36ef2af5a7	knet: improve logging messages by adding knet subsystem Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-24 09:41:35 +01:00
Fabio M. Di Nitto	19232f6052	knet: Change nodeids to knet_node_id_t for new knet compatibility after some feedback on github, people prefers to have the option to support up to 64K node_id's. libknet added knet_node_id_t to mask the size and type, currently set to uint16_t. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-14 06:08:45 +01:00
Christine Caulfield	c0f1d576d6	knet: Fix MTU sizes & allow transport config in corosync.conf Corosync layers don't need to know the knet MTU size - this way corosync fragments buffers only when they get larger than the KNET buffer size (64K) and knet fragments below that based on the actual MTU and transport considerations. It is also now possible to configure knet to use UDP or SCTP transports in corosync.conf. This is currently done per-link so if you have more than 1 link you need several interface{} stanzas inside totem{} to make it use other than the default of UDP. if it's useful I might add the option of a global default. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-02-13 16:54:30 +00:00
Fabio M. Di Nitto	970549ddfc	knet: PMTUd data_mtu already accounts for IP and knet header overheads provide some more space for data and small (+1% perf boost) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-11 06:41:38 +01:00
Fabio M. Di Nitto	18fef0ae7f	knet: switch from write to sendto() this provides another 9.6% performance boost on 2 node clusters Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2017-02-11 06:24:12 +01:00
Christine Caulfield	d9df98ceba	knet: Change nodeids to 8 bit for new knet compatibility I've also put an assert in totemknet_member_add() to check for invalid nodeids. Later on we need to fix the rest of the corosync code to only use 8bit nodeids (or force people to use UDPU if they want large nodeids). Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-02-03 09:38:32 +00:00
Christine Caulfield	2d478505e5	knet: Fix member_remove to shut down existing links first Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2017-01-16 13:16:15 +00:00
Christine Caulfield	029b8ebad6	knet: Reduce default pong count to 2 for faster startup The default PONG_COUNT of 5 made corosync slow to connect to other nodes. This helps.	2017-01-03 13:30:26 +00:00
Christine Caulfield	950cca886e	totemknet: Make it compile with kronosnet git master Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-12-22 10:25:11 +00:00
Takeshi MIZUTA	4939c75629	Remove redundant header file inclusion Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-12-05 09:59:08 +01:00
Bin Liu	819d66ca1c	Totempg: remove duplicate memcpy in mcast_msg func In function mcast_msg of totempg.c, line 923, there is a memcpy call in "else" branch, and also another memcpy out of the "else" branch, while the two calls have the same parameters. It is possibleto remove the memcpy in "else" branch. Signed-off-by: Bin Liu <bliu@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-12-05 09:40:55 +01:00
Takeshi MIZUTA	034553c080	man: Modify man-page according to command usage Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-12-01 16:32:42 +01:00
Takeshi MIZUTA	9c5b39d438	totempg: totempg_groups_join return valid error totempg_groups_join() is called by sync_init(). sync_init() judge that totempg_groups_join() failed if return code of totempg_groups_join() is -1. Therefore, the return code should return in -1 when totempg_groups_join() fails. Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-11-23 09:22:21 +01:00
Christine Caulfield	401f483cce	knet: Support reload of link parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-11-17 11:41:54 +00:00
Takeshi MIZUTA	f5dcc4a5f2	list: Unify the list processing with qb_list func Signed-off-by: Takeshi MIZUTA <miz.take4@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-11-15 12:19:13 +01:00
Christine Caulfield	7cec6a131d	knet: Allow configuration of more params knet_pmtud_interval & knet_pong_count Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-11-15 09:32:09 +00:00
Chrissie Caulfield	65219a6300	knet: Don't lose log messages when knet gets busy (#165 ) Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-11-14 15:01:34 +00:00
Jan Friesse	1f90c31ba7	list: Replace for_each by safe version where need Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-27 14:56:52 +02:00
Michael Jones	b4c06e52f3	list: Replace uses of list.h with qblist.h Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-27 14:56:52 +02:00
Christine Caulfield	86de6ce1e6	totem: add totemknet.[ch] it seems git is better at deleting files than adding them Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-13 08:46:34 +01:00
Michael Jones	a24d26c46a	cfg: Prevents use of uninitialized buffer Signed-off-by: Michael Jones <jonesmz@jonesmz.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-12 16:19:05 +02:00
Christine Caulfield	268cde6ee4	totem: Add Kronosnet transport. This is a big update that removes RRP & MRP from the codebase and makes knet the default transport for corosync. UDP & UDPU are still (currently) supported but are deprecated. Also crypto and mutiple interfaces are only supported over knet. To compile this codebase you will need to install libknet from https://github.com/fabbione/kronosnet The corosync.conf(5) man page has been updated with info on the new options. Older config files should still work but many options have changed because of the knet implementation so configs should be checked carefully. In particular any cluster using using RRP over UDP or UDPU will not start as RRP is no longer present. If you need multiple interface support then you should be using the knet transport. Knet brings many benefits to the corosync codebase, it provides support for more interfaces than RRP (up to 8), will be more reliable in the event of network outages and allows dynamic reconfiguration of interfaces. It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have plagued corosync/openais from day 1 Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-10-11 10:09:42 +01:00
HideoYamauchi	f1ffe31ce5	coropase: Set a poll_period value for wd monitor Signed-off-by: HideoYamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-10-06 15:48:38 +02:00
Christine Caulfield	c4683be9b0	votequorum: simplify reconfigure message handling As we now have update_node_expected_votes(), we can use that when receiving a new EXPECTED_VOTES value from another node rather than having our own loop. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-09-13 15:55:58 +01:00
Christine Caulfield	bd2e6b5d9d	votequorum: Don't update expected_votes display if value is too high If expected_votes was set via the library but the calculation decides it's too high, then an error is correctly returned but the value is still set in the nodes' expected_votes field and turns up in the corosync-quorumtool display. This patch separates out the quorum calculation from the updating of expected_votes per node to prevent this from happening. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-09-13 14:28:56 +01:00
Ferenc Wágner	cf10a754e9	Fix various typos occured -> occurred parantheses -> parentheses configuraton -> configuration aquire -> acquire retrive -> retrieve prefered -> preferred Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-09-12 09:50:11 +02:00
Jan Friesse	f837f95dfe	Config: Flag config uidgid entries Uidgid entries parsed from configuration files now has prefix (uidgid.config.) so they are distinguishable from dynamically added entries. Entries added from config file are pruned on reload if no longer exists in config file (dynamic one stays unaffected). Also whole uidgid.config. prefix is made read only. This make PCMK work again after configuration reload is called. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-08-04 16:13:48 +02:00
HideoYamauchi	71c9035c27	Low: totemsrp: Addition of the log. Signed-off-by: HideoYamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-08-01 10:11:45 +02:00
Jan Friesse	1925074909	Fix few bugs found by coverity Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2016-06-28 13:58:43 +02:00
Christine Caulfield	0665aca9e1	quorum: revert patch that adds qdevice (node 0) to quorum callback Revert patch 9f54f0a1fad7dad42c55562a50dfb9d773e6a660 as it causes more troubles than it solves. Code that uses the quorum nodelist to get a list of actual nodes in the cluster for communication break using this as well as the display from corosync-quorumtool Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:43 +02:00
Christine Caulfield	c9c6d9e30f	quorum: Return qdevice nodeid in the quorum callbacks (if active). Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:41 +02:00
Christine Caulfield	e41b256c67	votequorum: Allow wait_for_all with qdevice Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	98548e1880	qnetd: lms: Fix search for node/ring_id check We were looking for us in other node lists, rather than others in our nodelist. Also, remove debug print in votequorum.c Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	3a5d51fca7	votequorum: Fix up quorum/nodelist callbacks This patch tidies the two state change callbacks and explains them in the man page: The difference between votequorum_nodelist_notification_t and votequorum_quorum_notification_t is subtle but important. The 'nodelist' callback is sent at the start of a cluster state transition and contains the new ring_id and only the list of nodes that are included in the sync state - ie only active nodes. No quorum information is included this callback because it is not available at that time. The 'quorum' callback is sent after the cluster state transition has completed and does contain quorum information. In addition, the nodelist contains a list of all nodes known to votequorum (whether up or down) and their state as well as information about the quorum device attached (if any). quorum callbacks will not be sent for qdevice up and down events unless they affect quorum. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:39 +02:00
Christine Caulfield	cf0028c86e	votequorum: split callbacks into nodelist and quorum This split is needed for qdevice, so that it gets the ring_id and nodelist as part of the sync process and not afterwards - when quorum has been calculated. As this is and unsupported API I'm not too worried about breaking existing code - all the clients I know of are using the quorum API anyway as they should be. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-06-28 13:58:38 +02:00
Jan Friesse	44df76a7ee	config: get_cluster_mcast_addr error is not fatal Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2016-06-28 13:57:14 +02:00
Ferenc Wágner	c76ee39f61	Fix typo: Diabled -> disabled Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:48 +02:00
Ferenc Wágner	b1de8efd15	Fix typo: aquire -> acquire Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:28 +02:00
Ferenc Wágner	841f48e253	Fix typo: Uknown -> Unknown Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-06-22 14:26:22 +02:00
Christine Caulfield	f2a1fcc5bf	logconfig: Fix logging reload disabling logfiles In my previous logconfig patch, adding a subsys so the logging stanzas could disable logging to a file, because the subsys closed the file used by the main logging. This patch only applies defaults to higher-level logging and non-deprecated keys. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-27 17:36:30 +02:00
yuusuke	2ef086bd9b	wd: Warn if values are out of range Signed-off-by: yuusuke <yusk.iida@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-05-27 10:38:30 +02:00
yuusuke	39cd6b3d1d	parser: WD Read type correctly from corosync.conf Signed-off-by: yuusuke <yusk.iida@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-05-27 10:36:24 +02:00
Christine Caulfield	571b1621e9	Add some more RO keys Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-24 12:33:55 +02:00
Christine Caulfield	125848d80a	Reapply config defaults corosync.conf reload There were several places where defaults were not restored if the keys were removed from corosync.conf and the file reloaded. This patch adds those back so that reloading corosync.conf has the expected effect when keys are deleted. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-24 12:33:35 +02:00
Jan Friesse	b93d75abc4	schedwrk: Cleanup and make it work on PPC BE Schedwrk is passing hdb handle (64-bit) to totempg_callback_token_create as a context. Context is defined to be pointer, so there is conversion function which stores 64-bit hdb_handle into pointer. Potentially, pointer can be 32-bit. This means, check part of hdb is discarded (and have to get special no_check value in schedwrk_do) later. This works quite well on 32-bit Little-Endian system. Sadly on Big-Endian system, check partition of hdb is stored instead of value. Result is error of hdb_handle_get call. Proposed solution is to pass handle pointer to totempg_callback_token_create as context. This means full hdb (check + value) can be used in schedwrk_do (easier detection of memory corruption). Main reason for this patch is to remove usage of pointer as integer value. Small drawback of given solution is that handle pointer must be memory allocated on heap or static memory, making API more bug-prone. Current usage of schedwrk API across corosync always use memory in .text section (safe), so it's not a problem. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-05-17 16:29:25 +02:00
Valentin Vidic	8d8d4a936a	wd: make watchdog device configurable Add configuration option resources.watchdog_device allowing runtime selection of watchdog device. Useful for newer servers having more than one watchdog available (IPMI and iTCO). Special value "off" disables watchdog in configuration rather than just using build options. Useful when watchdog device is needed elsewhere (SBD cluster stonith service). Signed-off-by: Valentin Vidic <Valentin.Vidic@CARNet.hr> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-03 15:47:15 +02:00
Christine Caulfield	1e2de52ef1	logging: Use our own version of basename basename() function has some potentially odd issues on other platforms. So, to be safe, here's an internal version. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-05-03 15:31:29 +02:00
Christine Caulfield	d245831d65	logsys: fix TOTEM logging when corosync built out of tree If corosync is built out-of-tree (passing --srcdir to configure) then TOTEM logging doesn't print anything. This is caused by the source filenames (from __FILE__ at compilation time) having the configured path in them - in this example ../corosync/exec/totemudp.c etc. The list of totem source filenames passed to libqb logging facility only has the basenames so the filenames never match up as libqb does an exact string match. I looked into fixing this in libqb but it causes a regression. We can't simply basename() __FILE__ at the point of calling log_printf as it's i common also to use __FILE__ to generate the logging source, and using basename() on both removes the distinction between similarly named files from different directories which could be a requirement. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2016-04-26 09:49:53 +01:00
Christine Caulfield	aab55a004b	parser: Make config file parser more hierarchy pass 'state' down the stack so that the state of the hierarchy doesn't get lost when there are unexpected items in the config hierarchy. Don't bother setting 'state' on SECTION_END as there's no point now we're going back up the stack. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-04-22 13:01:04 +02:00
Jan Friesse	60565b7da7	totemconfig: Explicitly pass IP version If resolver was set to prefer IPv6 (almost always) and interface section was not defined (almost all config files created by pcs), IP version was set to mcast_addr.family. Because mcast_addr.family was unset (reset to zero), IPv6 address was returned causing failure in totemsrp. Solution is to pass correct IP version stored in totem_config->ip_version. Patch also simplifies get_cluster_mcast_addr. It was using mix of explicitly passed IP version and bindnet IP version. Also return value of get_cluster_mcast_addr is now properly checked. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2016-04-07 14:45:05 +02:00
Jan Friesse	600fb4084a	totempg: Fix memory leak Previously there were two free lists. One for operational and one for transitional state. Because every node starts in transitional state and always ends in the operational state, assembly was always put to normal state free list and never in transitional free list, so new assembly structure was always allocated after new node connected. Solution is to have only one free list. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Steven Dake <stdake@cisco.com>	2016-02-10 15:57:20 +01:00
Richard B Winters	028c473886	Fix spelling error in binary corosync - Changed paramater to parameter in exec/logcconfig.c Change-Id: I8a24b0ef5c6621dc6c19d7decbdfe7a255afd10d Signed-off-by: Richard B Winters <rik@mmogp.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-27 18:29:25 +01:00
Ruben Kerkhof	37f092bbed	totemsrp: Fix clang warning (tautological compare) gsfrom is always >= 0 Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-04 17:28:14 +01:00
Ruben Kerkhof	da3288217c	Remove a few unused variables and functions Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2016-01-04 17:11:06 +01:00
Ruben Kerkhof	479ec4dbf0	Check for fdatasync If we don't have it, fall back to fsync Fixes the build on FreeBSD Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-12-16 16:43:27 +01:00
Hideo Yamauchi	5ab922701a	quorum: Display node id as unsigned int. Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-11-27 15:56:54 +01:00
Christine Caulfield	165561df9b	totemudp: Move udp bind() so that multicast works with IPv6 It seems that the IPv6 multicast parameters only take effect when bind() is called, so I've moved the mcast recv socket bind() to the bottom of totemudp_build_sockets_ip(). Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-11-16 16:00:36 +00:00
Christine Caulfield	a71ec5d95d	votequorum: Don't send multiple callbacks when nodes join This patch aligns the votequorum callbacks so that they are the same as the quorum ones. Previously it was quite common for votequorum to send one callback for every node in the cluster when a single new node joined (because it sent one for every nodeinfo message it received). This new system makes much more sense in itself and being consistent with the internal quorum is also an advantage! Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-10-22 11:45:26 +01:00
Ferenc Wágner	73910bd66e	totmesrp: Fix typo in log message Signed-off-by: Ferenc Wágner <wferi@niif.hu> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-08-26 09:26:26 +02:00
Christine Caulfield	d64ee7b531	wd: fix setting of watchdog timeouts Fix setting of initial watchdog timeout, and also changing of timeout. Remove redundant starting of timer in exec_init_fn Signed-off-by: Kazunori INOUE <kazunori.inoue3@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-07-14 10:04:06 +01:00
Jason HU	15b2e94cca	CFG: Prevent CFG orignating messages during SYNC During SYNC, corosync-cfgtool -R/-H commands can pass through IPC then send totem messages. This may corrupts assembly_list_inuse/assembly_list_free if those messages are recedived after SYNC is done. The solution is marking related CFG APIs as CS_LIB_FLOW_CONTROL_REQUIRED. Signed-off-by: Jason HU <huzhijiang@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-07-02 16:49:38 +02:00
Christine Caulfield	b9f5c290b7	votequorum: Fix auto_tie_breaker behaviour in odd-sized clusters auto_tie_breaker can behave incorrectly in the case of a cluster with an odd number of nodes. It's possible for a partition to have quorum while the other side has the ATB node, and both will continue working. (Of course in a properly configured cluster one side will be fenced but that becomes an indeterminate race .. just what ATB is supposed to avoid). This patch prevents ATB from running in a partition if the 'other' partition might have quorum, and also mandates the use of wait_for_all in clusters with an odd number of nodes so that a quorate partition cannot start services or fence an existing partition with the tie breaker node. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-06-18 09:57:59 +01:00
Christine Caulfield	ab8942f626	totemsrp: Improve logging of left/down nodes This patch from Hideo Yamauchi improves the logging of whether nodes leave the cluster cleanly or uncleanly, making it easier to determine if a node ws shut down by the operator. There is also the possibility that a LEAVE message could get missed (due to the node being in flush state) so this can also make that clearer. The modifications are as follows. Change 1) I added the list which maintained LEAVE node to totemsrp. Change 2) I added registration, a search, the handling of to clear LEAVE node. Change 3) I added the output to log. Change 4) I changed an output level of the log. Signed-off-by: Hideo Yamauchi <renayama19661014@ybb.ne.jp> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-06-12 16:16:45 +01:00
Christine Caulfield	53f67a2a79	totem: Log a message if JOIN or LEAVE message is ignored As per recent email thread, this patch adds a log message if a JOIN or LEAVE message is discarded while corosync is flushing the receive queue. While ignoring a JOIN message is harmless (it will be resent), ignoring a LEAVE message can cause a longer state transition as it is treated as a node crashing rather than leaving gracefully, so the system admin might be confused as to the cause. Unfortunately, we can't (at the totemudp level) distinguish between JOIN or LEAVE messages without a lot more protocol-specific code creeping in the lower layer so the message is left ambiguous. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2015-04-17 15:49:53 +01:00
Christine Caulfield	997074cc3e	totemconfig: Check for duplicate nodeids Having duplicate nodeids in corosync.conf can play havoc with a cluster, so (as suggested by someone on this list) here is some code to check that all nodeids are unique. Even if a nodeid is not specified it will check to be sure that the ID generated from the IP address (ipv4 only) does not clash with one that is provided. It logs all non-unique nodeids to syslog, but only the last is reported on the command-line to the user which should be enough to get them to check further. At startup this will cause corosync to fail to start. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2015-04-10 14:22:07 +01:00
Christine Caulfield	82526d2fe9	quorum: don't allow quorum_trackstart to be called twice If quorum_trackstart() or votequorum_trackstart() are called twice with CS_TRACK_CHANGES then the client gets added twice to the notifications list effectively corrupting it. Users have reported segfaults in corosync when they did this (by mistake!). As there's already a tracking_enabled flag in the private-data, we check that before adding to the list again and return an error if the process is already registered. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-16 11:37:52 +00:00
Christine Caulfield	8cc8e51363	cpg: Add support for messages larger than 1Mb If a cpg client sends a message larger than 1Mb (actually slightly less to allow for internal buffers) cpg will now fragment that into several corosync messages before sending it around the ring. cpg_mcast_joined() can now return CS_ERR_INTERRUPT which means that the cpg membership was disrupted during the send operation and the message needs to be resent. The new API call cpg_max_atomic_msgsize_get() returns the maximum size of a message that will not be fragmented internally. New test program cpghum was written to stress test this functionality, it checks message integrity and order of receipt. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-05 16:45:15 +00:00
Andrey N. Groshev	5d9acc5604	totemsrp: Format member list log as unsigned int Signed-off-by: Andrey N. Groshev <greenx@yandex.ru> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-05 16:34:07 +01:00
Christine Caulfield	c832ade034	Don't allow both two_node and auto_tie_breaker in corosync.conf The two_node and auto_tie_breaker options are incompatible as they specify conflicting methods of determining the quorate half of a cluster partition. This patch detects this error in corosync.conf, issues a message and disables two_node if auto_tie_breaker is present. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-02 15:50:21 +00:00
Christine Caulfield	314a01c98e	Votequorum: Fix auto_tie_breaker default The default for auto_tie_breaker should be 'lowest' - which is what it was before the extended ATB functionality of auto_tie_breaker_node was added, and what the documentation states. However this was broken so that if auto_tie_breaker_node was not specified then auto_tie_breaker itself was ignored. This patch fixes that. It also fixes a typo in a comment. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2015-03-02 15:48:01 +00:00
Jan Friesse	d77cec24d0	Handle adding and removing UDPU members atomically When config file is reloaded with removed UDPU member, internal icmap index of nodelist.node can change. This can result in removal and then adding back node. This, with UDPU alive filtering (where member is by default considered as not a member) makes corosync not sending messages to such members resulting in new membership creation. Solution is to properly test which members were really deleted and added (instead of relying on internal and dynamic naming of icmap hash table key name). Also trully dynamic add and remove node (via cmap) is now handled by same function so totem_config->interfaces is now updated properly. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2015-01-21 16:37:26 +01:00
Jan Friesse	252b38ab8a	corosync_ring_id_store: Use safer permissions corosync_ring_id_store should use same (safer) permissions as corosync_ring_id_create_or_load for (eventually) newly created ringid file. Credit to Sjerek for finding this problem. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-20 11:21:05 +01:00
Jason	4ee84c51fa	totem: Ignore duplicated commit tokens in recovery In active rrp mode, commit tokens are treated as mcast data messages, thus, rrp directly delivers them to srp layer by active_mcast_recv(). This will result in duplicated commit tokens being received by srp from different heartbeat links. If node is in recovery state and has already sent out the initial orf token, those duplicated commit tokens will cause message_handler_memb_commit_token() to send initial orf token again! This is wrong because it resets the orf token content in instance->orf_token_retransmit, which breaks the token retransmission state. Furthermore, by sending those initial orf tokens again and again, it may lead active_token_recv() to drop some subsequent orf tokens. It is OK for rrp because srp will do token retransmission, but as said above, srp retransmission state has already been broken, so finally we meet a "token lost in recovery state" condition caused by software. If token timeout value is large, then it will takes long time to create a new ring. This can be reproduced by having two noded set to active rrp mode, with two heartbeat links. Then with one node always on, let the other one do stop/start again and again. It has a low probability to reproduce. In theory, I think, the more heartbeat links used, the more easily it can be reproduced. This problem can be resolved by letting message_handler_memb_commit_token() to ignore duplicated commit tokens in recovery state if node (the ring representation) has already sent out the initial orf token. Different from prev take, this version do not depends on stored token data but uses originated_orf_token in totemsrp_instance to remember if initial orf token has been already originated for current membership. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-15 17:33:04 +01:00
Jan Friesse	e0ac861efd	Log auto-recovery of ring only once Make sure to log auto-recovery of ring only once. Every MESSAGE_TYPE_RING_TEST_ACTIVATE receive is logged, but with lower priority and more detailed information. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-14 18:13:29 +01:00
Jan Friesse	177ef0e524	Set RR priority by default Experience with larger production clusters showed that setting RR priority for corosync is viable for prevent random fencing, ... Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2015-01-05 15:01:49 +01:00
Jason	8f284b26b3	Reset timer_problem_decrementer on fault After a heartbeat link's FAULTY and its auto re-enable, active_instance->timer_problem_decrementer did not reset to zero. So in the next timer_function_active_token_expired() round, active_timer_problem_decrementer_start() will not be called. This will result in that the active_instance->counter_problems of this link can not be decreased any more. Cause rrp lose the ability to tolerate network fluctuation. This problem can be reproduced by the following sequence: 1) Set RRP in active mode, configure at least 2 heartbeat links. 2) Unplug one link till corosync-cfgtool -s shows it is FAULTY. 3) Re-plug this link then corosync-cfgtool -s shows it is active with no faults. 4) Unplug this link again but quicky re-plug it before it becomes FAULTY. 5) Finally, you can see corosync-cfgtool -s shows it is in "Incrementing problem counter" state despite it currently is physically healthy. It can be solved by not forget to reset timer_problem_decrementer to zero in active_timer_problem_decrementer_cancel(). Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-12-08 16:26:28 +01:00
Jan Friesse	6449bea835	config: Ensure mcast address/port differs for rrp When using multiple interfaces, it's necessary to use different multicast address/port pair for each interface to make rrp work correctly. This is now checked in parser. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-11-24 11:55:37 +01:00
Jan Friesse	70bd35fc06	config: Process broadcast option consistently Broadcast option is global but in config set in interface section. When more interfaces are defined, only broadcast from last section was used. Solution is to use broadcast whenever at least one interface use broadcast. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-11-24 11:55:37 +01:00
Jan Friesse	6c028d4d9c	config: Make sure user doesn't mix IPv6 and IPv4 Checking code was there, sadly not correct, so it was possible to enter one bindnet addr as IPv4 and second as IPv6. Fix is trivial. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-11-24 11:55:37 +01:00
Jan Friesse	bb52fc2774	Store configuration values used by totem to cmap Some totem configuration values (like token, consensus, ...) are ether computed or default value is used. It's hard to find out, what value is really used. Solution is to store values in cmap. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-10-13 11:59:06 +02:00
Jan Friesse	03f95ddaa1	Adjust MTU for IPv6 correctly MTU for IPv6 is 20 bytes larger then IPv4. This fact was not taken into account so IPv6 packets were larger then MTU resulting in fragmentation. Solution is to substract correct IP header size. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-10-01 14:20:21 +02:00
Fabio M. Di Nitto	239e239782	[crypto] fix crypto block rounding/padding calculation libnss is "weird" in this respect as some block sizes are hardcoded, others need to be determined dynamically. For AES we need to use the values we know since GetBlockSize would return errors, for 3des (that hopefully nobody is using) the value returned by GetBlockSize is 8, but let's use the call into libnss to avoid possible conflicts with distro patching or older versions. Now, given the correct block size, the old calculation simply added block size to the hdr_size. This is not sufficient. We use _PAD encryption methods and we need to take that into account. _PAD is calculated given the current input buf len and rounded up to block size boundary, then block_size is added. Ideally we would do that on a per packet base but current transport infrastructure doesn't allow it yet. So round up the hdr_size to double the block_size reported by the cipher. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-09-06 07:11:56 +02:00
Jan Friesse	2429481b96	totemudpu: Send msgs to all members occasionally To follow spec it's needed to send messages to all nodes (not only active members) from time to time to detect merge. This is needed in situations when totemsrp merge timer isn't running (because there is enough messages sent by processors) to detect merge. Example scenario: - 3 nodes, all of them running cpgverify - One node is isolated (iptables for example) - Node is un-isolated Without this commit, node will not merge as long as the cpgverify is running. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-26 15:36:07 +02:00
Jan Friesse	71f1b99649	totemudpu: Implement member_set_active Member active is used for sending "multicast" messages only to members of ring. This reduces network load if some nodes are intentionally down. Only regular multicast message load is reduced (messages sent by totemudpu_mcast_noflush_send), because special messages (like hold cancel, join message, ...) still have to be send to all members to ensure correct behavior. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-26 15:36:05 +02:00
Jan Friesse	371a99e961	totemrrp: Implement _membership_changed All _membership_changed calls totemnet_member_set_active passing 1 as active parameter for joined nodes and 0 for left nodes. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-26 15:36:02 +02:00
Jan Friesse	4c717942cf	totemnet: Add totemnet_member_set_active totemnet_member_set_active together with transport specific member_set_active makes possible for totemnet (and more interestingly transport) to be informed about membership changes. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-26 15:35:59 +02:00
Jan Friesse	acb55cdb03	totem: Inform RRP about membership changes Services are informed about membership changes, but if same information is needed inside totemrrp or totemnet, it's impossible to gather this information. Patch makes this possible for now only for RRP with empty callbacks. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-26 15:35:56 +02:00
Christine Caulfield	02f58aec9c	YKD: Fix loading of YKD quorum module Although YKD is currently unsupported, untested and decprecated it's handy for testing things in the quorum module. This patch allows YKD to actually load without an error. It does not fix anything else in the service! Also remove vsftype and its reference to YKD being the preferred and default provider from the corosync.conf man page, as that hasn't been true for a considerable time. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-08-18 09:33:59 +01:00
Christine Caulfield	cbf753405b	votequorum: Add cmap key to reset wait_for_all It's possible in a two_node cluster (and others but it's more likely with just two) that a node could be booted up after downtime or failure and the other node is not available for some reason. In this case it would not be allowed to proceed because wait_for_all is enforced. This patch provides a cmap key to clear this flag in the desperate situation where that becomes necessary. It should only be used with extreme caution and will be wrapped up in pcs which should also check that fencing has been run. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-08-12 16:02:46 +01:00
Jason HU	f135b68096	Cancel token holding while in retransmition When there is no other activty on ring but only retransmition, and token is in hold mode, the retransmition will become slow. More over, if the retransmition is always fail but token rotation works well, then it takes quite a lone time (fail_to_recv_const * token_hold = 2500 * 180ms = 450sec) for the retransmit requester to meet the "FAILED TO RECEIVE" condition to re-construct a new ring. This problem can be solved by checking if retransmits are present before going into hold. If a node is the retransmit requester or the resender, it set my_token_held to 0 to speed up retransmition and omit further unnecessary sending of token_hold_cancel signal. Signed-off-by: Jason HU <huzhijiang@gmail.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-12 09:28:04 +02:00
Jan Friesse	17488909d4	votequorum: Make qdev timeout in sync configurable Configuration option quorum.device.sync_timeout is available for setting qdevice poll timeout for synchronization phase. Default value is 30 sec. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-05 17:22:52 +02:00
Jan Friesse	b4c9934635	votequorum: Block sync until qdevice poll If qdevice is registered a alive, corosync waits in sync phase until timeout expires or qdevice votes with correct nodeid parameter. This gives qdevice time to decide to vote or not undisturbed and without time hazard. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-05 17:22:47 +02:00
Jan Friesse	7cad804629	ipc: Process votequorum messages during sync This is needed for qdevice to be able to process messages during synchronization phase. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-05 17:22:44 +02:00
Jan Friesse	b8902464d1	votequorum: Add ring id to poll call If votequorum service receives incorrect (not current) ringid, call is ignored and CS_ERR_MESSAGE_ERROR is returned. This and previous commits makes incompatible changes in votequorum API/ABI, so library version is increased. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-05 17:22:41 +02:00
Jan Friesse	5f6f68805c	votequorum: Return current ring id in callback Returning ring id will be used in poll function. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-08-05 17:22:37 +02:00
Christine Caulfield	88dbb9f722	totemconfig: Make sure join timeout is less than consensus The thesis contains this paragraph: " The Join timeout is shorter than the Consensus timeout and is used to increase the probability that Join messages from all currently working processors are received during a single round of consensus." Empirically I can confirm that making join less than consensus can cause havoc with a cluster so I think we should enforce this. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-07-25 08:24:02 +01:00
Christine Caulfield	3b8365e806	config: Fix typos Fix several places where 'then' is used instead of 'than' in error messages and a comment. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-07-24 10:27:45 +01:00
Jan Friesse	63bf09776f	totemconfig: refactor nodelist_to_interface func Move finding of bindaddr in nodelist to generally usable function totem_config_find_local_addr_in_nodelist and refactor config_convert_nodelist_to_interface function to use it. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2014-07-22 14:59:31 +02:00
Jan Friesse	10c80f454e	totemconfig: totem_config_get_ip_version Add totem_config_get_ip_version to get user configured ip version. Make totem_config_read use this newly introduced function. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2014-07-22 14:59:27 +02:00
Jan Friesse	dc35bfae62	totemconfig: Free ifaddrs list Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2014-07-22 14:59:20 +02:00
Fabio M. Di Nitto	84b9e5989a	be consistent in using CPPFLAGS vs CFLAGS Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-07-21 08:47:21 +02:00
Vladislav Bogdanov	e3ffd4fedc	Implement config file testing mode Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-07-16 16:10:32 +02:00
Jan Friesse	dfaca4b10a	Fix compiler warning introduced by previous patch QB loop signal handler prototype differs from signal(2) prototype. Solution is to create wrapper functions. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2014-07-09 15:57:35 +02:00
zouyu	384760cb67	Handle SIGSEGV and SIGABRT signals SIGSEGV and SIGABRT signals are now correctly handled (blackbox is dumped and logsys is finalized). Signed-off-by: zouyu <hopkings2005@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-07-03 15:13:48 +02:00
zouyu	cc80c8567d	fix memory leak produced by 'corosync -v' Signed-off-by: zouyu <hopkings2005@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-07-03 14:54:05 +02:00
Jan Friesse	72cf15af27	votequorum: Do not process events during reload During reload, local_node_pos is deleted and reinstation is handled in totemconfig after reload is finished. votequorum handles this events and tries to reload it's configuration. This led to logging a little scary messages (even nothing bad is happening, because after local_node_pos reinstation everything back to normal). Solution is to stop processing events during reload. Sadly, simple tracking of config.reload_in_progress doesn't work because LibQB events triggering order is undefined so votequorum reload handler can be called before totemconfig (and before local_node_pos is reinstatied). So new config.totemconfig_reload_in_progress key is defined with very similar semanthic as config.reload_in_progress but set inside totem_reload_notify function. Votequorum then use this new key. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-27 11:40:21 +02:00
Jan Friesse	c8e3f14fdb	Make config.reload_in_progress key read only It's not very good idea to allow user apps changing internal key reload_in_progress. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-27 11:40:18 +02:00
Jan Friesse	4e9716ed30	coroparse: More strict numbers parsing Previous safe_atoi didn't check range of input values so if for example user used -1 s token timeout, it was converted to UINT32_MAX without letting user know. Another safe_atoi problem was using strtol. This works pretty well on 64-bit systems, where long integer is usually 64-bits long, sadly on 32-bit systems, it is usually 32-bit long. And because strtol returns signed integer, it was not possible to enter 32-bit value with highest bit set. Solution is to use strtoll which is guaranteed to be at least 64-bits long and check value range. Also error message now contains also information about expected value range. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-12 14:49:00 +02:00
Jan Friesse	da46ecfc30	Move ringid store and load from totem library Functions for storing and loading ring id was in the totem library. This causes problem, what to do when it's impossible to load or store ring id. Easy solution seemed to be assert, but sadly this makes hard for user to find out what happened (because corosync was just aborted and logsys didn't flush) Solution is to move these functions to main.c, where is much easier to handle error. This also makes libtotem free of any file system operations. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:54:57 +02:00
Jan Friesse	d310b251c3	Introduce get_run_dir function Run dir (LOCALSTATEDIR/lib/corosync) was hardcoded thru whole codebase. Totemsrp was trying to create and chdir into it, but also takes into account environment variable COROSYNC_RUN_DIR creating inconsistency. get_run_dir correctly returns COROSYNC_RUN_DIR (when set) or LOCALSTATEDIR/lib/corosync. This is now used by all functions instead of hardcoded string. All occurrences of mkdir/chdir are removed from totemsrp and chdir is now called in main function. Mkdir call is completely removed, because it was not used anyway (check in main.c was called before totemsrp init, so mkdir was never called) and also make install and/or package system should take care of creating this directory with correct permissions/context. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:53:18 +02:00
Jan Friesse	8f13a98320	logsys: Log warning if flightrecorder init fails Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:36:10 +02:00
Jan Friesse	19c5b63ff5	logsys: Log error if blackbox cannot be created Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-06-02 14:36:08 +02:00
Jan Friesse	e905f92bf5	totemiba: Fix incorrect failed log message rdma_join_multicast failed ... message parameters was swapped. Also information about multicast join is now logged as notice. Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2014-05-15 15:28:51 +02:00
Yevheniy Demchenko	4d6a18d8a5	totemiba: Add multicast recovery Totemiba wasn't able to survive SubnetManager handover or restart. If SM was migrated to another node, corosync logged "multicast error" and losses connectivity. Commit should solve this situation. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-05-14 14:51:07 +02:00
hfu	d0dc9ae93c	Indent: Remove newline before else branch start Signed-off-by: hfu <askfuhu@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-05-09 11:38:02 +02:00
hfu	b6e2c8024d	Indent: Remove space in negation of expression Signed-off-by: hfu <askfuhu@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-05-09 11:37:47 +02:00
Jan Friesse	7557fdec48	config: Allow dynamic change of token_coefficient token_coefficient change in cmap didn't triggered change. So only way how to change token_coefficient was editing config file and reload. Patch let's key totem.token_coefficient to be processed so token_coefficient can be dynamically changed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-05-07 15:55:26 +02:00
Jan Friesse	58176d6779	Add token_coefficient option Token coefficient is used only when nodelist is specified and contains at least 3 nodes. If so, real token timeout is then computed as token + (number_of_nodes - 2) * token_coefficient. This allows cluster to scale without manually changing token timeout every time new node is added. This value can be set to 0 resulting in effective removal of this feature. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:17 +01:00
Jan Friesse	9a8de87c34	totemconfig: Log errors on key change and reload When volatile key was changed (cmap set or reload) and checks fails, nothing was logged. Values are now checked and error string is logged on problems. Also totem_config is dumped to log (DEBUG level) after every volatile key change and every reload. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:14 +01:00
Jan Friesse	b95ebd640e	totemconfig: Key change process dependencies When key with dependency was changed, dependant keys were not recomputed. Nice example is consensus timeout. If token timout was changed, consensus timeout was not recomputed correctly (nether via cmap change of key nor via cfg reload). Solution is almost complete refactor of handling volatile defaults. totem_volatile_config_read now handles not only storing cmap key to totem_config structure, but also checking of existence, comparing with zero value and properly storing defaults. totem_set_volatile_defaults is gone. It's function was splitted into totem_volatile_config_read and totem_volatile_config_validate functions. Reload callback and change of key callback are now mostly same functions and both calls totem_volatile_config_read. Patch also fixes small memory leak. totem.vsftype key is not used for long time and original totem_volatile_config_read wasn't freeing allocated memory returned by icmap_get_string. Whole reading of totem.vsftype is removed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:12 +01:00
Jan Friesse	eeb2384157	Really clear totemconfig nodes on reload When reload was called nodes were constantly added to totemconfig nodelist. So simple corosync-cfgtool -R resulted very quickly in filling whole array and segfault. Solution is to clear member_count. Clearing is also moved directly to put_nodelist_members_to_config to make sure it's always processed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:29:09 +01:00
Jan Friesse	1b6abcc7d5	Log: Make reload of logging work When reload was called multiple times (~20), logging to file stopped working. Main problem was hidden in the fact, that log file was opened multiple times, because even target_id was shared via subsystem loggers, file name was not. Solution is to ALWAYS set proper log file name into subsystem logger (copy is stored). This will not only fix problem but also removes small leak. Also if filename didn't changed, function can return sooner. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-25 15:13:33 +01:00
Jan Friesse	2f0cad20a9	config: Handle totem_set_volatile_defaults errors When totem_set_volatile_defaults is called from totem_config_validate return code is unchecked. It's then perfectly possible to set (for example) join timeout to very small value (1) and consensus value is then set to 0 making corosync unable to create membership. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-03-17 10:04:00 +01:00
Jan Friesse	e1801ba497	votequorum: Properly initialize atb and atb_string icmap_get_* behavior is to NOT modify passed variable when it doesn't success. So we must initialize variable before icmap_get_* call. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2014-02-26 16:59:02 +01:00
Jan Friesse	ff67daa55f	mon: Make monitoring work Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:20 +01:00
Jan Friesse	099f704cdd	mon: Pass correct pointer to inst Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:16 +01:00
Jan Friesse	57ff693b70	mon: Fix comparsion typo Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:13 +01:00
Jan Friesse	e1e2390b61	mon: Make mon compilable with libstatgrab ver 0.9 Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-25 14:57:10 +01:00
Jan Friesse	fbe8768f1b	cpg: Make sure left nodes are really removed When node is paused and other nodes has in meantime exited cpg process, paused node after resume doesn't update it's membership correctly so on previously paused node exited cpg process is still visible. Solution is to compare join list with cpd and remove all pids which are not included in join list. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:14 +01:00
Jan Friesse	83c63b247f	cpg: Make sure nodid is always logged as hex num Also number is prefixed by 0x so it's easier to spot that number is hexadecimal. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:10 +01:00
Jan Friesse	fcf26e0303	cpg: Refactor mh_req_exec_cpg_procleave Most of functionality is moved to do_proc_leave function to make it reusable. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-19 10:59:05 +01:00
Jan Friesse	38c04d9a66	totemsrp: Fix typo with cont gather Patch `f3ffd3da5c` introduced named states of state-machine, but sadly contains logical problem causing stats.continuous_gather increasing even when it shouldn't. Problem is not critical, because continuous_gather is set to 0 on successful membership creation. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-02-18 16:12:57 +01:00
Christine Caulfield	90d448af3b	votequorum: Add extended options to auto_tie_breaker This patch adds more flexibility to the auto_tie_breaker feature of votequorum. With this, not only can the lowest nodeid be used as a tie breaker, but also the highest, or a node from a nominated list. If there is a list of nodes, the first node in the list that was not part of the previous partition is used. This allows the user to specify a preferred set of nodes but prevents a split-brain if the cluster divides evenly with a node in each half. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-02-17 16:29:45 +00:00
Masatake YAMATO	fa71067a93	Free object allocated at quorum_register_callback Memory object allocated with malloc at quorum_register_callback is not freed. The object is linked to internal_trackers_list. The object is unlinked at quorum_unregister_callback. However, it is not freed at the function. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-23 17:18:44 +01:00
Jan Friesse	45dd9861ff	Properly check result of symlink Error message is displayed when it's impossible to create symlink to fdata file. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:31 +01:00
Jan Friesse	5c54f941ac	Fix cppchecks warning Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:29 +01:00
Jan Friesse	178c0d82d9	Close devnull file handler Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-14 11:24:26 +01:00
Jason	cfbb021e13	totem: Drop invalid join msg in operational state According to the totem paper, if a processor receives a join message in the operational state and if the receivers identifier is in the join messages fail list, then join message should be ignored. By applying this validation of join messages, we can avoid unnecessary switching from operational state to gather state(or even lead to rings can not be merged) like the following to happen. 1. Initially, there is only one ring contains three nodes, say ring(A,B,C). 2. A and B network partition, "in the same time", C is down. 3. Node A sends join message with proclist:A,B,C. faillist:NULL. Node B sends join message with proclist:A,B,C. faillist:NULL. 4. Both A and B consensus timeout due to network partition. 5. A and B network remerged. 6. Node A sends join message with proclist:A,B,C. faillist:B,C. and create ring(A). Node B sends join message with proclist:A,B,C. faillist:A,C. and create ring(B). 7. Say join message with proclist:A,B,C. faillist:A,C which sent by node B is received by node A because network remerged. 8. Node A shifts to gather state and send out a modified join message with proclist:A,B,C. faillist:B. Such join message will prevent both A and B from merging. 9. Node A consensus timeout (caused by waiting node C) and sends join message with proclist:A,B,C. faillist:B,C again. Same thing happens on node B, so A and B will dead loop forever in step 7, 8 and 9. As the paper also said: "If a processor receives a join message in the operational state and if the sender's identifier is in the receiver's my_proclist and the join message's ring_seq is less than the receiver's ring sequence number, then it ignores the join message too." So these patch applying these validations of join messages altogether. Signed-off-by: Jason <huzhijiang@gmail.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2014-01-13 14:46:13 +01:00
Christine Caulfield	ff6a43edb3	votequorum: Add persistent expected_votes tracking. This patch adds the option to store expected_votes to persistent storage. This is needed to allow_downscale to operate properly. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2014-01-07 15:30:11 +00:00
Jan Friesse	b88c0766fe	logsys: Make logging of totem work again Because of change in libqb (9abb686) logging of TOTEM subsystem stopped working. Instead of rely on previous behavior (implicit substring match), all totem files are now explicitly given. Also QB subsystem now uses comma separated filelist instead of previous function calling. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-11-04 12:32:35 +01:00
Masatake YAMATO	f3ffd3da5c	totemsrp: Show English message when memb_state_gather_enter is called The reason why memb_state_gather_enter is invoked was printed in integer code. This patch introduces human readable English messages for the code. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-10-24 16:46:17 +02:00
Yevheniy Demchenko	805b3423ee	totemiba: Check if configured MTU is allowed by HW Solution use aproximation of totem structures. This needs to be rewritten in proper way. Also MTU checking should be implemented for IP transports. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:27:08 +02:00
Yevheniy Demchenko	8f14a5788f	totemiba: Fix parameters position for poll_add Parameters in functions like mcast_cq_send_event_fn, ... were defined in incorrect order. Also their names were weird. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:50 +02:00
Yevheniy Demchenko	c5d4a0762f	totemiba: Del channel fd from poll before destroy Corosync freezes after several peer node connects/disconnects. The freeze happens in recv_token_cq_recv_event_fn in ibv_get_cq_event call. The problems is in fact, that after each peer node connect, recv_token_accept_destroy is called, which tries to call poll_dispatch_delete _after_ freeing of completion_channel. As completion_channel contains fd, handlers are not disconnected from poller properly. This leads to complete inconsistency in subsequent calls to handlers. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:04 +02:00
Yevheniy Demchenko	5046de387b	totemiba: Properly allocate RDMA buffers 1. In UD mode receivnig side of RDMA application should have enough space in buffer to hold data and GRH. Also, sge.length on the receiving size should be set to max_msg_size + sizeof (struct ibv_grh). Current corosync doesn't take grh in the account and does not work if mtu is set to the real mtu of IB port (it works if netmtu is set to < 2048-40). 2. ibv_wc.byte_len is the actual lentgh of the received packet, i.e. msg_len + GRH. GRH length should be substracted in further proceeding. If not, it might cause problems when messages get retransmitted, as their apparent size will constantly grow. 3. Current corosync will not work with rdma and mtus > 2048. Most modern IB HW supports 4096 mtu. Signed-off-by: Yevheniy Demchenko <zheka@uvt.cz> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-20 11:26:00 +02:00
Christine Caulfield	1a046793cb	Reload: Add atomic reload to log config When a reload is in progress, wait until it has all finished before re-reading all of the logging parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:10:07 +01:00
Christine Caulfield	c0bfd48928	Reload: Add atomic reload to totemconfig When a reload is in progress, wait until the whole thing has finished before setting parameters Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:55 +01:00
Christine Caulfield	82fbffc34b	Reload: Add reload code to cfg Add the code to do the actual corosync.conf reload to cfg, along with a corosync-cfgtool -R command to trigger it Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:41 +01:00
Christine Caulfield	bc47c583bd	Reload: Make coroparse use a designated icmap hash table Pass an icmap hashtable into coroparse so we can load it into a temporary one during reload Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-12 16:09:06 +01:00
Jan Friesse	95133a5d77	icmap: Add func to test equality of two key values Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-09-10 17:02:12 +02:00
Christine Caulfield	8567887abb	[PATCH] Replace freopen with open/dup2 when daemonizing This patch replaces the existing freopen method of forcing stdin/out/err to /dev/null with the more usual system of open/dup2. While I don't like posting patches I don't fully understand, this patch seems to fix a problem where stdout/err get assigned to a socket causing double logging output on systemd. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-10 15:33:31 +01:00
Christine Caulfield	3663622576	Add log message to exit signal handler I've seen a few instances where corosync has shut down for apparently 'no reason'. In fact most of the time the shutdown has been caused by an external source (often an init script) but it's not been obvious what has happened and people implicate the deamon This patch simply adds a log message to the signal handler when it is called so that the cause of the shutdown is obvious. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-09-03 14:04:50 +01:00
Jan Friesse	26ef8e15db	icmap: Add map copy function Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:46 +02:00
Jan Friesse	e363f8b06d	icmap: Add function to return item data pointer icmap_get_r is now implemented using this function. Function is not very safe tho defined as static. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:41 +02:00
Jan Friesse	624cd439aa	icmap: Fix value len checking for strings Implementation should allow pass only parts of string (shorten string) and must prohibit reading of uninitialized memory. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:37 +02:00
Jan Friesse	04ddddd6d2	icmap: Add function to return global icmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-29 17:08:32 +02:00
Jan Friesse	e5a528c5cb	icmap: Allow multiple icmap instances Patch adds reentrant version of most of functions (with exception of RO flags support and tracking) to allow multiple icmap instances existence inside corosync. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-08-27 15:23:52 +02:00
Michael Chapman	2740cfd1ea	Fix scheduler pause-detection timeout qb_loop_timer_add expects the timeout to be in nanoseconds, but we were passing the value in milliseconds. Scale the timeout appropriately. Signed-off-by: Michael Chapman <mike@very.puzzling.org> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-08-19 09:03:24 +02:00
David Vossel	b424acc3a0	ipc_glue: proper ref counting during service connection iteration Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:52 +02:00
David Vossel	aa8e56a0fe	ipc_glue: Remove connection unref with no matching reference. We don't reference the connection object on creation, so there is on reason to dereference it on disconnect. Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:36 +02:00
David Vossel	771b239603	ipc_glue: Fixes connection ref count leak Signed-off-by: David Vossel <dvossel@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-07-04 13:05:02 +02:00
Christine Caulfield	074e57910e	The corosync message "A processor joined or left the membership" is vague and unhelpful. People have to look for the following quorum message and try to deduce which nodes have joined or left from that and past membership messages, even though the routine printing the message already has this information to hand. This patch fixes that message so that it prints the nodeids of the nodes that have joined/left the cluster. Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com> Reviewed-By: Jan Friesse <jfriesse@redhat.com>	2013-06-27 14:44:46 +01:00
Jan Friesse	615d7592fb	Log: Output parse errors to syslog When corosync was started in daemon mode and there was parse error, no way existed how to find out what happened (this is usual situation with systemd enabled systems). Solution seems to be output to syslog by default. Also redundant line with setting logsys is removed because it's no longer needed, because FORK and THREADED mode options has no longer effect. FORK is handled by libqb by default and THREADED mode is forced by calling logsys_thread_start. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:42 +02:00
Jan Friesse	d6dd2e455d	totemconfig: Prevent leak of cluster_name str Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:33 +02:00
Jan Friesse	7cba14fb61	service: Fix memleak in service_unlink_and_exit Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-21 11:21:29 +02:00
Jan Friesse	514eb0f37d	ipc_glue: Check service name len Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	f7beba46c5	ipc_glue: Introduce constant for service name len Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	90da72cd7f	cfg: Check interface status and name length Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	335da1ecfd	cfg: Check number of interfaces Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:12 +02:00
Jan Friesse	5dc3fc4bda	totemrrp: Make status string shorter Status string should be same lenght as needed for cfg ringstatusget function. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>	2013-06-18 14:36:11 +02:00
Jan Friesse	845a625908	totem: Don't leak instance variable on crypto fail Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:25 +02:00
Jan Friesse	93286a344e	totemudpu: Handle fd leak in totemudpu Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:21 +02:00
Jan Friesse	421de34972	totemconfig: Check length of rrp_mode string Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:15 +02:00
Jan Friesse	675da75759	coroparse: Ensure that config items fits into cmap Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-18 14:35:05 +02:00
Jan Friesse	e094ab2e2c	votequorum: Prevent leak in qdevice_is_configured Also LEAVE from function is now properly logged. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-17 15:47:27 +02:00
Jan Friesse	4310d84e4d	Initialize error variable in ykd_init Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:57 +02:00
Jan Friesse	92b900da67	Initialize node_found in nodelist_to_interface fun Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:57 +02:00
Jan Friesse	903e02875d	Initialize item in cmap_mcast_send Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	f198955644	votequrorum: Assert sender nodeid is known Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	56ee492471	Check result of logsys_subsys_create Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	d5d4cdb972	Check logsys_format_set result in logsys setup Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	90f8a68a2b	Use proper totem_ip_address size in memset Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	df6b87f293	Free icmap strings in logconfig Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:56 +02:00
Jan Friesse	ce9c69da03	Properly break MAIN_CP_CB_DATA_STATE_QDEVICE state Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	d5d3fb4d45	Do not dereference format_buffer when it's NULL Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	96a89a0085	Check icmap str get for clustername Even this check is really not needed, it's nice to have it and on fault ensure that cluster_name is really NULL. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	966f461b69	Properly check result of stat func in coroparse Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-06-13 10:53:55 +02:00
Jan Friesse	e684e4ca6f	Remove unnecessary mmap in cpg Code for zero-copy in cpg does following mmaps: - Mmap anonymous, private memory to some address (-> malloc) - Mmap shared memory of fd to address returned by first mmap (effectively shadows first mapping) This is not necessary and only one mapping is needed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Steven Dake <sdake@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-05-21 14:46:15 +02:00
Jan Friesse	8429d01389	Detect big scheduling pauses Add poll timer scheduler to be called 3 times per token timeout. If poll timer was not called for more then 0.8 * token timeout, it means corosync process was not scheduled and ether token_timeout should be increased or load should be reduced (useful for VM, where host is overcommitted so VM is not scheduled as expected). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-04-08 09:58:42 +02:00
Jan Friesse	86b074dc1a	Support for numerical uid/gid Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-04-02 09:32:10 +02:00
Andrei Belov	005e7fd3b9	Improved POSIX-compliant handling of getpwnam_r() and getgrnam_r(). Signed-off-by: Andrei Belov <defanator@gmail.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-03-28 16:32:53 +01:00
Jan Friesse	0e3d1a9c51	totempg: Make iov_delv local variable Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-03-21 14:24:23 +01:00
Xia Li	ca6051e80c	Convert the nodeid byte order to be aligned with network order When using corosync with clear_node_high_bit setting to yes, the highest bit is cleared. When all the cluster nodes are in one subnet, we probably configure the IP addresses as follows: node1: 147.2.207.64 node2: 147.2.207.192 If the byte order of the nodeid is little endian, wiping off the highest bit will make the two nodes have the same nodeid! This patch fixes this by converting the nodeid to network order. Signed-off-by: Xia Li <xli@suse.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-03-19 16:39:59 +01:00
Jeremy Fitzhardinge	52f88d04ea	Handle ERANGE from getpwnam_r / getgrnam_r These functions return ERANGE if the supplied buffer is too small to fit a line. Try doubling the buffer a few times until it works.	2013-03-07 16:59:51 -08:00
Jan Friesse	66172a501a	Handle unexpected closing brace in config file If configuration file contains closing brace before opening brace at top level, configuration parsing is stopped and file is not completely parsed. Solution is to detect extra closing brace and display error. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-01-31 16:11:22 +01:00
Jan Friesse	663489d277	Handle colon in configuration file If colon was entered as part of value on end of value, it is deleted. This makes impossible to enter (legal) IPv6 address ending with :: (like fed0::). Also when line contains both brace and colon, it is parsed twice (first as key = value and second as start of section). This is handled by continue in if section. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2013-01-31 16:11:18 +01:00
Fabio M. Di Nitto	98d0245c7e	votequorum: port to sync API (take 2) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-31 15:32:07 +01:00
Fabio M. Di Nitto	55dc09ea23	totemconfig: enforce hmac config when crypto is enabled Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 12:31:47 +01:00
Kazunori INOUE	1ad21e384e	log: move Corosync started log messages "Corosync Cluster Engine ... started" message is shown after logsys is full configured. Signed-off-by: Kazunori INOUE <inouekazu@intellilink.co.jp> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:52:26 +01:00
Fabio M. Di Nitto	ed6bca3293	crypto: drop < 2.3 protocols and onwire compat Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:49:32 +01:00
Fabio M. Di Nitto	b3f456a8ce	totemcrypto: fix hmac key initialization Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2013-01-14 11:23:32 +01:00
Jan Friesse	6127be1806	Move qb_loop creation after daemonization Creating qb_loop before daemonization is not problem for poll or epoll type loops, but it's problem for kqueue, because kqueue is not shared in child with parent after fork. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-12-12 11:47:42 +01:00
Jan Friesse	dd588d004e	Add option to specify ip version Default is ipv4. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-12-03 14:02:32 +01:00
Jan Friesse	92e0f9c7bb	Add waiting_trans_ack also to fragmentation layer Patch for support waiting_trans_ack may fail if there is synchronization happening between delivery of fragmented message. In such situation, fragmentation layer is waiting for message with correct number, but it will never arrive. Solution is to handle (callback) change of waiting_trans_ack and use different queue. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:12 +01:00
Jan Friesse	2d4e7bebb5	Handle segfault in backlog_get If instance->memb_state is not OPERATION or RECOVERY, we was passing NULL to cs_queue_used call. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:48:07 +01:00
Steven Dake	402638929e	Fix problem with sync operations under very rare circumstances This patch creates a special message queue for synchronization messages. This prevents a situation in which messages are queued in the new_message_queue but have not yet been originated from corrupting the synchronization process. Signed-off-by: Steven Dake <sdake@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-22 11:47:57 +01:00
Fabio M. Di Nitto	220d659b38	totemcrypto: implement crypto packet format 2.2 and crypto_compat: config opt Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-22 11:13:30 +01:00
Evgeny Barskiy	e3f615b4a0	corosync to start in infiniband + redundant ring active/passive mode Corosync now works with infiniband transport in any redundant ring mode Signed-off-by: Evgeny Barskiy <barskiy@rts.ru> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-21 10:28:57 +01:00
Fabio M. Di Nitto	ed63c812af	votequorum: fix handling of expected_votes/votes changes from cmapctl and allow natural selection to take place.... Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-20 15:45:57 +01:00
Jan Friesse	3cd4f9a1f5	Add support for selecting IPC type Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-08 12:16:11 +01:00
Jan Friesse	89809ec80e	Check successful initialization of IPC Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-08 12:16:06 +01:00
Angus Salkeld	abc3b6abed	Try reduce the number of sprintf's Signed-off-by: Angus Salkeld <asalkeld@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-11-07 21:28:31 +11:00
Jan Friesse	d4db2ea535	If failed_to_recv is set, consensus can be empty If failed_to_recv is set (node detect itself not able to receive message), we can end up with assert, because my_failed_list and my_member_list are same list. This is happening because we are not following specification and we allow to mark node itself as failed. Because if failed_to_recv is set and we reached consensus across nodes, single node membership is created (ignoring both fail list and member_list), we can skip assert. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-11-05 15:16:25 +01:00
Jacek Konieczny	07832748f2	link libtotem_pg to libqb The libtotem_pg library uses symbols from libqb, so it should be explicitely linked with it. This doesn't cause problems for corosync binary itself, as it is linked to both libraries, but can cause problems if anything else links to libtotem_pg.so and automated checkers can show this as a library problem. Signed-off-by: Jacek Konieczny <jajcus@jajcus.net> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-10-29 16:49:19 +01:00
Jan Friesse	8a9869eeec	Correctly check if service was unloaded my_processing_idx is pointer to received service list, instead of global service number. If we check state of service we should use service_id instead of my_processing_idx. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-17 15:06:36 +02:00
Jan Friesse	c165bf4f51	Define AES_*_KEY_LENGTH if not defined Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-17 15:06:32 +02:00
Fabio M. Di Nitto	20c5871525	totemcrypto: add support for different encryption methods (backport from nsscrypto kronosnet code) Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-10-15 10:00:16 +02:00
Jan Friesse	fc50443f5f	Make totemiba compile again Signed-off-by: Jan Friesse <jfriesse@redhat.com>	2012-10-08 17:44:09 +02:00
Jan Friesse	b7635ab9f7	Return back "Totem is unable to form..." message This patch returns back SUBJ functionality. It rely on fact, that sendmsg will return error, and if such error is returned for long time, it's probably because of firewall. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:35 +02:00
Jan Friesse	d042671369	Move "Totem is unable to form..." message to main Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:33 +02:00
Jan Friesse	6c3b337b37	Use unix socket for local multicast loop Instead of rely on multicast loop functionality of kernel, we now use unix socket created by socketpair to deliver multicast messages to local node. This handles problems with improperly configured local firewall. So if output/input to/from ethernet interface is blocked, node is still able to create single node membership. Dark side of the patch is fact, that membership is always created, so "Totem is unable to form a cluster..." will never appear (same applies to continuous_gather key). Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-08 16:53:30 +02:00
Jan Friesse	4354ed6ecb	Store config_version of other nodes Config version of other nodes is stored in runtime.totem.pg.mrp.srp.members.NODEID.config_version key. Also when local config_version is changed, all nodes are informed. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-03 11:26:35 +02:00
Jan Friesse	d2a85593c4	Support for check of config version on start Config version is requested from other nodes. If our config version is not 0 and differes from highest config version of other nodes, corosync quits. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:32 +02:00
Jan Friesse	73b0fe688d	Make cmap_mcast_send return correct error code Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:28 +02:00
Jan Friesse	a273be58ae	Make service_build contain correct number of msgs Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:24 +02:00
Jan Friesse	3c019f2130	Align items in cmap_mcast_send Aligning function (kernel style magic) MAR_ALIGN_UP is used for aligning of items in req_exec_cmap_mcast message. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:20 +02:00
Jan Friesse	2214a60639	Support for flt and dbl in mcast_endian_convert Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:17 +02:00
Jan Friesse	cbaa2977ae	Add support for sending cmap values to wire Function is little more complex, but it is designed to be used in future without big changes. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:07 +02:00
Jan Friesse	6825c1d39b	Parse config_version as 64-bit uint Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-10-02 16:04:02 +02:00
Jan Friesse	373ded0652	Don't access invalid mem in totemconfig interfaces When ringnumber in config file was set to value bigger or equal to INTERFACE_MAX, we are using this big value as index to totemconfig interfaces array, resulting to access to invalid memory and segfault. Instead of that, ringnumber is now checked and proper error message is printed if value is too big. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-27 13:54:39 +02:00
Jan Friesse	5ce59f49ba	Move some totem and cpg messages to trace level Messages which are flow messages, rather then lifecycle are now logged in trace level. Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:16 +02:00
Jan Friesse	5717655019	Add support for debug level trace in config file Signed-off-by: Jan Friesse <jfriesse@redhat.com> Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>	2012-09-19 11:03:10 +02:00
Fabio M. Di Nitto	8a2e936381	icmap: fix mapping return codes Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Angus Salkeld <asalkeld@redhat.com>	2012-09-12 08:18:50 +02:00
Fabio M. Di Nitto	bb5946babb	build: clean AM_CFLAGS and AM_CPPFLAGS usage around also set commont include dirs. fPIC and DPIC are automatically detected and added as required by libtool. We don't need to carry it around. Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com> Reviewed-by: Jan Friesse <jfriesse@redhat.com>	2012-09-07 09:04:07 +02:00

... 3 4 5 6 7 ...

2100 Commits