Commit Graph

1978 Commits

Author SHA1 Message Date
Christine Caulfield
137b31397c knet: Don't try to create loopback interface twice
It wasn't hardmful, but it generated an annoying message

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-07-02 08:00:36 +02:00
Christine Caulfield
5dda71ae29 knet: Fix knet log buffer size
knet sends log messages as struct knet_log_msg, not a string
of KNET_MAX_LOG_MSG_SIZE (which is only part of that structure).
So we were both losing and corrupting messages.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-07-02 08:00:15 +02:00
Jan Friesse
23e17953fe cpg: Inform clients about left nodes during pause
Patch tries to fix incorrect behaviour during following test-case:
- 3 nodes
- Node 1 is paused
- Node 2 and 3 detects node 1 as failed and informs CPG clients
- Node 1 is unpaused
- Node 1 clients are informed about new membership, but not about Node 1
  being paused, so from Node 1 point-of-view, Node 2 and 3 failure

Solution is to:
- Remove downlist master choose and always choose local node downlist.
  For Node 1 in example above, downlist contains Node 2 and 3.
- Keep code which informs clients about left nodes
- Use joinlist as a authoritative source of nodes/clients which exists
  in membership

This patch doesn't break backwards compatibility.

I've walked thru all the patches which changed behavior of cpg to ensure
patch does not break CPG behavior. Most important were:
- 058f50314c - Base. Code was significantly
  changed to handle double free by split group_info into two structures
  cpg_pd (local node clients) and process_info (all clients). Joinlist
  was
- 97c28ea756 - This patch removed
  confchg_fn and made CPG sync correct
- feff0e8542 - I've tested described
  behavior without any issues
- 6bbbfcb6b4 - Added idea of using
  heuristics to choose same downlist on all nodes. Sadly this idea
  was beginning of the problems described in
  040fda8872,
  ac1d79ea7c,
  559d4083ed,
  02c5dffa5b,
  64d0e5ace0 and
  b55f32fe2e
- 02c5dffa5b - Made joinlist as
  authoritative source of nodes/clients but left downlist_master_choose
  as a source of information about left nodes

Long story made short. This patch basically reverts
idea of using heuristics to choose same downlist on all nodes.

(ported from needle 9c2a97f4f9)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-30 14:37:20 +02:00
Jan Friesse
e45bbcc92a totemsrp: Fix leave message regression
Leave message in totem is just join message where leaving member is
excluded from member list and included in fail list. It also contains
special nodeid in header.nodeid and system_from.nodeid fields.

Before "totem: Use nodeid ONLY in srp_addr" fix, most of the functions
were using system_from addresses and not nodeid, which was used only in
one specific case for memb_consensus_set function.

After the patch, addresses are gone and only nodeid is used. Result is,
that leaving node nodeid is not added into local fail list
(my_faillist) so node is unable to reach consensus till token timeout,
which starts new gather process.

Solution is to send valid leaving node nodeid in system_from.nodeid and
handle specific case for memb_consensus_set in memb_join_process.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-23 17:46:05 +02:00
Jan Friesse
dc590159f5 totemsrp: Log proc/fail lists in memb_join_process
These information are useful and with trace log level they should not be
too much irritating.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-23 17:45:51 +02:00
Jan Friesse
9b3782e48e totemsrp: Fix srp_addr_compare
There is regression caused by "totem: Use nodeid ONLY in srp_addr" patch
in srp_addr_compare function. This function should be usable with qsort,
so it should return values less than, equal to or greater than zero. It
was however returning only zero or negation of a zero. Final results
were unable to reach consensus in following test case:
- 3 node cluster
- start nodes 1, 2, 3
- shutdown node 3
- start node 3
- shutdown node 2
- start node 2
- shutdown node 1

After this steps, node 2 and 3 were unable to reach consensus.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-23 17:45:29 +02:00
Ferenc Wágner
baece74c39 Fix typo: sucesfully -> successfully
Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-04-20 12:04:49 +02:00
Jan Friesse
ccb2290f84 totemsrp: Check join and leave msg length
If number of proc_list, failed_list or active members is too high it
may be impossible to put them into message, which is allocated on the
stack what results in stack corruption.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-12 15:25:38 +02:00
Jan Friesse
c139255669 totemsrp: Implement sanity checks of received msgs
Sanity checkers are used to prevent crashing because of
accessing unallocated memory.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-04-12 15:25:33 +02:00
Jan Friesse
69857efb5b totem: Display IP of sender
To make finding victim of incompatible messages easier, IP of sender is
logged. Propagating IP in layers makes patch slightly larger.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-03-16 13:58:15 +01:00
Jan Friesse
0c509a25a7 totemsrp: Add magic and version into header
Magic number (0xC070) together with version in every packet
is used for detecting that other node is really
Corosync 3.x.

Endian_detector field is removed and magic number is now
used instead.

If received packet magic number differs, guessing is used to show more
about the source (Corosync 2.3+, 2.2 are quite reliable, Knet and
unencrypted Corosync 2.1/2.0/1.x/OpenAIS are semi-reliable and encrypted
Corosync 2.1/2.0/1.x/OpenAIS are quite unreliable).

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-03-16 13:57:55 +01:00
Christine Caulfield
066525efd3 knet: Fix display of links with unconfigured link0
because totemknet always configures link0 as loopback even
if it's not known to corosync, we need to filter it
out when returning the link status, as things get misaligned
in cfg.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-16 13:11:13 +01:00
Jan Friesse
b3f3a1df26 main: Set errno before calling of strtol
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-03-02 17:29:22 +01:00
Christine Caulfield
2c20590d16 knet: Always use link0 for loopback
Even if it's not used for anything else.

Also, make cfgtool show the correct link ID when links are not
contiguous

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:23:20 +01:00
Christine Caulfield
111bfbc11d totem: Fix debug warnings printed by knet
Fix crash introduced a couple of commits ago in iface_get

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:22:22 +01:00
Christine Caulfield
f5871c6b4c config: Allow use of ring0_addr
Allow ring0_addr to be used in place of 'name' for
backwards compatibility

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:21:37 +01:00
Christine Caulfield
7a639d1b62 config: Update message when local host isn't found
Make the message more representative of what's going on.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:20:00 +01:00
Christine Caulfield
386d710ed1 cfg: Fix cfg_get_node_addrs so that DLM works
Also update copyright dates

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:19:45 +01:00
Christine Caulfield
f5b690bd96 totem: Return interface count correctly
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:19:12 +01:00
Christine Caulfield
fc8580bdbf totem: Use nodeid ONLY in srp_addr
This shrinks the srp_addr (and consequently every packet sent by
corosync) so that instead of containing loads of IP addresses to
identify a node, it just sends the nodeid.

This then allows us to make ring0 optional and replaceable when running
knet.

It also means that we need some other way of identifying the local
node in corosync.conf, so the nodelist.node.name entry is now mandatory
and is mapped to the local host using the same algorithm as used in
cman.

This code needs LOTS of testing as it touches a huge amount of totemsrp
and totemconfig.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-03-01 14:18:51 +01:00
Rytis Karpuška
105f3ae98c totempg: Fix corrupted messages
Commit 899cb29983 changed copy_len
to iovec[i].iov_len, assuming,
copy_len is always the same as iovec[i].iov_len under those
circumstances, but it missed the possability of small message being
partly put at the end of packet, which cuts this message in two parts
and therefore making copy_len not equal to iovec[i].iov_len.

This is revert of 899cb29983

Signed-off-by: Rytis Karpuška <rytisk@neurotechnology.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-02-09 17:38:05 +01:00
Rytis Karpuška
899cb29983 totempg: use iovec[i].iov_len instead of copy_len
To be more explicit that we are copying whole message.

Related to 0ebae6b47d.

Signed-off-by: Rytis Karpuška <rytisk@neurotechnology.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-02-08 09:30:07 +01:00
Rytis Karpuška
0ebae6b47d totempg: Fix fragmentation segfault
The problem was that two or more messages were concatenated
together during fragmentation in mcast_msg() function. In specific case,
message of just short of 1MB was provided for mcast_msg() and it
happened so, that the remainder (212 bytes to be exact) left some free
space in packet, therefore branch

  if ((copy_len + fragment_size) <
    (max_packet_size - sizeof (unsigned short))) {
...

was selected and this was the last mesage in provided iovec.
Then, on the second call, came another big message (about 300KB ) and
during fragmentation mcast.fragmented was set to 1.

On the other end, while receiving messages, due to missing
mcast.fragmentation==0 those two messages were concatenated and
therefore assembly->data array overflowed overwriting linked list
pointers and offset (which happened to be set to 0 and that 300KB
message was being copied from the beginning again).
After whole 300KB message has been sent, mcast.fragmentation==0 arrived
and totempg_deliver_fn() tried to move assembly structure to
assembly_list_free list, but as linked list pointers has been overriden,
segfault occured.

Signed-off-by: Rytis Karpuška <rytisk@neurotechnology.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-02-08 09:29:22 +01:00
Fabio M. Di Nitto
1411608a81 [build] fix build with non-standard knet location
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-02-05 15:57:12 +01:00
Jan Friesse
11fa527ed4 logging: Close before and open blackbox after fork
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-30 13:21:52 +01:00
Jan Friesse
79dba9c51f logging: Make blackbox configurable
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-30 13:21:48 +01:00
Jan Friesse
1fba1b83aa build: Replace -lknet with autoconf generated vars
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-25 16:08:09 +01:00
Jan Friesse
589ed92505 build: Remove rdma/ibverbs
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-25 16:08:07 +01:00
Christine Caulfield
31ddba64a2 config: Don't fudge port numbers
When I was adding knet I wanted the port numbers to default to the
base port number + the linknumber.

However I seem to have messed this up such that any port number
specified in the config file has the link number added to it. Which
is almost certainly not what people would expect.

This patch sets it right. If a port number is not specified
then 5405+linknumber is used. If a port number IS specified
then that actual number is used.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-01-18 16:31:24 +01:00
Christine Caulfield
22ae4cacda knet: Allow ping_timers to be auto-configured
knet ping_timers are auto-configured according to token value.

This patch also fixes some knet config bugs that resulted in defaults
not being applied when values were removed from corosync.conf.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-01-15 15:08:19 +01:00
yuskiida
e7734fab70 build: Add the headers necessary for RPM build
Signed-off-by: yuskiida <yusk.iida@gmail.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-01-11 14:47:46 +01:00
Christine Caulfield
236032f7b5 config: if local node addr is wrong, fail with a sensible message
If no valid local address is found in corosync.conf then corosync
exits with: "parse error in config: No multicast port specified"

This is because of the config change for knet that always populates
the interfaces. The old error of "no interfaces found" was only
slightly better anyway IMHO.

This patch adds an explicit check that local_node_pos has been
set in icmap and uses that to determine if a valid local address
has been found.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-01-09 17:50:12 +01:00
Jan Friesse
96cb977880 totemknet: Drop truncated packets on receive
This is backport of part of "totemudpu: Scale receive buffer" patch.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-09 17:46:31 +01:00
Jan Friesse
0f1813adff totemudp: Make use of UDP_RECEIVE_FRAME_SIZE_MAX
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-09 17:46:28 +01:00
Jan Friesse
32535b842c totemudpu: Export and rename UDPU_FRAME_SIZE_MAX
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-09 17:46:25 +01:00
Jan Friesse
3982f795d5 totemconfig: Fix UDP autogeneration of mcast addr
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-09 17:46:21 +01:00
Jan Friesse
155c0d4052 totemudpu: Scale receive buffer
Receive buffer should be based on PROCESSOR_COUNT_MAX and not static
buffer.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2018-01-09 17:46:04 +01:00
Christine Caulfield
98bb0c78c8 config: Allow selection of crypto_model
KNET has options for nss or openssl crpyto libraries, make this
available to corosync.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2018-01-05 15:25:17 +01:00
Christine Caulfield
2a6a571c06 config: Allow links to have different ip_versions
knet allows links to have different IP versions - proivided they
all match per link. So don't force them all to be the same.

I've added a check here to make sure that all nodes on the same
link are using the same IP version.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-12-22 17:15:19 +01:00
Bin Liu
b1d3eca448 wd: fix snprintf warnings
When running ./configure --enable-watchdog, gcc 7.2.1 will report
warnings for snprintf. This patch fixes the warnings.

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-12-01 17:23:54 +01:00
Christine Caulfield
1ca72a1154 totemsrp: Revert totemsrp_get_ifaces() changes
In my enthusiasm for removing code while integrating knet I
also deleted the correct code for returning IP address for a node,
so that only the IP addres of the local node was ever returned.

This commit restores the the previous code.

Also, because we always return INTERFACE_MAX interfaces now (they don't
have to be contiguous) set ss_family to zero if that interface is not
in use so that downstream apps know and don't display a lot of 0.0.0.0

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-11-30 16:59:05 +01:00
Bin Liu
af21baf0ff totemconfig: remove duplicate aes256 test
Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-11-29 18:18:52 +01:00
Jan Friesse
154895dfbe sync: Call sync_init of all services at once
This patch solves situation which can happen very rearly:
- Node B is running
- Node A is started and tries to create singleton membership. It also
  initialize service S which tries to send message during initialization
- Just before node A finished move to operational state, it gets
  Node B multicast message so moves to gather state
- Node A and B creates membership and moves to operational state and
  sync is started
- Node A and B receives message sent by node A during initialization of
  service S
- Node A exits before sync of service is finished

In this situation, node B may never execute sync_init for
service S. So node B service S is not aware of existence of node A but
it received message from it.

Similar situation can theoretically also happen during merge.

Solution is to change flow of sync, so now it looks like:

- Build service_list
- Call sync_init for all local services
- Send service_list
- Receive service_list from all members and send barier
- For all services:
  - Receive barier
  - Call sync_activate if this is not first service
  - Call sync_process for next service or finish sync if previous
    this service is the last one
  - Send barier

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2017-11-16 15:22:19 +01:00
Jan Friesse
499eaac80f sync: Remove unneeded determine sync code
Code was used for compatibility with old sync v1 (in needle this was
deleted and previous version 2 became v1), and it's no longer needed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2017-11-16 15:22:14 +01:00
Christine Caulfield
1df7eca5ad stats: Add some missing knet stats
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-11-16 08:35:50 +01:00
Ferenc Wágner
09b0123d58 Send corosync startup notification to systemd
This enables starting the daemon directly in the service file, because
dependent units won't be started until initialization is complete.

Signed-off-by: Ferenc Wágner <wferi@debian.org>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-11-09 09:49:18 +01:00
Jan Friesse
f05d1c9293 coroparse: Do not convert empty uid, gid to 0
When uid (or gid) value was empty string it was incorrectly converted to
0. Solution is to check input string emptines.

Thanks Bin Liu <bliu@suse.com> for reporting the bug.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Bin Liu <bliu@suse.com>
2017-11-06 09:37:54 +01:00
Christine Caulfield
45fe19ed86 stats: Don't display errors when reading knet stat
Only add the knet handle stat keys if we are actually running knet. This
prevents errors occurring when iterating through all of the stats keys

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-11-03 13:40:41 +01:00
Christine Caulfield
d9dfd41e4e stats: Add cmap key to clear the various stats.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-10-31 17:39:14 +01:00
Bin Liu
cf339c20c3 totemconfig: generate mcast icmap items for UDP
Generating mcastaddr and mcastport in icmap make
sense only for UDP transport.

Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2017-10-30 14:14:48 +01:00