Commit Graph

1589 Commits

Author SHA1 Message Date
Jerome FLESCH
99faa3b864 When flushing, discard only memb_join messages
Patch solves problem when 1 ring out of 2 went up/down quite often.

The simplest setup to reproduce bug is following:
- 2 VMs, connected by 2 network interfaces
- OS: Linux
- On one of the VMs, a test program sending some CPG messages (see the
  script "test_corosync.sh" joined to this mail for example)

Here are the Corosync logs we get when we do this setup:

Jun 06 16:23:40 corosync [TOTEM ] A processor joined or left the
membership and a new membership was formed.
Jun 06 16:23:40 corosync [CPG   ] chosen downlist: sender r(0)
ip(192.168.56.104) r(1) ip(192.168.57.104) ; members(old:1 left:0)
Jun 06 16:23:40 corosync [MAIN  ] Completed service synchronization,
ready to provide service.
Jun 06 16:24:37 corosync [TOTEM ] Marking ringid 1 interface
192.168.57.105 FAULTY
Jun 06 16:24:38 corosync [TOTEM ] Automatically recovered ring 1
Jun 06 16:25:33 corosync [TOTEM ] Marking ringid 1 interface
192.168.57.105 FAULTY
Jun 06 16:25:34 corosync [TOTEM ] Automatically recovered ring 1
Jun 06 16:26:35 corosync [TOTEM ] Marking ringid 1 interface
192.168.57.105 FAULTY
Jun 06 16:26:36 corosync [TOTEM ] Automatically recovered ring 1
(...)

The second ring goes down about every 2 minutes and automatically back
up right after.

We spent some times looking for the commit that introduced this bug, and
it appears it's due the following one:
Corosync 1.3.3 -> 1.3.4: e27a58d93d
Corosync 1.4.1 -> 1.4.2: be608c0502
Commit message: Ignore memb_join messages during flush operations

I had a look at this commit, and it seems to me it's dropping too many
packets:
Because of this commit, while totemrrp_recv_flush() is called, Corosync
drops memb_join packets, but also ORF tokens. In the end, it seems that
sometimes, we drop so many of them that Corosync marks the ring as
faulty.

To fix that, only memb_join messages are dropped now.

Signed-off-by: Jerome FLESCH <jerome.flesch@netasq.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-06-11 10:59:30 +02:00
Jan Friesse
2766e57ce5 Store fdata with timestamp and pid in name
This should allow easier handling of various blackbox dumps. Original
fdata name is now symlink to latest created dump.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-06-05 12:19:42 +02:00
Jan Friesse
7ce332a713 totemudpu: Bind sending sockets to bindto address
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-31 09:28:52 +02:00
Fabio M. Di Nitto
f008cf442c rename mainconfig to logconfig
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-05-29 09:36:00 +02:00
Fabio M. Di Nitto
b283ef8f12 mainconfig: allow mainconfig logic to be used both internally and externally
corosync logging configuration logic is rather complex and in order
to make it simpler to reuse (at least within corosync/ tree)
we need to be able to use both icmap and cmap.

the patch might seem controversial, but it reduces heaps of code around
from qdevices (coming next).

It might be useful to consider moving this to a common shared library
but there aren't enough users yet and a shared lib would force
corosync to link with cmap (that we do not want at all costs)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-29 09:04:03 +02:00
Angus Salkeld
5831136c87 LOG: make sure the log target is enabled.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-05-29 14:02:42 +10:00
Angus Salkeld
e6b35bdb7a LOG: handle closing unused logfiles better
This fixes a bug where having a second log file will close
the previous one.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-05-29 14:02:42 +10:00
Angus Salkeld
e6afc761fe LOG: be more explict about the qb file names
else we can get messages been put in the wrong subsys.

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-05-29 14:02:42 +10:00
Jan Friesse
2894f33c4f totemip: Support bind to exact address
Logic for binding now works in following way:
- Try to find exact match
- If not exact match is found, use first found network address

This allows set concrete IP even if network settings contains two IPs on
same network.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-24 14:01:12 +02:00
Jan Friesse
aaa575e091 totemip: insert items in correct order
list_add_tail is used instead of list_add so ip addresses are inserted
in same order as returned by getifaddrs.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-05-24 14:01:08 +02:00
Jan Friesse
0791f44c41 Include ringid in processor joined log message
This should help correlate syslog entires with their blackbox
counterparts.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Andrew Beekhof <andrew@beekhof.net>
2012-05-17 14:58:04 +02:00
Fabio M. Di Nitto
f2444effd0 icmap: don't leak memory when changing ro/rw status on a key
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto
1dcb2d43d9 icmap: fix a valgrind errors (pass 1)
clean up a lot of allocated blocks at exit.
those changes has no runtime effects, but it makes valgrind
output a bit more useful by dropping over 700 errors/warnings to skip
over every single run.

there are still a few icmap related valgrind errors but those need
some more complex and timeconsuming investigation.

pre patch:

==21844== HEAP SUMMARY:
==21844==     in use at exit: 1,229,321 bytes in 1,516 blocks
==21844==   total heap usage: 7,191 allocs, 5,675 frees, 3,819,853 bytes allocated

==21844== LEAK SUMMARY:
==21844==    definitely lost: 3,617 bytes in 11 blocks
==21844==    indirectly lost: 21,960 bytes in 11 blocks
==21844==      possibly lost: 1,080,101 bytes in 131 blocks
==21844==    still reachable: 123,643 bytes in 1,363 blocks
==21844==         suppressed: 0 bytes in 0 blocks

==21844== ERROR SUMMARY: 136 errors from 136 contexts (suppressed: 0 from 0)

post patch:

==25793== HEAP SUMMARY:
==25793==     in use at exit: 1,185,870 bytes in 808 blocks
==25793==   total heap usage: 9,427 allocs, 8,619 frees, 4,156,841 bytes allocated

==25793== LEAK SUMMARY:
==25793==    definitely lost: 3,697 bytes in 12 blocks
==25793==    indirectly lost: 22,248 bytes in 13 blocks
==25793==      possibly lost: 1,079,655 bytes in 113 blocks
==25793==    still reachable: 80,270 bytes in 670 blocks
==25793==         suppressed: 0 bytes in 0 blocks

==25793== ERROR SUMMARY: 119 errors from 119 contexts (suppressed: 0 from 0)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-04-24 09:28:23 +02:00
Fabio M. Di Nitto
d2872aec70 crypto init: release *_slot resource after init
Those are only used at init phase and we can free some memory for the system.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-04-20 10:57:16 +02:00
Fabio M. Di Nitto
b34c1e2870 ipcs: allow connections only after all services are ready
this fixes a rather annoying race condition at startup where a client
connects to corosync "too fast" before the service is ready to operate
and client gets some random data during initialization phase.

With this fix, we allow connections to ipc only after the main engine
is operational and configured (and after the first totem transition).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Angus Salkeld <asalkeld@redhat.com>
2012-04-16 13:39:03 +02:00
Jan Friesse
f89d7b715f Always allocate totemrrp stats array
This prevents segfault when rrp mode is set with only one ring.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-04-10 09:08:42 +02:00
Jan Friesse
92ead6106f Properly parse uidgid files
Full path to key is now tested rather then key name only.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-04-10 09:08:36 +02:00
Fabio M. Di Nitto
cde4468581 totemcrypt: fix build warning (unused variable)
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-27 12:06:46 +02:00
Fabio M. Di Nitto
4378915a33 totemcrypto: major code cleanup (no functional or onwire changes)
- cleanup include list
- reorder code and functions (crypto then hash)
- split crypt/decrypt/hash functions
- some micro optimizations by dropping a few memcpy
- make the code more readable (better var names and buffers mapping)
- improve exit paths on error (return codes and free)
- store crypto header size instead of recalculating it per packet

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-27 11:43:07 +02:00
Jan Friesse
e925f42165 Make ifaces_get work with dynamic no_rings
Commit which added number of addresses to srp_address structure didn't
count with totemsrp_ifaces_get where whole structure was copied instead
of addresses only. This is now fixed.

Also to make API totempg forward compatible, size of interfaces array
must be passed to ifaces_get like functions to prevent memory overwrite.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-26 11:54:26 +02:00
Jan Friesse
124ff4339c Add no_addrs field in srp_addr structure
This should allow us future change to dynamic number of rings without
breaking wire compatibility.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-22 14:03:38 +01:00
Jan Friesse
7a0a39b949 Mark few more icmap keys as read only
Also most of the key settings are now centralized in one function, so
it's easier to audit.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-16 09:37:25 +01:00
Jan Friesse
e57b5b9e6d crypto: Remove sha224 and add md5 hash
SHA224 is not supported on RHEL6 and also it's kind of weird. Instead of
that, md5 can now be configured.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-15 17:36:56 +01:00
Jan Friesse
3b7c2f0588 Update crypto_set API
Also few leftovers from cfg is removed and version of totempg is
increased to 5 to reflect all changes we made

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-15 17:33:53 +01:00
Fabio M. Di Nitto
c75153feb4 crypto: allocate padding in crypto_header
while it might seem a waste of space by using 2 extra bytes in
the crypto_config_header, it actually gives us the option
to grow "unknown at this time" features without hopefully
breaking onwire compat

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-15 12:55:11 +01:00
Fabio M. Di Nitto
4a2d503643 crypto: add new hashing methods and fix config defaults
add support for sha224/256/384/512

change config defaults to match coroparse and totemconfig

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-15 10:55:32 +01:00
Fabio M. Di Nitto
737de4dbd4 crypto: change network packets and add dynamic crypto header/data
The new network packet will look:

struct crypto_config_header * that provides info on crypto/hashing
hash_block[size based on hashing function] (if hash is selected)
salt[SALT_SIZE] (if crypto is selected)
...data...

and we kill the concept of crypto_security_header completely since
values are now dynamic for hash_block_size.

the reason why hash_block needs to be there, is because we do
hash salt in case both hashing and crypto are selected.

the crypto_config_header is totally transparent to totem
and to any underlaying crypto functions.

as we go cleaning, also use HASH_BLOCK_SIZE to generate hash_block.
the input buffer and output buffer size are dependent on the algo
used to hash.

we can now determine the real header size and adjust net_mtu properly
at startup. This will allow in future to use any algorithm since
size is dynamic.

some part of the code still needs some polishing to make it more
readable (specially the mapping of pointers into the packet
is still a bit obscure).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 15:57:01 +01:00
Fabio M. Di Nitto
c3f7d0ef3e totem: don't send garbage onwire if we fail to crypt
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 15:30:40 +01:00
Fabio M. Di Nitto
452800c958 crypto: add crypto config to network data
this add 2 bytes at the end of the each packet to propagate
config info.

in case there is a config mismatch packet must be rejected.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 12:32:10 +01:00
Fabio M. Di Nitto
0a6a6bbcfa crypto: drop secauth and make crypto none work again
keep totem.secauth config key for compatibility

if the key is NOT set, crypto will default to aes256/sha1
if the key is set to "off", crypto is disabled.
this reflects pretty much old behavior

keywords totem.crypto_cipher and totem.crypto_hash can
override secauth individually.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-14 11:28:36 +01:00
Jan Friesse
ab1675f0fe Parse and use hash and crypto from config file
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 17:38:59 +01:00
Jan Friesse
cb97ed186a Rename totemcrypto
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 17:38:46 +01:00
Fabio M. Di Nitto
55e8476697 crypto: mask the crypto operations from totem packet size management
totem doesn't need to understand what crypto does.

totem needs to be able to tell crypto: "those are data, play with them"
and crypto needs to return: "here are your scrambled data and the new size"

similar to decrypt/verify.

this way we add enough dynamic within crypto to change header size and all
at any given time (for different hash algorithm for example) without
affecting on wire compat.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-13 15:50:58 +01:00
Jan Friesse
42a2f69e6f onecrypt: move encryption code to crypto.c
This will remove duplicity of code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 12:23:13 +01:00
Jan Friesse
b5f7dcefeb cfg: remove crypto_set
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-13 12:23:10 +01:00
Jan Friesse
8cdd2fc493 Remove libtomcrypt
Tomcrypt in corosync is for long time not updated. Because we have
support for libnss, libtomcrypt can be removed.

Also few leftovers (AES is 256 bits, not 128, ...) are removed.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-13 09:19:47 +01:00
Fabio M. Di Nitto
20a5289074 drop evs service
there are several reasons for this:

1) evs is only partially implemented with no plans to complete it

typedef enum {
       EVS_TYPE_UNORDERED, /* not implemented */
       EVS_TYPE_FIFO,          /* same as agreed */
       EVS_TYPE_AGREED,
       EVS_TYPE_SAFE           /* not implemented */
} evs_guarantee_t;

2) evs has no users in any upstream distribution and no search
   engine can find any other upstream using it.

3) the only reason (I was told) to carry around evs was that evs
   receives the full ring_id struct from totem. This is only
   partially correct because while the structures are prepared
   to carry around those data, they are never transmitted from
   corosync engine down the IPC line to the user.
   CPG ring_id contains the exact same information and it's
   actually less buggy (due to prototying of the info).

worst case scenario where a user really absolutely need libevs,
it can be easily reimplemented as libcpg wrapper and avoid
lots of code duplication.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 15:51:50 +01:00
Fabio M. Di Nitto
c00502a70a build: drop another leftover from the past
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:13:04 +01:00
Fabio M. Di Nitto
fd79118110 build: drop last LCRSO references
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:12:20 +01:00
Fabio M. Di Nitto
eb3d49ef7d pload: make it a test service and not a public one
pload is a performance benchmark that measures the onwire
speed of corosync.

problem is that once pload has been executed, the cluster
is basically dead.

turn pload into a test tool, by removing corosync-pload tool
and user library.

cleanup pload code to make it more readable and drop lots
of unnecessary stuff.

add test/ploadstart tool that can configure and start pload
via cmap calls.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-12 07:11:51 +01:00
Fabio M. Di Nitto
142ce8c3a1 totem: drop crypt_accept: concept/option
this was another old onwire compat mode that is not useful anylonger.

we can safely move the new model by default.

According to Honza (real hardware 1 node testing) there are no
performance impact.

My tests (8 nodes VM cluster), there is up to 10/12% performance
improvements up to 1M packet size where old and new models are equal.

As a side note, nss still shows to be a performance loss on both
real and virtual hw (without any kind of nss hw acceleration).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
2012-03-10 07:08:30 +01:00
Angus Salkeld
03b32d7fad Fix typo in stats key name.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Angus Salkeld
41b4416bd4 Remove unused function logsys_priority_name_get()
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Angus Salkeld
f628ccba8b Add pid, hostname and process name to the logfile
Note this is only for file targets not stderr or syslog.

https://bugzilla.redhat.com/show_bug.cgi?id=789925

Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
2012-03-09 21:54:51 +11:00
Fabio M. Di Nitto
e0e27e3d12 utils: cleanup main daemon exit codes
some of them are not in use anymore and can be dropped.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-09 11:15:44 +01:00
Fabio M. Di Nitto
8f6e5ff530 sync: kill evil and syncv1 in one shot
this change breaks onwire compatibility.

cpg is the only user of sync_* interface and it's the only
service that will require extra testing.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2012-03-09 11:15:08 +01:00
Fabio M. Di Nitto
64fd946086 votequorum: move last malloc/alloca buf to static
this should guarantee that votequorum won't fail under high memory
pressure. Price is 3500 bytes extra preallocated at startup.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto
90c602902c votequorum: fix node allocation memory leak
stop using malloc for each new node, because we cannot free the memory
easily. Move to a static allocated buffer that can contain
PROCESSOR_MAX + qdevice cluster_node instead.

We can never have more than PROCESSOR_MAX nodes anyway and the memory
footprint is small enough compared to memory leaks (those can
effectively happen only in very dynamic clusters with tons of different
nodes joining/leaveing with different nodeids).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-05 14:30:17 +01:00
Fabio M. Di Nitto
2d7a8ab29a votequorum: rename leave_remove to allow_downscale
pointed out that leave_remove can be easily confused with the old
cman leave_remove behavior. The two are substantially different
and we need to avoid confusion both for users and our support team.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00
Fabio M. Di Nitto
75b3dc0f4e votequorum: fix handling of config updates
cmap changes are local to the node only and should not be broadcasted
as configuration changes.

if any change has happened to us, we will inform other nodes via
send_nodeinfo.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
2012-03-02 14:36:48 +01:00