This required adding a lot of return values to two previously
'void' functions. I did two rather than just the one that was
needed because it seemed to make sense to do them both together.
Although these functions now return errors, they are probably
still ignored higher up. this really needs a comprehensive audit.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Commit to drop packets from unlisted IPs made ifdown case not working
because msg_name is unset for socketpair.
solution is to drop packets from unlisted IPs only when bind state is
BIND_STATE_REGULAR.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Current we horribly over-use totempg_ifaces_get() to
retrieve information about knet interfaces. This is an attempt to
improve on that.
All transports are supported (so not only Knet but also UDP(U)).
This patch builds best against the "onwire-upgrade" branch of knet
as that's what sparked my interest in getting more information out.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
This feature allows corosync to block packets received from unknown
nodes (nodes with IP address which is not in the nodelist). This is
mainly for situations when "forgotten" node is booted and tries to join
cluster which already removed such node from configuration. Another use
case is to allow atomic reconfiguration and rejoin of two separate
clusters.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
This patch intends to solve long time ifdown corosync problem. Idea is
to use local socket for sending both unicast and multicast messages if
interface is down.
Together with testing what is current bind state it's possible to keep
pretending existence of old IP address instead of rebinding to localhost
what breaks a lot things badly.
Heavilly based on Yu, Zou <zouyu@shiqichuban.com> work and it's
basically port of UDP patch created by
Jan Friesse <jfriesse@redhat.com>.
(ported from needle 96354fba72)
Signed-off-by: Bin Liu <bliu@suse.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
To make finding victim of incompatible messages easier, IP of sender is
logged. Propagating IP in layers makes patch slightly larger.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
This shrinks the srp_addr (and consequently every packet sent by
corosync) so that instead of containing loads of IP addresses to
identify a node, it just sends the nodeid.
This then allows us to make ring0 optional and replaceable when running
knet.
It also means that we need some other way of identifying the local
node in corosync.conf, so the nodelist.node.name entry is now mandatory
and is mapped to the local host using the same algorithm as used in
cman.
This code needs LOTS of testing as it touches a huge amount of totemsrp
and totemconfig.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Receive buffer should be based on PROCESSOR_COUNT_MAX and not static
buffer.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Now we are using knet, it's possible to dynamically add, remove and
reconfigure links on the fly.
Also print 'n' for non-existant knet links. This will show up
only on loopback links >0. But it looks better than 'status ='
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
If bind call fails it's retried for BIND_MAX_RETRIES.
If it's still unsuccessful, corosync exists instead
of working incorrectly.
Slightly modified by reviewer.
Signed-off-by: Masse Nicolas <nicolas.masse@stormshield.eu>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
knet needs buffers to be KNET_MAX_PACKET_SIZE or messages will
get lost or corrupted.
UDPU packets shouldn't be that big so I introduced UDP_FRAME_SIZE_MAX
for that transport.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
This is a big update that removes RRP & MRP from the codebase
and makes knet the default transport for corosync. UDP & UDPU
are still (currently) supported but are deprecated. Also crypto
and mutiple interfaces are only supported over knet.
To compile this codebase you will need to install libknet from
https://github.com/fabbione/kronosnet
The corosync.conf(5) man page has been updated with info on the new
options. Older config files should still work but many options
have changed because of the knet implementation so configs should
be checked carefully. In particular any cluster using using RRP
over UDP or UDPU will not start as RRP is no longer present. If you
need multiple interface support then you should be using the knet transport.
Knet brings many benefits to the corosync codebase, it provides support
for more interfaces than RRP (up to 8), will be more reliable in the event
of network outages and allows dynamic reconfiguration of interfaces.
It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have
plagued corosync/openais from day 1
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
MTU for IPv6 is 20 bytes larger then IPv4. This fact was not taken into
account so IPv6 packets were larger then MTU resulting in fragmentation.
Solution is to substract correct IP header size.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
To follow spec it's needed to send messages to all nodes (not only
active members) from time to time to detect merge.
This is needed in situations when totemsrp merge timer isn't running
(because there is enough messages sent by processors) to detect merge.
Example scenario:
- 3 nodes, all of them running cpgverify
- One node is isolated (iptables for example)
- Node is un-isolated
Without this commit, node will not merge as long as the cpgverify is
running.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Member active is used for sending "multicast" messages only to members
of ring. This reduces network load if some nodes are intentionally down.
Only regular multicast message load is reduced (messages sent by
totemudpu_mcast_noflush_send), because special messages (like hold
cancel, join message, ...) still have to be send to all members to
ensure correct behavior.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
This patch returns back SUBJ functionality. It rely on fact, that
sendmsg will return error, and if such error is returned for long time,
it's probably because of firewall.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
drop all SOLARIS specific ifdefs and replace them with feature checks
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
This will remove (non critical) debug message from QB about polling on
closed FD.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Also few leftovers from cfg is removed and version of totempg is
increased to 5 to reflect all changes we made
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Fabio M. Di Nitto <fdinitto@redhat.com>
keep totem.secauth config key for compatibility
if the key is NOT set, crypto will default to aes256/sha1
if the key is set to "off", crypto is disabled.
this reflects pretty much old behavior
keywords totem.crypto_cipher and totem.crypto_hash can
override secauth individually.
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
totem doesn't need to understand what crypto does.
totem needs to be able to tell crypto: "those are data, play with them"
and crypto needs to return: "here are your scrambled data and the new size"
similar to decrypt/verify.
this way we add enough dynamic within crypto to change header size and all
at any given time (for different hash algorithm for example) without
affecting on wire compat.
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Tomcrypt in corosync is for long time not updated. Because we have
support for libnss, libtomcrypt can be removed.
Also few leftovers (AES is 256 bits, not 128, ...) are removed.
Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
this was another old onwire compat mode that is not useful anylonger.
we can safely move the new model by default.
According to Honza (real hardware 1 node testing) there are no
performance impact.
My tests (8 nodes VM cluster), there is up to 10/12% performance
improvements up to 1M packet size where old and new models are equal.
As a side note, nss still shows to be a performance loss on both
real and virtual hw (without any kind of nss hw acceleration).
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
These look ugly, are inconsistently done and just have
to be removed later in libqb before calling syslog.
Signed-off-by: Angus Salkeld <asalkeld@redhat.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Our preferred shared logging system is exported via the libqb library. As
a result, the corosync project no longer needs to export logsys.so and the
code can be directly included in the binary. The header file can also be
removed.
Signed-off-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
Add a new object called totem.interface.dynamic to allow creation/deletion
of new child objects using the corosync-objctl utility:
to add new member:
linux# corosync-objctl -c totem.interface.dynamic.10-211-55-12
to delete an existing member:
linux# corosync-objctl -d totem.interface.dynamic.10-211-55-12
Corosync will dynamically add these members to the configuration and start
communicating with those nodes.
Signed-off-by: Anton Jouline <anton.jouline@cbsinteractive.com>
Reviewed-by: Steven Dake <sdake@redhat.com>