mirror_corosync/exec
Jason cfbb021e13 totem: Drop invalid join msg in operational state
According to the totem paper, if a processor
receives a join message in the operational state and if the
receivers identifier is in the join messages fail list,
then join message should be ignored.

By applying this validation of join messages, we can avoid unnecessary
switching from operational state to gather state(or even lead to rings
can not be merged) like the following to happen.

1. Initially, there is only one ring contains three nodes, say
   ring(A,B,C).
2. A and B network partition, "in the same time", C is down.
3. Node A sends join message with proclist:A,B,C. faillist:NULL.
   Node B sends join message with proclist:A,B,C. faillist:NULL.
4. Both A and B consensus timeout due to network partition.
5. A and B network remerged.
6. Node A sends join message with proclist:A,B,C. faillist:B,C. and
   create ring(A).
   Node B sends join message with proclist:A,B,C. faillist:A,C. and
   create ring(B).
7. Say join message with proclist:A,B,C. faillist:A,C which sent
   by node B is received by node A because network remerged.
8. Node A shifts to gather state and send out a modified join message
   with proclist:A,B,C. faillist:B. Such join message will prevent
   both A and B from merging.
9. Node A consensus timeout (caused by waiting node C) and sends join
   message with proclist:A,B,C. faillist:B,C again.

Same thing happens on node B, so A and B will dead loop forever
in step 7, 8 and 9.

As the paper also said: "If a processor receives a join message in the
operational state and if the sender's identifier is in the receiver's
my_proclist and the join message's ring_seq is less than the receiver's
ring sequence number, then it ignores the join message too." So these
patch applying these validations of join messages altogether.

Signed-off-by: Jason <huzhijiang@gmail.com>
Reviewed-by: Steven Dake <sdake@redhat.com>
Reviewed-by: Jan Friesse <jfriesse@redhat.com>
2014-01-13 14:46:13 +01:00
..
.gitignore Add .gitignore files. 2010-10-21 07:43:46 -07:00
apidef.c sync: kill evil and syncv1 in one shot 2012-03-09 11:15:08 +01:00
apidef.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
cfg.c Reload: Add reload code to cfg 2013-09-12 16:09:41 +01:00
cmap.c Initialize item in cmap_mcast_send 2013-06-13 10:53:56 +02:00
coroparse.c votequorum: Add persistent expected_votes tracking. 2014-01-07 15:30:11 +00:00
cpg.c Remove unnecessary mmap in cpg 2013-05-21 14:46:15 +02:00
cs_queue.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
fsm.h Make logging of WD and MON service correct 2012-08-16 14:45:15 +02:00
icmap.c icmap: Add func to test equality of two key values 2013-09-10 17:02:12 +02:00
ipc_glue.c ipc_glue: proper ref counting during service connection iteration 2013-07-04 13:05:52 +02:00
logconfig.c Reload: Add atomic reload to log config 2013-09-12 16:10:07 +01:00
logconfig.h rename mainconfig to logconfig 2012-05-29 09:36:00 +02:00
logsys.c logsys: Make logging of totem work again 2013-11-04 12:32:35 +01:00
main.c logsys: Make logging of totem work again 2013-11-04 12:32:35 +01:00
main.h Reload: Make coroparse use a designated icmap hash table 2013-09-12 16:09:06 +01:00
Makefile.am link libtotem_pg to libqb 2012-10-29 16:49:19 +01:00
mon.c build: bring SOLARIS up to the same standard as other OSes 2012-08-30 15:00:27 +02:00
pload.c build: bring SOLARIS up to the same standard as other OSes 2012-08-30 15:00:27 +02:00
quorum.c sync: kill evil and syncv1 in one shot 2012-03-09 11:15:08 +01:00
quorum.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
schedwrk.c Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
schedwrk.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
service.c service: Fix memleak in service_unlink_and_exit 2013-06-21 11:21:29 +02:00
service.h service: remove leftovers from mt corosync 2012-08-09 15:10:16 +02:00
sync.c Correctly check if service was unloaded 2012-10-17 15:06:36 +02:00
sync.h sync: kill evil and syncv1 in one shot 2012-03-09 11:15:08 +01:00
timer.c Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
timer.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
totemconfig.c Reload: Add atomic reload to totemconfig 2013-09-12 16:09:55 +01:00
totemconfig.h Tweak nodeid warning 2012-02-21 16:33:56 +01:00
totemcrypto.c crypto: drop < 2.3 protocols and onwire compat 2013-01-14 11:49:32 +01:00
totemcrypto.h crypto: drop < 2.3 protocols and onwire compat 2013-01-14 11:49:32 +01:00
totemiba.c totemiba: Check if configured MTU is allowed by HW 2013-09-20 11:27:08 +02:00
totemiba.h Return back "Totem is unable to form..." message 2012-10-08 16:53:35 +02:00
totemip.c Convert the nodeid byte order to be aligned with network order 2013-03-19 16:39:59 +01:00
totemmrp.c Add waiting_trans_ack also to fragmentation layer 2012-11-22 11:48:12 +01:00
totemmrp.h Add waiting_trans_ack also to fragmentation layer 2012-11-22 11:48:12 +01:00
totemnet.c Return back "Totem is unable to form..." message 2012-10-08 16:53:35 +02:00
totemnet.h Return back "Totem is unable to form..." message 2012-10-08 16:53:35 +02:00
totempg.c totempg: Make iov_delv local variable 2013-03-21 14:24:23 +01:00
totemrrp.c totemrrp: Make status string shorter 2013-06-18 14:36:11 +02:00
totemrrp.h Update crypto_set API 2012-03-15 17:33:53 +01:00
totemsrp.c totem: Drop invalid join msg in operational state 2014-01-13 14:46:13 +01:00
totemsrp.h Add waiting_trans_ack also to fragmentation layer 2012-11-22 11:48:12 +01:00
totemudp.c totem: Don't leak instance variable on crypto fail 2013-06-18 14:35:25 +02:00
totemudp.h Return back "Totem is unable to form..." message 2012-10-08 16:53:35 +02:00
totemudpu.c totem: Don't leak instance variable on crypto fail 2013-06-18 14:35:25 +02:00
totemudpu.h Return back "Totem is unable to form..." message 2012-10-08 16:53:35 +02:00
util.c drop evs service 2012-03-12 15:51:50 +01:00
util.h rename mainconfig to logconfig 2012-05-29 09:36:00 +02:00
votequorum.c votequorum: Add persistent expected_votes tracking. 2014-01-07 15:30:11 +00:00
votequorum.h Remove include/engine/quorum and integrate it into exec/engine.h 2012-02-08 08:31:10 -07:00
vsf_quorum.c build: bring SOLARIS up to the same standard as other OSes 2012-08-30 15:00:27 +02:00
vsf_ykd.c Initialize error variable in ykd_init 2013-06-13 10:53:57 +02:00
vsf_ykd.h Remove include/engine/quorum and integrate it into exec/engine.h 2012-02-08 08:31:10 -07:00
vsf.h Update copyright header dates in exec directory 2012-02-13 17:05:04 -07:00
wd.c build: bring SOLARIS up to the same standard as other OSes 2012-08-30 15:00:27 +02:00