JGroups2.8.0 跨server 收不到消息

JGroups 部署在2台server，其中一台更改数据本机可以收到消息，而另一台则收不到消息，日志也没报错，请大虾帮忙：
PS: 同样的配置在jGroups2.6.12.jar可以work，但在2.8.0-GA.jar却不work。配置：
<config>
<synchronization>
<enabled>true</enabled>
<jgroupsInit>
<![CDATA[
UDP(mcast_addr=239.190.1.95;mcast_port=32986;discard_incompatible_packets=true;enable_diagnostics=false;
max_bundle_size=60000;max_bundle_timeout=30;ip_ttl=32;enable_bundling=true;
use_concurrent_stack=true;thread_pool.enabled=true;thread_pool.min_threads=1;
thread_pool.max_threads=25;thread_pool.keep_alive_time=5000;
thread_pool.queue_enabled=false;thread_pool.queue_max_size=100;
thread_pool.rejection_policy=Run;oob_thread_pool.enabled=true;oob_thread_pool.min_threads=1;
oob_thread_pool.max_threads=8;oob_thread_pool.keep_alive_time=5000;oob_thread_pool.queue_enabled=false;
oob_thread_pool.queue_max_size=100;oob_thread_pool.rejection_policy=Run):
PING(timeout=2000;num_initial_members=3):
MERGE2(max_interval=30000;min_interval=10000):
FD_SOCK:FD(timeout=10000;max_tries=5;shun=true):
VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(use_mcast_xmit=false;gc_lag=0;retransmit_timeout=300,600,1200,2400,4800;discard_delivered_msgs=true):
UNICAST(timeout=300,600,1200,2400,3600):
pbcast.STABLE(stability_delay=1000;desired_avg_gossip=50000;max_bytes=400000):
pbcast.GMS(print_local_addr=true;join_timeout=3000;shun=false;view_bundling=true):
FC(max_credits=20000000;min_threshold=0.10):
FRAG2(frag_size=59999):pbcast.STATE_TRANSFER
]]>
</jgroupsInit>
<groupName>csdn</groupName>
</synchronization>
</config>

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

I found the root cause:
       It’ s caused by bind_addr, 101 and 103 both have vmware, and on 101 server type command ifconfig shows:
[root@xmlapi-remedy1 bin]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:15:17:0D:DA:12
          inet addr:10.224.118.101  Bcast:10.224.118.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:46873435 errors:0 dropped:0 overruns:0 frame:0
          TX packets:19680485 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:33923524573 (31.5 GiB)  TX bytes:10429994201 (9.7 GiB)
          Base address:0x2020 Memory:b8820000-b8840000 lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:2584408 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2584408 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:295871444 (282.1 MiB)  TX bytes:295871444 (282.1 MiB)vmnet1    Link encap:Ethernet  HWaddr 00:50:56:C0:00:01
          inet addr:172.16.65.1  Bcast:172.16.65.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:126 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)vmnet8    Link encap:Ethernet  HWaddr 00:50:56:C0:00:08
          inet addr:172.16.185.1  Bcast:172.16.185.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12802 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
If we didn’t set bind_addr in JGroups settings, JGroups firstly create socket using vmnet8 inet addr: 172.16.185.1 not 10.224.118.101, so the messages can’t be sent.How to solve?
I add bind_addr=10.224.118.101, then it can work. But this is a temp way, I think it may has another way.
<jgroupsInit><![CDATA[
UDP(bind_addr=10.224.118.101;mcast_addr=239.190.1.1;mcast_port=32986;discard_incompatible_packets=true;enable_diagnostics=false;
max_bundle_size=60000;max_bundle_timeout=30;ip_ttl=5;enable_bundling=false;
use_concurrent_stack=true;thread_pool.enabled=true;thread_pool.min_threads=2;
thread_pool.max_threads=25;thread_pool.keep_alive_time=5000;
thread_pool.queue_enabled=true;thread_pool.queue_max_size=10000;
thread_pool.rejection_policy=discard;oob_thread_pool.enabled=true;oob_thread_pool.min_threads=1;
oob_thread_pool.max_threads=8;oob_thread_pool.keep_alive_time=5000;oob_thread_pool.queue_enabled=false;
oob_thread_pool.queue_max_size=100;oob_thread_pool.rejection_policy=Run):
PING(timeout=2000;num_initial_members=3):
MERGE2(max_interval=30000;min_interval=10000):
FD_SOCK:FD(timeout=10000;max_tries=5;shun=true):
VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(use_mcast_xmit=false;gc_lag=0;retransmit_timeout=300,600,1200,2400,4800;discard_delivered_msgs=true):
UNICAST(timeout=300,600,1200,2400,3600):
pbcast.STABLE(stability_delay=1000;desired_avg_gossip=50000;max_bytes=400000):
pbcast.GMS(print_local_addr=true;join_timeout=3000;shun=false;view_bundling=true):
FC(max_credits=20000000;min_threshold=0.10):
FRAG2(frag_size=59999):pbcast.STATE_TRANSFER
            ]]></jgroupsInit>Something need to remove:
a. UDP use_concurrent_stack=true
b. FD shun=true
c. GMS shun=false
These are deprecated now.Something need to add:
UDP: ucast_recv_buf_size=20000000; ucast_send_buf_size=640000; mcast_recv_buf_size=25000000; mcast_send_buf_size=640000;
If the size more than OS buffer size, it will give WARN message, so we should add above properties under UDP, or we can execute following on OS:
sysctl -w net.core.rmem_max=83886088
sysctl -w net.core.wmem_max=8388608
sysctl -w net.core.rmem_default=65536888
sysctl -w net.core.wmem_default=65536888
sysctl -w net.ipv4.route.flush=1