beta17 Posted May 27, 2016 #1 Posted May 27, 2016 Hi guys I've tried to create a High Availability cluster with two VM's on Proxmox. LAN 1 on both VM's are connected to the vmbr1 bridge. This bridges are direct connected to each other. The configuration wizzard fails with this message: This connection appears to be unstable. Please try a different network cable. And here the output from the Hearbeat check result Xpenology1> more HA.Heartbeat.check.result ======================= ==2016/05/27 11:02:17== ======================= /sbin/ifconfig eth0:DRBD 169.254.1.1 netmask 255.255.255.252 ======================= ==2016/05/27 11:02:17== ======================= /bin/ping 169.254.1.2 -s 1400 -I eth0 -c 3 -w 60 PING 169.254.1.2 (169.254.1.2): 1400 data bytes 1408 bytes from 169.254.1.2: seq=0 ttl=64 time=2004.519 ms 1408 bytes from 169.254.1.2: seq=1 ttl=64 time=1003.909 ms 1408 bytes from 169.254.1.2: seq=2 ttl=64 time=3.034 ms --- 169.254.1.2 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 3.034/1003.820/2004.519 ms ======================= ==2016/05/27 11:02:19== ======================= /bin/ping 169.254.1.2 -s 1400 -I eth0 -w 5 PING 169.254.1.2 (169.254.1.2): 1400 data bytes 1408 bytes from 169.254.1.2: seq=0 ttl=64 time=0.392 ms 1408 bytes from 169.254.1.2: seq=1 ttl=64 time=0.528 ms 1408 bytes from 169.254.1.2: seq=2 ttl=64 time=0.774 ms 1408 bytes from 169.254.1.2: seq=3 ttl=64 time=0.884 ms 1408 bytes from 169.254.1.2: seq=4 ttl=64 time=0.756 ms --- 169.254.1.2 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max = 0.392/0.666/0.884 ms ======================= ==2016/05/27 11:02:24== ======================= /usr/bin/wget --tries=1 --timeout=1 --delete-after http://169.254.1.2:5000/webman/modules/HAManager/10M --2016-05-27 11:02:24-- http://169.254.1.2:5000/webman/modules/HAManager/10M Connecting to 169.254.1.2:5000... connected. HTTP request sent, awaiting response... Read error (Connection timed out) in headers. Giving up. Any ideas? thx beta17
sbv3000 Posted May 27, 2016 #2 Posted May 27, 2016 maybe the latency on the heartbeat is too much because of the overheads of the VM networking? That aside, have you looked at the forum to see other attempts at HA ? I think people have tried but its not working with XPE/DSM If your data is critical enough to need HA for your setup, then maybe you should be using 'real' Syno boxes to get the full benefit of support and functionality? As an alternative to HA, you could just setup folder replication and/or a regular backup between two boxes, these utilities work ok in XPE/DSM and would probably be enough to make your data available.
beta17 Posted May 27, 2016 Author #3 Posted May 27, 2016 it's just because i have the Hardware to play around and lot of fun. Would never go in production with no official and supported products i've located the issue, i use a USB 3 to Gigabit dongle and this one does not support MTU 9000. Just changed the Network interface
AllGamer Posted May 27, 2016 #4 Posted May 27, 2016 maybe the latency on the heartbeat is too much because of the overheads of the VM networking? I was thinking the same thing as well, and I know that from experience, there are a lot of network timing sensitive applications that does not play well, when they are run inside VMs. I'll try later today to slap in 2 bare bones XPEnology machine and see if High Availability works fine in physical hardware, as that was part of the reason I'm spending so much time investing into this project Just ordered a Norco 24 hot swap bay box, and the 3x Supermicro controllers you recommended, i3 6100 skyline, a Workstation motherboard, 1000w PSU, and 32GB DDR4 RAM (not that it needs it, but is good to have extras for the future), and I already have 8x 3TB HDD from some older Linux setup I'm planning to decommission if this XPEnology works well 24/7/365 with High Availability. I currently have a cron job running a Rsync script in Ubuntu/Fedora for "HA", but if XPEnology works, I'll rather have XPEnology do all the work. If the new system works well, I was planning to turn the 2 old Linux boxes into XPEnology boxes as well and run HA on all 3 machines.
sbv3000 Posted May 29, 2016 #5 Posted May 29, 2016 AllGamer - You have too much money and too much 'spare time' :)
alwaysbadboy Posted May 30, 2016 #6 Posted May 30, 2016 Please find my post on Feb 2016 and i re-post below. Re: XpEnology High Availability Postby alwaysbadboy » 10 Feb 2016 12:50 Hi All I am facing the same problem, but solved it by below steps. (version XPEnoboot 5.2-5644) 1: modify the S/N number, make sure both Xpenology NAS are having different SN 2: modify MAC address, make sure it starts with 00:11:29:XXXXX, it is easy to change in Esxi 3: the virtual PC must have serial port ( otherwise you will see the "operation fail, please re-login again" message) 4: HA progress will be very smooth if you didn't create volume at all. But unfortunately, it introduced a new problem. 5: it shows "operation fail, please make sure the passive server is online" if i want to create volume after HA built. Note: (4): if you create volume before config the HA cluster, then HA will fail Any ideas ? How to check the log file of HA ? Thanks
beta17 Posted May 31, 2016 Author #7 Posted May 31, 2016 @alwaysbadboy check this file to find any issues on creating the HA cluster: /tmp/ha/HA.Heartbeat.check.result and check also that your network cards supports MTU 9000 let us know...
beefy147 Posted June 17, 2016 #8 Posted June 17, 2016 I found I could get HA working if you create the HA setup without a volume. when you get the "operation fail, please make sure the passive server is online" - unbind the passive server, create a volume and bind the passive server afterwards (takes a while to resync) this seems to work. only tested with creating some files and folders. Ive even done a "switchover" and it appears to be working. early days
Recommended Posts