zeroho

XpEnology High Availability

Recommended Posts

I went through lots of topics and tried to search over the forums; I am not able to find anybody who could enable the HA feature successfully based on 2 XPEnology VM guests on a single ESXi setup. I did changed to utilize multiple MAC and the corresponding SN on the 2 XPEnology; I also tried the 4.2-3202 (end up with "failed to create cluster after 10 minutes) and 4.3-3810 (end up with "Operation failed. Please re-login and try again" after I clicked the Apply button) versions but both were not success.

 

Please kindly give me a hand on this.

Share this post


Link to post
Share on other sites

I had a play with this today

 

for testing

2 x Supermicro 1RU servers

20g ram

2 x 250g drives in each, setup as a mirror (High Availability doesnt allow SHR)

2 x Intel Lan ports

 

setup 1st one with 4.3 3810 v1.0 from trantor

once installed, I broke the SHR config and rebuilt as a mirror set RAID1

 

setup local time properly

added a user

made sure networking was setup properly

 

edited grub.conf to reflect the propery mac address's in each server and both had different serial numbers

 

Set static IP's

 

Server 1

lan port 1 = 10.0.0.52, 255.255.255.0 <

lan port 2 = 192.168.0.1 255.255.255.0 <<< which doesn't matter as HA software changes this later too 169.254.1.1 255.255.255.252

 

Server 2

lan port 1 = 10.0.0.53, 255.255.255.0 <

lan port 2 = 192.168.0.2 255.255.255.0 <<< which doesn't matter as HA software changes this later too 169.254.1.1 255.255.255.252

 

Connected a cable direct from Server 1 to Server 2 both on Lan port 2

 

Double checked that all ports were running @ 1000Mbps......as 2 were not 1st off.....swap some cables around until all 4 ports were up @ Gigabit

 

Then off to install HA package (which 1st need you to install the python package)

 

Tested the setup, but yes it fails at the end

 

Then I thought, as each server has two ip addresses on Lan port 1.....eg one that you setup, then I guess a virtual / bridged one that you type in in the setup.... it must need bridging enabled??

 

had a looksee and it seems the bridge module isn't loaded...

 

/lib/module/bridge.ko

 

is there

 

so lets load it

 

insmod /lib/module/bridge.ko

 

fails as it need also /lib/module/stp.ko and /lib/module/llc.ko

 

now with all three loaded lets try again......

 

now it fails with

 

Jan 1 23:03:50 SUPER1 ha.cgi: write to uart2 error, error=(5)Input/output error

Jan 1 23:03:50 SUPER1 ha.cgi: ha.cc:1712: SYNOHWExternalControl failed 8192

 

which I wonder is something to do with the serial port?

 

which I will enable next.... tho I wonder if the REAL hardware has something attached to a serial port inside it?

 

.

Share this post


Link to post
Share on other sites

I tried ESXi.

2 DSM run on 2 different ESXi boxes.

 

I got some different log messages, but I don't know if those are critical ( caused by ESXi )

 

============

 

Jan 2 01:56:10 DSM-43-AA ha.cgi: net_wol_enable.c:110 Cannot get current wake-on-lan settings

Jan 2 01:56:10 DSM-43-AA ha.cgi: net_wol_enable.c:176 Failed to set wake-on-lan eth2 g

Jan 2 01:56:10 DSM-43-AA ha.cgi: ha.cc:1666 Failed to set Wake on Lan 2

Jan 2 01:56:10 DSM-43-AA ha.cgi: net_wol_enable.c:110 Cannot get current wake-on-lan settings

Jan 2 01:56:10 DSM-43-AA ha.cgi: net_wol_enable.c:176 Failed to set wake-on-lan eth3 g

Jan 2 01:56:10 DSM-43-AA ha.cgi: ha.cc:1666 Failed to set Wake on Lan 3

Jan 2 01:56:10 DSM-43-AA kernel: [ 3992.332923] synobios_ioctl: SYNOIO_SET_AUTO_POWERON

Jan 2 01:56:10 DSM-43-AA ha.cgi: external_auto_power_on_set.c:70 set auto poweron fail

Jan 2 01:56:10 DSM-43-AA ha.cgi: ha.cc:4689 failed to clear rtc schedule poweron

Jan 2 01:56:10 DSM-43-AA ha.cgi: ha.cc:1734 Failed to disable schedule power on and off.

 

 

Jan 2 02:52:48 DSM-43-AA ha.cgi: main.cc:3734 Failed to get remote node information.

Share this post


Link to post
Share on other sites

Actually, come to think of it.

 

the errors

 

Jan 1 23:03:50 SUPER1 ha.cgi: write to uart2 error, error=(5)Input/output error
Jan 1 23:03:50 SUPER1 ha.cgi: ha.cc:1712: SYNOHWExternalControl failed 8192

 

are probably to do with writing stuff to the LCD panel on a REAL Synology :wink:

 

.

Share this post


Link to post
Share on other sites

Hi,

 

I'm not shure, but HA can fail to get working if the MAC addresses are fake (like normally on a newly installed xpenology). After editing the vender file regarding the tutorial, worth to try again.

 

You need exactly the same 2 xpenology config (amount of ram/hdd/eth cards) and as above the

fixed MAC addresses. Editing vender file when booting from USB stick is qite easy, but editing vender file in ESXi with booting from the special vmdk image is more complicated...

Share this post


Link to post
Share on other sites

Hi all,

 

I´m trying to do the same setup with HA on 2 Supermicro boards Atom onboard, same hardware on both servers. Did you get it to work finally?

 

Please share your results.

 

Did the Mac address change and vendor solve the problem?

 

Thanks in advance.

 

Regards.

 

 

 

I had a play with this today

 

for testing

2 x Supermicro 1RU servers

20g ram

2 x 250g drives in each, setup as a mirror (High Availability doesnt allow SHR)

2 x Intel Lan ports

 

setup 1st one with 4.3 3810 v1.0 from trantor

once installed, I broke the SHR config and rebuilt as a mirror set RAID1

 

setup local time properly

added a user

made sure networking was setup properly

 

edited grub.conf to reflect the propery mac address's in each server and both had different serial numbers

 

Set static IP's

 

Server 1

lan port 1 = 10.0.0.52, 255.255.255.0 <

lan port 2 = 192.168.0.1 255.255.255.0 <<< which doesn't matter as HA software changes this later too 169.254.1.1 255.255.255.252

 

Server 2

lan port 1 = 10.0.0.53, 255.255.255.0 <

lan port 2 = 192.168.0.2 255.255.255.0 <<< which doesn't matter as HA software changes this later too 169.254.1.1 255.255.255.252

 

Connected a cable direct from Server 1 to Server 2 both on Lan port 2

 

Double checked that all ports were running @ 1000Mbps......as 2 were not 1st off.....swap some cables around until all 4 ports were up @ Gigabit

 

Then off to install HA package (which 1st need you to install the python package)

 

Tested the setup, but yes it fails at the end

 

Then I thought, as each server has two ip addresses on Lan port 1.....eg one that you setup, then I guess a virtual / bridged one that you type in in the setup.... it must need bridging enabled??

 

had a looksee and it seems the bridge module isn't loaded...

 

/lib/module/bridge.ko

 

is there

 

so lets load it

 

insmod /lib/module/bridge.ko

 

fails as it need also /lib/module/stp.ko and /lib/module/llc.ko

 

now with all three loaded lets try again......

 

now it fails with

 

Jan 1 23:03:50 SUPER1 ha.cgi: write to uart2 error, error=(5)Input/output error

Jan 1 23:03:50 SUPER1 ha.cgi: ha.cc:1712: SYNOHWExternalControl failed 8192

 

which I wonder is something to do with the serial port?

 

which I will enable next.... tho I wonder if the REAL hardware has something attached to a serial port inside it?

 

.

Share this post


Link to post
Share on other sites

I replicate your scenario, with the 2 super micro boards , raid 1 in both servers, and both with different macs on eth0 and eth1, serial different too, but unfornately no luck.

 

Please any help would be greatly appreciated.

 

Thanks

Share this post


Link to post
Share on other sites

Any news or fix on newer version?

I try with 2 vm xpenology on same esxi.. With volume configured process finish with error "unable configure ha"...

If i try to create cluster without volumes configured the process finish with success... but after i can't configure any volumes.. "configuration process said that it can't configure slave nas because don't respond"..

Share this post


Link to post
Share on other sites

Hello,

 

I've make a lot of cross test to enable high avaibility on ESXi VMs.

SN different and virtual mac address are defined in boot parameters.

On each tests, it's crash after 10~20min of configuration and failed back to original state.

We can note that on both DSM, the error message is that HA was configured as slave and not one with master en one with slave. maybe a point for search.

 

any other idea?

 

Thank's!

Share this post


Link to post
Share on other sites

i think that the volumes are incompatible...

Try to create a cluster without any volumes configured... the process goes well...

But after.. you can't create a new volumes..

for me the disks "virtually" mapped to phisical (that i don't know start from disk 3..) are incompatible in cluster configuration...

Share this post


Link to post
Share on other sites

Hi, I'm trying to setup HA with two of identical VM in terms of ram, cpu, processor, version and update, i'm using last nanoboot.

My setup

Hypervisor VmWare workatstation 11

every VM has two Nic, one bridged and other connected to specific lan segment so they can talk

same hdd same ram, different name

Start to create cluster i see errors about heartbeatd that seems dont start but no error showed, another error is about DDNS that i dont know why HA try to use DDNS

Ip configuration for second nic doesn't matter because HA package set them for using dhcp that not exist in lan segment, so both nic acquire apipa IP address ( 169.x.x.x )

I suspect a sort of control in heartbeatd daemon.

Another try was set up both VM with synology ma-address and real serial number thru excel sheet

http://xpenology.com/forum/viewtopic.php?f=2&t=5861

No lucky

Share this post


Link to post
Share on other sites

I have tried with physical PC's to make it work, no success.

 

Without the volume, the cluster creates perfectly but after this, the volume fails to create on the cluster. When I first create a volume and then the cluster....the cluster creation fails.

 

Anybody has more info?

Share this post


Link to post
Share on other sites

i have try it and did not work great

 

my storage crashed only 1 bytes now

 

did a roboot and everything is bac i have break the HA will try later.

Share this post


Link to post
Share on other sites
Hi,

 

I'm not shure, but HA can fail to get working if the MAC addresses are fake (like normally on a newly installed xpenology). After editing the vender file regarding the tutorial, worth to try again.

 

You need exactly the same 2 xpenology config (amount of ram/hdd/eth cards) and as above the

fixed MAC addresses. Editing vender file when booting from USB stick is qite easy, but editing vender file in ESXi with booting from the special vmdk image is more complicated...

 

Where did you find information on editing vendor file (serial number) on esxi?

Share this post


Link to post
Share on other sites

I edited the serial number in the XPEnoboot iso. The file I edited is "isolinux.cfg", still can't get HA to work. Since I used ESX, I modified the MAC addresses of the all interfaces in ESX. The MAC range for Synology starts with 00:11:32.

 

I get all the way to "confirm settings" and then click apply. After a minute or two, I get "Operation failed. Please re-login and try again."

Share this post


Link to post
Share on other sites

Got HA up and running by adding serial ports and making sure the server names were unique BUT when I add a disk it fails.... This error was in the logs

 

Jun 2 22:14:15 SAN-HA volumehandler.cgi: ha_space.cc:4997 part size fake:1.

Jun 2 22:14:15 SAN-HA volumehandler.cgi: ha_space.cc:5049 sec:0.

Jun 2 22:14:16 SAN-HA volumehandler.cgi: storage_util.cpp:148 SYNOHACheckRemoteHddDL() failed

Jun 2 22:14:16 SAN-HA volumehandler.cgi: volumehandler.cpp:1384 HAValidRemote() failed

Share this post


Link to post
Share on other sites

Good News...

With last active Xpenoboot 5.2.5592.2 on 2 different Pc (Old Dell 745) with 1 disk 200Gb,2Gb ram and 2 network interface (one integrated ad one Intel pro100 desktop adapter), the process of creating cluster terminated correctly!!...

I notice that the network adapter added is essential for completing the process.. With normal oem adapter the process gave error durig first phase...

 

The configuration is really simple...

Create 2 usb key from img file for xpenoboot, modify on second key the cfg file changing the serial number (last digit is 3 i changed to 5)..

Install normally the synology system on either systems..

On the first network adapter configure a static ip address... The second adapter (for the heartbeat) on dhcp...

Connect directly the second adapter (the heartbeat) of either systems with a crossover cable (do not pass throught switch because the process change the MTU value to 9000)...

Create a volume on every system with standard mode (Synology raid mode SR isn't supported), in my case i have only one disk without raid protection.

Launch HA procedure adding the second ip, user and password.. The process terminated succesfully...

 

I notice that the slot disk marked as occupied is this case (with this Dell PC) is the number1.. But on other hardware where i try to install xpenology is the number3.. MAybe this difference can be important for terminating correctly the procedure...

 

Bye.

Share this post


Link to post
Share on other sites
Good News...

With last active Xpenoboot 5.2.5592.2 on 2 different Pc (Old Dell 745) with 1 disk 200Gb,2Gb ram and 2 network interface (one integrated ad one Intel pro100 desktop adapter), the process of creating cluster terminated correctly!!...

I notice that the network adapter added is essential for completing the process.. With normal oem adapter the process gave error durig first phase...

 

The configuration is really simple...

Create 2 usb key from img file for xpenoboot, modify on second key the cfg file changing the serial number (last digit is 3 i changed to 5)..

Install normally the synology system on either systems..

On the first network adapter configure a static ip address... The second adapter (for the heartbeat) on dhcp...

Connect directly the second adapter (the heartbeat) of either systems with a crossover cable (do not pass throught switch because the process change the MTU value to 9000)...

Create a volume on every system with standard mode (Synology raid mode SR isn't supported), in my case i have only one disk without raid protection.

Launch HA procedure adding the second ip, user and password.. The process terminated succesfully...

 

I notice that the slot disk marked as occupied is this case (with this Dell PC) is the number1.. But on other hardware where i try to install xpenology is the number3.. MAybe this difference can be important for terminating correctly the procedure...

 

Bye.

Hi, thanks for the infos, as i can read it seems that it's' hard to try with virtual environment, maybe in the week-end should have a spare time to try.

Share this post


Link to post
Share on other sites

Hi All

 

I am facing the same problem, but solved it by below steps.

(version XPEnoboot 5.2-5644)

1: modify the S/N number, make sure both Xpenology NAS are having different SN

2: modify MAC address, make sure it starts with 00:11:29:XXXXX, it is easy to change in Esxi

3: the virtual PC must have serial port ( otherwise you will see the "operation fail, please re-login again" message)

4: HA progress will be very smooth if you didn't create volume at all.

But unfortunately, it introduced a new problem.

5: it shows "operation fail, please make sure the passive server is online" if i want to create volume after HA built.

 

Note:

(4): if you create volume before config the HA cluster, then HA will fail

 

Any ideas ?

How to check the log file of HA ?

Thanks

Share this post


Link to post
Share on other sites