flyride

Running 6.2.3 on ESXi? Synoboot is BROKEN, fix available

Recommended Posts

Posted (edited)

NOTE: This problem is consistently manifested when running on ESXi, but many have encountered problems with Synoboot devices on baremetal installs of 6.2.3.  The fix can be implemented safely on baremetal installs and does resolve the issue there also.

 

TL;DR:

  1. When running DSM 6.2.3 under ESXi, Jun's 1.03b and 1.04b bootloaders fail to build /dev/synoboot
    (this can be fixed by installing an extracted script from the loader to re-run after the boot has completed)
  2. DSM 6.2.3 displays SATA devices (i.e. bootloader on 1.04b) that are mapped beyond the MaxDisks limit when previous versions did not
  3. DSM 6.2.3 update rewrites the synoinfo.cfg disk port bitmasks which may break some high-disk count arrays, and cause odd behavior with the bootloader device

 

Background:

 

Setting the PID/VID for a baremetal install allows Jun's loader to pretend that the USB key is a genuine Synology flash loader.  On an ESXi install, there is no USB key - instead, the loader runs a script to find its own boot device, and then remakes it as /dev/synoboot. This was very reliable on 6.1.x and Jun's loader 1.02b. But moving to DSM 6.2.x and loaders 1.03b and 1.04b, there are circumstances when /dev/synoboot is created and the original boot device is not suppressed. The result is that sometimes the loader device is visible in Storage Manager. Someone found that if the controller was mapped beyond the maximum number of disk devices (MaxDisks), any errant /dev/sd boot device was suppressed.  Adjusting DiskIdxMap became an alternative way to "hide" the loader device on ESXi and Jun's latest loaders use this technique.

 

Now, DSM 6.2.3: The upgrade changes at least two fundamental DSM behaviors:

  1. SATA devices that are mapped beyond the MaxDisks limit no longer are suppressed, including the loader (appearing as /dev/sdm if DiskIdxMap is set to 0C)
  2. The disk port configuration bitmasks are rewritten in synoinfo.conf: internalportcfg, usbportcfg and esataportcfg and on 1.04b, do not match up with default MaxDisks=16 anymore. NOTE: If you have more than 12 disks, it will probably break your array and you will need to edit them back (and that's not just an ESXi issue)!

Also, when running under ESXi, DSM 6.2.3 breaks Jun's loader synoboot script such that /dev/synoboot is not created at all. Negative impacts:

  1. The loader device might be accidentally configured in Storage Manager, which will crash the system
  2. The loader partitions may inadvertently be mapped as USB or eSata folders in File Station and become corrupted
  3. Absence of /dev/synoboot devices may cause future upgrades to fail, when the upgrade wants to modify rd.gz in the loader (often, ERROR 21)

Unpacking Jun's synoboot script reveals that it harvests the device nodes, deletes the devices altogether, and remakes them as /dev/synoboot. It tries to identify the boot device by looking for a partition smaller than the smallest array partition allowed. It's an ambiguous strategy to identify the device, and something new in 6.2.3 is causing it to fail during early boot system state. There are a few technical configuration options can can cause the script to select the correct device, but they are difficult and dependent upon loader version, DSM platform, and BIOS/EFI boot.

 

However, if Jun's script is re-run after the system is fully started, everything is as it should be. So extracting the script from the loader, and adding it to post-boot actions is a universal solution to this problem:

  1. Download the attached FixSynoboot.sh script
  2. Copy the file to /usr/local/etc/rc.d
  3. chmod 0755 /usr/local/etc/rc.d/FixSynoboot.sh

Thus, Jun's own code will re-run after the initial boot after whatever system initialization parameters that break the first run of the script no longer apply. This solution works with either 1.03b or 1.04b and is simple to install. This should be considered required for ESXi running 6.2.3, and it won't hurt anything if installed or ported to another environment.

FixSynoboot.sh

Edited by flyride
Updated FixSynoboot.sh to clean up volumetab to avoid spurious /var/log/messages
  • Like 6
  • Thanks 9

Share this post


Link to post
Share on other sites
Posted (edited)

For those Linux newbs who need exact instructions on installing the script, follow this here.  Please be VERY careful with syntax especially when working as root.

  1. If you have not turned on ssh in Control Panel remote access, do it

  2. Download putty or other ssh terminal emulator for ssh access

  3. Connect to your nas with putty and use your admin credentials.  It will give you a command line "$" which means non-privileged

  4. In File Station, upload FixSynoboot.sh to a shared folder.  If the folder name is "folder" and it's on Volume 1, the path in command line is /volume1/folder

  5. From command line, enter "ls /volume1/folder/FixSynoboot.sh" and the filename will be returned if uploaded correctly.  Case always matters in Linux.

  6. Enter "sudo -i" which will elevate your admin to root.  Use the admin password again. Now everything you do is destructive, so be careful.  The prompt will change to "#" to tell you that you have done this.

  7. Copy the file from your upload location to the target location "cp /volume1/folder/FixSynoboot.sh /usr/local/etc/rc.d"

  8. Make the script executable by "chmod 0755 /usr/local/etc/rc.d/FixSynoboot.sh"

  9. Now verify by "ls -la /usr/local/etc/rc.d/FixSynoboot.sh" and it should return something like this:
    -rwxr-xr-x  1 root root 2184 May 18 17:54 FixSynoboot.sh

The important part is the first -rwx which indicates it can be executed.

 

Now reboot the nas and FixSynoboot will be enabled.

Edited by flyride
  • Like 1
  • Thanks 2

Share this post


Link to post
Share on other sites

Thank you very much @flyride for his extremely well written article!! 💪👏

 

I will try your solution today on my second backup NAS, which runs as a VM under ESXi 7.0 (since yesterday 😀 ) on a HPE Microserver Gen8 (as DS3615 with 1.03b) and give some feedback.

 

So to make it clear. I will:
 

- install the Update 6.2.3 manually

- reboot DSM as part of the update process (as usal)

- afterwards I will see the loader as ESATA

- I will apply/install your script

- reboot again

=> and now the loader disappears as ESATA and everything is like before

 

Share this post


Link to post
Share on other sites
Posted (edited)

It seems that I am unable to edit my first post so I have to make another one instead:

 

 

After applying the `FixSynoboot.sh` the first time the Loader disappears as ESATA-Devices but also at the same time I lost the access to the serial console (which is my last resort of contact if something will failing).

I found my failure was I just made a `chmod 755 FixSynoboot.sh`.
I deleted the `/usr/local/etc/rc.d/FixSynoboot.sh` and rebooted: I got access to the serial console is back again. :-)

At the reboot after deleting `FixSynoboot.sh` I see some messages again regarding the console:

[341.514080] synobios write k to /dev/ttyS1 failed
[341.558121] synobios write k to /dev/ttyS1 failed
[341.591942] synobios write t to /dev/ttyS1 failed

 

The solution is to apply the rights correct as:

root@ds2:~# chmod 0755 FixSynoboot.sh
root@ds2:~# cp FixSynoboot.sh /usr/local/etc/rc.d/

(Just as you have written! :-) ).

 

After a second reboot I get now a message "External device External SATA Disk1 on ds2 was not ejected properly." and I see two new shares "satashare1-1" and "satashare1-2" with the content of the Loader:
 

# Rights and Owner of the script:
root@ds2:~# ls -lah /usr/local/etc/rc.d/
total 12K
drwxr-xr-x  2 root root 4.0K Apr 18 11:02 .
drwxr-xr-x 10 root root 4.0K Aug 11  2018 ..
-rwxr-xr-x  1 root root 1.6K Apr 18 11:02 FixSynoboot.sh


# remaining ESATA-Share after boot and running the script:
root@ds2:~# mount | grep satashare
/dev/sdm1 on /volumeSATA1/satashare1-1 type vfat (rw,relatime,uid=1024,gid=100,fmask=0000,dmask=0000,allow_utime=0022,codepage=fault,iocharset=default,shortname=mixed,quiet,utf8,flush,errors=remount-ro)
/dev/sdm2 on /volumeSATA1/satashare1-2 type vfat (rw,relatime,uid=1024,gid=100,fmask=0000,dmask=0000,allow_utime=0022,codepage=fault,iocharset=default,shortname=mixed,quiet,utf8,flush,errors=remount-ro)


# Contents of the shares are the Loader:
root@ds2:~# ls -lah /volumeSATA1/satashare1-1
total 2.6M
drwxrwxrwx 6 admin users  16K Jan  1  1970 .
drwxr-xr-x 6 root  root  4.0K Apr 18 11:13 ..
-rwxrwxrwx 1 admin users 2.5M Aug  1  2018 bzImage
drwxrwxrwx 3 admin users 2.0K Apr 18 11:13 @eaDir
drwxrwxrwx 3 admin users 2.0K Aug  1  2018 EFI
drwxrwxrwx 6 admin users 2.0K Aug  1  2018 grub
-rwxrwxrwx 1 admin users  103 Apr 25  2019 GRUB_VER
-rwxrwxrwx 1 admin users  225 Aug  1  2018 info.txt
drwxrwxrwx 2 admin users 2.0K Apr 18 10:27 @tmp

root@ds2:~# ls -lah /volumeSATA1/satashare1-2
total 11M
drwxrwxrwx 4 admin users  16K Jan  1  1970 .
drwxr-xr-x 6 root  root  4.0K Apr 18 11:13 ..
-rwxrwxrwx 1 admin users  111 Apr 18 10:25 checksum.syno
drwxrwxrwx 3 admin users 2.0K Apr 18 11:13 @eaDir
-rwxrwxrwx 1 admin users 1.9M Aug  1  2018 extra.lzma
-rwxrwxrwx 1 admin users   55 Apr 18 10:25 grub_cksum.syno
-rwxrwxrwx 1 admin users 5.9M Apr 18 10:25 rd.gz
-rwxrwxrwx 1 admin users  512 Nov 26  2018 Sone.9
drwxrwxrwx 2 admin users 2.0K Apr 18 10:27 @tmp
-rwxrwxrwx 1 admin users 3.0M Apr 18 10:25 zImage

root@ds2:~# ls -lah /volumeSATA1/satashare1-3
total 8.0K
drwxrwxrwx 2 root root 4.0K Apr 18 11:47 .
drwxr-xr-x 6 root root 4.0K Apr 18 11:47 ..


I do not know where my error is or if its an issue with the loader 1.3b (you have written that you take the parts of the script from loader 1.4b).

 

My Loader was not edited in any way besides just the PID/VID and MAC-Entries as usal.

Then I started the script manually as root and the ESATA-Shares went away:

# manually start the script:
root@ds2:~# ./FixSynoboot.sh start

# check ESATA-mounts afterwards:
root@ds2:~# mount | grep satashare

# check contents of ESATA-Shares:
root@ds2:~# ls -lah /volumeSATA1/satashare1-2
total 8.0K
drwxrwxrwx 2 root root 4.0K Apr 18 11:13 .
drwxr-xr-x 6 root root 4.0K Apr 18 11:13 ..

root@ds2:~# ls -lah /volumeSATA1/satashare1-1
total 8.0K
drwxrwxrwx 2 root root 4.0K Apr 18 11:13 .
drwxr-xr-x 6 root root 4.0K Apr 18 11:13 ..

 

In DSM I see "External device External SATA Disk1 on ds2 was not ejected properly."

 

I think that the the script is time or order related to run or there is a DSM-unmount command missing.

 

I found a shell unmount command for USB-Devices (e.g. `/usr/syno/bin/synousbdisk -umount sdk1`) but I haven't found a similar command for correctly unmounting ESATA-shares in DSM from the shell. :-(

 

I just tried:

root@ds2:~# umount /dev/sdm1
root@ds2:~# umount /dev/sdm2

 

I works but `satashare1-1` and `satashare1-2` are remaining empty in DSM.

 

After another reboot there is only one mount left:

root@ds2:~# mount | grep sdm
/dev/sdm1 on /volumeSATA1/satashare1-1 type vfat (rw,relatime,uid=1024,gid=100,fmask=0000,dmask=0000,allow_utime=0022,codepage=fault,iocharset=default,shortname=mixed,quiet,utf8,flush,errors=remount-ro)

 

 

 

So I think there is a time or order related issue left with the script or we need to cleanup the auto mounted "ESATA"-shares afterwards.

 

 

EDIT:

 

So I am now pretty sure that there is really some time or order related issue.

 

- I inserted the function "FixSynoboot" into "/etc/rc.local" so it runs a little bit earlier in the boot process. :-D

 

- Now I don't have any mounts in DSM regarding ESATA and it looks clean. :-)

 

- But I lost the serial console again. Not the entire console (I must correct myself) but there is no login-prompt anymore. Messages are still written to the console but I am unable to login.

 

- I can access the console via ssh without problems. So there must be a link between the loader and the local serial console.

 

- I will comment out my edits in "/etc/rc.local" and will use the standalone script in "/usr/local/etc/rc.d/FixSynoboot.sh" again. Better to have some entries left in DSM than no login prompt anymore via the serial console.

 

Maybe you have an idea how to solve the remaining issues. :-)

 

Edited by Balrog
News about remaining issues

Share this post


Link to post
Share on other sites
Posted (edited)

So, a couple of comments.

 

There isn't anything in Jun's script that has anything to do with the console.  It only manipulates devices that have a partition structure.  If the console is failing it has to be a general problem with 6.2.3 and the loader, or something else is going on.  Can you reboot a few times and see if that behavior is reliable? 

 

That code from 1.04b is identical to 1.03b and is probably the same back to the first Jun loader.

 

There are really two problems being solved here - First, the synoboot device is being generated as a generic block device in the range reserved for esata. If a device exists too late in the boot sequence, DSM sees the device and tries to mount the share.  So we want the device gone before that.  Second, we need the boot device to be present as synoboot.

 

When you are checking on status, ls /dev/syno* and you should see synobios and three synoboot entries if everything is working right.

 

I was hoping not to have to recommend this, but if you either change DiskIdxMap=1600 in grub.cfg, or change the portcfg bitmasks in /etc/synoinfo.conf, that will keep the esata shares from generating, and you can keep running the synoboot fix in /usr/local/etc/rc.d

 

Then all we have left is the console problem, which I really think is unrelated here but warrants investigation.  AFAIK /dev/ttyS1 is the general-purpose I/O port in Syno hardware to manage the lights and beep and fanspeed and doesn't do anything in XPenology.  I think the console is /dev/ttyS0.  It might be informative if you did a dmesg | fgrep tty

Edited by flyride

Share this post


Link to post
Share on other sites
Posted (edited)

Thank you for your suggestions!

 

I will compare everything from the ground up and give feedback. Just to make sure that I do not have any settings which are not default in the grub.cfg. :-)

 

These are my current entries in the grub.cfg:

set common_args_3615='syno_hdd_powerup_seq=0 HddHotplug=0 syno_hw_version=DS3615xs vender_format_version=2 console=ttyS0,115200n8 withefi elevator=elevator quiet syno_port_thaw=1'

set sata_args='sata_uid=1 sata_pcislot=5 synoboot_satadom=1 DiskIdxMap=0C SataPortMap=114 SasIdxMap=0'

It was a long time ago that I must have a look into the grub.cfg and I am not sure if this is modified or not.

 

I have 2 SATA Disks connected to the the VM:

- "SATA Controller 0" with a 32 GByte VMDK connected as "SATA 0:0" (for some packages). This will be seen as /volume1 in DSM.

- "SATA Controller 1" with the 50 Mbyte-Loader connected as "SATA 1:0"

- and finally 1 Dell H200 flashed to IT-Mode with pass through to the VM (these are the data-disks) which will be seen as /volume2 in DSM.

 

Maybe this is the root of the issue. But the VM runs fine for about 2 years now. :-D

 

I just tried to swap SATA 0:0 with 1:0 and change the boot order in the BIOS of the VM:

- DSM is booting but /volume1 is not found anymore.

So I changed the settings back to the original ones.

 

Besides these Tests I rebooted a few times and till now I have a stable serial console on  /dev/ttyS0 just like described in grub.cfg:  "console=ttyS0,115200n8".

 

EDIT:

 

The syno-devices are looking good with the script in "/usr/local/etc/rc.d/FixSynoboot.sh" and the above settings for the VM:

root@ds2:~# ls /dev/syno*
/dev/synobios  /dev/synoboot  /dev/synoboot1  /dev/synoboot2

 

Edited by Balrog
New Info

Share this post


Link to post
Share on other sites

I was able to duplicate your report with the esata shares populating before the loader device is killed.  Apparently some differences on boot timing with 918 and 3615 I guess.

 

The DiskIdxMap=1600 should solve the esata problem on 3615 for now, but I'm searching for a simpler fix to the script.

  • Thanks 1

Share this post


Link to post
Share on other sites
Posted (edited)

Script updated to gracefully remove loader partitions mounted as esata shares if they exist.  No need to edit DiskIdxMap with this new version.

Edited by flyride
  • Thanks 1

Share this post


Link to post
Share on other sites
Posted (edited)

Thank you very much for the new updated version of your script!
It work absolute fabulous! :-)

I just exchanged the first version with the new one and rebooted a few times:

 

root@ds2:~# chmod 0755 FixSynoboot.sh
root@ds2:~# cp FixSynoboot.sh /usr/local/etc/rc.d/
root@ds2:~# reboot

 

- the script works very well! :-)
- the login of the serial console stays now rock stable and did not disappears once since the new script is used
- in DSM are no remaining remnants left from the loader :-)

 

- in the console of course I can see the loader still as block device:

 

root@ds2:~# lsblk
NAME                            MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
[ .... other block devices .... ]
sdm                               8:192  0   50M  0 disk
├─sdm1                            8:193  0   15M  0 part
├─sdm2                            8:194  0   30M  0 part
└─sdm3                            8:195  0    4M  0 part

 

or a little bit longer:
 

root@ds2:~# lsblk --fs
NAME                            FSTYPE            LABEL                      UUID                                   MOUNTPOINT
[ .... other block devices .... ]
sdm
├─sdm1                          vfat                                         00C2-16A7
├─sdm2                          vfat                                         2016-2016
└─sdm3

 

I installed "lsblk" as ipkg as it is my favorite command to list block devices in a much more human readable way. :-)

 

The loader is correctly available:
 

root@ds2:~# ls /dev/syno*
/dev/synobios  /dev/synoboot  /dev/synoboot1  /dev/synoboot2


So for now I can only say "Thank you very much!" and I have a lot to do to understand the new parts of the script. :-D
I never had contact with "uevents" before and it looks that my unmount-commands are not so bad at all.
"synocheckshare" is the next command which I will have a deeper look also. :-)

 

I will run this for a few days and then I am going to update my main NAS.

Have a nice day!

Edited by Balrog

Share this post


Link to post
Share on other sites

Can someone check if the SMART info is showing correctly? I passed through the onboard lsi2308 and smart info is missing(it was there before the update.

Share this post


Link to post
Share on other sites

Please post this on an appropriate thread, SMART info has nothing to do with synoboot.

Share this post


Link to post
Share on other sites

Hello Flyride,

 

update today on DSM-Backup (esxi 6.7 U2) and use your script for satashare and it's working well :)

 

Thk's !

 

 

Share this post


Link to post
Share on other sites

I'm completely new to the XPEnology project; is this the type of significant change/update that will lead to an updated loader, or should I follow the instructions above? Or would it behoove me to wait for a few weeks for things to settle?

Share this post


Link to post
Share on other sites

The existing loader works, and the individual that authored them has not been active since January (and historically is not active period).  So I would not expect any updated loader.

 

You have to make your own decision about when to pull the trigger on upgrades as they do take some time to sort out.  That said, 6.2.3 seems like a good release for XPEnology as many of the interim hacks that were necessary for 6.2.1 and 6.2.2 are not needed anymore (in fact, most of the problems people are having with 6.2.3 are because they still have those hacks in place and they no longer work properly).

 

That said, this script fixes a problem unique to running XPEnology under ESXi.  The issue was partially a problem before and folks lived with it. This is a complete fix despite the band-aid style implementation.

 

Share this post


Link to post
Share on other sites

I just installed 6.2.3 on EXSi 7.0.0 with the script and it worked fine! 
 

Now i moved my 5 disks from the N54L to EXSi in the right order, RDM Mapped to the VM and after reboot i have my old server runnig. Even not a migration was necessary!

Share this post


Link to post
Share on other sites
Posted (edited)

Where has the script gone? It's not available for download anymore :-(

Edit: Oh f***, I had to login to make it available.

Nevermind!

Edited by Tattoofreak

Share this post


Link to post
Share on other sites

Thank you flyride,deployed the script according to the step,it's worked.After apply the script,the /dev/synoboot is avalibale.

Share this post


Link to post
Share on other sites

I noticed that the eSATA volume removal method left a reference in a volume table which caused constant errors to be posted to /var/log/messages.  Also under certain circumstances the folder structure is left intact from the eSATA volume mount (this may actually be normal behavior with eSATA/USB shares).

 

Neither of these are issues that affect functionality or are visible in normal operations, but I updated the script to clean them both up.

Share this post


Link to post
Share on other sites

 

So, what the steps for installing this update?

Should i install this script before or after installing the update?

 

In previous post #4, it seems that i should install this script after install the update in DSM, but wont there be a problem because the vm will boot once with no script installed?

Share this post


Link to post
Share on other sites
Posted (edited)

It won't hurt to install it on your current system.  But it can be installed after, it isn't a problem to boot without it, you will just have "ESATA" shares exposing your loader until you install it.

 

If you think about a clean install (or a major version upgrade involving a loader change) there is no way to install the script ahead of time, so it must work without it.

Edited by flyride

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.