DSM got broken after Proxmox 6.x to 7.x migration. Then I fixed it but it gets broken again and again

Hi all,


DSM gets "continually broken" (I mean, I fix it but then it gets broken again) after a Proxmox 6 to 7 upgrade. Jun's loader shows (serial terminal):

"lz failed 9
alloc failed
Cannot load /boot/zImage"


But please, let me first explain all details and read until the end because the situation is very strange...


I'm using JUN'S LOADER v1.03b - DS3617xs in a Proxmox VM (on HP Gen8), with DSM 6.2.3-25426. I've been using this config for months without any problem on Proxmox **6.x** (this seems relevant).


The other day I decided to upgrade Proxmox to **7.x**. I upgraded Proxmox and rebooted. All my VMs booted up perfectly... except DSM one :(


I observed that I couldn't reach my network shares at DSM, and after a quick investigation, I discovered that DSM booted up in "installation mode", showing this message in web interface:

"We've detected that the hard drives of your current DS3617xs has been moved from a previous DS3617xs, and installing a newer DSM is required before continuing.".


I thought DSM partition may have been corrupted in some way or (most likely) Proxmox 7 introduced kind of a "virtual hw change" so now DSM thinks it's been booted in another hw. This last option is very plausible because Proxmox 7 uses qemu 6.0, while Proxmox 6 (latest) uses qemu 5.2. Maybe other changes in new version of Proxmox could have been introduced (for instance, I've read something regarding assigned MAC of a bridge interface being different).


What I did was:

1/ Power off DSM VM.

2/ Back up partition 1 for all my 4 disks (i.e the md0 array which contains DSM OS).

3/ Power on DSM VM.

4/ I followed instructions in the web interface, chose "Migrate" (which basically keeps my data and config untouched), selected a manual installation of DSM and uploaded the .pat corresponding to the very same version I was already running before the problem, i.e. DSM_DS3617xs_25426.pat (DSM 6.2.3-25426). I didn't want to downgrade, and of course, I shouldn't upgrade because next version is 6.2.4, which is incompatible with Jun's loader.

5/ Migration got finished, DSM rebooted and... FIXED!!! :-) DSM was working again with no loss of data nor config.


*But* another problem arised later... When my server (Proxmox) got rebooted again, DSM VM resulted broken again but this time in a very different way: I couldn't ping my DSM VM, and after investigation, I concluded DSM kernel was not being loaded at all. Indeed, I attached a serial terminal to DSM VM and I could see Jun's loader being stuck at the very beginning with these messages:

"lz failed 9
alloc failed
Cannot load /boot/zImage"


No idea why this is happening nor what these messages really mean (well, it seems obvius kernel is not being loaded but I don't know why) !!


I managed to fix it again (yeah xD) by:

1/ Power off DSM VM.

2/ Restore partition 1 for all my disks from just the backup I took when solving former problem :)

3/ Power on DSM VM

4/ I confirmed loader worked again and that I got to the same point where DSM needed a migration

5/ I "migrated" exactly in the same way I had done minutes before :). FIXED!!


What's the problem then? Easy... every time I reboot my server (so Proxmox reboots), my DSM VM got broken again with the second error ("lz failed... etc), i.e, loader's kernel not being loaded. I could temporarily fix it but sooner or later I'll need to reboot Proxmox again and... boom again :-(


Any of these problems are familiar to you? Any clue about how to solve this or a least, some ideas I should focus my investigation on?


PLEASE, help!! :_(



PS: My Proxmox VM config (a.k.a. qemu config) (with some info redacted):



args: -device 'nec-usb-xhci,id=usb-ctl-synoboot,addr=0x18' -drive 'id=usb-drv-synoboot,file=/var/lib/vz/images/100/synoboot_103b_ds3617_roman.img,if=none,format=raw' -device 'usb-storage,id=usb-stor-synoboot,bootindex=1,removable=off,drive=usb-drv-synoboot' -netdev type=tap,id=net0,ifname=tap100i0 -device e1000e,mac=00:XX:XX:XX:XX:XX,netdev=net0,bus=pci.0,addr=0x12,id=net0

bios: seabios

boot: d

cores: 4

cpu: IvyBridge

hotplug: disk,network,usb

memory: 2048

name: NAS-Synology

numa: 0

onboot: 1

ostype: l26

sata0: /dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4NXXXXXXX,size=2930266584K

sata1: /dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4NYYYYYYY,size=2930266584K

sata2: /dev/disk/by-id/ata-WDC_WD30EFRX-68EUZN0_WD-WCC4NZZZZZZZ,size=2930266584K

sata3: /dev/disk/by-id/ata-WDC_WD40EFRX-68N32N0_WD-WCC7KAAAAAAA,size=3907018584K

scsihw: virtio-scsi-pci

serial0: socket

smbios1: uuid=9ba1da8f-1321-4a5e-8b00-c7020e51f8ee

sockets: 1

startup: order=1

usb0: host=5-2,usb3=1


