DSM 6.2.4 / 7 loader - major kernel changes?

Recommended Posts

I decided to pursue franken-DSM route - newer VDSM kernel (which supports older CPUs!) with DS3617xs OS.



Notable Changes

  • Features only in DS:
    • HA (can be fixed, but does anyone use that with xpenology?)
    • Disk compatibility database (this is most likely easily fixable with manual DB update from settings)
  • Drivers only in DS: 
    • igb (various Intel Gigabit ethernets)
    • ip_tunnel
    • ixgbe (Intel 82598/82599 10Gb ethernets)
    • leds-lp3943
    • mpt2sas (LSI HBA)
    • mpt3sas (LSI HBA)
    • mv14xx (Marvell 1475 HBA)
    • phy_alloc_0810_x64 (?)
    • synotty (Synology's custom serial driver)
    • e1000e
  • Drivers only in VDSM:
    • virtio (everything for running under KVM properly)
    • EHCI (native USB 2.0)
    • i40evf (Intel XL710 10/40Gb ethernets)
    • igbvf (Intel eth. passed via SR-IOV)
    • ixgbevf (helper for igbvf)
    • be2net (some emulated ethernet, never saw that before)
  • Drivers support is overall better. I deliberately omitted some modules which weren't cross-present since they changed names between old vs. new kernel but the functionality is still there. I bolded these which will be problematic. The biggest ones are probably these supporting LSI and Intel GbE, but those can most likely be compiled using a standard v4.4 tree.



Patching time!

  • The goal is to have a split-brain situation where as much of the stuff thinks it's a DS while DRMs are thinking it's a VDSM ;) 
  • Some of the stuff checks the platform using the platform identifier (unique key in synoinfo) but some check the bios
  • After initial patching I've got:
  • I was able to get through disk formatting up to the moment where the installation actually takes place. The installation stalled with error 13. Here are the steps to debug such thing if anybody struggles during devlopment:
    • I started comparing installation from Jun's DS3617xs running under VM vs. vanilla VDSM kernel + 3617 PAT
    • The installer running under VDSM image is MUCH more verbose (e.g. it prints details about every file's code signature from synocodesign)
    • The only clue I was able to find is the following:
      # Jun's
      updater: updater.c:631 file [/tmp/bootmnt/VERSION] process done.
      updater: updater.c:641 checksum file updated by [/bin/cp -f /tmp/checksum.syno.tmp /tmp/bootmnt/checksum.syno]
      updater: updater.c:825 GRUB version file /tmp/bootmnt/GRUB_VER does not exist
      updater: updater.c:7221 ==== Finish flash update ====
      # VDSM-3617
      updater: updater.c:631 file [/tmp/bootmnt/VERSION] process done.
      updater: updater.c:641 checksum file updated by [/bin/cp -f /tmp/checksum.syno.tmp /tmp/bootmnt/checksum.syno]
      updater: updater.c:1655 Bios upgrade needed file not exist, skip BIOS updating
      updater: updater.c:7082 fail to upgrade BIOS
      updater: updater.c:727 Restoring [zImage]
      updater: updater.c:738 Copying file [/tmpData/synob@ckup/zImage] as [/tmp/bootmnt/zImage], result[0]
      updater: updater.c:727 Restoring [rd.gz]
      <and here it continues restoring files>


    • The failure seems to be related to the BIOS update:
      • The error message is... bogus according to what updater is actually doing
      • The phy_alloc_0810_x64 is one of the components of the flasher (not present in VDSM image)
      • The updater also verifies presence of H2OFFT-Lx64 and platform.ini files
      • If any of the three is missing the "Bios upgrade needed file not exist, skip BIOS updating" is triggered
      • Attempting to make sure that the BIOS update is at least attempted causes KVM CPU emulation crash - don't do that :lol:
        QEMU[1071]: KVM internal error. Suberror: 1
        QEMU[1071]: emulation failure
        QEMU[1071]: Code=kvm: /build/pve-qemu/pve-qemu-kvm-5.2.0/include/hw/core/cpu.h:660: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.
      • Since there were more things to discover I did a simple hack to remove the bios file from the update after signatures are checked but before bios check is performed - good enough for a PoC:
        rm /tmp/checksum.syno.tmp
        while [ ! -f /tmp/checksum.syno.tmp ]; do true; done; rm '/tmpData/upd@te/bios.ROM'


    • After all these above I've got "Congratulations!! The update has been completed!! Do configupdate when next bootup." 


  • ... but the happiness after installation was short-lived as it welcomed me with an error I've never seen before:
    • Looking at the logs the update was applied correctly but something weird happens afterwards:
      linuxrc.syno executed successfully.
      Post init
      [   15.502431] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: barrier=1
      switch_root: error moving root
      START /linuxrc.syno
      mkdir: can't create directory '/tmpRoot': File exists

      It seems like it applies the update and then tries to chroot but it fails for some reason.

    • Debugging further I found innocently looking error which starts the cascade of doom:

      [   11.922345] VFS: opened file in mnt_point: (/dev), file: (/console)
      umount: can't umount /dev: Device or resource busy

      In essence to move root successfully stuff like /dev has to be moved. If anything fails before the switch_root will fail to execute mount() syscall with a very cryptic error. That aside notes in the switch_root linked here are worth reading too.

    • This should be an easy fix... and in fact Jun's patch had some provisioning for that (now it makes sense why he nuked console everywhere... quantum linux: it breaks when you're lookin at it :lol:)

    • ...it turned out to not be so easy: after debugging for over a night I found that any attempts to remount rootfs from initramfs (which is the last step of the ramdisk/first step of actual OS boot) fail for no obvious reason. I think I tried everything and found that for some reason MS_MOVE doesn't want to work - maybe someone here will have any idea?

    • Currently I have a manual patch-list which gets me to the point of getting the OS properly verified, installed, and rebooted for the first time (and it should work further after I found the solution for that pesky problem above). After I verify it's working I will re-arrange that to a proper patch file and obviously open source.



More work in progress

Now I turned into looking into the kernel itself as no matter which kernel is used (VDSM or DS) a module will be needed to fake VID/PID for bare metal. Also, if DS kernel is used synobios checks (which @IG-88 mentioned above mentioned) will need to be stubbed. I also cannot find the list of PCI VID/PIDs which DSM checks anywhere... I swear I saw a list on the forum but I cannot find it now.

  • Like 11
  • Thanks 1
Link to post
Share on other sites

So, you guys deserve an update like 3 days ago... but I hate doing updates with bad news so I decided to wait until I have good news ;)



On 6/5/2021 at 7:39 AM, Vortex said:

DSM checks hda1 checksums / sigs upon install.

If they fails, dsm will flag it then won’t to do switchroot anymore.

I think you're mixing two different checksums here. There's a checksum of the PAT when it's uploaded and unpacked (which can be easily defeated) and there's a custom kernel-level checksum.



18 hours ago, ilovepancakes said:

@kiler129Nice progress on your last 2 analysis posts! Did you happen to try DSM 7 with the same approach to see if it installs the same with your franken-DSM method or you only stuck to 6.2.4 for now?

Thanks! I didn't experiment with DSMv7 as, the last time I checked, it wasn't publicly avaialble and only invited people could access an early preview.
Franken-DSM is still on the table as this will give us a newer kernel but since I was able to defeat the kernel checking (details below) booting a ready-made distribution (i.e. a PAT from a DS) is probably more stable... or so I can guess. 




What's stopping ramdisk modification?

Initially I started writing a post on Saturday as I hoped the fix will be quite simple. Then I realized it isn't... and my browser crashed somewhere during that time deleting my post (and now restoring my previous one instead). I will try to summarize in short (I have probably close to 10 pages of notes since Friday) since some of the things I will be writing for the 3rd time now.

The error I described in the previous post regarding MS_MOVE not working is actually caused by a custom mechanism implemented in the kernel which deliberately disables:

  •  mount(MS_MOVE)
  •  mount(MS_BIND)
  •  forcefully enables modules signature requirement!


Everything starts here:

## init/initramfs.c
static int __init populate_rootfs(void)
	char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size);
	if (err)

#ifdef MY_ABC_HERE

	if (hydro_sign_verify(sig, initrd_start, rd_len, ctx, pk)) {
		ramdisk_check_failed = true;
		printk(KERN_ERR "ramdisk corrupt");
	} else {
		ramdisk_check_failed = false;
		initrd_end -= hydro_sign_BYTES;


In essence the whole ramdisk is verified so it cannot be modfied (or so they say ;)). This will cause "ramdisk corrupt" to be present in dmesg output, while the boot continues "normally":

DiskStation> dmesg | grep ramdisk  
[    0.577409] ramdisk corrupt


This flag is then checked in mount syscall implementation:

## fs/namespace.c:do_mount()
  else if ((flags & MS_BIND) && ramdisk_check_failed)
		retval = -EPERM;
  else if ((flags & MS_MOVE) && ramdisk_check_failed)
  		retval = -EPERM;



Then the flag enabled a pesky modules signature verification:

## kernel/module.c
static int module_sig_check(struct load_info *info, int flags)
	int err = -ENOKEY;
	const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
	const void *mod = info->hdr;

#ifdef MY_ABC_HERE
	sig_enforce |= ramdisk_check_failed;

Normally the kernel is built with signatures verification enabled (CONFIG_MODULE_SIG=y) but it is not enforced by default. In such state all modules signed by any of the keys present in the kernel (i.e. Synology's keys) can be loaded without problem. Theoretically you CAN load any other modules if you can get them into the kernel. While I didn't try you could possibly download a module via curl and do an insmod foo.ko. However, as soon as the ramdisk (vd.rd) is modified in any way (e.g. adding new modules) the signature check will not allow modules to load:

DiskStation> insmod hello.ko
insmod: can't insert 'hello.ko': Required key not available





unDRMing without full sources?

If Synology complied with GPL properly a simple patch to the kernel, cutting out the ramdisk/initramfs verification, would be sufficient. However due to MY_ABC_HERE nonsense this is not an option.


There are in essence two options:

  • finding weakness in their signature implementation
  • binary patching the kernel

The first option may be possible. Their implementation doesn't do a due dilligence of initializing the hydrogen engine, which most likely cause a huge cryptographical weaknesses. However this is way above for my amateur head, and the second option is (or seemed, see next section) simpler. 

Fortunately having even partial sources is very useful. Due to legalities I will not publish disassembled code in this post (as TECHNICALLY the binary is not GPLed). However I will just say that loading it in NSA's Ghidra along with kallsyms reveals a very nice code when entering populate_rootfs(). After cleaning up we're getting:

  signature_verify_result = elv_rb_find(/** ... **/)
  if (signature_verify_result == 0) {
    initrd_end = initrd_end + -0x40;
    ramdisk_check_failed = 0;
  else {
    ramdisk_check_failed = 1;

If we could only change ramdisk_check_failed in the "else" to 0.... it turns out it is possible and requires exactly one byte of a patch in the binary.




Testing the patch, or so I was thining 

So all that's left it to just boot the patched binary and check, right? Wrong... very wrong... that was ready on Saturday morning ;) 


It turns out that while kernel can be unpacked pretty easily with a tool present in the Linux's source, repacking it back into (b)zImage is something which people say is impossible. After hours of googling, the conclusion I was getting from every source was similar to the one described in the very first source I found: it's a compile time option and without rebuilding the kernel the process cannot be replicated. Even worse, due to how stupidly messy after 40 years of hacks x86 & IBM/PC are, booting an uncompressed image is shortly not an option (anybody want's to hack GRUB? ;)).


At first I unpacked the kernel and was under impression that I can just grab the header from the previous kernel, repack the old one again, glue it together and call it a day... oh boy, I couldn't be more wrong. 

However, everything is impossible until someone says I will do it hold my beer. To understand how ridiculous the bzImage structure is you need to look at the two pictures:


(a very shortened build process of vmlinux => bzImage, normally it has like 20+ more files; credits: opensource.com)



(summary of the bzImage layers)


I'm not gonna go into details of how it is done (as I said, it's ~10 pages of notes) as when I clean it up it will be shared as an open-source tool with docs. However I will only say that:

  • Flow cannot be directly applied to an unpacked kernel, as some files are missing (information is stripped while building the kernel and cannot be recovered from bzImage, yet the process makes calculations based on these pieces)
  • Some ASM modification is required to the sources to make it buildable
  • The kernel build process has parts which weren't touched since 90s', and it is at times a very spaghetti code (e.g. bash generating C which generates ASM with printf()s)
  • Kernel uses non-standard hacky Makefile tricks
  • Only some objects in the kernel source tree can be MAKEd without building the whole thing
  • I discovered that the 10-years old script present in the kernel source doesn't ACTUALLY produce a correct file but I had no energy to develop a patch for it


But you know what? Give me back my beer - it does work :D I can unpack the kernel, byte-patch it, put it back together into bzImage and run it. And the best part? Signature verification is no longer a problem:

DiskStation> dmesg | grep ramdisk
[    0.546989] ramdisk alterrd
DiskStation> insmod virtio.ko 
[   69.283548] virtio: module verification failed: signature and/or required key missing - tainting kernel
DiskStation> lsmod | grep virtio
virtio                  3506  0 



What does it mean?

Well, a kernel can be booted without magic kexec() calls and patching-on-the-fly. Just a good'ol binary patch. So far I don't have a ready loader for you guys but the kernel boots and loads natively in Proxmox and allows for any modules to be loaded. Modules can be compiled against the native version of the kernel (not loader kernel) which is v3.10.105 in DS3617xs running 25556.


...one more thing

Just looking at my real boxes and those with Jun's loader I found something strange:

user@xpenobox1:~$ dmesg | grep 'verification fail'
[    0.692073] ce: module verification failed: signature and/or required key missing - tainting kernel
user@xpenobox1:~$ lsmod | grep -E '^ce'
ce                     17708  0


user@xpenobox2:~$ dmesg | grep 'verification fail'
[    1.201593] fudfffucff: module verification failed: signature and/or required key missing - tainting kernel
user@xpenobox2:~$ lsmod | grep fudfffucff
fudfffucff             17708  0


user@synology:~$ dmesg | grep 'verification fail'
user@synology:~$ lsmod | grep '17708'


Interesting, isn't it? Based on this + my other search my speculation as to how Jun's loader works:

  • Boots his own kernel
  • Unpacks the kernel on-the-fly
  • Binary-patches it to load a custom module from memory (?)
  • Loads the kerenel into a proper RAM location
  • Boots the kernel using kexec() as.... kexec is was the only way to boot images after unpacking ;) 
  • The module injected into the kernel does all stubbing needed. It may now even be a PATCH-PATCH but a dynamic monkey patch for certain calls
Edited by kiler129
typos, clarity
  • Like 11
  • Thanks 6
Link to post
Share on other sites
6 hours ago, kiler129 said:

I didn't experiment with DSMv7 as, the last time I checked, it wasn't publicly avaialble and only invited people could access an early preview.


They updated the page a few days ago to provide DSM 7 RC publicly. So while not a final release I suppose, it seems pretty damn close unless they find a major show-stopping bug at this point. No registration required either. https://www.synology.com/en-us/beta/DSM70Beta

Edited by ilovepancakes
  • Thanks 1
Link to post
Share on other sites
On 6/8/2021 at 11:34 AM, ilovepancakes said:

They updated the page a few days ago to provide DSM 7 RC publicly. So while not a final release I suppose, it seems pretty damn close unless they find a major show-stopping bug at this point. No registration required either. https://www.synology.com/en-us/beta/DSM70Beta

Good to know, thanks! I didn't follow their DSM 7 release cycle closely as storage is definitely not an area where I like to experiment. Since all my real DSM systems are used for production tasks and I cannot make any configuration changes surprisingly (not even running a VDSM) I was waiting until some stable release is published ;)



7 hours ago, NooL said:

@kiler129 Wow this is such an impressive and exciting read.. Awesome work! 







There's a liar in the room

I had some time to tinker with the code again and I discovered something which changes the view I have on the code from Synology. After writing the last post I had the modules loading working. However all mounts with --move or --bind were instantly returning a permission error. I was puzzled and I slept on it. This wasn't possible with the code of do_mount() because there are strict set of conditions when it may happen:

  • There's lack of sufficient user capabilities (CAP_SYS_ADMIN is needed, but this doesn't apply to superuser)

  • security_sb_mount() returns some error (which is not possible as of now, it's hardcoded to 0)

  • Synology's ramdisk corruption flag is set (previous post)

That's it. There aren't any more for bind/move in the code Synology shared. There are more cases for remounts (not important here) but that code path is not executed for bind/move.

I re-read the code several times and I had no idea what may be causing it, so I came back to painstakingly analyzing the actual ASM from binary kernel...


There's another DRM measure hidden there, not present in the GPL source. IANAL but in my book this is a big GPL violation. In short the source shared with us does:

# pseudocode, not the actual ASM in the binary
MOV        retval,-0x1
CMP        byte ptr [ramdisk_check_failed],0x0
JNZ        dput_out
# execution continues


While the real code does something more like:

# pseudocode, not the actual ASM in the binary
CMP        byte ptr [ramdisk_check_failed],0x0
MOV        retval,-0x1
JNZ        dput_out
CMP        qword ptr [<obscure pointer>],0x0
JNZ        dput_out
# execution continues


The flow is slightly obscure to people not familiar. However instructions in ASM are executed in order and there are no blocks per-se. So in essence, here ramdisk flag is compared THEN retval is preemptively set to EPERM and then a JNZ (jump-if-not-zero/false) is executed. However if that jump is not executed (i.e. the ramdisk wasn't tampered) it will do another comparison to an <obscure pointer>. If that check returns non-zero value it will jump to value return. It will return EPERM as the retval is still set to the EPERM.

I don't like it - the shared GPL code is not the same code as used to compile the kernel. For me it's a much bigger deal than MY_ABC_HERE and stripping comments (I get that, they want to protect their internal know-how).






But why? What is that check doing?

So, we have yet another check. Once I found it, patching it was easy, just two bytes to replace JNZ with JL (here it's practically a NOOP but with the correct length as the pointer references an unsigned value). However I went further and started digging what that obscure pointer points to. It turns out it's a hook present very early in the boot process. It references a function which is present in the compiled kernel:

$ cat /proc/kallsyms | grep -E '\spci_setup$'
ffffffff818aa83f t pci_setup



While nothing unusual there if we look at the same function in the code it the C and ASM there's a striking difference for one of the branches:

} else if (!strncmp(str, "pcie_scan_all", 13)) {

#ASM translated to pseudo-C
if (_boot_params.hdr.version < 0x209) {
  pciFlags = pciFlags | 1;


By itself reading through it nothing stands out... until you consider the cross reference of mount and checking for <obscure pointer> pointing to pciFlags. For me it was "wait a second..." moment:

  1. PCI setup checks boot params header version (offtopic: this nicely shows how messy x86 is - 40 letter pages of explanation for booting a binary :lol:)
  2. A flag for rescan PCI devices is set
  3. Mount checks if any PCI flags were set and blocks mounting

This paints a nice picture to me:

  • stuff cannot be remapped using a variety of PCI-related flags
  • boot header is populated by the bootloader (in essence it's a "care package" from the bootloader to the kernel)
  • Synology's controls what version of the header will be detected by the kernel
  • They use that innocently-looking check to block mounting of the filesystem...


But you know what? It can magically be walked around:

DiskStation> mkdir /tmp1 /tmp2
DiskStation> mount -t tmpfs tmpfs /tmp1
DiskStation> mount --move /tmp1 /tmp2
DiskStation> echo $?
DiskStation> mount
# ...
tmpfs on /tmp1 type tmpfs (0)
/tmp1 on /tmp2 type bind (move)





Jun's loader

Also a super quick update on the Jun's loader. I wanted to extract that magical dynamic module. However it's not that easy:

  • Module is not loaded using insmod
  • Trying to extract the ELF from a memory dump is difficult as the name is random
  • The kernel is in-memory patched within kexec() and then jumped to
  • Jun's loader implements a simple ELF parser to get to correct offsets
  • "va not found" is actually an error referencing inability to extract the standard "p_vaddr" from the ELF binary. Most likely it's because that method is very fuzzy in nature.
  • Jun most likely patched the hydrogen signature check function as his kernel accepts modified ramdisks without throwing anything into dmesg
  • Like 11
  • Thanks 4
Link to post
Share on other sites

Have your published your work so far anywhere as I would like to have a we look at it and study it to see if I can help with anything. I'm not a developer but I like to challenge myself and look into coding etc from time to time. In the past I have built kernels from source for different android devices so I might be able to help in some way.

  • Like 1
Link to post
Share on other sites
On 6/10/2021 at 3:51 AM, Aigor said:

I would like to have a quarter of your knoweledge 

I will say it's more fake-till-you-make-it for me ;) 

My experience is very limited in this space really.



On 6/10/2021 at 4:27 AM, Vortex said:

Nice findings!


Here is the Juns' stuff from loader v1.04.

modprobe is kernel module injector, and jun.ko is the module itself.

Happy reversing.

Thank you! That helps immensely! A classical example why different people thinking differently can come up with something great.



On 6/11/2021 at 2:14 AM, john_matrix said:

Very interesting topic!

Thank you very much @kiler129 for your work and for sharing with us this knowledge, I am pretty sure it will be very valuable.

My whole goal is to actually describe everything and make the knowledge accessible. As of know most of the stuff is not publicly available and seems to be known only by a couple of devs which developed previous loaders.



21 hours ago, Eoghan said:

Have your published your work so far anywhere as I would like to have a we look at it and study it to see if I can help with anything. I'm not a developer but I like to challenge myself and look into coding etc from time to time. In the past I have built kernels from source for different android devices so I might be able to help in some way.

As of now no, I didn't share any repo (except some code snippets to e.g. generate DSM root password). However it will definitely land on github - currently it's a messy brain dump of multiple files in one directory + 4 different test & build systems across two Proxmox servers... yeah, it's bad, it's a huge PoC/research.

What I'm trying to prioritize is to document the findings in posts here in a way that is less brain-dumpy and get back to the documentation part later (oh, we all know how developers are :lol:). But with all the seriousness you may find the update below helpful as I could definitely use some help.




What am I doing right now?

Since the last update, as the kernel is booted and seems to be working correctly, I shifted my focus towards developing a scaffolding for a loader. The goal of the MVP is to:

  • Have somewhat defined structure so we don't end up with a spaghetti code
  • Contain description of what needs to be hooked and where
  • Implement basic elements like parsing cmdline


  • Creating an installer
  • Process of creating images (it's simple for a developer)
  • Making it stealthy (for now, but kernel has many facilities to hide)




What I have done?

Currently I have a:

  • Byte-patched 2556 kernel which boots & is able to use privileged MS_MOVE/MS_BIND
  • A simple kernel module which parses cmdline
  • Rough understanding how to do USB VID/PID faking
[    3.272716] redpill_lkm: module verification failed: signature and/or required key missing - tainting kernel
[    3.274282] <redpill_lkm/redpill_lkm.c:29> RedPill loaded
[    3.275588] <redpill_lkm/call_protected.c:31> Got addr ffffffff81164480 for cmdline_proc_show
[    3.276700] <redpill_lkm/cmdline_delegate.c:20> Cmdline count: 334
[    3.277601] <redpill_lkm/cmdline_delegate.c:41> Param #0: |syno_hdd_powerup_seq=0|
[    3.278725] <redpill_lkm/cmdline_delegate.c:41> Param #1: |HddHotplug=0|
[    3.279695] <redpill_lkm/cmdline_delegate.c:41> Param #2: |syno_hw_version=DS3615xs|
[    3.280803] <redpill_lkm/cmdline_delegate.c:41> Param #3: |vender_format_version=2|
[    3.281911] <redpill_lkm/cmdline_delegate.c:41> Param #4: |console=ttyS0,115200n8|
[    3.283016] <redpill_lkm/cmdline_delegate.c:41> Param #5: |withefi|
[    3.283915] <redpill_lkm/cmdline_delegate.c:41> Param #6: |quiet|
[    3.284746] <redpill_lkm/cmdline_delegate.c:41> Param #7: |syno_port_thaw=1|
[    3.285721] <redpill_lkm/cmdline_delegate.c:41> Param #8: |DiskIdxMap=0C0005|
[    3.286740] <redpill_lkm/cmdline_delegate.c:41> Param #9: |SataPortMap=157|
[    3.287734] <redpill_lkm/cmdline_delegate.c:41> Param #10: |root=/dev/md0|
[    3.288740] <redpill_lkm/cmdline_delegate.c:41> Param #11: |sn=1230LWN022239|
[    3.289838] <redpill_lkm/cmdline_delegate.c:41> Param #12: |mac1=e631e2fba63f|
[    3.290742] <redpill_lkm/cmdline_delegate.c:41> Param #13: |mac2=cea5327542cb|
[    3.291644] <redpill_lkm/cmdline_delegate.c:41> Param #14: |netif_num=2|
[    3.292476] <redpill_lkm/cmdline_delegate.c:41> Param #15: |vid=0x46f4|
[    3.293277] <redpill_lkm/cmdline_delegate.c:78> VID override: 0x46f4
[    3.294086] <redpill_lkm/cmdline_delegate.c:41> Param #16: |pid=0x0001|
[    3.294895] <redpill_lkm/cmdline_delegate.c:107> PID override: 0x0001
[    3.295690] <redpill_lkm/cmdline_delegate.c:41> Param #17: |earlycon=uart8250,io,0x3f8,115200n8|
[    3.296763] <redpill_lkm/cmdline_delegate.c:41> Param #18: |earlyprintk|
[    3.297593] <redpill_lkm/cmdline_delegate.c:41> Param #19: |loglevel=15|
[    3.298475] <redpill_lkm/cmdline_delegate.c:49> CmdLine processed successfully, tokens=20



Jun's loader kernel module

First of all big shout-out to @Vortex for sharing the module and discovering where it is! Jun used some clever tricks to hide parts of his loader. The major kernel one is implementing its own ELF parser & patcher in the kexec. This makes working with that hard, as his code is a small addition to overall big codebase. However I don't think digging there is needed.

Next it turns out that the kernel module, which does most of the shimming, was sneakily hidden in the modprobe binary:

$ binwalk ds918-modprobe.elf

0             0x0             ELF, 64-bit LSB executable, AMD x86-64, version 1 (SYSV)
872           0x368           ELF, 64-bit LSB relocatable, AMD x86-64, version 1 (SYSV)


That file essentially has two ELF headers. When executed it will load itself and inject the kernel module from its own file. I didn't spend time analyizng that part as IMHO it's not needed as I'm not intending to hide how the loader works.




Approaching the unknown

Jun's kernel module contains a huge body of knowledge. Unfortunately there's no source for it. There are two ways to handle this problem:

  • Start from scratch and reverse engineer protections & quirks present in DSM (as to be fair most of the stuff is not a DRM but rather how the DSs are designed)
  • Use Jun's module as a knowledge base by disassembling it and documenting all findings


I picked the second route. I think it's also a fair approach - while Jun didn't share the C source code, his module is marked as licensed under GPL (see below). Whether the decision was conscious I cannot say. However I will definitely not pretend that I didn't get any knowledge from his binary - that would be silly, open source movement is all about giving credit where it's due and not reinventing the wheel ;) 


(anyone can verify this by doing "strings FILE | grep license"  where FILE is any file from the package shared by @Vortex)


However, as I mentioned before due to lack of source this approach is definitelly not easy either. Additionally, most of the strings are obfuscated (e.g. when the address of a cmdline_proc_show cannot be retrieved the error message reads "gc: ret=-1"). It will take time, but it's definitely doable. However, I hope it's the last time this has to be done by yet another person.




Task #-1: bootloader

Short and sweet: this isn't done and probably will be the last thing I'm going to touch (unless someone decide to chime-in after github repo is public). Currently I'm loading a kernel using the VDSM loader image which contains a slightly modified GRUB 0.97. I don't think this makes any difference what it is as long as it's able to load the kernel. However we will need a clean GRUB build with some scripts to build it automatically as sharing a binary from Synology will be an easy route to a DMCA takedown. This is essentially how hackintosh loaders are existing without being removed - they don't copyrighted code. 


For anyone who's willing to try:

  • Get the VDSM 2556 & DS3617xs PATs
  • Unpack both PATs (tar -xvf)
  • Mount VDSM bootloader using losetup
  • Mount the first partition from the loop device
  • Replace VDSM's zImage with zImage from DS3617xs PAT
  • Replace VDSM's rd.gz with rd.gz from DS3617xs PAT
  • Adjust grub.cfg to match what you want to do




Task #0: kernel cmdline

In order to have any configurability of the module after it's built the obvious choice is to read from cmdline. Jun's loader did the same thing and I agree that this is the correct approach. However, as simple as it seems from the userland it is not in the kernel. Inside of the kernel space the cmdline is saved very early and there's no public API to access it.

However, as the cmdline is accessible in /proc/cmdline we can hijack the function which feeds that file:

static int cmdline_proc_show(struct seq_file *m, void *v)
	seq_printf(m, "%s\n", saved_command_line);
	return 0;


This requires three tricks:

  • Obtaining an address to the function, as the symbol is not exported (kallsyms_lookup_name() is helpful here, yet requires some unusual trickery)
  • Building the seq_file struct properly supplying just enough while not overusing the kernel stack for buffers (IIRC the whole kernel has ~8K of stack space)
  • Simulating just enough of the expected behavior to read the data (normally seq_file structures are handled by the proc FS code to respond to userland FS calls)

From this place I have to make a confession: it took me almost a whole day to make this working as the last time I programmed in pure C was.... 15 or so years ago, so my knowledge of raw pointers and memory management got pretty rusty 😅 


However now the code is working properly and I set up an easy structure to export-unexported functions pretty easily:



Side note: in the meantime I picked up a book "Linux Kernel Programming" by Billimoria - I'm obviously still reading it but I can already highly recommend it to anyone who's interested in kernel hacking.




Task #1: USB VID/PID faking

After I've got the cmdline handling in a modular way, I thought that a good starting point will be to handle the VID/PID override. Conceptually it's not that difficult: get a tree of devices and override VID+PID on it to a predefined one (0xf400 in mfg mode or 0xf401 in retail mode). However there are some challenges, which I was able to discover both testing and looking through Jun's module ASM:

  • Kernel module must be loaded as soon as possible, the earlier the better
  • To override VID/PID usbcore module must be loaded to have some symbols to work with
  • The override of VID/PID must happen AFTER devices are enumerated but BEFORE scsi layer creates types assignments & /dev/sdX nodes

Fortunately it's not that hard to do that, but it's definitely not a bare standard way of how you code. The flow is as follows:

  1. Register our module as quickly as possible
  2. Register for notifications using register_module_notifier() to get notified when a new module is loaded into the kernel
  3. Wait until usbcore is registered while continuing other tasks
  4. When usbcore registration event comes in there's a short window (~6-10ms) of time to do a usb_register_notify()
    1. This will watch for new devices found on USB hubs
    2. This symbol cannot be linked in the module as the kernel module loader will explode when module is loaded before usbcore (and it has to!)
    3. Symbols can be accessed using a __symbol_get() kernel interface
  5. When new device is inserted we need to quickly respond to the event by checking of VID+PID matches the one specified as override in the cmdline and replace them with 0xf400 or 0xf401 (mfg/retail) before scsi layer is able to get to them


The last point is a straight race condition. Normally you avoid it like fire - however in this case it's the desirable approach (sort of like RGH on Xbox 360 ;)). This is why on slower single-core machines that hack may fail and I believe even Jun warned about that.


As of writing this I was able to code & test 1-3. The 4 is technically coded but I cannot verify it due to a problem described below.




Current problem at hand: devices are not detected

So, there's this weird behavior of the DSM kernel/modules - when e.g. USB or even e1000e drivers are loaded no devices detection is actually taking place:

DiskStation> insmod usbcore.ko ; insmod usb-storage.ko ; insmod etxhci-hcd.ko ; 
insmod xhci-hcd.ko ; insmod uhci-hcd.ko ; insmod usbhid.ko
[ 4143.899687] ACPI: bus type USB registered
[ 4143.900791] usbcore: registered new interface driver usbfs
[ 4143.902366] usbcore: registered new interface driver hub
[ 4143.904496] usbcore: registered new interface driver ethub
[ 4143.906427] usbcore: registered new device driver usb
[ 4143.908860] usbcore: registered new interface driver usb-storage
[ 4143.916707] uhci_hcd: USB Universal Host Controller Interface driver
[ 4143.918145] uhci_hcd 0000:00:1a.0: setting latency timer to 64
[ 4143.918872] uhci_hcd 0000:00:1a.0: UHCI Host Controller
[ 4143.919599] uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 1
[ 4143.920617] uhci_hcd 0000:00:1a.0: irq 16, io base 0x00005040
[ 4143.921955] hub 1-0:1.0: USB hub found
[ 4143.922550] hub 1-0:1.0: 2 ports detected
// repeated for all hubs
[ 4143.984063] usbcore: registered new interface driver usbhid
[ 4143.985472] usbhid: USB HID core driver
DiskStation> insmod e1000e.ko
[ 4222.346786] e1000e: Intel(R) PRO/1000 Network Driver - 3.3.4-NAPI
[ 4222.347994] e1000e: Copyright(c) 1999 - 2016 Intel Corporation.


Normally I will expect some devices to be found on the USB bus and the ethernet card to be present after their drivers are loaded. But this doesn't happen on DSM. However it's not some kernel-wide setting, as loading VirtIO drivers makes ethernet & others perfectly usable and detected right away.

I'm expecting one of the Synology's checks is not passing. However rather than a DRM I'm thinking it's a mechanism to control disk spinup and USB power. If someone could poke at that it will be great (I know @Vortex has experience with IDA ;)). It will be somewhere between Jun's binary module and kernel sources. Something is stopping devices from being autodetected. This is probably nothing crazy complex.

  • Like 4
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.