As far as I know, virtual SSD can't be used as SSD cache.
In my experience, SSD cache can only be used with pass-through-ing host SATA I/F and SSDs.
If you tried with success, please post about that.
BTW, My recommended configuration for ESXi system is following.
In the following config, whole system is hosted with SSD cache & (almost) all drives are managed by DSM.
It performs well and notifies me on any disk troubles. I'm satisfied with this configuration.
(I think) It's not so complex, but has enough performance & good flexibility.
I hope this helps.
My recommended XPEnology based configuration of ESXi system:
boot ESXi from USB drive (as you do)
Add 1 disk for VMFS datastore (& use it's disk inteface directly for ESXi). (this datastore is just for booting "Host DSM" VM) (*1)
Make 1 XPEnology VM as "Host DSM" VM and pass through All disk interfaces (other than above one) to the VM.
(This VM is used only for ESXi datastore & ESXi host.)
Add all other HDDs / SSDs to make XPEnology VM & setup SSD cache & format disk group with ext4 (for VM performance).
Create share folder for ESXi datastore using nfs export (better for performance & good maintainability with SMB access.
you can add SMB access for direct maintenance of the datastore from client PCs.)
Add that nfs exported datastore from ESXi
Add your own VMs on that datastore (*2)
Add Another XPEnology VM ("User DSM" VM) with thick provisioned vmdk, formatting with btrfs (for usual file sharing, etc.).
Add users & apps only on "User DSM". (*3)
*1) If you don't use USB sharing for VM, you can use USB disk for this datastore, perhaps.
*2) I can also add Windows/MacOS Desktop VM with pass-through-ing GPU and USB. (Choose ESXi 6.0 for hosting mac. You can still use vCenter 6.5 or later.)
*3) The only I wish but I can't is H/W encoding with DSM6.1 + DS916 VM. (Perhaps, you have to pass-through host's iGPU. I don't have such iGPU system.)