Jump to content
XPEnology Community

flyride

Moderator
  • Posts

    2,438
  • Joined

  • Last visited

  • Days Won

    127

Posts posted by flyride

  1. If you haven't built a server yet, getting an compatible chipset, SATA controller, network card are most important.

    But you could always select a platform with dual cores that was upgradable to a quad-core chip.

     

    If you are not going to transcode, run 10GBe network, or run a lot of VM's you don't even need a high clock rate.  There's a reason that 8 year old Atom CPU's are still running DSM.

     

  2. If you have access to all your drives in all use cases, and you aren't out of total slots, why does the numbering and order matter to you?  It doesn't matter to DSM as it writes a GUID to each partition.

     

    SATAPortMap is normally used to gain access to disk slots that aren't accessible with the default detection.

    I would investigate how many unique controllers that DSM actually perceives (lspci, evaluate udev table, etc).  Without that knowledge it is hard to determine what port map to use.

     

    If scenario #1 has two logical disk controllers, try SATAPortMap=22 and see what happens.  I don't think you can use it to address Scenarios #2 or #3.

    • Like 1
  3. On 3/31/2018 at 9:29 PM, swords80 said:

    There seems to be something odd with 6.1.6, on a ESXi VM with DS3617xs the paravirtual controller and virtual LSI SAS controller don't load either:

     

    Loading module vmw_pvscsi[ 3.849379] general protection fault

    Loading module mptsas[ 3.997779] BUG: unable to handle kernel paging request

     

    However, if Xpenology ESXi VM is booted with a USB key attached to the virtual USB controller, these modules load fine.

    root@testnas:/etc# cat /proc/cmdline
    syno_hdd_powerup_seq=0 HddHotplug=0 syno_hw_version=DS3617xs vender_format_version=2 console=ttyS0,115200n8
    withefi quiet root=/dev/md0 sn=[redacted] mac1=[redacted] netif_num=1
    root@testnas:/etc# cat /etc.defaults/VERSION
    majorversion="6"
    minorversion="1"
    productversion="6.1.6"
    buildphase="GM"
    buildnumber="15266"
    smallfixnumber="0"
    builddate="2018/03/26"
    buildtime="16:58:27"
    root@testnas:/etc# lsmod | grep mptsas
    mptsas                 38463  2
    mptscsih               18705  2 mptsas,mptspi
    mptbase                61898  4 mptctl,mptsas,mptspi,mptscsih
    root@testnas:/etc# lsmod | grep vmw_pvscsi
    vmw_pvscsi             15239  0

     

     

  4. The synoboot.vmdk referencing the bootloader img should be connected via the default virtual SATA controller, and it will be automatically hidden when you boot with the ESXi option.  No other devices should be attached to that SATA controller.

     

    I have had the best results with VMDK's for data drives by adding a separate VM SCSI controller and connecting the virtual disk to that.  If you have a plan to passthrough a physical SATA controller and all the drives attached to it, that generally works without much fanfare assuming that DSM has driver support for the hardware.

  5. 1 hour ago, bmac6996 said:

    Wanted to add.. running esxi 6.5 and using 3617 latest 6.1.6.

    Choosing the option ESXI in the bootloader, at most I can move forward with the DSM upload but after that reboot, it throws out the error and cannot see the hard drives. 

     

    This is the same problem I'm encountering specifically with the 6.1.6 upgrade.  Try using 6.1.5 for now until we figure out what's going on.

  6. - Outcome of the update: UNSUCCESSFUL

    - DSM version prior update: DSM 6.1.5-15254 Update 1

    - Loader version and model: JUN'S LOADER v1.02b - DS3617xs

    - Using custom extra.lzma: NO

    - Installation type: VM - ESXi 6.5 (test VM)

    - Additional comments: Does not come back up after reboot and shows up in Synology Assistant as Not Installed without a disk installed. Replacing the boot loader makes the disk visible and the VM come up as "Recoverable." Attempting to recover ends up rebooting back into the Not Installed state.

    • 6.1.6 installs successfully on the same configuration (ESXi DS3617xs test VM) except booting with a USB key instead of the synoboot vmdk.
    • Migrated the test installation (with synoboot vmdk) to DS3615xs and successfully upgraded to 6.1.6 with no issues.
    • Migrated my production nas (ESXi 6.5, NVMe via pRDM, and passthrough SATA) to DS3615xs, and upgraded to 6.1.6 with no issues.
  7. I'm seeing a problem where both the boot loader and the boot drive are corrupted by the upgrade.  Tried a few times, each with the same results.  But I realized that somewhere along the line, I accidentally substituted my production boot loader so that two serial numbers were active on the network at the same time, and I am wondering if maybe that is the issue.

     

    I will test later today to prove or disprove this idea.  But is a duplicate serial number on the same network a possibility for you?

     

  8. USING PHYSICAL RDM TO ENABLE NVMe (or any other ESXi accessible disk device) AS REGULAR DSM DISK

    Summary:

    1. Heretofore, XPEnology DSM under ESXi using virtual disks is unable to retrieve SMART information from those disks. Disks connected to passthrough controllers work, however.

    2. NVMe SSDs are now verified to work with XPEnology using ESXi physical Raw Device Mapping (RDM). pRDM allows the guest to directly read/write to the device, while still virtualizing the controller.

    3. NVMe SSDs configured with pRDM are about 10% faster than as a VMDK, and the full capacity of the device is accessible.

    4. Configuring pRDM using the ESXi native SCSI controller set specifically to use the "LSI Logic SAS" dialect causes DSM to generate the correct smartctl commands for SCSI drives. SMART temperature, life remaining, etc are then properly displayed from DSM, /var/log/messages is not filled with spurious errors, and drive hibernation should now be possible. EDIT: SAS controller dialect works on 6.1.7 only (see this post).

    Like many other posters, I was unhappy with ESXi filling the logfiles with SMART errors every few seconds, mostly because it made the logs very hard to use for other things.  Apparently this also prevents hibernation from working. I was able to find postings online using ESXi and physical RDM to enable SMART functionality under other platforms, but this didn't seem to work with DSM, which apparently tries to query all drives as ATA devices. This is also validated by synodisk --read_temp /dev/sdn returning "-1"

     

    I also didn't believe that pRDM would work with NVMe, but in hindsight I should have known better, as pRDM is frequently used to access SAN LUNs, and it is always presented as SCSI to the ESXi guests.  Here's how pRDM is configured for a local device: https://kb.vmware.com/s/article/1017530  If you try this, understand that pRDM presents the whole drive to the guest - you must have a separate datastore to store your virtual machine and the pointer files to the pRDM disk!  By comparison, a VMDK and the VM that uses it can coexist on one datastore.  The good news is that none of the disk capacity is lost to ESXi, like it is with a VMDK.

     

    Once configured as a pRDM, the NVMe drive showed up with its native naming and was accessible normally.  Now, the smartctl --device=sat,auto -a /dev/sda syntax worked fine!  Using smartctl --device=test, I found that the pRDM devices were being SMART-detected as SCSI, but as expected, DSM would not query them.

     

    NVMe device performance received about a 10% boost, which was unexpected based on VMWare documentation.  Here's the mirroring operation results:

    root@nas:/proc/sys/dev/raid# echo 1500000 >speed_limit_min
    root@nas:/proc/sys/dev/raid# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1]
    <snip>
    md2 : active raid1 sdb3[2] sda3[1]
          1874226176 blocks super 1.2 [3/1] [__U]
          [==>..................]  recovery = 11.6% (217817280/1874226176) finish=20.8min speed=1238152K/sec
    <snip>

    Once the pRDM drive mirrored and fully tested, I connected the other drive to my test VM to try a few device combinations.  Creating a second ESXi SATA controller has never tested well for me.  But I configured it anyway to see if I could get DSM to use SMART correctly.  I tried every possible permutation and the last one was the "LSI Logic SAS" controller dialect associated with the Virtual SCSI controller... and it worked!  DSM correctly identified the pRDM drive as a SCSI device, and both smartctl and synodisk worked!

     

    root@testnas:/dev# smartctl -a /dev/sdb
    smartctl 6.5 (build date Jan  2 2018) [x86_64-linux-3.10.102] (local build)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF INFORMATION SECTION ===
    Vendor:               NVMe
    Product:              INTEL SSDPE2MX02
    Revision:             01H0
    Compliance:           SPC-4
    User Capacity:        2,000,398,934,016 bytes [2.00 TB]
    <snip>
    SMART support is:     Available - device has SMART capability.
    SMART support is:     Enabled
    Temperature Warning:  Disabled or Not Supported
    === START OF READ SMART DATA SECTION ===
    SMART Health Status: OK
    Current Drive Temperature:     26 C
    Drive Trip Temperature:        85 C
    <snip>
    root@testnas:/dev# synodisk --read_temp /dev/sdb
    disk /dev/sdb temp is 26

    Finally, /var/log/messages is now quiet.  There is also a strong likelihood that drive hibernation is also possible, although I can't really test that with NVMe SSD's.

     

    Postscript PSA: My application for pRDM was to make Enterprise NVMe SSD's accessible to DSM.  As DSM recognizes the devices as SSDs, it then provides the option for scheduled TRIM support (which I decided to turn on over 18 months later). The TRIM job resulted in a corruption of the array and flagged the member disks as faulty. While a full recovery was possible, I don't know if that is was due to incompatibility of NVMe drives with the DSM TRIM implementation, or if it is an unexpected problem with pRDM not supporting TRIM correctly. You've been warned!

    • Like 2
    • Thanks 2
  9. On 3/9/2018 at 6:59 PM, raumi20 said:

    Open-vm-tools spk poduce to much failures and spam logs..

     

    Solution to the logfile spam problem:

     

    1) cat /proc/version > /etc/arch-release

    2) edit /var/packages/open-vm-tools/scripts/start-stop-status, and comment out these lines:

    #               if [ -e ${PIDFILE} ]; then
    #                       echo "$(date) vmtoolsd ($(cat  ${PIDFILE})) is running..." >> ${LOGFILE}
    #               else
    #                       echo "$(date) vmtoolsd is not running..." >> ${LOGFILE}
    #               fi

     

    On 9/15/2017 at 7:59 PM, yale-xpo said:

    I wanted open-vm-tools / vmware tools installed on my xpenology nas running on top of ESXi, mainly so i could press the power button on the host and still have the DSM guest shutdown safely. The thought of compiling and installing a custom spk for open-vm-tools seemed to complicated and risky, so instead I created a Docker image that solves the problem.

    Any chance to make the SSH port configurable?

  10. Ah, I didn't see the fact you were running VMWare Workstation.  I don't have any experience with that, as all my work knowledge comes from ESXi.  I'm guessing that the "physical hardware" option of VM Workstation isn't a true passthrough, so XPEnology doesn't have unfettered access to the hardware.  Maybe someone else who knows the product can comment.

     

    Most of us build up a server expressly to run XPEnology.  I started with DSM on a baremetal server, but switched to ESXi when I couldn't get NVMe functional.  As far as running DSM on a VM  (and other VM's side-by-side with DSM) versus VMs within DSM... the VMM manager in DSM has limitations that ESXi does not.  For example, it only supports specific OS's and versions.  ESXi is a little bit finicky on hardware, but then so is XPEnology.

     

    Regarding NVMe performance under ESXi, I think IG-88 already quoted this post:

     

  11. I have never edited the ssd db to have SATA SSD recognized on Synology or XPEnology.  I don't think that the ssd db even matches the current compatibility list?  Honestly, I really don't know what the db is for.  For what it's worth, every Intel, Samsung and VM SSD I've tried has been recognized as SSD.  Obviously there are a lot of other SSD products on the market.

     

    This is in /etc/rc:

    if [ "$PLATFORM" != "kvmx64" -a -f /usr/syno/bin/syno_hdd_util ]; then
            syno_hdd_util --ssd_detect --without-id-log 2>/dev/null
    fi

     

    I can only guess that syno_hdd_util evaluates SSD status and updates something, otherwise why would Synology run it there.  However, I can't find a reference to syno_hdd_util in /lib/udev, as a hotplug event somehow has to determine HDD/SDD status as well.

  12. On 3/25/2018 at 1:06 AM, IG-88 said:

    in theory you can mark a virual disk as ssd to make it look like a ssd in the vm, don't know how dsm reacts to this

    also yu can read this

    By default, if ESXi is virtualizing any type of SSD, it will present to the guest as SSD. This can be overridden (SSD->HDD or HDD->SSD) as needed, and maybe some drives ESXi can't determine SSD status.

     

    From the ESXi console:

    [root@esxi:/] esxcli storage core device list (results are trimmed for clarity)
    t10.NVMe____INTEL_SSDPE2MX020T4_CVPD6114003E2P0TGN__00000001
       Display Name: Local NVMe Disk (t10.NVMe____INTEL_SSDPE2MX020T4_CVPD6114003E2P0TGN__00000001)
       Size: 1907729
       Device Type: Direct-Access
       Vendor: NVMe
       Model: INTEL SSDPE2MX02
       Is SSD: true

     

    The above drive has a vmdk configured on it, and is presented to my Synology VM as /dev/sda:

    synology:/run/synostorage/disks/sda$ cat vendor model type
    VMware  Virtual disk            SSD

     

    On 3/25/2018 at 4:58 AM, pcdtox02 said:

    right, I can raid together my NVME and SSD as a volume... I'm just saying neither show up as SSD either before or after.  Both just show up as HDD. 

    I don't think there is any supportable way to get a NVMe drive to work natively as a volume within current versions of DSM. Only through virtualization via ESXi.

     

    On 3/25/2018 at 4:25 AM, pcdtox02 said:

      I'm also trying just a basic SATA SSD (850 evo) to use as cache until I find a way to use the NVMe.  But I can't even get it to show up as an SSD.  Just shows up as HDD and the Cache Advisor is grey out... any thoughts?

    This doesn't make sense to me.  You state that you are allowing ESXi access to physical disks.  If you used a full passthru of your SATA controller, DSM should have full hardware access to the SATA drive and recognize as SSD.  I don't know exactly how RDM would work though.  A vmdk shows up per the above.  Passthru of your NVMe drive won't work at all. Maybe you are mixing up results from your SATA and NVMe drives?

     

    On 3/24/2018 at 2:54 PM, pcdtox02 said:

    I bought this card and a 960EVO NVMe to use as cache.  I'm not finding any info on compatibility and NVMe cache via expansion cards.... Can anyone point me in the right direction?

    I never tried to run NVMe as a cache drive since what I was trying to accelerate was an entire volume, and think the DSM implementation of SSD cache is a poor value for the cost of NVMe SSD.  But I took some scratch space on an NVMe drive (in this case, a Samsung PM961, which is the closest thing I have available to OP's 960 EVO) and created another virtualized SCSI drive as a test:

     

    xp1.jpg

     

    And here is how it shows up in DSM (as disk #9, also note SATA SSD on passthru controller as disk #10)

    xp4.jpg

     

    And how both are recognized for SSD cache:

    xp5.jpg

     

    Also understand that any NVMe "controller" is nothing but a mechanism to map PCI lanes to the NVMe drive.  The driver in ESXi, Windows and Linux is standardized and works with all drives (although some manufacturers like Intel have their own to support extended features). So OP's original request to use NVMe for cache should be possible using ESXi.

     

  13. A faster CPU is going to transcode faster than a slower CPU,  unless the slower CPU has hardware acceleration like Quicksync, AND you have software that will use it.

     

    So despite your reluctance to use PassMark or other CPU benchmark site, you will only confirm what is already known.  If you must test yourself, you should use a standardized transcoding workload.  Pick a file to transcode and run it on each of the platforms.  If you are using Synology Videostation for transcoding, you'll probably have to test it with a stopwatch and the GUI.  If you are using Plex, just use the installed ffmpeg and transcode your standard workload from the command line.

     

    Again, from a ranking standpoint, you won't get any different results than a pure CPU benchmark unless you have hardware transcoding capability, are running the DS916+ image, and your transcoding software is able to take advantage of that (Synology Videostation or Plex Pass).

     

  14. I don't think this has anything to do with your update.  Your Disk 1 disconnected from the RAID momentarily, which caused your array to go critical (non-redundant).

     

    The disk reconnected and appears to be working, but SMART (the drive self-test information) is now reporting a hardware fail state or pending failure on the drive.  You should replace it.  It is in warranty so you should be able to print the details of the SMART status (under "Health Info") and WD will send you a new drive.

     

    Once you install the new drive, manage your RAID Group 1 and add it to the array to restore redundancy.

     

    • Thanks 1
  15. This doesn't get any space back, just avoiding disk access.  The system drive partition structure is intact on all drives even after the adjustment.  So if DSM "reclaims" via hotspare activity or other, it only operates within the preallocated system partition.  So no possibility of damage to any other existing RAID partition on the drives.

     

    If the system or swap partitions are deleted on any disk, DSM will call the drive Not Initialized.  Any activity that initializes a drive will create them, no exceptions

  16. Just to clarify the example layout:

     

    /dev/md0 is the system partition, /dev/sda.../dev/sdd.  This is a 4-disk RAID1

    /dev/md1 is the swap partition, /dev/sda../dev/sdd.  This is a 4-disk RAID1

    /dev/md2 is /volume1 (on my system /dev/sda../dev/sdd RAID5)

     

    Failing RAID members manually in /dev/md0 will cause DSM to initially report that the system partition is crashed as long as the drives are present and hotplugged.  But it is still functional and there is no risk to the system, unless you fail all the drives of course.

     

    At that point cat /proc/mdstat will show failed drives with (F) flag but the number of members of the array will still be n.

     

    mdadm -grow -n 2  /dev/md0 forces the system RAID to the first two available devices (in my case, /dev/sda and /dev/sdb).  Failed drives will continue to be flagged as (F) but the member count will be two.

     

    Adding the removed members back in at that point will result in them being flagged as hotspares (S).

     

    It's probably worth mentioning that I do not use LVM (no SHR) but I think this works fine in systems with SHR arrays.

     

    • Like 1
  17. USE NVMe AS VIRTUAL DISK / HOW TO LIMIT DSM TO SPECIFIC DRIVES

    NOTE: if you just want to use NVMe drives as cache and can't get them to work on DSM 6.2.x, go here.

    Just thought to share some of the performance I'm seeing after converting from baremetal to ESXi in order to use NVMe SSDs.

    My hardware: SuperMicro X11SSH-F with E3-1230V6, 32GB RAM, Mellanox 10GBe, 8-bay hotplug chassis, with 2x WD Red 2TB (sda/sdb) in RAID1 as /volume1 and 6x WD Red 4TB (sdc-sdh) in RAID10 as /volume2

     

    I run a lot of Docker apps installed on /volume1. This worked the 2TB Reds (which are not very fast) pretty hard, so I thought to replace them with SSD.  I ambitiously acquired NVMe drives (Intel P3500 2TB) to try and get those to work in DSM. I tried many tactics to get them running in the baremetal configuration. But ultimately, the only way was to virtualize them and present them as SCSI devices.

     

    After converting to ESXi, /volume1 is on a pair of vmdk's (one on each NVMe drive) in the same RAID1 configuration. This was much faster, but I noted that Docker causes a lot of OS system writes which spanned all the drives (since Synology replicates the system and swap partitions across all devices). I was able to isolate DSM I/O to the NVMe drives by disabling the system partition (the example below excludes disks 3 and 4 from a 4-drive RAID):

    root@dsm62x:~# mdadm /dev/md0 -f /dev/sdc1 /dev/sdd1
    mdadm: set /dev/sdc1 faulty in /dev/md0
    mdadm: set /dev/sdd1 faulty in /dev/md0
    
    root@dsm62x:~# mdadm --manage /dev/md0 --remove faulty
    mdadm: hot removed 8:49 from /dev/md0
    mdadm: hot removed 8:33 from /dev/md0
    
    root@dsm62x:~# mdadm --grow /dev/md0 --raid-devices=2
    raid_disks for /dev/md0 set to 2
    
    root@dsm62x:~# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
    md2 : active raid5 sdc3[5] sda3[0] sdd3[4] sdb3[2]
          23427613632 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
    
    md1 : active raid1 sdc2[1] sda2[0] sdb2[3] sdd2[2]
          2097088 blocks [16/4] [UUUU____________]
    
    md0 : active raid1 sda1[0] sdb1[1]
          2490176 blocks [2/2] [UU]
    
    unused devices: <none>
    
    root@dsm62x:~# mdadm --add /dev/md0 /dev/sdc1 /dev/sdd1
    mdadm: added /dev/sdc1
    mdadm: added /dev/sdd1
    
    root@dsm62x:~# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
    md2 : active raid5 sdc3[5] sda3[0] sdd3[4] sdb3[2]
          23427613632 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
    
    md1 : active raid1 sdc2[1] sda2[0] sdb2[3] sdd2[2]
          2097088 blocks [16/4] [UUUU____________]
    
    md0 : active raid1 sdd1[2](S) sdc1[3](S) sda1[0] sdb1[1]
          2490176 blocks [2/2] [UU]
    root@dsm62x:~#

     

    If you want to disable the swap partition I/O in the same way, substitute /dev/md1 and its members sd(x)2 with the procedure above (NOTE: swap partition management not verified with DSM 6.1.x but it should work).

     

    After this, no Docker or DSM system I/O ever touches a spinning disk. Results: The DSM VM now boots in about 15 seconds. Docker used to take a minute or more to start and launch all the containers, now about 5 seconds.  Copying to/from the NVMe volume maxes out the 10GBe interface (1 gigaBYTE per second) and it cannot fill the DSM system cache; the NVMe disks can sustain the write rate indefinitely.  This is some serious performance, and a system configuration only possible because of XPEnology!

     

    Just as a matter of pointing out what is possible with Jun's boot loader, I was able to move the DSM directly from baremetal to ESXi, without reinstalling, by passthru of the SATA controller and the 10GBe NIC to the VM.  I also was able to switch back and forth between USB boot using the baremetal bootloader menu option and ESXi boot image using the ESXi bootloader menu option. Without the correct VM settings, this will result in hangs, crashes and corruption, but it can be done.

     

    I did have to shrink /volume1 to convert it to the NVMe drives (because some space was lost by virtualizing them), but ultimately was able to retain all aspects of the system configuration and many months of btrfs snapshots converting from baremetal to ESXi. For those who are contemplating such a conversion, it helps to have a mirror copy to fall back on, because it took many iterations to learn the ideal ESXi configuration.

    • Like 3
  18. On 1/13/2018 at 2:58 AM, test4321 said:

    Was reviewing this thread and saw discussion that Mellanox Connect-X 2 might not be supported by the Synology driverset.  I can confirm that the standard Mellanox driver supports Connect-X 2 single and dual port 10GBe on baremetal Xpenology 6.2 with no problem, tested on my own system.  However,  I switched to a ConnectX-3 because PCI 3.0 and SR-IOV support for ESXi.

     

     

    On 1/13/2018 at 2:58 AM, test4321 said:

    Hey guys,

     

    Now for networking I want to go 10GBe SFP+

     

    I am looking at:

     

    2X - Mellanox ConnectX-2

    https://www.ebay.com/itm/391459428428

     

  19. On 1/13/2018 at 2:58 AM, test4321 said:

    Was reviewing this thread and saw discussion that Mellanox Connect-X 2 might not be supported by the Synology driverset.  I can confirm that the standard Mellanox driver supports Connect-X 2 single and dual port 10GBe on baremetal Xpenology 6.2 with no problem, tested on my own system.  However,  I switched to a ConnectX-3 because PCI 3.0 and SR-IOV support for ESXi.

     

     

    On 1/13/2018 at 2:58 AM, test4321 said:

    Hey guys,

     

    Now for networking I want to go 10GBe SFP+

     

    I am looking at:

     

    2X - Mellanox ConnectX-2

    https://www.ebay.com/itm/391459428428

     

  20. Answers:

     

    1) don't passthrough USB controller, just virtualize the USB devices you need in the VM.  This also works for a physical synoboot key if you want to do that.  A related hint: don't use two USB keys with the same VID/PID (obvious in hindsight).

    2) The ESXi option on grub boot menu hides the boot drive/controller (not affected by SATAPortMap, etc).  I got better results by making sure that the drive and controller order is correct in the VM (boot drive and controller first).

×
×
  • Create New...