Jump to content
XPEnology Community

flyride

Moderator
  • Posts

    2,438
  • Joined

  • Last visited

  • Days Won

    127

Posts posted by flyride

  1. For others who might be reading this: software management and recovery of RAID is gradually becoming a more capable strategy than hardware RAID.  Even enterprise storage systems are moving to software models instead of custom circuitry.  In the case of DSM, using btrfs and intelligent file recovery is a significant enhancement over the hardware implementation.  But enough said, you've explained your preference.

     

    Regarding your loader and DSM versions - are you asking me if you should be nervous about being on the latest and greatest because I'm not?

    If that's your question, the answer is: it depends.  6.2 support is not as robust as earlier 6.x versions.  Depending on your hardware, there may be problems with upgrading.  I have one system on 6.2-23739U2 because if I upgrade, the Realtek NIC will cease to work, at least until a new loader and/or driver signing strategy comes forth.  My main system is actually on DSM 6.1.7-15284U2 because of its complexity, and the amount of testing I will require to change major DSM versions.

  2. What is the logic of the hardware RAID to run DSM?  It seems a bit backwards to run an OS designed for software RAID on top of a hardware RAID.

     

    In any case, if you want to preserve the system state, do it all with DSM, and end up with a larger VMDK, you can add a 4TB second VMDK and RAID1 in DSM, then delete the original VMDK and replace with a 4TB VMDK, remirror and your DSM storage will get expanded for you.  Then remove one of vdisks, force the array back to one member (RAID1 to Basic) using mdadm.

  3. So far Synology only supports NVMe on SSD cache, and DSM only has that capability with DS918 image.  Other images have no support.

     

    Several people (myself included) have attempted to modify DSM to make NVMe supported as a normal disk device.  To my knowledge, the only way this can be done today is using a hypervisor.  Here's some detail on how this was successfully implemented: https://xpenology.com/forum/topic/12391-nvme-optimization-baremetal-to-esxi-report/?tab=comments#comment-88243

  4. Yes, you need 3 drives for RAID5.  You can set up RAID1 and then convert to RAID5 when you have more than two drives, if needed.

    Your other statements are accurate for RAID5.  SHR will allow you to mix drive sizes, but one of the largest drives will be totally reserved for parity.

     

    As mentioned, DSM is normally installed to all drives.  You can isolate Plex to the SSD if you install it at a separate, Basic volume (obviously no redundancy). 

    I do something like this with my system - 2x SSD's in RAID1 for Docker/Plex, 8xHDD in RAID10 for media files and other storage.

     

    You actually can remove the DSM replicas and from specific devices - i.e. HDD's leaving DSM on SSD's, but it's a little obscure (and unsupported) how to do that, and there is no way to reclaim the space that DSM would have used.  It's a bit of an extreme tuning strategy and I would not consider it without having at least 2 devices for DSM. If you want to learn more read this: How to manage DSM to specific drives

     

    Quote

    Correct me if I'm wrong, but for RAID 5 the following limitations apply:

    • I need at least three drives of the same size (unless I want to under utilize the larger drives)
    • The only expansion I can do is add more drives of the same size or they will be under utilized
    • To use larger drives in the future, I'd have to replace smaller ones one at a time and rebuild
    • During the replacement cycle, the larger drives will be under utilized until all the smaller drives are replaced

    Where does DSM install itself initially? Would it be possible to force the OS to install on the SSD drive and dedicate the SSD to that and the other apps I want to host like PLEX?

     

    • Like 1
  5. System partition is the multi-drive RAID1 across all member drives where DSM is stored.  The array is getting initialized during boot without the USB drives being online and so it goes critical.  Using USB as array members is non-standard so something isn't quite going right during boot.  Repairing the system partition is just resyncing that RAID1.

     

    Edit: didn't see Balrog's post but he basically says the same.

  6. All good advice.  However, if you manually troubleshoot your volumes, a standard RAID1/RAID5 array is easier to work with than SHR (which is multiple RAID arrays joined via LVM).  So if you don't need that extra 500GB of storage, I'd leave those two off and just add to your array when you get additional, larger drives.

     

    Also, consider skipping the cache.  It is less effective than you might think.  And much of the value is duplicated with the large amount of RAM you have which will write cache.  There are many, many instances of corrupted volumes due to SSD cache.  At a minimum, do some testing with and without cache for a workload you are likely to do.  I think you'll find that it isn't meaningful enough to warrant its use.

    • Like 1
  7. You are on the XPenology forum.  The point of DSM (the operating system enabled by the XPenology loaders) is to implement RAID in software, not at the hardware/ASIC level.  So what you really want for the best functionality and performance is drive port density to match your desired number of drives.

     

    So you should look at the hardware compatibility threads for the SATA controller chips that are supported by DSM and see if any of those are implemented in mini-PCI.  I don't know of RAID controllers on mini-PCI, but again, that should not be what you are looking for.

     

    FWIW, most modern RAID controllers can be reflashed to perform as multi-channel SATA controllers, which would maximize their usability for DSM/XPenology.

  8. You should be able to deselect any boot source in your BIOS boot priority menu.  If a key is removed and the system powered-on, that will usually cause the BIOS to forget any preferred boot settings regarding that key (even if the key is reinstalled later).

     

    It also might be worth trying to disable legacy mode in case you are seeing the same key twice (once for UEFI, once for legacy).

     

  9. Synology doesn't provide a DAS connectivity option to date.  I had a similar desire to use DSM as the back end storage to a device that only was able to connect via USB.  I was able to make this work by purchasing a $25 "NanoPi NEO2" which is a tiny Linux computer that 1) had a gigabit Ethernet interface, and 2) supported USB OTG mode.  Then I built a sparsely populated image file on a Synology share and NFS mounted on the NanoPi, and used the Linux g_mass_storage gadget to emulate a USB drive from the image file.

     

    A Raspberry Pi only has a 100MB Ethernet and that's not fast enough to provide performance similar to traditional USB DAS.

     

    • Like 1
  10. The board will work fine.  I am running on DS918+ image right now with J4105.  As long as your RAM is compatible with the motherboard it doesn't matter very much.

     

    The board has an onboard Realtek NIC.  This works fine with the DS918+ image.  You want a second NIC?  Just make sure the Realtek or Intel driver is supported.  I have not seen anyone with this board running a second NIC however.

     

  11. You have a fairly typical setup.  I am running on this same motherboard with no problem.

     

    The SSD won't be very helpful to you aside from SSD cache, and many frankly do not recommend it due to general DSM stability and corruption problems that are encountered with regularity with SSD cache.

     

  12. 1 hour ago, Jamzor said:

    I have a old version being DSM 6.0.2-8451 update 8 (installed like a year ago. Have not updated a single time since.

    Now its time to update my system to the latest possible version (I guess 6.2-23739)

    So do I first go through this tutorial first and then update from 6.1.x to 6.2 or is there a way to jump directly to the latest version?

     

    Also a little question about data integrity. I do have a lot of data on my NAS that I cant really backup. is there a risk to loose my data here? Im running it barebone no virtualisation.

    If there is risk then how can I secure my data before attempting to update (without backing up everything.. like 6TB or so..)

     

    Why is it time to update your system to the latest possible version?  The latest possible version is alpha status for XPenology.

    Upgrading to the current version is not yet a highly robust process and there a lot of ways for it to go wrong.

    Because of this, you have the risk to make your data inaccessible.  If this happens and you are unable to follow recovery steps you may inadvertently lose your data.

     

    Even when upgrading works consistently on all hardware, you are unwise to attempt it without a backup.

     

    If you simply must proceed, perhaps you could attempt to duplicate your current system using a different set of drives, and a second USB stick on your existing motherboard, NIC and PCIe disk controller card (if you have one).  In other words, carefully remove production drives in order, set them aside, remove your production USB.  Then configure a set of test USB and disks.  If you then successfully upgrade this TEST system and take careful notes, then you will know how to successfully upgrade when you reattach your original USB and disks.

    • Thanks 1
  13. I've found that using the Synology Assistant app can be more reliable than find.synology.com as it's fully contained within your network and doesn't rely on any Synology cloud services.

  14. What is your objective in using ESXi to provide a client to host DSM?  Get access to DSM features for utility? Improve hardware compatibility?  Performance?  Reduce the number of physical boxes? Where is your storage - is it drives physically connected to the ESXi platform, or SAN?  You should be able to answer these questions to decide how to approach your DSM storage strategy.

     

    For me, I am using ESXi because it is the most effective way to get directly-connected NVMe SSD's to be functionally supported on DSM.  If it were not for that, I would run DSM baremetal for the best performance.  Initially, I used VMDK's to assign storage to DSM from NVMe based storage pools.  Later I discovered that the NVMe drives could be passed via physical RDM and presented to DSM via SCSI translation.  This offers spectacular performance by exposing the NVMe SSD hardware to DSM as much as possible.  Similarly, hardware that can be natively supported by DSM (10Gbe NIC, SATA controller) are passed through to the VM for best support and performance.

     

    Is it even possible to do snapshots with RDM?  I thought that Dependent mode is only available with a VMDK.  Even with a VMDK, I think you would have to have a lot of free space in your storage pool.  This might be feasible in a large VM environment with an external SAN, where the objective of hosting DSM is not to maximize storage available.  Again, in my use case of using ESXi specifically to host DSM, I want all my directly-connected storage allocated to it.

     

    My strategy to test upgrades with ESXi is to maintain a small storage pool separate from DSM (could be the same pool running ESXi itself) and build a second test VM with the same attributes as your production VM.  That way it's easy to copy off and/or burn down and rebuild without affecting production.  I've never considered using a VMWare snapshot to resolve this due to the limited size of the storage pool, but if I had a dedicated SAN behind ESXi, that might be more feasible as a test approach.

     

  15. I'm sorry but you are in a bad way now, and your actions thus far have probably made things more difficult to recover.  There is not a step-by-step method of fixing your problem at this point.  However, depending on how much time and energy you want to devote to learning md and lvm, you might be able to retrieve some data from the system.

     

    I have a few thoughts that may help you, and also explain the situation for others:

    1. A RAID corruption like this is where SHR makes the problem much harder to resolve.  If this were a RAID1, RAID5, RAID6 or RAID10, the solution would be easier.  Personally, I think about this when building a system and selecting the array redundancy strategy.
       
    2. MDRAID writes a comprehensive superblock containing information about the entire array to each member partition on each drive.  When the array is healthy, you can move drives around without much of a complaint from DSM.  When the array is broken, moving drives can cause an array rebuild to fail.  You should restore the drive order to what it was before the crash prior to doing anything else.
       
    3. /proc/mdstat reports four arrays in the system.  /dev/md0 is the OS array and is healthy.  /dev/md1 is the swap array and is healthy.  These are multi-member RAID1's so they will not degrade when a drive is missing from the system.  However, it's odd that the missing member in each of these arrays is out of order, and the numbering does not match between the arrays.
       
    4. I believe /dev/md2 and /dev/md3 are the two arrays that make up the SHR.  Each are reporting only 1 partition present (should be 4 in /dev/md2, and 3 in /dev/md3), which is why everything is crashed. 
       
    5. When the drives are in the correct physical order, hopefully /dev/md2 and /dev/md3 will show up in a critical (but functional) state instead of crashed.  If so, you just need to replace your bad drive, and repair your array.
       
    6. If you still see missing member partitions from /dev/md2 and /dev/md3, you may want to try to stop those arrays and force assemble them.  The system can scan the superblocks and guess which partitions belong to which arrays, or you can manually specify them.  Again, there isn't a step-by-step method for this as it will depend on what /proc/mdstat and individual partition superblock dumps tell you at that point.  You might start by reviewing the recovery examples in this thread here and mdadm administrative information here.
       
    7. If you are successful in getting your arrays started, you might end up with the situation where the volume is accessible from the command line but not visible in DSM or via network share.  In that case, you can get data off by copying directly from /volume1/<share> to an external drive.  If that works, I suggest copy data off, delete your volume and storage pool, recreate, and copy your data back on.
       

    I wish I could be of more help, but this is pretty far down the rabbit hole.  Take your time, and good luck to you.

    • Like 2
  16. How did the array look before the crash?  Right now DSM seems confused as to the construction of the original array and your drives are in an odd order.

     

    Did you rearrange the drives or remove drives from the array in an attempt to fix it?

     

    SSH into your box and perform a 

    $ cat /proc/mdstat

    Post the results here.

×
×
  • Create New...