sublimnl

Have I lost everything?

sublimnl replied to sublimnl's topic in General Post-Installation Questions/Discussions (non-hardware specific)

In case someone else runs across this topic in the future, I WAS ABLE TO GET EVERYTHING BACK. Had to use Recovery Explorer Pro to get the data back after the failed restriping. At first it seemed like it could not recover the data (but at least was able to get file names), so I reached out to their support team and they offered to have a look over TeamViewer. The guy took about 10 mins sussing out the drives, changing some recovery settings that were over my head despite working in all levels of IT Infrastructure for 25 years now, it was cool to watch him piece it back together. In the end, left it scanning overnight and I was able to fully recover everything. There were maybe 10 files that were corrupted out of over 300,000. Highly recommend these guys if someone else ends up in the same boat.

Have I lost everything?

sublimnl replied to sublimnl's topic in General Post-Installation Questions/Discussions (non-hardware specific)

OK, I managed to ssh into the xpenology VM. here is some output from there which can hopefully shed some light on what needs to be done next. It seems like it is still trying to reshape the array (46.5% (903431996/1942787584) finish=109615.1min speed=158K/sec) but I don't think its actually doing anything as it only says 158K/sec and I dont see any activity on the disks themselves. If I am interpreting the output correctly it looks like it thinks disk 0 is missing and only 1 and 2 are present even though disk 0 should be /dev/sdb5, which IS present and online. /dev/sdb is the disk that I originally copied all my data to (and then created a RAID1 with /dev/sdc5). /dev/sdb is also the disk that went offline during the expansion causing things to fall apart. I also see the reshape positions are different between the three disks, which doesn't seem like a great thing. Hoping someone out there can help 🙏 # cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md127 : active raid5 sdc5[1] sdd5[2] 1942787584 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU] [=========>...........] reshape = 46.5% (903431996/1942787584) finish=109615.1min speed=158K/sec unused devices: <none> # mdadm --detail /dev/md127 /dev/md127: Version : 1.2 Creation Time : Sat Sep 2 04:39:01 2023 Raid Level : raid5 Array Size : 1942787584 (1852.79 GiB 1989.41 GB) Used Dev Size : 1942787584 (1852.79 GiB 1989.41 GB) Raid Devices : 3 Total Devices : 2 Persistence : Superblock is persistent Update Time : Wed Sep 6 03:08:54 2023 State : clean, degraded, reshaping Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Consistency Policy : resync Reshape Status : 46% complete Delta Devices : 1, (2->3) Name : Synology:3 UUID : e8adb5a7:6b89d8ea:fa38fa83:5679ef65 Events : 15528 Number Major Minor RaidDevice State - 0 0 0 removed 1 8 37 1 active sync /dev/sdc5 2 8 53 2 active sync /dev/sdd5

sublimnl started following Have I lost everything? September 6, 2023

Have I lost everything?

sublimnl posted a topic in General Post-Installation Questions/Discussions (non-hardware specific)

I've just recently migrated to xpenology with DSM 7.2 and things went seriously wrong while expanding a storage pool. Here is the run-down of what I have done so far... I was moving off on an unraid array with 4 disks to xpenology. I freed up one of the disks on my physical unraid machine and moved it to my new xpenology VM on ESXi 8. Attached that disk to my xpenology VM as an RDM (via a spare 4 port USB3 enclosure I had laying around), created a new SHR volume and copied all my data over. Freed another disk from unraid and moved it to the Xpenology VM, added it to the storage pool and waited for that to sync up - great, now I have a RAID1 array on xpenology with all my data. Feeling confident that my data was now protected in Xpenology, I moved the remaining 2 disks from my unraid array into Xpenology. Now I have all 4 disks in my USB3 enclosure. Each disk is mapped as an RDM to my xpenology VM. Went into DSM and started pool expansion using the 2 new disks that were just added. Expansion ran for 12 hours and was maybe 20% complete. At this point I started looking into why it was taking so long and found out that I had accidentally attached the enclosure to a USB 2.0 port. Woops. I did some reading and found that I could safely shutdown the xpenology VM via the shutdown option in DSM and it should resume expansion when powered back on. Shut it down, moved the enclosure to a USB3 port, remapped the RDM's, being careful to make sure they were attached to the VM on the same exact SATA addresses. Booted back up. As advertised, the expansion picked up where it left off and was chugging along much faster. It was now saying another 15 hours to finish, which I felt much better about. After about 2 hours I got an email saying disk 1 (the one I originally created the pool on) had crashed. This was in the middle of my work day, so I didn't have time to investigate right then and there. I did see that the expansion was still going, and I could still access my data, so at this point I crossed my fingers that it would complete and I would at least have an SHR pool with 3 disks at the end. A couple hours later, got another email saying the ENTIRE POOL had crashed. WTF. After some investigating I found that ESX had completely dropped the USB connection to the enclosure and I couldn't even see the devices anymore from ESX's perspective. I have now gotten the USB connection stable again, but I cannot even boot into DSM. I have noticed that I can ping the IP address of the VM while it is booting, but once it starts load the kernel pings drop again and never come back (supposedly because DSM never finishes initializing). I noticed this behavior previoulsy with the pings while the system was healthy. It seems like the system does an initial boot which brings networking online, then when DSM starts to load pings drop again until everything is up and running. So I guess that is normal behavior even on a healthy system? So it seems like something is going wrong when DSM is loading, but without networking I have no way to see whats going on inside the VM. I can see that CPU is steady at about 30% when this happens, like it is stuck in a loop of some sort. No disk activity is happening at this time. Any ideas on what steps I should take from here?

sublimnl joined the community September 5, 2023

Performance when adding disk to pool in ESXi 8

sublimnl posted a topic in DSM 7.x

Just wondering if anyone else has had a peek at the disk monitoring for their Xpenology VM when adding a disk to a volume. Mine looks like this, which basically means its not doing anything 25-30% of the time. Not sure if this is normal or not. Disks are in an external USB3 JBOD enclosure (I know, I know) attached to a USB3 port and are mapped to the VM as RDM's.

September 5, 2023

Sign In

Posts

Joined

Last visited

sublimnl's Achievements

Newbie (1/7)

Reputation

Have I lost everything?

Have I lost everything?

Have I lost everything?

Performance when adding disk to pool in ESXi 8

Forums

What's new

MUST READ

Members