kaku Posted September 18, 2021 Share #1 Posted September 18, 2021 This is a catetrophe for me. I know lot of it if my fault, but please hel out if you can. My setup of a 3 HDD SHR (as shown in pic ). I am only concerned with Volume1 (Drive 1,2,3) . Drive 2 (2TB) was failing with bad sectors. I could not get physical access and told someone who did to turn off NAS. Apperantly he did not! Before the drive could be replaced, One morning drive 1 (4TB) crashed. This happend within 1-2 week or so . After that I shut it down for good till I can recover data. Today is that day...hopefully i am posting some basic data below. If someone could help me out. Please do. i am Confused between md2 (/dev/sd*5) or md3 (/dev/sd*6)!! So I will be posting data for both. I feel md3 is the current. # cat /proc/mdstat root@ANAND-NAS:~# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1] md3 : active raid5 sda6[5] sdc6[4] 1953485824 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU] md2 : active raid5 sdc5[3] sda5[5] 1943862912 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU] md5 : active raid1 sdc7[1] 1953494912 blocks super 1.2 [2/1] [_U] md4 : active raid1 sdd3[0] 971940544 blocks super 1.2 [1/1] [U] md1 : active raid1 sdd2[3] sdc2[2] sdb2[1] sda2[0] 2097088 blocks [12/4] [UUUU________] md0 : active raid1 sdb1[2] sdc1[1] sdd1[3] 2490176 blocks [12/3] [_UUU________] unused devices: <none> # ls /dev/sd* &mg* &vg* root@ANAND-NAS:~# ls /dev/sd* /dev/sda /dev/sda5 /dev/sdb /dev/sdb3 /dev/sdc /dev/sdc5 /dev/sdd /dev/sdd3 /dev/sda1 /dev/sda6 /dev/sdb1 /dev/sdb5 /dev/sdc1 /dev/sdc6 /dev/sdd1 /dev/sda2 /dev/sda7 /dev/sdb2 /dev/sdb6 /dev/sdc2 /dev/sdc7 /dev/sdd2 root@ANAND-NAS:~# ls /dev/md* /dev/md0 /dev/md1 /dev/md2 /dev/md3 /dev/md4 /dev/md5 root@ANAND-NAS:~# ls /dev/vg* /dev/vga_arbiter /dev/vg1000: lv # mdadm --detail /dev/md2 root@ANAND-NAS:~# mdadm --detail /dev/md2 /dev/md2: Version : 1.2 Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Used Dev Size : 971931456 (926.91 GiB 995.26 GB) Raid Devices : 3 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sat Sep 18 16:48:41 2021 State : clean, degraded Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Name : ANAND-NAS:2 (local to host ANAND-NAS) UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Events : 3405974 Number Major Minor RaidDevice State - 0 0 0 removed 3 8 37 1 active sync /dev/sdc5 5 8 5 2 active sync /dev/sda5 # mdadm --examine /dev/sd[abcde]5 | egrep 'Event|/dev/sd' root@ANAND-NAS:~# mdadm --examine /dev/sd[abcde]5 | egrep 'Event|/dev/sd' /dev/sda5: Events : 3405974 /dev/sdb5: Events : 1984014 /dev/sdc5: Events : 3405974 # mdadm --examine /dev/sd[abcdefklmnopqr]5 >>/tmp/raid.status root@ANAND-NAS:~# mdadm --examine /dev/sd[abcdefklmnopqr]5 >>/tmp/raid.status root@ANAND-NAS:~# cat /tmp/raid.status /dev/sdb5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795 Update Time : Tue Jul 20 20:32:58 2021 Checksum : 89f7077e - correct Events : 1984014 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 0 Array State : AAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7 Update Time : Sat Sep 18 16:48:41 2021 Checksum : 428a02b8 - correct Events : 3405974 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 1 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sda5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : d9d33dec:f7fcb455:96363412:16b96f43 Update Time : Sat Sep 18 16:48:41 2021 Checksum : d308d501 - correct Events : 3405974 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 2 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdb5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795 Update Time : Tue Jul 20 20:32:58 2021 Checksum : 89f7077e - correct Events : 1984014 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 0 Array State : AAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7 Update Time : Sat Sep 18 16:48:41 2021 Checksum : 428a02b8 - correct Events : 3405974 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 1 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) # mdadm --detail /dev/md3 root@ANAND-NAS:~# mdadm --detail /dev/md3 /dev/md3: Version : 1.2 Creation Time : Thu Mar 1 15:12:43 2018 Raid Level : raid5 Array Size : 1953485824 (1862.99 GiB 2000.37 GB) Used Dev Size : 976742912 (931.49 GiB 1000.18 GB) Raid Devices : 3 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sat Sep 18 16:48:41 2021 State : clean, degraded Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Name : ANAND-NAS:3 (local to host ANAND-NAS) UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a Events : 386242 Number Major Minor RaidDevice State - 0 0 0 removed 5 8 6 1 active sync /dev/sda6 4 8 38 2 active sync /dev/sdc6 # mdadm --examine /dev/sd[abcde]6 | egrep 'Event|/dev/sd' root@ANAND-NAS:~# mdadm --examine /dev/sd[abcde]6 | egrep 'Event|/dev/sd' /dev/sda6: Events : 386242 /dev/sdb6: Events : 361776 /dev/sdc6: Events : 386242 # mdadm --examine /dev/sd[abcdefklmnopqr]6 >>/tmp/raid.status root@ANAND-NAS:~# mdadm --examine /dev/sd[abcdefklmnopqr]6 >>/tmp/raid.status root@ANAND-NAS:~# cat /tmp/raid.status /dev/sdb5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795 Update Time : Tue Jul 20 20:32:58 2021 Checksum : 89f7077e - correct Events : 1984014 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 0 Array State : AAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7 Update Time : Sat Sep 18 16:48:41 2021 Checksum : 428a02b8 - correct Events : 3405974 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 1 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sda5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : d9d33dec:f7fcb455:96363412:16b96f43 Update Time : Sat Sep 18 16:48:41 2021 Checksum : d308d501 - correct Events : 3405974 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 2 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdb5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795 Update Time : Tue Jul 20 20:32:58 2021 Checksum : 89f7077e - correct Events : 1984014 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 0 Array State : AAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc5: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef Name : ANAND-NAS:2 (local to host ANAND-NAS) Creation Time : Sat Feb 11 20:12:44 2017 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB) Array Size : 1943862912 (1853.81 GiB 1990.52 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7 Update Time : Sat Sep 18 16:48:41 2021 Checksum : 428a02b8 - correct Events : 3405974 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 1 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sda6: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a Name : ANAND-NAS:3 (local to host ANAND-NAS) Creation Time : Thu Mar 1 15:12:43 2018 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1953485856 (931.49 GiB 1000.18 GB) Array Size : 1953485824 (1862.99 GiB 2000.37 GB) Used Dev Size : 1953485824 (931.49 GiB 1000.18 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=32 sectors State : clean Device UUID : ace310f3:16f9f59b:6760b452:ab0266f0 Update Time : Sat Sep 18 16:48:41 2021 Checksum : 74926764 - correct Events : 386242 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 1 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdb6: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a Name : ANAND-NAS:3 (local to host ANAND-NAS) Creation Time : Thu Mar 1 15:12:43 2018 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1953485856 (931.49 GiB 1000.18 GB) Array Size : 1953485824 (1862.99 GiB 2000.37 GB) Used Dev Size : 1953485824 (931.49 GiB 1000.18 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=32 sectors State : clean Device UUID : fca446c7:fa907b0b:96b8782b:1804d140 Update Time : Mon Aug 9 21:28:23 2021 Checksum : e14a402b - correct Events : 361776 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 0 Array State : AAA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc6: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a Name : ANAND-NAS:3 (local to host ANAND-NAS) Creation Time : Thu Mar 1 15:12:43 2018 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 1953485856 (931.49 GiB 1000.18 GB) Array Size : 1953485824 (1862.99 GiB 2000.37 GB) Used Dev Size : 1953485824 (931.49 GiB 1000.18 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=32 sectors State : clean Device UUID : ff8b56d3:fe921dbb:29ae5f41:785222d3 Update Time : Sat Sep 18 16:48:41 2021 Checksum : 4567472d - correct Events : 386242 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 2 Array State : .AA ('A' == active, '.' == missing, 'R' == replacing) Also I may have made one read only via following this post - DANGER : Raid crashed. Help me restore my data! I feel md3 is the current one. due to its creation date. Quote Link to comment Share on other sites More sharing options...
flyride Posted September 22, 2021 Share #2 Posted September 22, 2021 On 9/18/2021 at 5:30 AM, kaku said: i am Confused between md2 (/dev/sd*5) or md3 (/dev/sd*6)!! So I will be posting data for both. I feel md3 is the current. Yes, this is a problem. You have a SHR, which is comprised of multiple arrays bound together. So all the arrays are important, there are none that are not "current." Follow this post specifically and post the discovery of your SHR arrays, then we can build an array map that helps us visualize what is going on. https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability/?do=findComment&comment=107971 Quote Link to comment Share on other sites More sharing options...
kaku Posted September 23, 2021 Author Share #3 Posted September 23, 2021 (edited) OK @flyride here you go. cat /etc/fstab root@ANAND-NAS:~# cat /etc/fstab none /proc proc defaults 0 0 /dev/root / ext4 defaults 1 1 /dev/vg1000/lv /volume1 btrfs auto_reclaim_space,synoacl,noatime 0 0 /dev/md4 /volume2 btrfs auto_reclaim_space,synoacl,noatime 0 0 lvdisplay -v root@ANAND-NAS:~# lvdisplay -v Using logical volume(s) on command line. --- Logical volume --- LV Path /dev/vg1000/lv LV Name lv VG Name vg1000 LV UUID 2kxrUL-GlLw-QNWB-DKnN-0u3C-gMnj-uLNI8f LV Write Access read/write LV Creation host, time , LV Status available # open 0 LV Size 5.45 TiB Current LE 1428427 Segments 4 Allocation inherit Read ahead sectors auto - currently set to 512 Block device 253:0 vgdisplay -v root@ANAND-NAS:~# vgdisplay -v Using volume group(s) on command line. --- Volume group --- VG Name vg1000 System ID Format lvm2 Metadata Areas 3 Metadata Sequence No 10 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 0 Max PV 0 Cur PV 3 Act PV 3 VG Size 5.45 TiB PE Size 4.00 MiB Total PE 1428427 Alloc PE / Size 1428427 / 5.45 TiB Free PE / Size 0 / 0 VG UUID Mke5hF-n7l1-wMpl-9Te4-Rvfq-Z21n-EQqcFk --- Logical volume --- LV Path /dev/vg1000/lv LV Name lv VG Name vg1000 LV UUID 2kxrUL-GlLw-QNWB-DKnN-0u3C-gMnj-uLNI8f LV Write Access read/write LV Creation host, time , LV Status available # open 0 LV Size 5.45 TiB Current LE 1428427 Segments 4 Allocation inherit Read ahead sectors auto - currently set to 512 Block device 253:0 --- Physical volumes --- PV Name /dev/md2 PV UUID 2JE2jc-v1cv-cRNa-NmUw-u2qO-wtUp-Wfg5mj PV Status allocatable Total PE / Free PE 474575 / 0 PV Name /dev/md3 PV UUID R0kZZM-1Ao0-AkZF-lWQ4-3dJj-RZFl-FeOnCO PV Status allocatable Total PE / Free PE 476925 / 0 PV Name /dev/md5 PV UUID EKN4BC-gbee-9l0t-NIyi-eqaW-AG9D-Ok0g9j PV Status allocatable Total PE / Free PE 476927 / 0 I see that md5 maybe involved too. posting its stats too below. mdadm --detail /dev/md5 root@ANAND-NAS:/# mdadm --detail /dev/md5 /dev/md5: Version : 1.2 Creation Time : Wed Mar 25 06:34:07 2020 Raid Level : raid1 Array Size : 1953494912 (1863.00 GiB 2000.38 GB) Used Dev Size : 1953494912 (1863.00 GiB 2000.38 GB) Raid Devices : 2 Total Devices : 1 Persistence : Superblock is persistent Update Time : Thu Sep 23 11:18:37 2021 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 Name : ANAND-NAS:5 (local to host ANAND-NAS) UUID : bb1662c4:8b29e668:ef84a145:ef29ee17 Events : 395326 Number Major Minor RaidDevice State - 0 0 0 removed 1 8 39 1 active sync /dev/sdc7 mdadm --examine /dev/sd[abcdefklmnopqr]7 >>/tmp/raid.status root@ANAND-NAS:/# mdadm --examine /dev/sd[abcdefklmnopqr]7 >>/tmp/raid.status root@ANAND-NAS:/# cat /tmp/raid.status /dev/sda7: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : bb1662c4:8b29e668:ef84a145:ef29ee17 Name : ANAND-NAS:5 (local to host ANAND-NAS) Creation Time : Wed Mar 25 06:34:07 2020 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 3906989856 (1863.00 GiB 2000.38 GB) Array Size : 1953494912 (1863.00 GiB 2000.38 GB) Used Dev Size : 3906989824 (1863.00 GiB 2000.38 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=32 sectors State : clean Device UUID : 93621687:1b547db6:710969ff:0dc48e9b Update Time : Fri Aug 27 23:26:41 2021 Checksum : 2d7f219f - correct Events : 388516 Device Role : Active device 0 Array State : AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdc7: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : bb1662c4:8b29e668:ef84a145:ef29ee17 Name : ANAND-NAS:5 (local to host ANAND-NAS) Creation Time : Wed Mar 25 06:34:07 2020 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 3906989856 (1863.00 GiB 2000.38 GB) Array Size : 1953494912 (1863.00 GiB 2000.38 GB) Used Dev Size : 3906989824 (1863.00 GiB 2000.38 GB) Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=32 sectors State : clean Device UUID : 8d6d572c:96c53779:a01333cf:4d4322eb Update Time : Thu Sep 23 11:18:37 2021 Checksum : b4fc2ff5 - correct Events : 395326 Device Role : Active device 1 Array State : .A ('A' == active, '.' == missing, 'R' == replacing) Again, thanks for helping. Edited September 23, 2021 by kaku some more info Quote Link to comment Share on other sites More sharing options...
kaku Posted September 23, 2021 Author Share #4 Posted September 23, 2021 root@ANAND-NAS:~# vgdisplay --- Volume group --- VG Name vg1000 System ID Format lvm2 Metadata Areas 3 Metadata Sequence No 10 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 0 Max PV 0 Cur PV 3 Act PV 3 VG Size 5.45 TiB PE Size 4.00 MiB Total PE 1428427 Alloc PE / Size 1428427 / 5.45 TiB Free PE / Size 0 / 0 VG UUID Mke5hF-n7l1-wMpl-9Te4-Rvfq-Z21n-EQqcFk root@ANAND-NAS:~# lvm pvscan PV /dev/md2 VG vg1000 lvm2 [1.81 TiB / 0 free] PV /dev/md3 VG vg1000 lvm2 [1.82 TiB / 0 free] PV /dev/md5 VG vg1000 lvm2 [1.82 TiB / 0 free] Total: 3 [5.45 TiB] / in use: 3 [5.45 TiB] / in no VG: 0 [0 ] Quote Link to comment Share on other sites More sharing options...
flyride Posted September 23, 2021 Share #5 Posted September 23, 2021 Ok, let's define a few things first: 1. Storage Pool - the suite of arrays that make up your storage system 2. Volume - the filesystem built upon the Storage Pool 3. Crashed (Storage Pool) - one or more arrays cannot be started and/or the vg that binds them cannot be started 4. Crashed (Volume) - the volume cannot be mounted for a variety of reasons 5. Crashed (Disk) - the disk probably has an unrecoverable bad sector which has resulted in array failure and possible data loss. The disk may still be operational. 6. Failing (Disk) - this means that the drive has failing a SMART test. The disk may be (and probably is) still operational. Your system has four drives. It looks like three of them participate in a SHR which spans three arrays (md2, md3, md5). Each array has its own redundancy, but all need to be operational in order to access data in the Storage Pool. One drive seems to be set up as a Basic volume all by itself (Disk 4, sdd, md4). If this doesn't sound right to you, please explain what is incorrect. Otherwise, this is your disk layout. Please note that the rank order of your arrays, and the members within the arrays, is not at all consistent. This is not a problem, but it can be confusing and a source of error if you are trying to correct array problems via command line. This means that arrays /dev/md2 and /dev/md3 are degraded because of missing /dev/sdb (Drive 2) and array /dev/md5 is degraded because of missing /dev/sda. This is a bit unusual (missing/stale array members that span multiple physical disks). And it also means that there is cross-linked loss of redundancy across the arrays. In simple terms, you have no redundancy and ALL THREE DRIVES are required to provide access to your data. On 9/18/2021 at 5:30 AM, kaku said: Also I may have made one read only via following this post - DANGER : Raid crashed. Help me restore my data! I'm not 100% sure what you actually did here. Did you stop one of the drives using the UI? If so, your Storage Pool should now show Crashed and your data should not be accessible. If this is the case, do a new cat /proc/mdstat and post the results. If the Storage Pool still shows Degraded, then your data should be accessible. In that case, your #1 job is to copy all the data off of your NAS, because if ANYTHING goes wrong now, your data is lost. Don't try and repair anything until all your data is copied and safe. Then we can experiment with restoring redundancy and replacing drives. Quote Link to comment Share on other sites More sharing options...
kaku Posted September 24, 2021 Author Share #6 Posted September 24, 2021 10 hours ago, flyride said: One drive seems to be set up as a Basic volume all by itself (Disk 4, sdd, md4) Correct. Drive 4 is just another storage pool with volume2 with no RAID which contains non important data. We can ignore it. 10 hours ago, flyride said: I'm not 100% sure what you actually did here. Did you stop one of the drives using the UI? I did mdadm --misc -o /dev/md2 from your comment here . I think i did it on md2 . but does this carry over after a reboot? Didnt want to keep NAS runnig to avoid more damage to disks. 14 hours ago, flyride said: you have no redundancy and ALL THREE DRIVES are required to provide access to your data. Well, volume1 was accessible in degraded mode even after Drive 2 was kicked out due to bad sectors. Its when Drive1 "crashed"(Its a relatively new drive), then volume crashed. There is also loss of config for shared folders. then my brother installed photoshation app again thinking that it my have some error(I know i shouldn't have given him this much power in NAS, but he had physical access, so I thought it may be usefull ). Thats the last thing we did and shutdown for good. 14 hours ago, flyride said: If this is the case, do a new cat /proc/mdstat and post the results. Current state: Storage pool is degraded, Volume1 is crashed. Data could not be accessible. cat /proc/mdstat root@ANAND-NAS:~# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1] md5 : active raid1 sdc7[1] 1953494912 blocks super 1.2 [2/1] [_U] md2 : active raid5 sdc5[3] sda5[5] 1943862912 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU] md3 : active raid5 sda6[5] sdc6[4] 1953485824 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU] md4 : active raid1 sdd3[0] 971940544 blocks super 1.2 [1/1] [U] md1 : active raid1 sda2[0] sdb2[1] sdc2[2] sdd2[3] 2097088 blocks [12/4] [UUUU________] md0 : active raid1 sdb1[2] sdc1[1] sdd1[3] 2490176 blocks [12/3] [_UUU________] unused devices: <none> Quote Link to comment Share on other sites More sharing options...
flyride Posted September 24, 2021 Share #7 Posted September 24, 2021 Ok, we know your Storage Pool is functional but degraded. Again, degraded = loss of redundancy, not your data, as long as we don't do something to break it further. Your btrfs filesystem has some corruption that appears to be causing it not to mount. btrfs attempts to heal itself in real-time and when it cannot, it throws errors or refuses to mount. Your goal is to get it to mount read-only through combination of discovery and leveraging redundancy in btrfs. btrfs is not like ext4 and fsck. You should not expect to "fix" the problem. If you can make your data accessible through command-line directives, then COPY everything off, delete the volume entirely, and recreate it. I'm no expert at ferreting out btrfs problems, but you might find some of the strategies in this thread helpful (starting with the linked post): https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability/?do=findComment&comment=107979 Post back here if you aren't sure what to do. Quote Link to comment Share on other sites More sharing options...
kaku Posted September 24, 2021 Author Share #8 Posted September 24, 2021 OK, I will check it out Quote Link to comment Share on other sites More sharing options...
kaku Posted September 24, 2021 Author Share #9 Posted September 24, 2021 (edited) Thanksfor stikcing with me!! Oh no I found out that volume1 is mounted but with "blank data". Is this overwritten? on the other hand, Could not find volume1 current mount in cat /etc/mtab (attached) root@ANAND-NAS:/volume1$ ls -lr total 32 -rw-rw---- 1 system log 19621 Aug 28 22:08 @SynoFinder-log drwxr-xr-x 3 root root 4096 Aug 28 22:07 @SynoFinder-LocalShadow drwxr-xr-x 3 root root 4096 Aug 28 22:07 Plex drwxrwxrwx 14 root root 4096 Aug 28 22:07 @eaDir So still good to go ? I did the following and got same error . sudo mount /dev/vg1000/lv /volume1 sudo mount -o clear_cache /dev/vg1000/lv /volume1 sudo mount -o recovery /dev/vg1000/lv /volume1 sudo mount -o recovery,ro /dev/vg1000/lv /volume1 mount: wrong fs type, bad option, bad superblock on /dev/vg1000/lv, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. dmesg | tail root@ANAND-NAS:~# dmesg | tail [35831.857718] parent transid verify failed on 88007344128 wanted 7582040 found 7582038 [35831.857733] md2: syno_self_heal_is_valid_md_stat(496): md's current state is not suitable for data correction [35831.867644] md2: syno_self_heal_is_valid_md_stat(496): md's current state is not suitable for data correction [35831.877616] parent transid verify failed on 88007344128 wanted 7582040 found 7582038 [35831.877641] parent transid verify failed on 88007344128 wanted 7582040 found 7582038 [35831.877643] BTRFS error (device dm-0): BTRFS: dm-0 failed to repair parent transid verify failure on 88007344128, mirror = 2 [35831.890418] parent transid verify failed on 88007344128 wanted 7582040 found 7582041 [35831.890476] BTRFS: Failed to read block groups: -5 [35831.915227] BTRFS: open_ctree failed Now tried sudo btrfs rescue super /dev/vg1000/lv root@ANAND-NAS:~# sudo btrfs rescue super /dev/vg1000/lv All supers are valid, no need to recover sudo btrfs-find-root /dev/vg1000/lv Dump is big , so showing first fwe lines and attaching full dump. root@ANAND-NAS:~# sudo btrfs-find-root /dev/vg1000/lv parent transid verify failed on 88007344128 wanted 7582040 found 7582041 parent transid verify failed on 88007344128 wanted 7582040 found 7582041 parent transid verify failed on 88007344128 wanted 7582040 found 7582038 parent transid verify failed on 88007344128 wanted 7582040 found 7582041 Ignoring transid failure incorrect offsets 15625 135 Superblock thinks the generation is 7582040 Superblock thinks the level is 1 Found tree root at 88006344704 gen 7582040 level 1 Well block 87994417152(gen: 7582033 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1 Well block 87992172544(gen: 7582032 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1 Well block 87981391872(gen: 7582029 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1 Well block 87975280640(gen: 7582028 level: 1) seems good, but generation/level doesn't match, want gen: 7582040 level: 1 Well block 87968268288(gen: 7582024 level: 1) seems good, but generation/level doesn't match, want gen: 7582040 level: 1 Well block 87966810112(gen: 7582020 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1 Finally sudo btrfs insp dump-s -f /dev/vg1000/lv root@ANAND-NAS:~# sudo btrfs insp dump-s -f /dev/vg1000/lv superblock: bytenr=65536, device=/dev/vg1000/lv --------------------------------------------------------- csum 0x812b0015 [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsid 8ae2592d-0773-4a58-86cd-9be492d7cabe label 2017.02.11-14:42:46 v8451 generation 7582040 root 88006344704 sys_array_size 226 chunk_root_generation 6970521 root_level 1 chunk_root 21037056 chunk_root_level 1 log_root 88008196096 log_root_transid 0 log_root_level 0 total_bytes 5991257079808 bytes_used 3380737892352 sectorsize 4096 nodesize 16384 leafsize 16384 stripesize 4096 root_dir 6 num_devices 1 compat_flags 0x0 compat_ro_flags 0x0 incompat_flags 0x16b ( MIXED_BACKREF | DEFAULT_SUBVOL | COMPRESS_LZO | BIG_METADATA | EXTENDED_IREF | SKINNY_METADATA ) csum_type 0 csum_size 4 cache_generation 312768 uuid_tree_generation 7582040 dev_item.uuid 4e64f8b3-55a5-4625-82b1-926c902a62e0 dev_item.fsid 8ae2592d-0773-4a58-86cd-9be492d7cabe [match] dev_item.type 0 dev_item.total_bytes 5991257079808 dev_item.bytes_used 3528353382400 dev_item.io_align 4096 dev_item.io_width 4096 dev_item.sector_size 4096 dev_item.devid 1 dev_item.dev_group 0 dev_item.seek_speed 0 dev_item.bandwidth 0 dev_item.generation 0 sys_chunk_array[2048]: item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 0) chunk length 4194304 owner 2 stripe_len 65536 type SYSTEM num_stripes 1 stripe 0 devid 1 offset 0 dev uuid: 4e64f8b3-55a5-4625-82b1-926c902a62e0 item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 20971520) chunk length 8388608 owner 2 stripe_len 65536 type SYSTEM|DUP num_stripes 2 stripe 0 devid 1 offset 20971520 dev uuid: 4e64f8b3-55a5-4625-82b1-926c902a62e0 stripe 1 devid 1 offset 29360128 dev uuid: 4e64f8b3-55a5-4625-82b1-926c902a62e0 backup_roots[4]: backup 0: backup_tree_root: 88006344704 gen: 7582040 level: 1 backup_chunk_root: 21037056 gen: 6970521 level: 1 backup_extent_root: 88004870144 gen: 7582040 level: 2 backup_fs_root: 29556736 gen: 6 level: 0 backup_dev_root: 88006639616 gen: 7582040 level: 1 backup_csum_root: 88007917568 gen: 7582041 level: 3 backup_total_bytes: 5991257079808 backup_bytes_used: 3380737892352 backup_num_devices: 1 backup 1: backup_tree_root: 88004640768 gen: 7582037 level: 1 backup_chunk_root: 21037056 gen: 6970521 level: 1 backup_extent_root: 88002723840 gen: 7582037 level: 2 backup_fs_root: 29556736 gen: 6 level: 0 backup_dev_root: 29802496 gen: 7581039 level: 1 backup_csum_root: 87998988288 gen: 7582038 level: 3 backup_total_bytes: 5991257079808 backup_bytes_used: 3380737863680 backup_num_devices: 1 backup 2: backup_tree_root: 88009687040 gen: 7582038 level: 1 backup_chunk_root: 21037056 gen: 6970521 level: 1 backup_extent_root: 88004018176 gen: 7582039 level: 2 backup_fs_root: 29556736 gen: 6 level: 0 backup_dev_root: 29802496 gen: 7581039 level: 1 backup_csum_root: 88001593344 gen: 7582039 level: 3 backup_total_bytes: 5991257079808 backup_bytes_used: 3380737896448 backup_num_devices: 1 backup 3: backup_tree_root: 88013193216 gen: 7582039 level: 1 backup_chunk_root: 21037056 gen: 6970521 level: 1 backup_extent_root: 88004018176 gen: 7582039 level: 2 backup_fs_root: 29556736 gen: 6 level: 0 backup_dev_root: 29802496 gen: 7581039 level: 1 backup_csum_root: 88001593344 gen: 7582039 level: 3 backup_total_bytes: 5991257079808 backup_bytes_used: 3380737949696 backup_num_devices: 1 If I want to restore to a new HDD . Its size should be of the USED volume/storage pool space size (6TB) or of the actual used volume space(i think it was less than 4TB, I checked the other post and git my answer, but how can i know how much data size was ACTUALLY present? Can it be checked from the dumps above?) Next I want to do sudo btrfs check --init-extent-tree /dev/vg1000/lv sudo btrfs check --init-csum-tree /dev/vg1000/lv sudo btrfs check --repair /dev/vg1000/lv Holding off for now, any suggestions? btrfs-find-root.txt mtab.txt Edited September 24, 2021 by kaku found answere in othe post Quote Link to comment Share on other sites More sharing options...
flyride Posted September 24, 2021 Share #10 Posted September 24, 2021 8 minutes ago, kaku said: Oh no I found out that volume1 is mounted but with "blank data". Is this overwritten? I think perhaps you mean that you see a /volume1 folder. If the volume is not mounted, that folder will be blank. Check to see what volumes have been mounted with df -v at the command line. You should see a /volume1 entry connected to your /dev/lv device if it is mounted successfully. You can also sudo mount from the command line and any unmounted but valid volumes will be mounted (or you will receive an error message as to why they cannot be). Quote Link to comment Share on other sites More sharing options...
flyride Posted September 24, 2021 Share #11 Posted September 24, 2021 Obviously we need to prove that you do not have a fs mounted on /volume1 per the previous post (incidentally, it looks like you have some jobs - ie Plex - that have written to /volume1 and in the absence of the mounted filesystem, those files are in the root filesystem. You should stop your Plex from running while this is going on) I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem. http://www.infotinks.com/btrfs-restoring-a-corrupt-filesystem-from-another-tree-location/ If nothing works I would go ahead with at least the first two check options. Quote Link to comment Share on other sites More sharing options...
kaku Posted September 24, 2021 Author Share #12 Posted September 24, 2021 29 minutes ago, flyride said: Check to see what volumes have been mounted with df -v at the command line I checked . volume1 not mounted. So /volume1 is just a folder. 19 minutes ago, flyride said: it looks like you have some jobs - ie Plex - that have written All packages are in "error" mode so, i think its old crons which ran at the time. (doesnt containt anything) 34 minutes ago, flyride said: I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem. http://www.infotinks.com/btrfs-restoring-a-corrupt-filesystem-from-another-tree-location/ i will try this tommorow and post back. Quote Link to comment Share on other sites More sharing options...
kaku Posted September 25, 2021 Author Share #13 Posted September 25, 2021 A roadbock. WARNING for a stupid thing I did. 13 hours ago, flyride said: I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem. http://www.infotinks.com/btrfs-restoring-a-corrupt-filesystem-from-another-tree-location/ I ran the script posted their to find out the best tree root to get. It did show my folder structures intact! But That script ran too long, I inspeected why and sure, it was filling /root dir in GBs! I didnt comprehend that it will fill /root so fast. I should have ran on a USB stick . And now I cant login to DSM or SSH because storage is full!! Quote You cannot login to the system because the disk space is full currently. Please restart the system and try again. Is their a Grub/boot loader script i can run to clear the /root folder at startup? Specially the 333* files generated from the script their. Quote Link to comment Share on other sites More sharing options...
Orphée Posted September 25, 2021 Share #14 Posted September 25, 2021 Did you try to login with SSH (putty) and be patient. You may succed to login if it does not go in timeout Quote Link to comment Share on other sites More sharing options...
kaku Posted September 25, 2021 Author Share #15 Posted September 25, 2021 (edited) @Orphée I did. It says access denied on SSH and the above error on DSM The closest I caan see with my issue is this old post . Reinstall of DSM in out of question for me(due to recovery). I have dug a pretty deep hole for myself. 😕 Edited September 25, 2021 by kaku Quote Link to comment Share on other sites More sharing options...
kaku Posted September 25, 2021 Author Share #16 Posted September 25, 2021 OK Another old post I found which could load me into root is here . Quote Restart Press the ESC key at the grub prompt Press e to enter modification mode Select the line of the starting kernel and press e To the last line, enter rw init=/bin/bash Press enter, then b to restart the computer At this time, the computer will live in the root shell without a password Not able to get this to work in grub.cfg Should i just add line in cfg like so? Or do it the way mentioned above? linux $img/$zImage $common_args $bootdev_args $extra_args $@ rw init=/bin/bash Quote Link to comment Share on other sites More sharing options...
flyride Posted September 25, 2021 Share #17 Posted September 25, 2021 What does it say when access denied? Should be able to log in ssh if it was configured before. Alternative is to set up a linux boot environment and hten manually start /dev/md0 and mount the partition temporarily. Quote Link to comment Share on other sites More sharing options...
kaku Posted September 26, 2021 Author Share #18 Posted September 26, 2021 It just says access denied. On ssh. Now DSM is also not showing. I booted into ubuntu live. I can see md2,3,4,5 and vg1000 , same as in DSM (mdadm -Asf && vgchange -ay). I think i can do tree root stuff from here too. But I cant see md0 to correct /root. How can i rebuild md0?Sent from my SM-G998B using Tapatalk Quote Link to comment Share on other sites More sharing options...
flyride Posted September 26, 2021 Share #19 Posted September 26, 2021 root is from md0. Did you try mdadm --scan --assemble? Try not to modify anything other than /dev/md0 from Linux. Quote Link to comment Share on other sites More sharing options...
kaku Posted October 1, 2021 Author Share #20 Posted October 1, 2021 Ok. Before i did that. I could see that DSM stopped loading(no HDD connected error on webpage) . Also on a different local ip then i specified.But they are shown correct in live ubuntu envoinment. I loaded md0 as you suggested. Cleared /root but DSM ISSUE till their.I think its best I do the remaing stuff in ubuntu. As mdstat us exactly same as it showed in DSM.ANY other way boot into DSM. Ssh now shows refused.Sent from my SM-G998B using Tapatalk Quote Link to comment Share on other sites More sharing options...
kaku Posted October 1, 2021 Author Share #21 Posted October 1, 2021 Ok found the issue, some how grub got changedFromLoadlinux 3617 usbTo Loadlinux 3617 sataNow able to boot .I will report back what i findSent from my SM-G998B using Tapatalk Quote Link to comment Share on other sites More sharing options...
kaku Posted October 2, 2021 Author Share #22 Posted October 2, 2021 On 9/24/2021 at 10:01 PM, flyride said: I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem. UPDATE: THANK YOU @flyride!!! I AM SAVED. I have copied out my data to another HDD!!!!! I made another storage pool and volume. and copied date off to it. 3TB. I think data is good. How can I check which files were corrupted or not fully restored? On 9/24/2021 at 9:39 PM, kaku said: sudo btrfs check --init-extent-tree /dev/vg1000/lv sudo btrfs check --init-csum-tree /dev/vg1000/lv sudo btrfs check --repair /dev/vg1000/lv Now that Data is back. Should I try repairing the original Volume? Quote Link to comment Share on other sites More sharing options...
flyride Posted October 2, 2021 Share #23 Posted October 2, 2021 7 hours ago, kaku said: I have copied out my data to another HDD!!!!! I made another storage pool and volume. and copied date off to it. 3TB. I think data is good. How can I check which files were corrupted or not fully restored? Did you have btrfs checksum "on" for the affected volume? If so, btrfs would have told you if there was data corruption via pop-up in the UI. Normally it would also fix it, but you have no redundancy. Without the checksum, there is no way to determine corruption. If files were missing for some reason, that is also not really detectable. A good reason to have a real backup somewhere... 7 hours ago, kaku said: Now that Data is back. Should I try repairing the original Volume? I say this often: btrfs will fix itself if it can. If it cannot do that, there is underlying corruption that MAY be fixable via filesystem correction tools, but probably won't fully address the problem. Linux culture holds that ext4/fsck can fix ANYTHING and always should have confidence in the filesystem after it is complete, but that just isn't true with btrfs. I strongly recommend you delete the volume and rebuild it from scratch. Since you also have a Storage Pool problem, there is consequently no reason not to delete that as well, replace any drives that are actually faulty, and re-create a clean Storage Pool too. Then copy your files back in from your backup. Glad this worked out, probably as well as it could have for you given the difficult intermediate steps. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.