Jump to content
XPEnology Community

Help! Volume Crashed. SHR btrfs


Recommended Posts

This is a catetrophe for me. I know lot of it if my fault, but please hel out if you can.

My setup of a 3 HDD SHR (as shown in pic ). I am only concerned with Volume1 (Drive 1,2,3) . Drive 2 (2TB) was failing with bad sectors. I could not get physical access and told someone who did to turn off NAS. Apperantly he did not! Before the drive could be replaced, One morning drive 1 (4TB) crashed. This happend within 1-2 week or so . After that I shut it down for good till I can recover data. Today is that day...hopefully
i am posting some basic data below. If someone could help me out. Please do.

 

i am Confused between md2 (/dev/sd*5) or md3 (/dev/sd*6)!! So I will be posting data for both. I feel md3 is the current.

 

 

# cat /proc/mdstat

root@ANAND-NAS:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1]
md3 : active raid5 sda6[5] sdc6[4]
      1953485824 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU]

md2 : active raid5 sdc5[3] sda5[5]
      1943862912 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU]

md5 : active raid1 sdc7[1]
      1953494912 blocks super 1.2 [2/1] [_U]

md4 : active raid1 sdd3[0]
      971940544 blocks super 1.2 [1/1] [U]

md1 : active raid1 sdd2[3] sdc2[2] sdb2[1] sda2[0]
      2097088 blocks [12/4] [UUUU________]

md0 : active raid1 sdb1[2] sdc1[1] sdd1[3]
      2490176 blocks [12/3] [_UUU________]

unused devices: <none>

# ls /dev/sd* &mg* &vg*

root@ANAND-NAS:~# ls /dev/sd*
/dev/sda   /dev/sda5  /dev/sdb   /dev/sdb3  /dev/sdc   /dev/sdc5  /dev/sdd   /dev/sdd3
/dev/sda1  /dev/sda6  /dev/sdb1  /dev/sdb5  /dev/sdc1  /dev/sdc6  /dev/sdd1
/dev/sda2  /dev/sda7  /dev/sdb2  /dev/sdb6  /dev/sdc2  /dev/sdc7  /dev/sdd2

root@ANAND-NAS:~# ls /dev/md*
/dev/md0  /dev/md1  /dev/md2  /dev/md3  /dev/md4  /dev/md5

root@ANAND-NAS:~# ls /dev/vg*
/dev/vga_arbiter
/dev/vg1000:
lv

 

# mdadm --detail /dev/md2

root@ANAND-NAS:~# mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
  Used Dev Size : 971931456 (926.91 GiB 995.26 GB)
   Raid Devices : 3
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Sat Sep 18 16:48:41 2021
          State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : ANAND-NAS:2  (local to host ANAND-NAS)
           UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
         Events : 3405974

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       3       8       37        1      active sync   /dev/sdc5
       5       8        5        2      active sync   /dev/sda5

 

# mdadm --examine /dev/sd[abcde]5 | egrep 'Event|/dev/sd'

root@ANAND-NAS:~# mdadm --examine /dev/sd[abcde]5 | egrep 'Event|/dev/sd'
/dev/sda5:
         Events : 3405974
/dev/sdb5:
         Events : 1984014
/dev/sdc5:
         Events : 3405974

# mdadm --examine /dev/sd[abcdefklmnopqr]5 >>/tmp/raid.status

root@ANAND-NAS:~# mdadm --examine /dev/sd[abcdefklmnopqr]5 >>/tmp/raid.status
root@ANAND-NAS:~# cat /tmp/raid.status
/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795

    Update Time : Tue Jul 20 20:32:58 2021
       Checksum : 89f7077e - correct
         Events : 1984014

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : 428a02b8 - correct
         Events : 3405974

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sda5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : d9d33dec:f7fcb455:96363412:16b96f43

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : d308d501 - correct
         Events : 3405974

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795

    Update Time : Tue Jul 20 20:32:58 2021
       Checksum : 89f7077e - correct
         Events : 1984014

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : 428a02b8 - correct
         Events : 3405974

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)

 

 

 

# mdadm --detail /dev/md3

root@ANAND-NAS:~# mdadm --detail /dev/md3
/dev/md3:
        Version : 1.2
  Creation Time : Thu Mar  1 15:12:43 2018
     Raid Level : raid5
     Array Size : 1953485824 (1862.99 GiB 2000.37 GB)
  Used Dev Size : 976742912 (931.49 GiB 1000.18 GB)
   Raid Devices : 3
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Sat Sep 18 16:48:41 2021
          State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : ANAND-NAS:3  (local to host ANAND-NAS)
           UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a
         Events : 386242

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       5       8        6        1      active sync   /dev/sda6
       4       8       38        2      active sync   /dev/sdc6

# mdadm --examine /dev/sd[abcde]6 | egrep 'Event|/dev/sd'

root@ANAND-NAS:~# mdadm --examine /dev/sd[abcde]6 | egrep 'Event|/dev/sd'
/dev/sda6:
         Events : 386242
/dev/sdb6:
         Events : 361776
/dev/sdc6:
         Events : 386242

# mdadm --examine /dev/sd[abcdefklmnopqr]6 >>/tmp/raid.status

root@ANAND-NAS:~# mdadm --examine /dev/sd[abcdefklmnopqr]6 >>/tmp/raid.status
root@ANAND-NAS:~# cat /tmp/raid.status
/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795

    Update Time : Tue Jul 20 20:32:58 2021
       Checksum : 89f7077e - correct
         Events : 1984014

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : 428a02b8 - correct
         Events : 3405974

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sda5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : d9d33dec:f7fcb455:96363412:16b96f43

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : d308d501 - correct
         Events : 3405974

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 94883b4f:ce8c63c1:892ec4a8:53428795

    Update Time : Tue Jul 20 20:32:58 2021
       Checksum : 89f7077e - correct
         Events : 1984014

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc5:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : e0f9ada5:5340d9a9:4d65da22:4a8309ef
           Name : ANAND-NAS:2  (local to host ANAND-NAS)
  Creation Time : Sat Feb 11 20:12:44 2017
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1943862912 (926.91 GiB 995.26 GB)
     Array Size : 1943862912 (1853.81 GiB 1990.52 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : 88873b1c:649dc11d:5bb0f405:ee1826c7

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : 428a02b8 - correct
         Events : 3405974

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sda6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a
           Name : ANAND-NAS:3  (local to host ANAND-NAS)
  Creation Time : Thu Mar  1 15:12:43 2018
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1953485856 (931.49 GiB 1000.18 GB)
     Array Size : 1953485824 (1862.99 GiB 2000.37 GB)
  Used Dev Size : 1953485824 (931.49 GiB 1000.18 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=32 sectors
          State : clean
    Device UUID : ace310f3:16f9f59b:6760b452:ab0266f0

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : 74926764 - correct
         Events : 386242

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a
           Name : ANAND-NAS:3  (local to host ANAND-NAS)
  Creation Time : Thu Mar  1 15:12:43 2018
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1953485856 (931.49 GiB 1000.18 GB)
     Array Size : 1953485824 (1862.99 GiB 2000.37 GB)
  Used Dev Size : 1953485824 (931.49 GiB 1000.18 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=32 sectors
          State : clean
    Device UUID : fca446c7:fa907b0b:96b8782b:1804d140

    Update Time : Mon Aug  9 21:28:23 2021
       Checksum : e14a402b - correct
         Events : 361776

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 16a6ee01:9b92f0fd:bc15ec91:deec644a
           Name : ANAND-NAS:3  (local to host ANAND-NAS)
  Creation Time : Thu Mar  1 15:12:43 2018
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1953485856 (931.49 GiB 1000.18 GB)
     Array Size : 1953485824 (1862.99 GiB 2000.37 GB)
  Used Dev Size : 1953485824 (931.49 GiB 1000.18 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=32 sectors
          State : clean
    Device UUID : ff8b56d3:fe921dbb:29ae5f41:785222d3

    Update Time : Sat Sep 18 16:48:41 2021
       Checksum : 4567472d - correct
         Events : 386242

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : .AA ('A' == active, '.' == missing, 'R' == replacing)

 

Also I may have made one read only via following this post - DANGER : Raid crashed. Help me restore my data!

I feel md3 is the current one. due to its creation date.

 

storage.thumb.PNG.e37066bc742c0cdc615548f87878e764.PNGhdds.thumb.PNG.70b79f330a2855763674b6c40293b75c.PNG

Link to comment
Share on other sites

 

On 9/18/2021 at 5:30 AM, kaku said:

i am Confused between md2 (/dev/sd*5) or md3 (/dev/sd*6)!! So I will be posting data for both. I feel md3 is the current.

 

Yes, this is a problem.  You have a SHR, which is comprised of multiple arrays bound together.  So all the arrays are important, there are none that are not "current."

 

Follow this post specifically and post the discovery of your SHR arrays, then we can build an array map that helps us visualize what is going on.

https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability/?do=findComment&comment=107971

Link to comment
Share on other sites

OK @flyride here you go.

 


cat /etc/fstab

root@ANAND-NAS:~#  cat /etc/fstab
none /proc proc defaults 0 0
/dev/root / ext4 defaults 1 1
/dev/vg1000/lv /volume1 btrfs auto_reclaim_space,synoacl,noatime 0 0
/dev/md4 /volume2 btrfs auto_reclaim_space,synoacl,noatime 0 0

lvdisplay -v

root@ANAND-NAS:~# lvdisplay -v
    Using logical volume(s) on command line.
  --- Logical volume ---
  LV Path                /dev/vg1000/lv
  LV Name                lv
  VG Name                vg1000
  LV UUID                2kxrUL-GlLw-QNWB-DKnN-0u3C-gMnj-uLNI8f
  LV Write Access        read/write
  LV Creation host, time ,
  LV Status              available
  # open                 0
  LV Size                5.45 TiB
  Current LE             1428427
  Segments               4
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     512
  Block device           253:0

vgdisplay -v

root@ANAND-NAS:~# vgdisplay -v
    Using volume group(s) on command line.
  --- Volume group ---
  VG Name               vg1000
  System ID
  Format                lvm2
  Metadata Areas        3
  Metadata Sequence No  10
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                3
  Act PV                3
  VG Size               5.45 TiB
  PE Size               4.00 MiB
  Total PE              1428427
  Alloc PE / Size       1428427 / 5.45 TiB
  Free  PE / Size       0 / 0
  VG UUID               Mke5hF-n7l1-wMpl-9Te4-Rvfq-Z21n-EQqcFk

  --- Logical volume ---
  LV Path                /dev/vg1000/lv
  LV Name                lv
  VG Name                vg1000
  LV UUID                2kxrUL-GlLw-QNWB-DKnN-0u3C-gMnj-uLNI8f
  LV Write Access        read/write
  LV Creation host, time ,
  LV Status              available
  # open                 0
  LV Size                5.45 TiB
  Current LE             1428427
  Segments               4
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     512
  Block device           253:0

  --- Physical volumes ---
  PV Name               /dev/md2
  PV UUID               2JE2jc-v1cv-cRNa-NmUw-u2qO-wtUp-Wfg5mj
  PV Status             allocatable
  Total PE / Free PE    474575 / 0

  PV Name               /dev/md3
  PV UUID               R0kZZM-1Ao0-AkZF-lWQ4-3dJj-RZFl-FeOnCO
  PV Status             allocatable
  Total PE / Free PE    476925 / 0

  PV Name               /dev/md5
  PV UUID               EKN4BC-gbee-9l0t-NIyi-eqaW-AG9D-Ok0g9j
  PV Status             allocatable
  Total PE / Free PE    476927 / 0

 

I see that md5 maybe involved too. posting its stats too below.

mdadm --detail /dev/md5

root@ANAND-NAS:/# mdadm --detail /dev/md5
/dev/md5:
        Version : 1.2
  Creation Time : Wed Mar 25 06:34:07 2020
     Raid Level : raid1
     Array Size : 1953494912 (1863.00 GiB 2000.38 GB)
  Used Dev Size : 1953494912 (1863.00 GiB 2000.38 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Thu Sep 23 11:18:37 2021
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : ANAND-NAS:5  (local to host ANAND-NAS)
           UUID : bb1662c4:8b29e668:ef84a145:ef29ee17
         Events : 395326

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       39        1      active sync   /dev/sdc7

mdadm --examine /dev/sd[abcdefklmnopqr]7 >>/tmp/raid.status

root@ANAND-NAS:/# mdadm --examine /dev/sd[abcdefklmnopqr]7 >>/tmp/raid.status
root@ANAND-NAS:/# cat /tmp/raid.status



/dev/sda7:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : bb1662c4:8b29e668:ef84a145:ef29ee17
           Name : ANAND-NAS:5  (local to host ANAND-NAS)
  Creation Time : Wed Mar 25 06:34:07 2020
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3906989856 (1863.00 GiB 2000.38 GB)
     Array Size : 1953494912 (1863.00 GiB 2000.38 GB)
  Used Dev Size : 3906989824 (1863.00 GiB 2000.38 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=32 sectors
          State : clean
    Device UUID : 93621687:1b547db6:710969ff:0dc48e9b

    Update Time : Fri Aug 27 23:26:41 2021
       Checksum : 2d7f219f - correct
         Events : 388516


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc7:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : bb1662c4:8b29e668:ef84a145:ef29ee17
           Name : ANAND-NAS:5  (local to host ANAND-NAS)
  Creation Time : Wed Mar 25 06:34:07 2020
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3906989856 (1863.00 GiB 2000.38 GB)
     Array Size : 1953494912 (1863.00 GiB 2000.38 GB)
  Used Dev Size : 3906989824 (1863.00 GiB 2000.38 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=32 sectors
          State : clean
    Device UUID : 8d6d572c:96c53779:a01333cf:4d4322eb

    Update Time : Thu Sep 23 11:18:37 2021
       Checksum : b4fc2ff5 - correct
         Events : 395326


   Device Role : Active device 1
   Array State : .A ('A' == active, '.' == missing, 'R' == replacing)

 

Again, thanks for helping.

Edited by kaku
some more info
Link to comment
Share on other sites

root@ANAND-NAS:~# vgdisplay
  --- Volume group ---
  VG Name               vg1000
  System ID
  Format                lvm2
  Metadata Areas        3
  Metadata Sequence No  10
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                3
  Act PV                3
  VG Size               5.45 TiB
  PE Size               4.00 MiB
  Total PE              1428427
  Alloc PE / Size       1428427 / 5.45 TiB
  Free  PE / Size       0 / 0
  VG UUID               Mke5hF-n7l1-wMpl-9Te4-Rvfq-Z21n-EQqcFk


root@ANAND-NAS:~# lvm pvscan
  PV /dev/md2   VG vg1000   lvm2 [1.81 TiB / 0    free]
  PV /dev/md3   VG vg1000   lvm2 [1.82 TiB / 0    free]
  PV /dev/md5   VG vg1000   lvm2 [1.82 TiB / 0    free]
  Total: 3 [5.45 TiB] / in use: 3 [5.45 TiB] / in no VG: 0 [0   ]

 

Link to comment
Share on other sites

Ok, let's define a few things first:

 

1. Storage Pool - the suite of arrays that make up your storage system

2. Volume - the filesystem built upon the Storage Pool

3. Crashed (Storage Pool) - one or more arrays cannot be started and/or the vg that binds them cannot be started

4. Crashed (Volume) - the volume cannot be mounted for a variety of reasons

5. Crashed (Disk) - the disk probably has an unrecoverable bad sector which has resulted in array failure and possible data loss. The disk may still be operational.

6. Failing (Disk) - this means that the drive has failing a SMART test. The disk may be (and probably is) still operational.

 

Your system has four drives.  It looks like three of them participate in a SHR which spans three arrays (md2, md3, md5).  Each array has its own redundancy, but all need to be operational in order to access data in the Storage Pool.  One drive seems to be set up as a Basic volume all by itself (Disk 4, sdd, md4).  If this doesn't sound right to you, please explain what is incorrect.

 

Otherwise, this is your disk layout.  Please note that the rank order of your arrays, and the members within the arrays, is not at all consistent.  This is not a problem, but it can be confusing and a source of error if you are trying to correct array problems via command line.

 

image.thumb.png.31dbc0d3bcce2fb2598466be89af330b.png

 

This means that arrays /dev/md2 and /dev/md3 are degraded because of missing /dev/sdb (Drive 2) and array /dev/md5 is degraded because of missing /dev/sda. This is a bit unusual (missing/stale array members that span multiple physical disks).  And it also means that there is cross-linked loss of redundancy across the arrays.  In simple terms, you have no redundancy and ALL THREE DRIVES are required to provide access to your data.

 

On 9/18/2021 at 5:30 AM, kaku said:

Also I may have made one read only via following this post - DANGER : Raid crashed. Help me restore my data!

 

I'm not 100% sure what you actually did here. Did you stop one of the drives using the UI?  If so, your Storage Pool should now show Crashed and your data should not be accessible.  If this is the case, do a new cat /proc/mdstat and post the results.

 

If the Storage Pool still shows Degraded, then your data should be accessible. In that case, your #1 job is to copy all the data off of your NAS, because if ANYTHING goes wrong now, your data is lost. Don't try and repair anything until all your data is copied and safe.  Then we can experiment with restoring redundancy and replacing drives.

Link to comment
Share on other sites

10 hours ago, flyride said:

One drive seems to be set up as a Basic volume all by itself (Disk 4, sdd, md4)

Correct. Drive 4 is just another storage pool with volume2 with no RAID which contains non important data. We can ignore it.

 

10 hours ago, flyride said:

I'm not 100% sure what you actually did here. Did you stop one of the drives using the UI? 

I did  mdadm --misc -o /dev/md2 from your comment here  . I think i did it on md2 . but does this carry over after a reboot? Didnt want to keep NAS runnig to avoid more damage to disks.

 

14 hours ago, flyride said:

you have no redundancy and ALL THREE DRIVES are required to provide access to your data.

Well, volume1 was accessible in degraded mode even after Drive 2 was kicked out due to bad sectors. Its when Drive1 "crashed"(Its a relatively new drive), then volume crashed. There is also loss of config for shared folders. then my brother installed photoshation app again thinking that it my have some error(I know i shouldn't have given him this much power in NAS, but he had physical access, so I thought it may be usefull :|). Thats the last thing we did and shutdown for good.

 

 

14 hours ago, flyride said:

If this is the case, do a new cat /proc/mdstat and post the results.

Current state: Storage pool is degraded, Volume1 is crashed. Data could not be accessible.

 

cat /proc/mdstat

root@ANAND-NAS:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1]
md5 : active raid1 sdc7[1]
      1953494912 blocks super 1.2 [2/1] [_U]

md2 : active raid5 sdc5[3] sda5[5]
      1943862912 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU]

md3 : active raid5 sda6[5] sdc6[4]
      1953485824 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [_UU]

md4 : active raid1 sdd3[0]
      971940544 blocks super 1.2 [1/1] [U]

md1 : active raid1 sda2[0] sdb2[1] sdc2[2] sdd2[3]
      2097088 blocks [12/4] [UUUU________]

md0 : active raid1 sdb1[2] sdc1[1] sdd1[3]
      2490176 blocks [12/3] [_UUU________]

unused devices: <none>

storage.PNG

volume crashed.PNG

Link to comment
Share on other sites

Ok, we know your Storage Pool is functional but degraded.  Again, degraded = loss of redundancy, not your data, as long as we don't do something to break it further.

 

Your btrfs filesystem has some corruption that appears to be causing it not to mount.

 

btrfs attempts to heal itself in real-time and when it cannot, it throws errors or refuses to mount.  Your goal is to get it to mount read-only through combination of discovery and leveraging redundancy in btrfs.

 

btrfs is not like ext4 and fsck. You should not expect to "fix" the problem.  If you can make your data accessible through command-line directives, then COPY everything off, delete the volume entirely, and recreate it.

 

I'm no expert at ferreting out btrfs problems, but you might find some of the strategies in this thread helpful (starting with the linked post):

https://xpenology.com/forum/topic/14337-volume-crash-after-4-months-of-stability/?do=findComment&comment=107979

 

 

Post back here if you aren't sure what to do.

Link to comment
Share on other sites

Thanksfor stikcing with me!!

 

Oh no I found out that volume1 is mounted but with "blank data". Is this overwritten? on the other hand, Could not find volume1 current mount in cat /etc/mtab (attached)


root@ANAND-NAS:/volume1$ ls -lr
total 32
-rw-rw----  1 system log  19621 Aug 28 22:08 @SynoFinder-log
drwxr-xr-x  3 root   root  4096 Aug 28 22:07 @SynoFinder-LocalShadow
drwxr-xr-x  3 root   root  4096 Aug 28 22:07 Plex
drwxrwxrwx 14 root   root  4096 Aug 28 22:07 @eaDir

So still good to go ?

 

 

I did the following and got same error .

sudo mount /dev/vg1000/lv /volume1

sudo mount -o clear_cache /dev/vg1000/lv /volume1

sudo mount -o recovery /dev/vg1000/lv /volume1

sudo mount -o recovery,ro /dev/vg1000/lv /volume1

mount: wrong fs type, bad option, bad superblock on /dev/vg1000/lv,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

 

dmesg | tail

 

root@ANAND-NAS:~#  dmesg | tail
[35831.857718] parent transid verify failed on 88007344128 wanted 7582040 found 7582038
[35831.857733] md2: syno_self_heal_is_valid_md_stat(496): md's current state is not suitable for data correction
[35831.867644] md2: syno_self_heal_is_valid_md_stat(496): md's current state is not suitable for data correction
[35831.877616] parent transid verify failed on 88007344128 wanted 7582040 found 7582038
[35831.877641] parent transid verify failed on 88007344128 wanted 7582040 found 7582038
[35831.877643] BTRFS error (device dm-0): BTRFS: dm-0 failed to repair parent transid verify failure on 88007344128, mirror = 2

[35831.890418] parent transid verify failed on 88007344128 wanted 7582040 found 7582041
[35831.890476] BTRFS: Failed to read block groups: -5
[35831.915227] BTRFS: open_ctree failed

 

Now tried
sudo btrfs rescue super /dev/vg1000/lv

root@ANAND-NAS:~#  sudo btrfs rescue super /dev/vg1000/lv
All supers are valid, no need to recover

 

sudo btrfs-find-root /dev/vg1000/lv

Dump is big , so showing first fwe lines and attaching full dump.

root@ANAND-NAS:~#  sudo btrfs-find-root /dev/vg1000/lv
parent transid verify failed on 88007344128 wanted 7582040 found 7582041
parent transid verify failed on 88007344128 wanted 7582040 found 7582041
parent transid verify failed on 88007344128 wanted 7582040 found 7582038
parent transid verify failed on 88007344128 wanted 7582040 found 7582041
Ignoring transid failure
incorrect offsets 15625 135
Superblock thinks the generation is 7582040
Superblock thinks the level is 1
Found tree root at 88006344704 gen 7582040 level 1
Well block 87994417152(gen: 7582033 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1
Well block 87992172544(gen: 7582032 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1
Well block 87981391872(gen: 7582029 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1
Well block 87975280640(gen: 7582028 level: 1) seems good, but generation/level doesn't match, want gen: 7582040 level: 1
Well block 87968268288(gen: 7582024 level: 1) seems good, but generation/level doesn't match, want gen: 7582040 level: 1
Well block 87966810112(gen: 7582020 level: 0) seems good, but generation/level doesn't match, want gen: 7582040 level: 1

 

Finally

sudo btrfs insp dump-s -f /dev/vg1000/lv

root@ANAND-NAS:~#  sudo btrfs insp dump-s -f /dev/vg1000/lv
superblock: bytenr=65536, device=/dev/vg1000/lv
---------------------------------------------------------
csum                    0x812b0015 [match]
bytenr                  65536
flags                   0x1
                        ( WRITTEN )
magic                   _BHRfS_M [match]
fsid                    8ae2592d-0773-4a58-86cd-9be492d7cabe
label                   2017.02.11-14:42:46 v8451
generation              7582040
root                    88006344704
sys_array_size          226
chunk_root_generation   6970521
root_level              1
chunk_root              21037056
chunk_root_level        1
log_root                88008196096
log_root_transid        0
log_root_level          0
total_bytes             5991257079808
bytes_used              3380737892352
sectorsize              4096
nodesize                16384
leafsize                16384
stripesize              4096
root_dir                6
num_devices             1
compat_flags            0x0
compat_ro_flags         0x0
incompat_flags          0x16b
                        ( MIXED_BACKREF |
                          DEFAULT_SUBVOL |
                          COMPRESS_LZO |
                          BIG_METADATA |
                          EXTENDED_IREF |
                          SKINNY_METADATA )
csum_type               0
csum_size               4
cache_generation        312768
uuid_tree_generation    7582040
dev_item.uuid           4e64f8b3-55a5-4625-82b1-926c902a62e0
dev_item.fsid           8ae2592d-0773-4a58-86cd-9be492d7cabe [match]
dev_item.type           0
dev_item.total_bytes    5991257079808
dev_item.bytes_used     3528353382400
dev_item.io_align       4096
dev_item.io_width       4096
dev_item.sector_size    4096
dev_item.devid          1
dev_item.dev_group      0
dev_item.seek_speed     0
dev_item.bandwidth      0
dev_item.generation     0
sys_chunk_array[2048]:
        item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 0)
                chunk length 4194304 owner 2 stripe_len 65536
                type SYSTEM num_stripes 1
                        stripe 0 devid 1 offset 0
                        dev uuid: 4e64f8b3-55a5-4625-82b1-926c902a62e0
        item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 20971520)
                chunk length 8388608 owner 2 stripe_len 65536
                type SYSTEM|DUP num_stripes 2
                        stripe 0 devid 1 offset 20971520
                        dev uuid: 4e64f8b3-55a5-4625-82b1-926c902a62e0
                        stripe 1 devid 1 offset 29360128
                        dev uuid: 4e64f8b3-55a5-4625-82b1-926c902a62e0
backup_roots[4]:
        backup 0:
                backup_tree_root:       88006344704     gen: 7582040    level: 1
                backup_chunk_root:      21037056        gen: 6970521    level: 1
                backup_extent_root:     88004870144     gen: 7582040    level: 2
                backup_fs_root:         29556736        gen: 6  level: 0
                backup_dev_root:        88006639616     gen: 7582040    level: 1
                backup_csum_root:       88007917568     gen: 7582041    level: 3
                backup_total_bytes:     5991257079808
                backup_bytes_used:      3380737892352
                backup_num_devices:     1

        backup 1:
                backup_tree_root:       88004640768     gen: 7582037    level: 1
                backup_chunk_root:      21037056        gen: 6970521    level: 1
                backup_extent_root:     88002723840     gen: 7582037    level: 2
                backup_fs_root:         29556736        gen: 6  level: 0
                backup_dev_root:        29802496        gen: 7581039    level: 1
                backup_csum_root:       87998988288     gen: 7582038    level: 3
                backup_total_bytes:     5991257079808
                backup_bytes_used:      3380737863680
                backup_num_devices:     1

        backup 2:
                backup_tree_root:       88009687040     gen: 7582038    level: 1
                backup_chunk_root:      21037056        gen: 6970521    level: 1
                backup_extent_root:     88004018176     gen: 7582039    level: 2
                backup_fs_root:         29556736        gen: 6  level: 0
                backup_dev_root:        29802496        gen: 7581039    level: 1
                backup_csum_root:       88001593344     gen: 7582039    level: 3
                backup_total_bytes:     5991257079808
                backup_bytes_used:      3380737896448
                backup_num_devices:     1

        backup 3:
                backup_tree_root:       88013193216     gen: 7582039    level: 1
                backup_chunk_root:      21037056        gen: 6970521    level: 1
                backup_extent_root:     88004018176     gen: 7582039    level: 2
                backup_fs_root:         29556736        gen: 6  level: 0
                backup_dev_root:        29802496        gen: 7581039    level: 1
                backup_csum_root:       88001593344     gen: 7582039    level: 3
                backup_total_bytes:     5991257079808
                backup_bytes_used:      3380737949696
                backup_num_devices:     1

 

 

If I want to restore to a new HDD . Its size should be of the USED volume/storage pool  space size (6TB) or of the actual used  volume space(i think it was less than 4TB, I checked the other post and git my answer, but how can i know how much data size was ACTUALLY present? Can it be checked from the dumps above?)

 

 

Next I want to do

sudo btrfs check --init-extent-tree /dev/vg1000/lv

sudo btrfs check --init-csum-tree /dev/vg1000/lv

sudo btrfs check --repair /dev/vg1000/lv

Holding off for now, any suggestions?

 

 

 

btrfs-find-root.txt mtab.txt

Edited by kaku
found answere in othe post
Link to comment
Share on other sites

8 minutes ago, kaku said:

Oh no I found out that volume1 is mounted but with "blank data". Is this overwritten?

 

I think perhaps you mean that you see a /volume1 folder.  If the volume is not mounted, that folder will be blank.

 

Check to see what volumes have been mounted with df -v at the command line.  You should see a /volume1 entry connected to your /dev/lv device if it is mounted successfully.

 

You can also sudo mount from the command line and any unmounted but valid volumes will be mounted (or you will receive an error message as to why they cannot be).

Link to comment
Share on other sites

Obviously we need to prove that you do not have a fs mounted on /volume1 per the previous post (incidentally, it looks like you have some jobs - ie Plex - that have written to /volume1 and in the absence of the mounted filesystem, those files are in the root filesystem.  You should stop your Plex from running while this is going on)

 

I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem.

http://www.infotinks.com/btrfs-restoring-a-corrupt-filesystem-from-another-tree-location/

 

If nothing works I would go ahead with at least the first two check options.

Link to comment
Share on other sites

29 minutes ago, flyride said:

Check to see what volumes have been mounted with df -v at the command line

I checked . volume1 not mounted. So /volume1 is just a folder.

 

19 minutes ago, flyride said:

it looks like you have some jobs - ie Plex - that have written

All packages are in "error" mode so, i think its old crons which ran at the time. (doesnt containt anything)

 

 

34 minutes ago, flyride said:

I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem.

http://www.infotinks.com/btrfs-restoring-a-corrupt-filesystem-from-another-tree-location/

i will try this tommorow and post back.

Link to comment
Share on other sites

A roadbock. WARNING for a stupid thing I did.

 

13 hours ago, flyride said:

I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem.

http://www.infotinks.com/btrfs-restoring-a-corrupt-filesystem-from-another-tree-location/

 

 

I ran the script posted their to find out the best tree root to get. It did show my folder structures intact! But That script ran too long, I inspeected why and sure, it was filling /root dir in GBs! I didnt comprehend that it will fill /root so fast. I should have ran on a USB stick . And now I cant login to DSM or SSH because storage is full!!

 

Quote

You cannot login to the system because the disk space is full currently. Please restart the system and try again.

 

Is their a Grub/boot loader  script i can run to clear the /root folder at startup? Specially the 333* files generated from the script their.

 

storage full.PNG

Link to comment
Share on other sites

OK

 

Another old post I found which could load me into root is here .
 

Quote

 

Restart

Press the ESC key at the grub prompt

Press e to enter modification mode

Select the line of the starting kernel and press e

To the last line, enter rw init=/bin/bash

Press enter, then b to restart the computer

At this time, the computer will live in the root shell without a password

 

Not able to get this to work in grub.cfg

Should i just add  line in cfg like so? Or do it the way mentioned above?

linux $img/$zImage $common_args $bootdev_args $extra_args $@ rw init=/bin/bash

 

Link to comment
Share on other sites

It just says access denied. On ssh. Now DSM is also not showing.

I booted into ubuntu live. I can see md2,3,4,5 and vg1000 , same as in DSM (mdadm -Asf && vgchange -ay). I think i can do tree root stuff from here too.

But I cant see md0 to correct /root. How can i rebuild md0?



Sent from my SM-G998B using Tapatalk

Link to comment
Share on other sites

Ok. Before i did that. I could see that DSM stopped loading(no HDD connected error on webpage) . Also on a different local ip then i specified.

But they are shown correct in live ubuntu envoinment. I loaded md0 as you suggested. Cleared /root but DSM ISSUE till their.

I think its best I do the remaing stuff in ubuntu. As mdstat us exactly same as it showed in DSM.

ANY other way boot into DSM. Ssh now shows refused.

Sent from my SM-G998B using Tapatalk

Link to comment
Share on other sites

On 9/24/2021 at 10:01 PM, flyride said:

I would see if you can do a restore (copy out) using the alternate tree roots before you try and make corrections from the filesystem.

UPDATE: THANK YOU @flyride!!! I AM SAVED.

 

I have copied out my data to another HDD!!!!! I made another storage pool and volume. and copied date off to it. 3TB. I think data is good. How can I check which files were corrupted or not fully restored?

 

On 9/24/2021 at 9:39 PM, kaku said:

sudo btrfs check --init-extent-tree /dev/vg1000/lv

sudo btrfs check --init-csum-tree /dev/vg1000/lv

sudo btrfs check --repair /dev/vg1000/lv

Now that Data is back. Should I try repairing the original Volume?

 

Link to comment
Share on other sites

7 hours ago, kaku said:

I have copied out my data to another HDD!!!!! I made another storage pool and volume. and copied date off to it. 3TB. I think data is good. How can I check which files were corrupted or not fully restored?

 

Did you have btrfs checksum "on" for the affected volume?  If so, btrfs would have told you if there was data corruption via pop-up in the UI.  Normally it would also fix it, but you have no redundancy.  Without the checksum, there is no way to determine corruption.  If files were missing for some reason, that is also not really detectable.  A good reason to have a real backup somewhere...

 

image.thumb.png.f2c23bd3a5faceed5cfd3fdb86abea2c.png

 

7 hours ago, kaku said:

Now that Data is back. Should I try repairing the original Volume?

 

I say this often: btrfs will fix itself if it can.  If it cannot do that, there is underlying corruption that MAY be fixable via filesystem correction tools, but probably won't fully address the problem. Linux culture holds that ext4/fsck can fix ANYTHING and always should have confidence in the filesystem after it is complete, but that just isn't true with btrfs.

 

I strongly recommend you delete the volume and rebuild it from scratch.  Since you also have a Storage Pool problem, there is consequently no reason not to delete that as well, replace any drives that are actually faulty, and re-create a clean Storage Pool too.  Then copy your files back in from your backup.

 

Glad this worked out, probably as well as it could have for you given the difficult intermediate steps.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...