One or more raid groups/ssd caches are crashed

petersnows · June 27, 2017

One or more raid groups/ssd caches are crashed.
I have a volume 3 that shows up as crashed.
It had been working fine for more than a week

I can accees it though (all the files are fine).
I would like to know how do I change the HD/volume/raidgroup status.
I actually don't know why it shows as crashed.

hp microserver gen8
ESXI 6.5
3TB Hardrive, RDM, (Raw Device Mapping)
6.1.15047
Jun 1.02b

** Mdstat


root@fs02:/etc/space# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1]
md2 : active raid1 sdb5[0]
      47695808 blocks super 1.2 [1/1] [U]

md3 : active raid1 sdc3[0](E)
      2925444544 blocks super 1.2 [1/1] [E]

md1 : active raid1 sdc2[1] sdb2[0]
      2097088 blocks [12/2] [UU__________]

md0 : active raid1 sdc1[1] sdb1[0]
      2490176 blocks [12/2] [UU__________]

unused devices: <none>
root@fs02:/etc/space#

** mdadm


root@fs02:/etc/space# mdadm --detail /dev/md3
/dev/md3:
        Version : 1.2
  Creation Time : Sun Apr 16 19:58:33 2017
     Raid Level : raid1
     Array Size : 2925444544 (2789.92 GiB 2995.66 GB)
  Used Dev Size : 2925444544 (2789.92 GiB 2995.66 GB)
   Raid Devices : 1
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Tue Jun 27 23:15:13 2017
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : fs03:3
           UUID : bbe73c5c:694d1437:20f54790:be2b9bfb
         Events : 5

    Number   Major   Minor   RaidDevice State
       0       8       35        0      active sync   /dev/sdc3
root@fs02:/etc/space#

** mount


root@fs02:/etc/space# mount
/dev/md0 on / type ext4 (rw,relatime,journal_checksum,barrier,data=ordered)
none on /dev type devtmpfs (rw,nosuid,noexec,relatime,size=1022480k,nr_inodes=255620,mode=755)
none on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
none on /proc type proc (rw,nosuid,nodev,noexec,relatime)
none on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
/tmp on /tmp type tmpfs (rw,relatime)
/run on /run type tmpfs (rw,nosuid,nodev,relatime,mode=755)
/dev/shm on /dev/shm type tmpfs (rw,nosuid,nodev,relatime)
none on /sys/fs/cgroup type tmpfs (rw,relatime,size=4k,mode=755)
cgmfs on /run/cgmanager/fs type tmpfs (rw,relatime,size=100k,mode=755)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset,release_agent=/run/cgmanager/agents/cgm-release-agent.cpuset,clone_children)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu,release_agent=/run/cgmanager/agents/cgm-release-agent.cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct,release_agent=/run/cgmanager/agents/cgm-release-agent.cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory,release_agent=/run/cgmanager/agents/cgm-release-agent.memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices,release_agent=/run/cgmanager/agents/cgm-release-agent.devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer,release_agent=/run/cgmanager/agents/cgm-release-agent.freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio,release_agent=/run/cgmanager/agents/cgm-release-agent.blkio)
none on /proc/bus/usb type devtmpfs (rw,nosuid,noexec,relatime,size=1022480k,nr_inodes=255620,mode=755)
none on /sys/kernel/debug type debugfs (rw,relatime)
securityfs on /sys/kernel/security type securityfs (rw,relatime)
/dev/mapper/vol1-origin on /volume1 type ext4 (rw,relatime,journal_checksum,synoacl,data=writeback,jqfmt=vfsv0,usrjquota=aquota.user,grpjquota=aquota.group)
/dev/md3 on /volume3 type btrfs (rw,relatime,synoacl,nospace_cache,flushoncommit_threshold=1000,metadata_ratio=50)
none on /config type configfs (rw,relatime)
/dev/mapper/vol1-origin on /opt type ext4 (rw,relatime,journal_checksum,synoacl,data=writeback,jqfmt=vfsv0,usrjquota=aquota.user,grpjquota=aquota.group)
none on /proc/fs/nfsd type nfsd (rw,relatime)
/dev/mapper/vol1-origin on /volume1/@docker/aufs type ext4 (rw,relatime,journal_checksum,synoacl,data=writeback,jqfmt=vfsv0,usrjquota=aquota.user,grpjquota=aquota.group)
root@fs02:/etc/space#

** console logs

[91206.299324] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[91206.301555] ata3.00: irq_stat 0x40000000
[91206.303097] ata3.00: failed command: WRITE DMA
[91206.304644] ata3.00: cmd ca/00:08:80:06:4c/00:00:00:00:00/e0 tag 28 dma 4096 out
[91206.304644]          res 41/02:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[91206.309667] ata3.00: status: { DRDY ERR }
[91206.699074] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[91206.701293] ata3.00: irq_stat 0x40000000
[91206.702640] ata3.00: failed command: WRITE DMA
[91206.704173] ata3.00: cmd ca/00:08:80:06:4c/00:00:00:00:00/e0 tag 29 dma 4096 out
[91206.704173]          res 41/02:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[91206.709172] ata3.00: status: { DRDY ERR }

petersnows · July 12, 2017

So far I was able to resolve this by doing the following:

   - syno_poweroff_task -d
   - ssh again
   - mdadm --detail /dev/md3
     GET the UUID, mine is : bbe73c5c:694d1437:20f54790:be2b9bfb
   
        root@fs02:~# mdadm --detail /dev/md3
        /dev/md3:
                Version : 1.2
          Creation Time : Sun Apr 16 19:58:33 2017
             Raid Level : raid1
             Array Size : 2925444544 (2789.92 GiB 2995.66 GB)
          Used Dev Size : 2925444544 (2789.92 GiB 2995.66 GB)
           Raid Devices : 1
          Total Devices : 1
            Persistence : Superblock is persistent
        
            Update Time : Wed Jun 28 08:09:20 2017
                  State : clean
         Active Devices : 1
        Working Devices : 1
         Failed Devices : 0
          Spare Devices : 0
        
                   Name : fs03:3
                   UUID : bbe73c5c:694d1437:20f54790:be2b9bfb
                 Events : 5
        
            Number   Major   Minor   RaidDevice State
               0       8       35        0      active sync   /dev/sdc3
        root@fs02:~#
   
   - mdadm --stop /dev/md3         
   - btrfs check --repair /dev/md3
   - mdadm -Cf /dev/md3 -e1.2 -n1 -l1 /dev/sdc3 -ubbe73c5c:694d1437:20f54790:be2b9bfb
   - reboot

Commands to check the status:

  cat /proc/mdstat
  mdadm --detail /dev/md3
  mdadm --examine /dev/sdc3

So far this issue happened twice.
It has been ok in the past 2 days. Let's see how it goes.

Sign In

One or more raid groups/ssd caches are crashed

Recommended Posts

petersnows

Link to comment

Share on other sites

petersnows

Link to comment

Share on other sites

Join the conversation

Forums

What's new

MUST READ

Members