satdream

Members
  • Content Count

    66
  • Joined

  • Last visited

Community Reputation

1 Neutral

About satdream

  • Rank
    Regular Member
  1. - Outcome of the update: SUCCESSFUL - DSM version prior update: Disks migration from DS3615xs/6.1.2/loader 1.02b - Loader version and model: JUN'S LOADER v1.03b - DS3617xs - Using custom extra.lzma: YES - Installation type: BAREMETAL - HP Gen8 Micro 12Gb i5 3470T - New install - Dell H310 w/LSI 9211-8i P20 IT Mode + extension Rack of SAS/SATA mixed HDD - Additional comments: Gen8 onboard Dual Broadcom NiC working (no need of additionnal NiC thanks to native drivers from IG-88)
  2. Done tests with fresh install: confirm that IronWolf is working fine in a new fresh install in mixed SAS/SATA environment, but then tried migration and same issue ... Ironwolf support from DSM 6.2.2 24922 update 4 have a bug ... disk is put out of the pool as in default even if status is normal ... Close topic/my contribution, and thanks again to all you sending me suggestions (and private message), and especially to @flyride ! FINISH
  3. Got finally pool working and all status at Normal removing IronWolf, then replacing it, and adding another disk too, after long (long) resync. now config is full working w/DSM 6.2.2-24922 Update 4 ... But IronWolf is really an issue, will made other tests (not with my data, but fake config ) in order to try to understand, but for the moment I have to reconfigure all data accesses etc. Thanks all for support !
  4. Resync finalized with the new WD 12TB, extracted the 10TB IronWolf = all data available ... resync successfull ... But the pool is still failed ... then removed the unused 5TB Toshiba (from the beginning of issues I do not understand why this HDD was status changed to initalized in DSM assuming the Raid manager consider it as Ok in the Raid volume), as DSM asking for a 8TB minimum disk, I plugged a new 8TB Seagate NAS HDD version ... Resync initated, 8h prevision ... for the 1st round ... Rq: the resync duration predication is a bit unprecised, and do not consider that 2x resync are requested, but only the on-going one : but MD3 then MD2 shall be resync (SHR consider MD0/MD1 as system partition duplicated on all disks, data are on MD2/MD3 and parity error correction on MD4/MD5, btw the high volume to sync is on MD3 and MD2, why DSM shows 2x consecutive reparing/resync actions but is not able to give the cumulated duration eval).
  5. I run a basic grep -rnw etc. from root on the Ironwolf serial number, that returned a limited number of files ... btw I understood that overall displayed disks details are stored now in SQLite db, that I was able to edit with DBbrowser ... not difficult to remove smart tests results etc. btw it is also in those files the account, connection etc. logs are stored ... plugging AHCI/Sata it generated a specifc disk_latency file too (the Gen8 internal AHCI is a 3Gb link, as the Dell H310 a 6Gb x2 links, btw the DSM is able to determinate a latency in the accesses). It is those .SYNOxxxx files listed previously, and a disk_latency_tmp.db then made cleaning removing in SQLite records where the IronWolf serial was identified, but not change on the disk status ... except removing the log/trace/history of smart tests and extended one, not change on the disk himself. But now the issue seems to be more precisely linked with the pool management, as the disk health status is at "Normal" but the allocation is "Faulty" ... it means to determinate how the pool consider disks in its structure ... and why the eSata have impact (as for the IronWolf management) ...
  6. Some news: Few other tests performed: - Installed IronWolf directly on SATA direct on Gen8 interface, and "provoc" an install/migration with another boot card = IronWolf listed as "Normal/OK", but the 8 TB SATA WD installed in external inclosure with other SAS HDD not recognized/disks missing - Update the synoconf removing eSata etc. - the 8TB WD is detected etc. => the IronWolf become as "Faulty" and automatically set out of the pool in DSM => the IronWolf HDD are managed specifically in DSM allowing few additional functions as monitoring, smart specific etc. BUT it is an issue in my case as in order to be compatible of SAS +SATA HDD in the same enclosure, parameters modify the way DSM is detecting the IronWolf ... potentially the eSata support is used in interfacing IronWolf ... => not the same behaviour under DSM 6.1 ... where it was working perfectly, issue with 6.2.2 Current status: - installed a new 12 GB WD SATA on Gen8 AHCI SATA (in addition to the 8 disks installed via the H310 LSI card) - DSM pool manager accepted the disk and initiated a recovery: 1st disk check (taked ~12h) = Ok - then recovery in progress ... estimated at 24h to be continued ...
  7. Work now with integrated NIC using driver extension published two weeks ago, I run Gen8 with 6.2-24922 Update 4, but I performed a fresh install, no idea if update from previous release will work (see specific way to use driver extension)
  8. Many thanks, no worry about release 6.1/6.2 confusion, I provided in this topic a lot of info (and few mix ) ... btw Ok understood concerning array rebuild etc. I am with you: more and more I check = it is a "cosmetic" issue as you told ... but archiving in .bak the two files have nothing changed I assume the log files are not impacting the identification of faulty disk on 6.2 ... and no idea from where to change setting in the config files ... Interresting point : even if the files are in .bak, DSM still list the tests performed and status in history ... btw /var/log files are not considered Is it something to do in /var/log/synolog ? root@Diskstation:/var/log/synolog# ll total 296 drwx------ 2 system log 4096 Dec 27 02:01 . drwxr-xr-x 19 root root 4096 Dec 26 20:51 .. -rw-r--r-- 1 root root 26624 Dec 27 02:01 .SYNOACCOUNTDB -rw-r--r-- 1 system log 114688 Dec 27 01:47 .SYNOCONNDB -rw-r--r-- 1 system log 32768 Dec 27 02:01 .SYNOCONNDB-shm -rw-r--r-- 1 system log 20992 Dec 27 02:01 .SYNOCONNDB-wal -rw-r--r-- 1 system log 12288 Dec 25 18:10 .SYNODISKDB -rw-rw-rw- 1 root root 3072 Dec 27 01:50 .SYNODISKHEALTHDB -rw-r--r-- 1 system log 8192 Dec 27 01:47 .SYNODISKTESTDB -rw-r--r-- 1 root root 2048 Dec 22 15:40 .SYNOISCSIDB -rw-r--r-- 1 system log 14336 Dec 27 01:50 .SYNOSYSDB -rw-r--r-- 1 system log 32768 Dec 27 01:51 .SYNOSYSDB-shm -rw-r--r-- 1 system log 1080 Dec 27 01:51 .SYNOSYSDB-wal None are editable ... Thanks Ps: It is 2am and I will stop investigate the 2 coming days as out of home, will continue when back
  9. Sorry for wrong formulation mixed Smart test and Smart status ... : - the SMART Extended reported No Errors, everything as "Normal" = final disk health status is "normal": - But strangly (see the hardcopy of SMART details) the SMART status shows few errors (assuming SMART details are sometime a bit "complex" to analyze): Raw_read error_rate Seek_error_rate Hardware_ECC_recovered Checking on few forums it seems not really an issue and common with IronWolf ... assuming the reallocated sector is at 0 ... Feel pretty confident on SMART extended tests as it is a non-destructive physical tests (read value/write value of each disk sector) ... In addition IronWolf have a specific test integrated in DSM, that I run and report No Error (000. Normal) Found no smart_test_log.xml but: /var/log/healthtest/ dhm_<IronWolfSerialNo>.xz /var/log/smart_result/2019-12-25_<longnum>.txz I keep them waiting recommandation ! What about rebuild the array ? Sequence to rebuild the array is well known as following one: 1. umount /opt 2. umount /volume1 3. syno_poweroff_task -d 4. mdadm -–stop /dev/mdX 5. mdadm -Cf /dev/mdxxxx -e1.2 -n1 -l1 /dev/sdxxxx -u<id number> 6. e2fsck -pvf -C0 /dev/mdxxxx 7. cat /proc/mdstat 8. reboot but my array look likes to be correct as cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1] md2 : active raid5 sdk5[1] sdh5[7] sdi5[6] sdm5[8] sdn5[4] sdg5[3] sdj5[2] 27315312192 blocks super 1.2 level 5, 64k chunk, algorithm 2 [8/7] [_UUUUUUU] md3 : active raid5 sdk6[1] sdm6[7] sdh6[6] sdi6[5] sdn6[4] sdg6[3] sdj6[2] 6837200384 blocks super 1.2 level 5, 64k chunk, algorithm 2 [8/7] [_UUUUUUU] md5 : active raid5 sdi8[0] sdh8[1] 3904788864 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [UU_] md4 : active raid5 sdn7[0] sdm7[3] sdh7[2] sdi7[1] 11720987648 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/4] [UUUU_] md1 : active raid1 sdg2[0] sdh2[5] sdi2[4] sdj2[1] sdk2[2] sdl2[3] sdm2[7] sdn2[6] 2097088 blocks [14/8] [UUUUUUUU______] md0 : active raid1 sdg1[0] sdh1[5] sdi1[4] sdj1[1] sdk1[2] sdl1[3] sdm1[6] sdn1[7] 2490176 blocks [12/8] [UUUUUUUU____] unused devices: <none> only the IronWold disk is consider as faulty ... not sure rebuild array will reset the disk error It is completly crazy, the disk are all normal, the arry is fully accessible, but DSM consider a disk as faulty and block any action (including adding drive etc.) Thx
  10. No change, the regenerated disk_overview.xml is now without the disconnected SSD, the structure of each disk is exactly the same tag as for the IronWolf: <SN_xxxxxxxx model="ST10000VN0004-1ZD101"> <path>/dev/sdm</path> <unc>0</unc> <icrc>0</icrc> <idnf>0</idnf> <retry>0</retry> </SN_xxxxxxx> Where the DSM is storing the disk status ? I remember something as Synology has a customized version of md driver and mdadm toolsets that adds a 'DriveError' flag to the rdev->flags structure in the kernel ... but don't know I to change it ... Thx
  11. Done, removedxml tag of IronWolf, but after reboot still the same status ... and I see that the former SSD I removed from install (disconnected) are listed in the disk_overview.xml What about performing a reinstall ? btw changing serialid of 3615xs in order to initiate a migration ? will it reset status ? as I do not reinstall applications/config etc. not a big issue to perform a migration ... Thanks a lot
  12. Here is the screenshot, but sorry it is in French ... IronWolf is displayed as "En panne" (="Failed" or "Broken") but with 0 bad sectors (but the SMART Extended test returned few errors) the pool status as "degraded" with the "failed drive" show as "Failed allocation status" and the rest of disks as normal the list of disks (the unused Toshiba is displayed as "Initied") And the SMART status of the IronWolf Many Thanks !
  13. Hi again, last news: after 20h hours of test, the IronWolf Pro 10TB is analysed at “Normal” without Errors … But DSM shows it as faulty … I do not understand (again, sorry) as the configuration is: Toshiba 5TB: sdg, sdj, sdk, sdl Seagate Exos 10TB: sdh, sdi Western Digital 8TB: sdn Seagate IronWolf Pro 10TB: sdm And I understand the “initiated” status is certainly correct on the Toshiba 5TB, and the “faulty” IronWold Pro 10TB is fully operational … @flyride Could you please confirm my analyse bellow ? dmesg extract: ... [ 32.965011] md/raid:md3: device sdk6 operational as raid disk 1 = Toshiba 5 TB [ 32.965013] md/raid:md3: device sdm6 operational as raid disk 7 = IronWold Pro 10 TB [ 32.965013] md/raid:md3: device sdh6 operational as raid disk 6 = Exos 10 TB [ 32.965014] md/raid:md3: device sdi6 operational as raid disk 5 = Exos 10 TB [ 32.965015] md/raid:md3: device sdn6 operational as raid disk 4 = WD 8 TB [ 32.965016] md/raid:md3: device sdg6 operational as raid disk 3 = Toshiba 5 TB [ 32.965016] md/raid:md3: device sdj6 operational as raid disk 2 = Toshiba 5 TB … [ 32.965507] md/raid:md3: raid level 5 active with 7 out of 8 devices, algorithm 2 [ 32.965681] RAID conf printout: [ 32.965682] --- level:5 rd:8 wd:7 [ 32.965683] disk 1, o:1, dev:sdk6 = Toshiba 5TB [ 32.965684] disk 2, o:1, dev:sdj6 = Toshiba 5TB [ 32.965685] disk 3, o:1, dev:sdg6 = Toshiba 5TB [ 32.965685] disk 4, o:1, dev:sdn6 = WD 8 TB [ 32.965686] disk 5, o:1, dev:sdi6 = Exos 10 TB [ 32.965687] disk 6, o:1, dev:sdh6 = Exos 10 TB [ 32.965688] disk 7, o:1, dev:sdm6 = IronWolrd Pro 10 TB … mdstat returns: Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [raidF1] md3 : active raid5 sdk6[1] sdm6[7] sdh6[6] sdi6[5] sdn6[4] sdg6[3] sdj6[2] 6837200384 blocks super 1.2 level 5, 64k chunk, algorithm 2 [8/7] [_UUUUUUU] md2 : active raid5 sdk5[1] sdh5[7] sdi5[6] sdm5[8] sdn5[4] sdg5[3] sdj5[2] 27315312192 blocks super 1.2 level 5, 64k chunk, algorithm 2 [8/7] [_UUUUUUU] md5 : active raid5 sdi8[0] sdh8[1] 3904788864 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [UU_] md4 : active raid5 sdn7[0] sdm7[3] sdh7[2] sdi7[1] 11720987648 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/4] [UUUU_] md1 : active raid1 sdg2[0] sdh2[5] sdi2[4] sdj2[1] sdk2[2] sdl2[3] sdm2[7] sdn2[6] 2097088 blocks [14/8] [UUUUUUUU______] md0 : active raid1 sdg1[0] sdh1[5] sdi1[4] sdj1[1] sdk1[2] sdl1[3] sdm1[6] sdn1[7] 2490176 blocks [12/8] [UUUUUUUU____] => In all cases the Toshiba 5TB (identified as “sdl”) is never used … except in md0/md1 partition (=status initiated) ... and the “faulty” IronWolf Pro (identified as "sdm") is working … !!! ... mdadm --detail /dev/md2 gives this result: /dev/md2: Version : 1.2 Creation Time : Thu Aug 3 09:46:31 2017 Raid Level : raid5 Array Size : 27315312192 (26049.91 GiB 27970.88 GB) Used Dev Size : 3902187456 (3721.42 GiB 3995.84 GB) Raid Devices : 8 Total Devices : 7 Persistence : Superblock is persistent Update Time : Thu Dec 26 18:41:14 2019 State : clean, degraded Active Devices : 7 Working Devices : 7 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Name : Diskstation:2 (local to host Diskstation) UUID : 42b3969c:b7f55548:6fb5d6d4:f70e8e8b Events : 63550 Number Major Minor RaidDevice State - 0 0 0 removed 1 8 165 1 active sync /dev/sdk5 2 8 149 2 active sync /dev/sdj5 3 8 101 3 active sync /dev/sdg5 4 8 213 4 active sync /dev/sdn5 8 8 197 5 active sync /dev/sdm5 6 8 133 6 active sync /dev/sdi5 7 8 117 7 active sync /dev/sdh5 Btw, the Raid mechanisms is working, as data are accessible ... but DSM do not change status on IronWolf Pro … I was close to a big mistake: I have not to change the IronWolf Pro because it is part of the working RAID, and the 20 hours test (SMART Extended) on the IronWolf Pro shows a correct status without currupted sector etc. confirm this. How do I force DSM to change status on disk IronWolf Pro 10 TB from faulty to normal ?!? Because the Toshiba 5 TB is not initiated (and perhaps the faulty drive ?) but DSM do not want to initiate it til the 10TB is not "replaced" ... that I have certainly not to do if I want to keep all data ... Thanks !
  14. satdream

    DSM 6.2 Loader

    Gen8 users (and other ?), in case of SAS HDD installed both with SATA on the same Mini-SAS port w/mixed SATA/SAS Disks (connected via LSI PCie card)= do not configure the "SasIdxMap=0", it shall not be part of Grub.cfg in order to allow detection of mix of Disks, if configured the SATA disks are not mapped by DSM !
  15. Yes, listed the exact message shown by DSM that is generic about SSD installed structure telling about "read-write SSD cache" ... but it was obviously configured in "read-only" I fully agree about the status "Initialized" as not strictly in use in disks pool, but I figure a wrong status or it was not possible to get access to data assuming the faulty HDD is also out of the pool ... and btw 2 disks out do not allows access to data (or I miss something in pool mechanisms) Current status is: - after 24h hours of parity (coherence) check = OK - x2 SSD excluded (cache switched "Off") - x3 5Tb + x1 WD 8Tb + x2 Seagate Exos = Normal - 5Tb Toshiba = Initialized - 10Tb Seagate Ironwolf Pro = still in default => Smart full test in progress on the Ironwolf Pro = On-going (~20h planned) => Spare 10Tb disk ordered for exchange asap Thanks ! Ps: for Gen8 users w/mixed SATA/SAS Disks = do not configure the "SasIdxMap=0", it shall not be part of Grub.cfg in order to allow detection of mix of Disks, if configured the SATA disks are not mapped by DSM.