new sata/ahci cards with more then 4 ports (and no sata multiplexer)


Recommended Posts

39 minutes ago, Vaifranz said:

Maybe my problem is the backplane.

that would be my guess too, reconnection errors usually originate from cable problems, the value refers directly to s.m.a.r.t. "UDMA_CRC_Error_Count"

also did a run of creating a raid5 with the 2nd jmb585 controller and 3617, no reconnection errors

2 hours ago, Vaifranz said:

[ 3365.801343] ata14: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

thats odd, the "normal" link speed is 6.0 Gbps

Link to post
Share on other sites
On 2/7/2021 at 2:02 AM, Vaifranz said:

Hi, I am testing the JMB585 card in a ds3617 system but I am having several problems and I don't understand why.
Fresh installation, DSM 6.2.3 update 3, during the installation phase everything is ok, during the RAID realization phase (both SHR and RAID5 or 6) I find several disconnections of HDD. Tried it on two motherboards, AsRock Z97 extreme9 and AsRock C226WS +, but the problems are the same, HDD WD RED.

 

ok, id did some more tests, nothing in case of reconnections (that points to interface/cable/backlane/connectors) there where still zero but i did see something "unusual" in the dmesg log but only for WD disks (had two 500GB disks one 2.5" the other 3.5") nothing like that with HGST, Samsung, Seagate or a Crucuial SSD MX300

 

[   98.256360] md: md2: current auto_remap = 0
[   98.256363] md: requested-resync of RAID array md2
[   98.256366] md: minimum _guaranteed_  speed: 10000 KB/sec/disk.
[   98.256366] md: using maximum available idle IO bandwidth (but not more than 600000 KB/sec) for requested-resync.
[   98.256370] md: using 128k window, over a total of 483564544k.
[  184.817938] ata5.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
[  184.825608] ata5.00: failed command: READ FPDMA QUEUED
[  184.830757] ata5.00: cmd 60/00:00:00:8a:cf/02:00:00:00:00/40 tag 0 ncq 262144 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  184.845546] ata5.00: status: { DRDY }
[  184.849222] ata5.00: failed command: READ FPDMA QUEUED
[  184.854373] ata5.00: cmd 60/00:08:00:8c:cf/02:00:00:00:00/40 tag 1 ncq 262144 in
                        res 40/00:00:e0:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  184.869165] ata5.00: status: { DRDY }
[  184.872839] ata5.00: failed command: READ FPDMA QUEUED
[  184.877994] ata5.00: cmd 60/00:10:00:8e:cf/02:00:00:00:00/40 tag 2 ncq 262144 in
                        res 40/00:00:e0:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  184.892784] ata5.00: status: { DRDY }
...
[  185.559602] ata5: hard resetting link
[  186.018820] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  186.022265] ata5.00: configured for UDMA/100
[  186.022286] ata5.00: device reported invalid CHS sector 0
[  186.022331] ata5: EH complete
[  311.788536] ata5.00: exception Emask 0x0 SAct 0x7ffe0003 SErr 0x0 action 0x6 frozen
[  311.796228] ata5.00: failed command: READ FPDMA QUEUED
[  311.801372] ata5.00: cmd 60/e0:00:88:3a:8e/00:00:01:00:00/40 tag 0 ncq 114688 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  311.816151] ata5.00: status: { DRDY }
...
[  312.171072] ata5.00: status: { DRDY }
[  312.174841] ata5: hard resetting link
[  312.634480] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  312.637992] ata5.00: configured for UDMA/100
[  312.638002] ata5.00: device reported invalid CHS sector 0
[  312.638034] ata5: EH complete
[  572.892855] ata5.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
[  572.900523] ata5.00: failed command: READ FPDMA QUEUED
[  572.905680] ata5.00: cmd 60/00:00:78:0a:ec/02:00:03:00:00/40 tag 0 ncq 262144 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  572.920462] ata5.00: status: { DRDY }
...
[  573.630587] ata5.00: status: { DRDY }
[  573.634262] ata5: hard resetting link
[  574.093716] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  574.096662] ata5.00: configured for UDMA/100
[  574.096688] ata5.00: device reported invalid CHS sector 0
[  574.096732] ata5: EH complete
[  668.887853] ata5.00: NCQ disabled due to excessive errors
[  668.887857] ata5.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
[  668.895522] ata5.00: failed command: READ FPDMA QUEUED
[  668.900667] ata5.00: cmd 60/00:00:98:67:53/02:00:04:00:00/40 tag 0 ncq 262144 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  668.915449] ata5.00: status: { DRDY }
...
[  669.601057] ata5.00: status: { DRDY }
[  669.604730] ata5.00: failed command: READ FPDMA QUEUED
[  669.609879] ata5.00: cmd 60/00:f0:98:65:53/02:00:04:00:00/40 tag 30 ncq 262144 in
                        res 40/00:00:e0:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  669.624748] ata5.00: status: { DRDY }
[  669.628425] ata5: hard resetting link
[  670.087717] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  670.090796] ata5.00: configured for UDMA/100
[  670.090814] ata5.00: device reported invalid CHS sector 0
[  670.090859] ata5: EH complete
[ 6108.391162] md: md2: requested-resync done.
[ 6108.646861] md: md2: current auto_remap = 0

 

i could shift the problem between ports by changing the port so its specific for WD disks

it was on both kernels 3617 and 918+ (3.10.105 and 4.4.59)

as the error points to NCQ and i found references on the internet a tried to "fix" it by disabling NCQ for the kernel

i added "libata.force=noncq" to the kernel parameters in grub.cfg, rebooted and did the same procedure as before (with 918+) and i did not see the errors (there will be entry's about not using ncq for every disks, so its good to see that the kernel parameter is used as intended)

in theory it might be possible to just disable ncq for some disks that are really WD but that would need intervention later if anything is changed on the disks

 

in general there was not problem with the raid's i build even with the ncq errors and btrfs had nothing to complain

i'd suggest to use this when having WD disks in the system

 

i'm only using HGST and Seagate on the system with the jmb585 so it was not visible before on my main nas

 

Link to post
Share on other sites

Interesting, I have all WD disks! Except the ones I use for tests. Does the 3615 kernel also have this "flaw"? Could you be more precise and tell me where to insert the command "libata.force = noncq", in the file grub.cfg? I would also like to do some tests on the systems I have. Thank you.

Link to post
Share on other sites
4 hours ago, Vaifranz said:

Does the 3615 kernel also have this "flaw"?

same kernel version as 3617, no reason to assume a change in central ahci code would not be in both

 

4 hours ago, Vaifranz said:

Could you be more precise and tell me where to insert the command "libata.force = noncq", in the file grub.cfg?

 

just add it at the end of the line as additional parameter to the others

set common_args_918='syno_hdd_powerup_seq=1 HddHotplug=0 syno_hw_version=DS918+ vender_format_version=2 console=ttyS0,115200n8 withefi elevator=elevator quiet syno_hdd_detect=0 syno_port_thaw=1'

a space separates entry's and the line is "closed" with the '

for 3617 its "set common_args_3617="

Link to post
Share on other sites
2 hours ago, IG-88 said:
7 hours ago, Vaifranz said:

Does the 3615 kernel also have this "flaw"?

same kernel version as 3617, no reason to assume a change in central ahci code would not be in both

What I thought, I'm not very well versed in computer systems, just a little passionate.

 

2 hours ago, IG-88 said:
7 hours ago, Vaifranz said:

Could you be more precise and tell me where to insert the command "libata.force = noncq", in the file grub.cfg?

 

just add it at the end of the line as additional parameter to the others


set common_args_918='syno_hdd_powerup_seq=1 HddHotplug=0 syno_hw_version=DS918+ vender_format_version=2 console=ttyS0,115200n8 withefi elevator=elevator quiet syno_hdd_detect=0 syno_port_thaw=1'

a space separates entry's and the line is "closed" with the '

for 3617 its "set common_args_3617="

Thanks for the tip, I'll try to follow it ... in the meantime I did a test:

 

run 3617, DSM 6.2.3 RAID 5 on AsRock Z97 extreme 9 without "libata.force = noncq"

 

Maxtor   STM3320820AS   on Z97 (built-in mobo)

Maxtor    6L250S0   on Z97 (built-in mobo)

WDC    WD6400AAKS-22A7B0    on JMB585 (SATA card)

WDC    WD4000KS-00MNB0    on JMB585 (SATA card)

WDC    WD3200AAJS-00Z0A0    on JMB585 (SATA card)

WDC    WD3200AAKS-00YGA0    on JMB585 (SATA card)

Seagate    ST3320620AS    on ASMEDIA 1061 (built-in mobo)

 

RAID completed and no disconnections or warnings except this:

 

[ 5023.397585] perf interrupt took too long (2643 > 5000), lowering kernel.perf_event_max_sample_rate to 50000

 

I don't know what to think, except that my problem is the backplane

Link to post
Share on other sites
3 minutes ago, Vaifranz said:

I don't know what to think, except that my problem is the backplane

what backplabe is it? some old part that does not support 6Gbps of SATA3?

Link to post
Share on other sites
8 minutes ago, IG-88 said:

what backplabe is it? some old part that does not support 6Gbps of SATA3?

I don't know exactly, I should take it apart, but it doesn't seem very old to me, the strange thing is that it is now running, not with JMB585 but with HBA, Dell 200 in IT mode (9211), it works fine, but it happens that every 4 / 5 days freezes, remains on but not accessible. 

Link to post
Share on other sites
58 minutes ago, Vaifranz said:

I don't know exactly, I should take it apart, but it doesn't seem very old to me, the strange thing is that it is now running, not with JMB585 but with HBA, Dell 200 in IT mode (9211), it works fine, but it happens that every 4 / 5 days freezes, remains on but not accessible. 

 

maybe try another nas distribution like omv or freenas to see if its working stable

Link to post
Share on other sites
3 hours ago, IG-88 said:

 

maybe try another nas distribution like omv or freenas to see if its working stable

Yes, a great idea, I can also try the system that is installed there without a backplane, so only mobos, disks and HBA cards.  I will update you, in the meantime thanks for your time.

Link to post
Share on other sites

I did some tests, my system works fine without the 24 bay case backplane, I will test it with other systems like OMV or Unraid or FreeNAS to understand if the problem is hardware or software.  Here I took a photo of the only code present.

21B60B19-299B-4F2A-B873-713712BC773C.jpeg

I searched the net but found nothing about it, the case has this code CSE-S46524.  I hope it will be useful to someone.

Edited by Vaifranz
Link to post
Share on other sites
[   98.256360] md: md2: current auto_remap = 0
[   98.256363] md: requested-resync of RAID array md2
[   98.256366] md: minimum _guaranteed_  speed: 10000 KB/sec/disk.
[   98.256366] md: using maximum available idle IO bandwidth (but not more than 600000 KB/sec) for requested-resync.
[   98.256370] md: using 128k window, over a total of 483564544k.
[  184.817938] ata5.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
[  184.825608] ata5.00: failed command: READ FPDMA QUEUED
[  184.830757] ata5.00: cmd 60/00:00:00:8a:cf/02:00:00:00:00/40 tag 0 ncq 262144 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  184.845546] ata5.00: status: { DRDY }
[  184.849222] ata5.00: failed command: READ FPDMA QUEUED
[  184.854373] ata5.00: cmd 60/00:08:00:8c:cf/02:00:00:00:00/40 tag 1 ncq 262144 in
                        res 40/00:00:e0:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  184.869165] ata5.00: status: { DRDY }
[  184.872839] ata5.00: failed command: READ FPDMA QUEUED
[  184.877994] ata5.00: cmd 60/00:10:00:8e:cf/02:00:00:00:00/40 tag 2 ncq 262144 in
                        res 40/00:00:e0:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  184.892784] ata5.00: status: { DRDY }
...
[  185.559602] ata5: hard resetting link
[  186.018820] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  186.022265] ata5.00: configured for UDMA/100
[  186.022286] ata5.00: device reported invalid CHS sector 0
[  186.022331] ata5: EH complete
[  311.788536] ata5.00: exception Emask 0x0 SAct 0x7ffe0003 SErr 0x0 action 0x6 frozen
[  311.796228] ata5.00: failed command: READ FPDMA QUEUED
[  311.801372] ata5.00: cmd 60/e0:00:88:3a:8e/00:00:01:00:00/40 tag 0 ncq 114688 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  311.816151] ata5.00: status: { DRDY }
...
[  312.171072] ata5.00: status: { DRDY }
[  312.174841] ata5: hard resetting link
[  312.634480] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  312.637992] ata5.00: configured for UDMA/100
[  312.638002] ata5.00: device reported invalid CHS sector 0
[  312.638034] ata5: EH complete
[  572.892855] ata5.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
[  572.900523] ata5.00: failed command: READ FPDMA QUEUED
[  572.905680] ata5.00: cmd 60/00:00:78:0a:ec/02:00:03:00:00/40 tag 0 ncq 262144 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  572.920462] ata5.00: status: { DRDY }
...
[  573.630587] ata5.00: status: { DRDY }
[  573.634262] ata5: hard resetting link
[  574.093716] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  574.096662] ata5.00: configured for UDMA/100
[  574.096688] ata5.00: device reported invalid CHS sector 0
[  574.096732] ata5: EH complete
[  668.887853] ata5.00: NCQ disabled due to excessive errors
[  668.887857] ata5.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
[  668.895522] ata5.00: failed command: READ FPDMA QUEUED
[  668.900667] ata5.00: cmd 60/00:00:98:67:53/02:00:04:00:00/40 tag 0 ncq 262144 in
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  668.915449] ata5.00: status: { DRDY }
...
[  669.601057] ata5.00: status: { DRDY }
[  669.604730] ata5.00: failed command: READ FPDMA QUEUED
[  669.609879] ata5.00: cmd 60/00:f0:98:65:53/02:00:04:00:00/40 tag 30 ncq 262144 in
                        res 40/00:00:e0:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  669.624748] ata5.00: status: { DRDY }
[  669.628425] ata5: hard resetting link
[  670.087717] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  670.090796] ata5.00: configured for UDMA/100
[  670.090814] ata5.00: device reported invalid CHS sector 0
[  670.090859] ata5: EH complete
[ 6108.391162] md: md2: requested-resync done.
[ 6108.646861] md: md2: current auto_remap = 0

I had the same problem with this HDD model: WDC WD60EFAX-68SHWN0, connected to JMB585.  So it's really a problem with Western Digital drives especially the 6Tb WD Red model.

 

I haven't tried the “libata.force = noncq” modification yet.  What difference does the system have with this change?

Edited by Vaifranz
Link to post
Share on other sites
  • 5 months later...

Hi, these ones seem interesting:

 

https://www.sybausa.com/index.php?route=product/product&path=64_181_85&product_id=1087 (Quad ASM1064 connected to a ASM2806 PCI-e Bridge)

 

https://aliexpress.com/item/1005002813788487.html (ASMedia ASM1166 + JMicron JMB5xx)

 

https://aliexpress.com/item/1005002767617226.html (ASM1166 12-24 ports, even they described them as ASM1064)

 

Any thoughts?

 

Edited by masters
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.