PnoT

*Update* Slow SSD performance

Recommended Posts

I seem to be having slow performance on my SSDs in my SM chassis and am unable to pinpoint the source of the problem. I'm having severe performance issues with my SSDs in the sub 200MB/sec range for the entire array but today those numbers have increased a bit but are still below what I would expect from a RAID 0 of all SSDs with this setup. I've pieced together a bunch of information and have tried my best to supply everything I can think of in this post to describe my issue in the hopes that someone can help me diagnose the problem.

 

One thing I couldn't find is a reliable way to determine link speed on each drive as most of the commands I found, on the net, came back with "".

 

Current Setup

X8SIL-F
Xeon 3450
16GB ECC RAM
SuperMicro SC846 Chassis
SAS2 backplane
IBM M1015 flashed to an LSI 9211-8i in IT mode and running R19 firmware.
Single cable from M1015 P0 to PRI_J0 on the backplane
5592.2 Update 3

 

Drive / RAID layout

8 x 4TB WD RED + 2 x 5TB WD RED in SHR
4 x 256GB Samsung 850 Pro in RAID 0

 

Drive Info:

/dev/sdq:

ATA device, with non-removable media
       Model Number:       Samsung SSD 850 PRO 256GB
       Serial Number:      S1SUNSAFC81422B
       Firmware Revision:  EXM02B6Q
       Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
       Used: unknown (minor revision code 0x0039)
       Supported: 9 8 7 6 5
       Likely used: 9
Configuration:
       Logical         max     current
       cylinders       16383   16383
       heads           16      16
       sectors/track   63      63
       --
       CHS current addressable sectors:   16514064
       LBA    user addressable sectors:  268435455
       LBA48  user addressable sectors:  500118192
       Logical  Sector size:                   512 bytes
       Physical Sector size:                   512 bytes
       Logical Sector-0 offset:                  0 bytes
       device size with M = 1024*1024:      244198 MBytes
       device size with M = 1000*1000:      256060 MBytes (256 GB)
       cache/buffer size  = unknown
       Nominal Media Rotation Rate: Solid State Device
Capabilities:
       LBA, IORDY(can be disabled)
       Queue depth: 32
       Standby timer values: spec'd by Standard, no device specific minimum
       R/W multiple sector transfer: Max = 1   Current = 1
       DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
            Cycle time: min=120ns recommended=120ns
       PIO: pio0 pio1 pio2 pio3 pio4
            Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
       Enabled Supported:
          *    SMART feature set
               Security Mode feature set
          *    Power Management feature set
          *    Write cache
          *    Look-ahead
          *    Host Protected Area feature set
          *    WRITE_BUFFER command
          *    READ_BUFFER command
          *    NOP cmd
          *    DOWNLOAD_MICROCODE
               SET_MAX security extension
          *    48-bit Address feature set
          *    Device Configuration Overlay feature set
          *    Mandatory FLUSH_CACHE
          *    FLUSH_CACHE_EXT
          *    SMART error logging
          *    SMART self-test
          *    General Purpose Logging feature set
          *    WRITE_{DMA|MULTIPLE}_FUA_EXT
          *    64-bit World wide name
               Write-Read-Verify feature set
          *    WRITE_UNCORRECTABLE_EXT command
          *    {READ,WRITE}_DMA_EXT_GPL commands
          *    Segmented DOWNLOAD_MICROCODE
          *    Gen1 signaling speed (1.5Gb/s)
          *    Gen2 signaling speed (3.0Gb/s)
          *    Gen3 signaling speed (6.0Gb/s)
          *    Native Command Queueing (NCQ)
          *    Phy event counters
          *    unknown 76[15]
          *    DMA Setup Auto-Activate optimization
               Device-initiated interface power management
          *    Asynchronous notification (eg. media change)
          *    Software settings preservation
               unknown 78[8]
          *    SMART Command Transport (SCT) feature set
          *    SCT LBA Segment Access (AC2)
          *    SCT Error Recovery Control (AC3)
          *    SCT Features Control (AC4)
          *    SCT Data Tables (AC5)
          *    reserved 69[4]
          *    DOWNLOAD MICROCODE DMA command
          *    SET MAX SETPASSWORD/UNLOCK DMA commands
          *    WRITE BUFFER DMA command
          *    READ BUFFER DMA command
          *    Data Set Management TRIM supported (limit 8 blocks)
Security:
       Master password revision code = 65534
               supported
       not     enabled
       not     locked
       not     frozen
       not     expired: security count
               supported: enhanced erase
       2min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 50025388a08df889
       NAA             : 5
       IEEE OUI        : 002538
       Unique ID       : 8a08df889
Checksum: correct

 

{ /volume3}-> dmesg | grep "Write cache"
[    8.714585] sd 0:0:15:0: [sdp] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.714698] sd 0:0:8:0: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.715425] sd 0:0:9:0: [sdj] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.716042] sd 0:0:10:0: [sdk] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.716334] sd 0:0:12:0: [sdm] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.716425] sd 0:0:11:0: [sdl] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.716564] sd 0:0:14:0: [sdo] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.769484] sd 0:0:13:0: [sdn] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.772838] sd 0:0:5:0: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.773529] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.773881] sd 0:0:6:0: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.775287] sd 0:0:2:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.778207] sd 0:0:3:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.778531] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.779942] sd 0:0:4:0: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA
[    8.790654] sd 0:0:7:0: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   66.494905] sd 7:0:0:0: [synoboot] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[236816.271293] sd 0:0:17:0: [sdq] Write cache: enabled, read cache: enabled, supports DPO and FUA
[236830.021295] sd 0:0:18:0: [sdr] Write cache: enabled, read cache: enabled, supports DPO and FUA
[236838.521456] sd 0:0:19:0: [sds] Write cache: enabled, read cache: enabled, supports DPO and FUA
[236927.771667] sd 0:0:20:0: [sdt] Write cache: enabled, read cache: enabled, supports DPO and FUA

 

The hdparm results per drive and finally the overall array look right on the money but why are tests with dd so horrible

 

Disk /dev/sdq: 256GB
Disk /dev/sdr: 256GB
Disk /dev/sds: 256GB
Disk /dev/sdt: 256GB

hdparm -tT --direct /dev/sdr
/dev/sdr:
Timing O_DIRECT cached reads:   946 MB in  2.00 seconds = 472.71 MB/sec
Timing O_DIRECT disk reads: 1468 MB in  3.00 seconds = 489.13 MB/sec

hdparm -tT --direct /dev/sds
/dev/sds:
Timing O_DIRECT cached reads:   966 MB in  2.00 seconds = 482.10 MB/sec
Timing O_DIRECT disk reads: 1476 MB in  3.00 seconds = 491.79 MB/sec

hdparm -tT --direct /dev/sdt
/dev/sdt:
Timing O_DIRECT cached reads:   962 MB in  2.00 seconds = 480.54 MB/sec
Timing O_DIRECT disk reads: 1464 MB in  3.00 seconds = 487.95 MB/sec

hdparm -tT --direct /dev/sdq
/dev/sdq:
Timing O_DIRECT cached reads:   964 MB in  2.00 seconds = 481.62 MB/sec
Timing O_DIRECT disk reads: 1466 MB in  3.00 seconds = 488.58 MB/sec

hdparm -tT --direct /dev/vg3/volume_3
/dev/vg3/volume_3:
Timing O_DIRECT cached reads:   2880 MB in  2.00 seconds = 1439.44 MB/sec
Timing O_DIRECT disk reads: 4570 MB in  3.00 seconds = 1523.11 MB/sec

 

Here is the query on my LSI controller to determine link speed which look like 8x:

 

lspci -vvv -d 1000:0072
02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
       Subsystem: Device 1028:1f1c
       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-         Latency: 0, Cache Line Size: 32 bytes
       Interrupt: pin A routed to IRQ 16
       Region 0: I/O ports at c000 [size=11]
       Region 1: Memory at fb3b0000 (64-bit, non-prefetchable) [size=64K]
       Region 3: Memory at fb3c0000 (64-bit, non-prefetchable) [size=256K]
       Expansion ROM at fb400000 [disabled] [size=1M]
       Capabilities: [50] Power Management version 3
               Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
       Capabilities: [68] Express (v2) Endpoint, MSI 00
               DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                       ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
               DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                       RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                       MaxPayload 256 bytes, MaxReadReq 512 bytes
               DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
               LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns, L1 <1us
                       ClockPM- Surprise- LLActRep- BwNot-
               LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
                       ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
               LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
               DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported
               DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
               LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                        Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                        Compliance De-emphasis: -6dB
               LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                        EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
       Capabilities: [d0] Vital Product Data
               Unknown small resource type 00, will not decode more.
       Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
               Address: 0000000000000000  Data: 0000
       Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
               Vector table: BAR=1 offset=0000e000
               PBA: BAR=1 offset=0000f800
       Capabilities: [100 v1] Advanced Error Reporting
               UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
               UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
               UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
               CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
               AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
       Capabilities: [138 v1] Power Budgeting <?>
       Kernel driver in use: mpt2sas

 

If you look at the performance of volume2 which consists of NAS REDS (SHR) it's not bad when using dd and the speeds are very consistent. Keep in mind that this image only shows 7 / 9 drives in the array due to the limitation of the resource monitor tool so if you add another 2 drives @ 90MB/sec that's well over 1GB/sec.

 

dd if=/dev/zero of=/volume2/test.bin bs=1M count=500M

 

A8htvig.jpg

 

The same command on the RAID0 Samsung 850 Pros nets some pretty crappy results. If you look closely you can see huge swings in performance from 50MB/sec to almost 200MB/sec per drive which is completely the opposite of how the SHR and spinning disks are performing.

 

dd if=/dev/zero of=/volume3/test.bin bs=1M count=500M

 

QmcPeWa.jpg

Edited by Guest

Share this post


Link to post
Share on other sites

I updated the original post with a ton of information as I felt there was a lot lacking in it initially.

Share this post


Link to post
Share on other sites