PnoT Posted August 15, 2015 Share #1 Posted August 15, 2015 I'm on the latest and greatest and am seeing what looks like timeouts on expanding a volume. During these timeouts all of my volumes are not accessible and the NAS pretty much freezes until it's over which is about 10-45 seconds. Aug 15 11:06:00 SYN kernel: [686987.368171] cdb[0]=0x28: 28 00 d3 8c 03 e0 00 03 80 00 Aug 15 11:06:02 SYN kernel: [686987.857817] cdb[0]=0x28: 28 00 d3 8c 07 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.857847] cdb[0]=0x28: 28 00 d3 8c 07 e0 00 01 00 00 Aug 15 11:06:02 SYN kernel: [686987.857861] cdb[0]=0x28: 28 00 d3 8c 08 e0 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.857874] cdb[0]=0x28: 28 00 d3 8c 09 60 00 02 80 00 Aug 15 11:06:02 SYN kernel: [686987.857887] cdb[0]=0x28: 28 00 d3 8c 0b e0 00 02 80 00 Aug 15 11:06:02 SYN kernel: [686987.857900] cdb[0]=0x28: 28 00 d3 8c 0e 60 00 02 80 00 Aug 15 11:06:02 SYN kernel: [686987.857914] cdb[0]=0x28: 28 00 d3 8c 10 e0 00 01 00 00 Aug 15 11:06:02 SYN kernel: [686987.857927] cdb[0]=0x28: 28 00 d3 8c 11 e0 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.857941] cdb[0]=0x28: 28 00 d3 8c 12 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.857954] cdb[0]=0x28: 28 00 d3 8c 12 e0 00 00 78 00 Aug 15 11:06:02 SYN kernel: [686987.857967] cdb[0]=0x28: 28 00 d3 8c 13 58 00 00 08 00 Aug 15 11:06:02 SYN kernel: [686987.857980] cdb[0]=0x28: 28 00 d3 8c 13 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.857994] cdb[0]=0x28: 28 00 d3 8c 13 e0 00 01 00 00 Aug 15 11:06:02 SYN kernel: [686987.858037] cdb[0]=0x28: 28 00 d3 8c 14 e0 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858050] cdb[0]=0x28: 28 00 d3 8c 15 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858064] cdb[0]=0x28: 28 00 d3 8c 15 e0 00 02 80 00 Aug 15 11:06:02 SYN kernel: [686987.858077] cdb[0]=0x28: 28 00 d3 8c 18 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858091] cdb[0]=0x28: 28 00 d3 8c 18 e0 00 02 80 00 Aug 15 11:06:02 SYN kernel: [686987.858104] cdb[0]=0x28: 28 00 d3 8c 1b 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858117] cdb[0]=0x28: 28 00 d3 8c 1b e0 00 01 00 00 Aug 15 11:06:02 SYN kernel: [686987.858130] cdb[0]=0x28: 28 00 d3 8c 1c e0 00 01 00 00 Aug 15 11:06:02 SYN kernel: [686987.858144] cdb[0]=0x28: 28 00 d3 8c 1d e0 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858157] cdb[0]=0x28: 28 00 d3 8c 1e 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858170] cdb[0]=0x28: 28 00 d3 8c 1e e0 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858184] cdb[0]=0x28: 28 00 d3 8c 1f 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858197] cdb[0]=0x28: 28 00 d3 8c 1f e0 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858210] cdb[0]=0x28: 28 00 d3 8c 20 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858224] cdb[0]=0x28: 28 00 d3 8c 20 e0 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858237] cdb[0]=0x28: 28 00 d3 8c 21 60 00 01 00 00 Aug 15 11:06:02 SYN kernel: [686987.858250] cdb[0]=0x28: 28 00 d3 8c 22 60 00 00 80 00 Aug 15 11:06:02 SYN kernel: [686987.858263] cdb[0]=0x28: 28 00 d3 8c 22 e0 00 04 00 00 this is also showing up which looks like some type of timeout Aug 15 05:15:53 SYN kernel: [665966.766989] mpt2sas0: log_info(0x31120436): originator(PL), code(0x12), sub_code(0x0436) It looks like the LSI SAS card is having issues. Any ideas? Link to comment Share on other sites More sharing options...
PnoT Posted August 18, 2015 Author Share #2 Posted August 18, 2015 I've finally figured out what was happening and it has to do with the 2TB Samsung F3s. These drives have been rock solid since the dawn of time but apparently they timeout the controller in XPEnology for some odd reason. I've had no issues with them in my 1812/1815 so I'm not sure, at this point, if it's due to the current LSI driver that was recently updated in XPEnology or some odd incompatibility with the controller /drives. The firmware on the controller and drives are the latest and there are no failures on the drives themselves. Link to comment Share on other sites More sharing options...
baruch Posted September 7, 2015 Share #3 Posted September 7, 2015 The LSI log info code decodes to say that this is an IO Aborted issue, the error is most likely by a bad cable but can also be due to a bad port. I suggest checking the cables first as they are cheaper to replace If it's a SAS disk you can look at the SAS counters (depending on the OS how you look at them) and see on which cable/port/phy the issue exists. I've written a tool to decode the LSI log info codes to help troubleshoot such problems, you can see relevant links to the log info at: * http://blog.disksurvey.org/knowledge-base/lsi-loginfo/ -- pre decoded list * https://github.com/baruch/lsi_decode_loginfo -- command line toool that essentially generated that list and can help with the unknown codes LSI never fully documented openly all the codes so the list I have is incomplete. Hope this helps and let me know if you need some more help figuring it out. Link to comment Share on other sites More sharing options...
snoopy78 Posted September 7, 2015 Share #4 Posted September 7, 2015 i don't know which version do you mean with LATEST.. but i strongly recomment NOT!! to use P20 fw... viewtopic.php?f=2&t=4985&p=29300 Link to comment Share on other sites More sharing options...
PnoT Posted September 7, 2015 Author Share #5 Posted September 7, 2015 The LSI log info code decodes to say that this is an IO Aborted issue, the error is most likely by a bad cable but can also be due to a bad port. I suggest checking the cables first as they are cheaper to replace If it's a SAS disk you can look at the SAS counters (depending on the OS how you look at them) and see on which cable/port/phy the issue exists. I've written a tool to decode the LSI log info codes to help troubleshoot such problems, you can see relevant links to the log info at: * http://blog.disksurvey.org/knowledge-base/lsi-loginfo/ -- pre decoded list * https://github.com/baruch/lsi_decode_loginfo -- command line toool that essentially generated that list and can help with the unknown codes LSI never fully documented openly all the codes so the list I have is incomplete. Hope this helps and let me know if you need some more help figuring it out. Wow, thank you for helping out and I've bookmarked those sites for future use that's pretty amazing. My fix was to remove the Samsung drives as there are known issues with them dropping out of RAID sets with LSI cards and since then I haven't had a single problem. I will swap the cable out and try a different slot and give the batch of drives another try. i don't know which version do you mean with LATEST.. but i strongly recomment NOT!! to use P20 fw... viewtopic.php?f=2&t=4985&p=29300 I should have been more specific and just said I was on p19 as I've seen the issues revolving around p20 but thank you for pointing it out. Link to comment Share on other sites More sharing options...
Recommended Posts