snoopy78 Posted January 11, 2015 #1 Posted January 11, 2015 HI @ all, just a warning for those who, like me, use the LSI 9201-16i HBA. DO NOT! flash the latest FW P20, as there is an bug in it (LSI confirmed this AFTER my system crashed). Use of the P19 FW seems to be fine and is recommended by LSI. Issue is reporting I/O errors randomly on your drives, so at one point the DSM will throw all volumes out and mark the drives as faulty, which makes an normal recovery via WebGui impossible. Manual revovery via CLI is needed and can be successful. BR snoopy78
jestosebno Posted January 11, 2015 #2 Posted January 11, 2015 C'mon. Flashed my 9211-8i to P20 yesterday and after reboot, volume 1 was crashed. 3 disks show crashed and 2 not initialized. How can i recover this? And why didnt LSI remove bios, if it is faulty?
snoopy78 Posted January 11, 2015 Author #3 Posted January 11, 2015 i haven't made it by myself.. luckily one of my colleagues were able to help me ^^ my setup for each volume is max. 4 drives using SHR, so we had to do follwoing things: ( one of my desaster szenarios was to have an spare System with HBA and xpenology available which we used, but original system should be fine too ) my drives (4) were labelled as sdg5/sdh5/sdgi5/sdj5 => stop the LVM => add the drives back to raid => rebuild => restart LVM/Server these should be the commands for my system...!! be adviced, know what you do or ALL is gone !! " mdadm --manage --stop /dev/vg1002/lv mdadm --examine /dev/sdg5 mdadm --examine /dev/sdi5 mdadm --examine /dev/sdh5 mdadm --examine /dev/sdj5 vgchange -an vg1001 mdadm --stop /dev/md3 mdadm --query --detail /dev/md3 cat /proc/mdstat mdadm --verbose --create /dev/md3 --chunk=64 --level=5 --raid-devices=4 /dev/sdi5 /dev/sdj5 missing /dev/sdh5 mdadm --manage /dev/md3 --add /dev/sdg5 cat /proc/mdstat " THIS is LSI's reply to my issue report: " There is an issue with P20. We are expecting a fixed version any day now. I recommend you downgrade to P19 until then. You have to erase P20 to downgrade and this can only be done in DOS or UEFI. Doc attached. Data Center Solutions Group Avago 4165 Shackleford Road Norcross, GA 30093 " BR
Diverge Posted January 11, 2015 #4 Posted January 11, 2015 Thanks for heads up! I updated to P20 on a new system not too long ago. I haven't done much with it but test stuff... but I just downgraded to P19.
jestosebno Posted January 16, 2015 #5 Posted January 16, 2015 Is there any solution provided by LSI or is downgrade only solution?
snoopy78 Posted January 18, 2015 Author #6 Posted January 18, 2015 as long as they don't provide the new version downgrade to P19 seens to be the only solution for me since i went back to P19 the system is working fine and without issues
NeoID Posted September 10, 2015 #7 Posted September 10, 2015 How did you guys downgrade? I've tried to follow this guide, but I only get "Cannot downgrade NVDATA version 14.01.00.06" and "Failed to get valid NVDATA image from file": https://amussey.github.io/2015/02/19/up ... mware.html Also, it hangs on "Reconnecting the efi driver..."
snoopy78 Posted September 10, 2015 Author #8 Posted September 10, 2015 when you donwload the firmware (f.e. DOS version) then there is a manual included.... f.e. 9201-16i http://docs.avagotech.com/docs/12350436 all required commands are in there too just ned to delete the firmware before installing the new one as AVAGO/LSI told me " You have to erase P20 to downgrade and this can only be done in DOS or UEFI. Doc attached. " http://sc836.lindem.de/update.pdf br snoopy78
NeoID Posted September 10, 2015 #9 Posted September 10, 2015 In DOS i get the follow error when trying to launch SASFLSH.exe: ERROR: Failed to initialize PAL. Exiting Program ...and EFI Shell hangs when trying to --listall... I guess I'll have to try DOS on a NON-UEFI PC. Edit: Flashed the HBA from DOS on a NON-UEFI PC and it worked right out of the box. Hopefully the change in FW doesn't screw up the volume. Thanks for the tip on downgrading, highly appreciated! I'll try to stress test the HBA later this weekend to see if this actually fixes the I/O issues...
NeoID Posted September 11, 2015 #10 Posted September 11, 2015 I can confirm that this is not related to the firmware version. Even on P19 I get the I/O error when doing a data scrub!
snoopy78 Posted September 11, 2015 Author #11 Posted September 11, 2015 then it's most likely an faulty drive...
NeoID Posted September 11, 2015 #12 Posted September 11, 2015 Nope, all drives are new and tested. The only thing it might be is a bad cable... even though I've now replaced all of them too without solving the issue... No idea, but as this only happens on the initiation of a data scrub or other really high I/O activities, I guess it's nothing to really worry about for now. Will look into it in the days to come.
dose Posted September 14, 2015 #13 Posted September 14, 2015 This is a heat-related issue. It has nothing to do with firmware revision except that perhaps P20 is more sensitive to high-heat conditions. At the same time, it could just be that P20 'runs' that much more hot. In any case, install a fan on to or over top of your HBA's PU.
NeoID Posted September 14, 2015 #14 Posted September 14, 2015 I doubt that. It also happens when the card is cold.
dose Posted September 14, 2015 #15 Posted September 14, 2015 These cards alone, are only ever cold while they're powered off. So it's quite obvious you've made no real attempt to cool it/them.
NeoID Posted September 15, 2015 #16 Posted September 15, 2015 That's my point exactly. Even though my server is powered off, I can force this issue within just a few minutes after the system has bootet. The server and card never gets hot in my testing.
dilemmaprison Posted September 27, 2015 #17 Posted September 27, 2015 Might want to check your p20 subversion. the original release (20.00.00.00) had io bugs according to LSI. There have been more releases 20.00.02.00 and 20.20.04.00. Fixes it according to folks who had issues with this on FreeNAS.
Recommended Posts