In our data center, we are using IBM Xseries 3650 M4 series. We updated one of the systemX server firmware and drivers a month ago. since then it was having issue of halting in random days specially in heavy load conditions (like when backup executes) and was presenting following errors on the screen.
Raid controller details were as followed.
SERVER--- OS name: Windows Server 2008 R2 OS Version: 6.1 OS Architecture: x86_64 Driver Name: megasas2.sys Driver Version: 6.702.07.00 Application Version: MegaRAID Storage Manager - 13.01.04.00 HARDWARE--- Controller: ServeRAID M5110e(Bus 22,Dev 0) Status: Optimal Firmware Package Version:23.22.0-0024 Firmware Version: 3.340.75-3372 BBU: NO
After some diagnostic it was found that the culprit was driver version “6.702.07.00“. As stated in IBM web site.
After that, we downloaded IBM Update Express ver 9.63 (ibm_utl_uxspi_9.63_winsrvr_32-64.exe) and execute the update for selected drivers on live running system that was hosting our lotus domino email system. It took around 1 hour for the download + update and upon rebooting, and till the writing of this post, the problem seems to be solved now.
SERVER--- OS name: Windows Server 2008 R2 OS Version: 6.1 OS Architecture: x86_64 Driver Name: megasas2.sys Driver Version: 6.708.09.00 Application Version: MegaRAID Storage Manager - 15.03.01.00 HARDWARE--- Controller: Controller0: ServeRAID M5110e(Bus 22,Dev 0,Domain 0) Status: Optimal Firmware Package Version:23.33.0-0043 Firmware Version: 3.450.145-4983 BBU: NO
Note: I would recommend to NOT upgrade any critical system firmware until is really required.