Multi-service Interruption - ECE Virtualization Platform – Memory Failure (Apr 25th 2014)

From ECE Information Technology Services
Revision as of 13:09, 28 April 2014 by Mberdan (talk | contribs)
Jump to navigationJump to search

This issue was resolved at ~4pm on April 25, 2014.

In the morning of April 25th, ECE’s server virtualization platform experienced a hardware memory failure, destabilizing several dependent virtual servers. Faculty, researchers, students and staff experienced the following issues as a result:

  1. Individuals with MATLAB installations depending on ECE’s departmental licensing server could not use MATLAB
  2. ECE’s E-mail system experienced numerous LDAP lookup query failures, despite in-place LDAP load-balancing and redundancy. The mail system subsequently failed to deliver messages to @ece.ubc.ca mail accounts. Messages remained queued, but not delivered for up to 3 hours.
  3. The Electric Power & Energy Systems Lab software license server intermittently failed to issue licenses to requesting hosts

The virtualization platform was restored to working order ~4pm on April 25, 2014. All dependent virtual machines and their hosted services were recovered to working order. The issues above were resolved as a result.

Thank you for your patience while we addressed this issue. If you have any questions or comments about this outage, don’t hesitate to contact the IT Support Team at help@ece.ubc.ca or in MacLeod 105.