Difference between revisions of "Kaiser Server Room Network Failure (Mar 2013)"

From ECE Information Technology Services
Jump to navigationJump to search
m
(Service restored 2013-02-22)
 
Line 12: Line 12:
  
 
=== Update 22 March 2013 ===
 
=== Update 22 March 2013 ===
The same network switch failed again 22 March 19:33.  We are investigating the possibility of replacing the switch.
+
The same network switch failed again 22 March 19:33.  We have confirmed that it is due to failing hardware, and have replaced it with another unit.  Service was restored just before midnight.

Latest revision as of 01:48, 23 March 2013

At 2 am on 21 March 2013, a network switch in the Kaiser server room spontaneously failed. Service was restored at 4:15 am by power-cycling the switch.

As a result, the following services were unavailable during the outage:

  • Authentication to the UBC_ECE domain
  • The ability to change account passwords
  • ssh-linux5, ssh-linux6, ssh-linux7
  • Electronic Software Distribution
  • Graduate Application Data Store
  • Several software license servers
  • CMC CAD tools
  • Several research groups' servers hosted in the Kaiser server room

Update 22 March 2013

The same network switch failed again 22 March 19:33. We have confirmed that it is due to failing hardware, and have replaced it with another unit. Service was restored just before midnight.