January 10, 2014 Network Outage Information
Before business hours Friday morning, January 10th, a routine policy change to the campus perimeter firewall was made. It’s important to stress that CaTS makes this type of change almost daily on the firewalls to accommodate new systems and services coming on line. During the process of applying the change, the firewall cluster that protects the campus perimeter, went down. This caused access to the internet to be lost. This firewall is configured in high availability mode (two firewall members that are synchronized) which is designed to prevent the type of catastrophic failure that was experienced.
Immediately after the failure, an attempt was made to apply configuration changes to bring the firewall back to a working state – those changes were not accepted by the firewall. The CaTS firewall team immediately called Check Point support for assistance. They were on the line with Check Point support remotely working on the problem for approximately 14 hours. During this time the problem was escalated three times to more senior members of the Check Point support team. To restore temporary connectivity to campus an older firewall appliance was put into service at approximately 2:00 pm which restored internet access for the majority of campus. The connectivity through the production perimeter firewall was restored last night at approximately 8:30 pm.
At this time Check Point has no explanation as to the underlying cause of this failure. We do know it was not a hardware failure but a problem within the firewall software. We are continuing to work with Check Point support to uncover the cause. Until the cause of the problem is addressed sufficiently, only necessary changes to the firewall will be made to reduce the likelihood of a reoccurrence.