Mon Dec 31 11:51:27 PST 2007

Power Back; Restoring Service

Power was restored at around 9:30 this morning.

I'm in the process of turning machines back on and making sure things are working properly. The Scientific Computing Lab machines (accessible through shell.math.hmc.edu are up and running, as are the Clinic lab machines. Faculty machines are being restarted as well.

Our mirror server, yum.math.hmc.edu is also back on line, although its mirrors are probably a bit out of date at the moment. Keep checking; they'll catch up.

Note that the Amber cluster and hex will remain offline for a bit longer. The chillers have not come back on line, which means that our machine room is running very hot. And the CS department is working on a few things in their machine room, so bringing the Amber cluster back up will have to wait a bit.

More later....


Posted by Claire Connelly | Permalink

Sun Dec 30 17:48:45 PST 2007

Power Still Out: Maybe Monday Morning

The latest news is that the power will be out until at least Monday morning.

Stay tuned....


Posted by Claire Connelly | Permalink

Sun Dec 30 13:10:24 PST 2007

Power Still Out...

Apparently the work has taken a bit longer than they expected, and power still hasn't been restored. According to Tom Shaffer, HMC's plant engineer, they're estimating anywhere between two and twelve hours before power will be restored and will be reliable (i.e., won't have to be switched between the grid and generators).

I'm supposed to get a call when things are working again, but it's looking like things won't be back up until sometime tomorrow morning, at best.


Posted by Claire Connelly | Permalink

Fri Dec 28 12:08:46 PST 2007

Power Outage Underway

The scheduled power outage is currently underway. Our mail services (POP, IMAP, SMTP) are still available and working, and we also have shell service accessible to all users at ponder.math.hmc.edu.

With the exception of our core servers, all other systems are shut down until the work is complete and power is restored. The scheduled time for power restoration is 12:00 noon on Sunday.

Note that some machines may not be completely functional with a simple power on, as the power to Olin was cut about ten hours earlier than we were told, around 2:00 PM on Thursday. I scrambled to get machines shut down, but some workstations shut down on their own, and some were interrupted when their UPSs failed. I will check on each machine on Sunday.

We also had some initial problems getting some of the servers to be happy with only one power supply. As part of exploring whether it was possible to turn off an alarm through the BIOS, we rebooted our new file server, which came back up with a new kernel that includes some extremely annoying bugs with its NFS server. We've gone back to an earlier kernel on that machine until a new kernel is released.

Other than the NFS issues, I believe that everything we promised is up and running. If you have problems, please let me know.


Posted by Claire Connelly | Permalink

Fri Dec 14 14:21:22 PST 2007

Winter Break Outage Schedule

There are two major outages planned over the winter break period.

Power Outage, 12:01 AM Friday, December 28 - 12:00 PM Sunday, December 30

The first outage is being imposed on us by Southern California Edison and the Colleges. They will be shutting down the campuses power grids in order to upgrade the systems that supply power to the Colleges and also installing additional equipment to make it possible for the campuses to expand their usage without intervention by Edison.

Note that the outage starts at midnight on Friday, which is really Thursday night. The outage is scheduled to be over at noon on Sunday; past imposed outages have often required additional downtime to complete, so don't expect things to actually be back to normal at noon.

During this outage, generators will be run by the colleges to supply limited power to important services. The math department expects to be able to have mail and web service available over the outage period, with some interactive support via ponder. Other resources, such as individual workstations, Clinic machines, printers, and the Amber cluster and hex will be shut down for the duration of the outage.

We will also be shutting down our mirror server, yum.math.hmc.edu during this period.

Intermittent Outages

I will be doing some work over the break that I haven't been able to do during the semester. The key project here is migrating user accounts from our old server to our new server, which has significantly more disk space than the old machine (as well as being faster).

Migrating user accounts will require me to shut down mail service (as mail is delivered directly to user home directories) and interactive usage (as it might otherwise be possible for users to lose work if they make changes to a directory after it has been migrated on the server side, but before the new directories are mounted on workstations). In addition, web service for home directory accounts will be disrupted during the work period.

This work will probably take place in the second week of January, 2008, (January 7 - 11). I will post more details here when I know exactly when I will be working on this project.

If you have important work that must be done during this period, please get in touch with me so that we can coordinate. Otherwise, kick back and enjoy your holiday break, and look forward to coming back to new stuff in the spring!


Posted by Claire Connelly | Permalink