I have pulled out the CPU expansion
board from hex and sent it
back to the vendor, who is going to try
to get a replacement from the
manufacturer. In the meantime,
hex is running with eight
cores (four CPUs) and 16 GB of RAM, and
it seems to be stable.
Please let me know about any problems, and check back here for updates.
I will be doing some systems work this weekend, June 7--8.
Work will probably begin around 11:00 AM on Saturday, June 7, and will continue for several hours. If necessary, additional work may be done on Sunday, June 8, within a similar block of time.
The work will disrupt most of our networked services, including e-mail, file service, interactive sessions, and the web server for periods of several minutes to an hour over the course of the work.
I also want to make sure that all of our Macs are running the latest security updates, so will be updating these machines during this time period as well.
If you're using a Mac or Linux system that mounts file systems from our servers, before you leave on Friday evening,
This work is necessary for us to ensure the security and improve the stability of the overall system. In particular, I am hoping that ongoing issues with our web server will be resolved as a result of this work.
I will do my best to keep as much of the system functional as possible for as much of the time as I can, but there will still be some outages.
Last semester we had some serious issues with interactions between the NFS support on our new file server and on our workstations and older servers, exacerbated by the HVAC failure. I was able to stabilize things, but we still see some flaky behavior (especially From the web server, which needs to be rebooted periodically).
On the Linux server side, I plan to update to the latest kernel releases and do some experimentation to see if everything will work together happily. I will need to reboot various servers and workstations an arbitrary number of times to explore all the possible interactions.
For Macs, I will install the latest updates, most of which require the machines to be rebooted. As Tiger (Mac OS X 10.4) has problems when an NFS server disappears and reappears, these machines would need to be rebooted anyway.
As usual, if there are problems with the scheduling of this work, requests or any other comments, please let me know.
As usual, updates on the status of the systems and progress reports will be posted to the ``sysblog'', on our web server at
Thanks for your cooperation!
I ended up doing some fairly
significant work in the machine room
Thursday afternoon and evening, which
involved rewiring the entire rack. In
order to be sure that some of the
systems were working properly, I
rebooted several of the machines in the
rack, including the department's main
file server (gytha) and
our parallel compute server
hex. As a result, some
workstations -- especially Mac OS X
machines -- may be confused about their
NFS mounts. If you have problems
logging in or if you can log in but you
can't access your home directory or
applications or other materials stored
in /shared/local, please
reboot the machine and try again.
I'm about to go to bed, but I will be reachable at home or by cell tomorrow if there are any unforeseen issues.
As anticipated, the system ran
fsck to check the disks on
most of the home-directory partitions,
taking on the order of half an hour to
complete.
The partitions came up clean and the system rebooted. The tape drive is working again, and I am flushing the previous day's backups to tape.
We now resume normal service. However,
if you come across any problems, please
let us know by sending e-mail to
system@math.hmc.edu.
The department's main server will be rebooted Saturday afternoon to clear a stuck IO process.
Services affected will include
Length of Outage: Approximately one hour.
The SCSI driver for our tape drive is stuck in a low-level IO loop that can't be interrupted. As we can't use the tape drive until this process is cleared, and the only way to clear it is by rebooting, we need to reboot the server.
The minimum time for a reboot of this system is around ten minutes based on various hardware tests and initializations. The actual reboot will probably take longer, especially if the system needs to run checks on disk partitions, in which case the reboot time could extend to around forty minutes or so. Rebooting can also reveal unforeseen consequences of some configuration changes, which can add additional delays before all services are available.
I will send messages to all logged-in users about ten minutes before I start the reboot. If you happen to log in shortly after the reboot, don't expect that the system will remain up unless the department's system blog has been updated with a message stating that the system is back up and maintenance is complete.
As usual, we apologize for any inconvenience that this downtime will impose, but occasional maintenance is required to keep the system running.
I have completed the repairs to our primary server, and everything should be working as usual. If not, please let me know!
END-----
We have a hardware problem with our
main server, esme, which
requires me to take the machine offline
in order to replace some parts.
I will shut the server down at 2:00 PM tomorrow, Saturday, January 20. The work will either take about twenty minutes or will require much more extensive part swapping, which could take several hours. Please check this blog (which will remain available) for updates and notification about everything being back on line.
Because the problem is with our primary server, e-mail, logins, and printing will not be available during the outage. Home directories will also not be available, so class websites hosted out of professors' home directories (which is most of them) will also not be available during the outage. Our web server is a separate machine and will remain available, with all content not kept in home directories.
Sorry for the inconvenience and short notice; I've only just received the necessary parts.
END-----As announced in November, 2005, support for the use of PHP, a popular, but problematic, web-programming language, has now ceased.
Any pages that relied on the Apache PHP module being available will no longer render properly.
If the lack of PHP poses a problem for you, please let me know and we can look into alternatives that pose less of a security risk for our system.
I've updated the version of Firefox
in /shared/local to 1.5,
which is the latest release.
You can run Firefox by typing
firefox at a terminal
prompt or by creating a GNOME Panel
launcher by right-clicking on a panel,
choosing Add to Panel, then choosing
Custom Application Launcher and filling
in the fields in the dialog box that
will appear.
You can find a Firefox icon in
/shared/local/firefox/icons.
The canonical path to the application
until such time as it is installed by
default on individual machines is
/shared/local/firefox/firefox.
For most people (unless you've tinkered
with your PATH), just
putting firefox in the
Command field will do the trick.
Among other improvements, Firefox 1.5 supports RSS, Atom, and other feed protocols in a much more convenient way than previous versions of Firefox did.
Despite the dramatic-sounding title of this entry, I expect that there will be little or no actual change in the way that the system works for about 99% of the affected users.
As another short-term way of dealing
with the ongoing disk space crisis on
/home/faculty, I have
migrated the emeritti and
former-faculty accounts that had their
home directories in
/home/faculty to a new
partition on the server.
Practically speaking, there should be no real impact from this change for anyone, even the people whose accounts were moved, as I have added links to preserve the appearance of the file system.
If your account has been moved (you can
tell by logging in and running
pwd, which will tell you
your present working directory) and you
notice some issues, or if you try to
reach a personal web resource (i.e.,
one that has a URL
similar to
http://www.math.hmc.edu/~someaccount)
that is no longer available, please
report the problem to me so I can track
it down and fix it.
If you're interested in the details of what was done, most of it is pretty visible, kind of like post-surgical scars.
/home/guests is now a link
farm, with symbolic links pointing to
actual directories that are located in
/home/guests-one or
/home/guests-two. The
account database has been set up so
that home directories for migrated
accounts are in
/home/guests (that is,
they point to the links that point to
the real directories).
Because of the limitations of
NFS, we
now have to export
/home/guests,
/home/guests-one, and
/home/guests-two, and
mount all three of those shares on each
machine that is available for general
use.
The original directories in
/home/faculty have been
replaced with links that point to the
directories in
/home/guests, so any
web-related links will still work.
Because of the links, everything should work as it always has. At some point down the road, however, I hope to be able to add some additional disk space, which will allow me to do some rejuggling of account locations. At that point I will probably try to clean up some of the remaining links to make everything neat and less complex.
With the removal of the links, scripts or other materials that refer to hard-coded, complete paths to your home directory or directories within your home directory may break. In other words, if you had a script that looked for files in your home directory and specified them as
/home/faculty/username/some/directory/or/file
(where username
is your username), but your physical
home directory is now located in
/home/guests-two/username,
and is referred to by the system as
/home/guests/username,
you will have problems when one or more
of the links is removed or changed.
If you're working with shell scripts,
the best way to refer to your home
directory is with the environment
variable $HOME, which is
pretty much guaranteed to resolve to
the correct answer no matter what shell
you or your script use. For many modern
shells (and scripts written in those
shell's language), you can use the
tilde (~) to refer to your
home directory, but $HOME
is safer and more likely to work no
matter what. (You'll want to use
~ on the command line, of
course.)
This mess will be cleaned up after we've obtained more disk space, which is on the agenda for a departmental computing-committee meeting on Friday. I hope that we will be able to find the money to move quickly on that project, and that I will be able to put additional disk space online over spring break (March 10 - 19).
In the meantime, keeping an eye on your disk usage and avoiding excessive disk usage (which I would define as usage that's significantly more than others with home directories in your partition) is, and will always be, a good thing to do that will benefit everyone else all the time, and you when you have a sudden, short-term need for a larger amount of disk space.
Thanks for your cooperation.
Version 1.5 of Mozilla Thunderbird, the Mozilla Foundation's e-mail client, was released today.
I have installed it in
/shared/local/thunderbird,
where it takes the place of the
previous release (which was 1.0.7). The
old release will still be available in
/shared/local/thunderbird-1.0.7
for at least a couple of weeks.
Please enjoy the new release, and let me know about any problems that you have with it.
The testing is complete. I have restarted all workstations and printers, so everything should be back to normal.
We did suffer one casualty, a faculty workstation whose power supply died. It's not clear that the problem was caused by the testing, but I have called the vendor for a replacement. (I have notified the machine's primary user by e-mail, so if you didn't get e-mail from me about your machine dying, your machine should be working -- please let me know if it isn't!)
Power-system and generator testing by the Claremont Colleges Physical Plant staff will affect all non-UPS-backed workstations on the mathematics department network. Because we cannot be certain about the duration of the outages, we will also be shutting down nonessential workstations.
The following systems will be unavailable:
shell.math.hmc.edu alias)
We expect that the servers and other
equipment in our machine room (backed
by UPSs and local generators) will
continue to operate during the outage
period.
ponder.math.hmc.edu should
be available for checking e-mail and
other simple usage during the outage.
The testing is scheduled for completion at 4:00 PM. If all goes well, the affected systems should be back soon after that time. If there are problems, systems will either be back by 6:00 PM or will not be running until sometime tomorrow, Saturday, 2005 December 31.
Please check back here,
http://www.math.hmc.edu/computing/blog/,
for updates on system status.
The Claremont Colleges have scheduled some power-related work during the holiday break. The outages on December 27 and 29 do not affect our systems, but the outage on December 30 affects the entire campus.
These outages are to replace the last of the "G&W boxes" that were responsible for the (unscheduled) power outage earlier this semester.
I'm checking with Theresa Potter to verify whether the outages will affect our servers. They will definitely affect our workstations, however.
At a minimum, power will be interrupted for significant periods of time between 12:00 PM (noon) and 4:00 PM (or later -- Theresa's message indicates that the end time is approximate). Because workstations in faculty offices, the Clinic lab, and the scientific-computing lab have short-run or no UPSs, they will not be available during this time period.
During this time, no workstations will
be available, including individual
office and lab machines and the
shell.math.hmc.edu alias.
If machine-room power will be interrupted, mail, file, print, and web services will also be interrupted.
If machine-room power is maintained,
people using POP or IMAP will be able
to access their mail. Web service
should also continue to be available,
as should
ponder.math.hmc.edu.
I will shut down any workstations (including faculty, Clinic, and scientific-computing lab machines) that are running on Friday morning (around 10:00 AM).
If it turns out that the outage will affect the servers, I will arrange to monitor them in person or remotely and shut some or all of them down if that seems to be required.
If I'm not already on campus to deal with things when the outage ends, I will come by in the evening to check on the situation and (possibly) restart machines. Given the previous record on electrical work requiring power outages, I am not expecting it to end at the scheduled time. If power is not restored until sometime after early evening on Friday, I will come in and try to restart machines on Saturday.
Check back here for updates and notice of restored service.
I have updated the version of Maple
installed on our server to 10.02. It's
set as the default, so just typing
maple or
xmaple should launch the
latest version.
If you have problems, (1) please
tell me, and then (2) run the
previous version by specifying the full
path to the maple or
xmaple executables, as in
/shared/local/maple10.01/bin/{maple|xmaple}.
I noticed that there was support for the AMD 64/x86_64 64-bit processors in the update, but found that I didn't have the original installation media for the 64-bit version of Maple. I got a copy from CIS, so we now have both 32-bit and 64-bit versions of Maple available for your use (assuming you're using one of the 64-bit workstations the department has, of course).
To be honest, I'm not sure what having the 64-bit version buys you, as Maple is a symbolic math application rather than a major number cruncher, but 64-bits must be cooler than 32-bits, right?
I came in today and rebooted our main
server, esme. As I had
expected, the home partitions needed
checking. Once that process had
finished, however, the machine came
back up and was running just fine.
I was able to move the new tape library
onto esme and verify that
it works. Very cool.
While I was working with the machine, I took the opportunity to update various firmware packages (BIOS, SCSI RAID, etc.). As far as I can tell, those updates worked fine, too.
I rebooted the scientific-computing laboratory machines. Faculty and Clinic workstations should probably also be rebooted; I will look at rebooting the Clinic machines over the next couple of days. Faculty should reboot their machines sometime next week (ideally when I'm in the office, just in case there are any issues).
Thanks for everyone's patience and
cooperation. As usual, if you notice
any problems with the systems, please
send mail to
system@math.hmc.edu
describing the problems you're having.
I will be rebooting the department's server systems over the Thanksgiving holiday. Exactly when, I'm not sure, but our main server, which provides file, print, mail, and some other services, hasn't been rebooted in over 200 days. As there's been a major update (from CentOS 3.5 to CentOS 3.6) during that period, we're more than due for a reboot.
If you were planning on running processes over the Thanksgiving holiday, please contact me immediately. As of right now, no one has spoken to me about any such processes (which, you'll recall, is a requirement of the department's long-jobs policy), so I'm assuming it's safe for me to reboot the systems whenever it's most convenient for me to do so.
This summer, we learned about a matching grant program run by IBM. Mudders who had gone on to work at IBM had donated money to be given to Mudd, with IBM matching those funds. Altogether, the donation was around $40,000.
After the department chairs hashed out who would get how much of the pool of funds, the mathematics department opted for a 3581 Tape Autoloader, a device that contains a single Ultrium LTO 3 drive and a robotic tape carousel that can hold eight Ultrium 3 tapes.
Each Ultrium 3 tape can hold 400 GB of data uncompressed, or up to 800 GB of data if compression is used. Our current backup system, which uses DLT IV tape, can hold 40 GB uncompressed, 80 GB compressed per tape, so the new system represents a tenfold increase in capacity.
The eight-cartridge carousel also means less tape changing -- the system can be set up to cycle through the tapes in order or to select particular tapes based on the ``slot'' in which they're loaded.
I'm currently in the process of testing the new tape system. It's installed in our rack, but actually getting the servers to talk to it and make it do what we want is going to require a bit of fiddling. I hope to have it online by next semester.
This new tape library clears the way for increasing our disk-space capacity. Now that we can back up larger amounts of data, we can start working toward obtaining additional disk space, knowing that we will be able to protect that data.
This IBM 3581 Tape Autoloader, with rack-mount kit and a SCSI cable, sells for $9,293. We would like to say ``Thank you, IBM,'' and especially to thank the Mudders now working there for contributing to this fund.
I am in the process of converting the only system pages that make use of PHP to a form that does not use PHP. Once that conversion is complete (probably by the end of the day on Monday, 2005 November 14), I will be removing all support for the use of PHP on the department's web server(s).
PHP is a server-side programming language that allows developers to write web pages with computer code embedded in them. It is widely used in the hobbyist market for writing web log, bulletin board, and forum-type applications. Unfortunately, PHP appears to be insecure by design, as numerous security holes continue to be found in the core PHP Apache module even though the system is about ten years old and has undergone several major rewrites and reimplementations.
Note that I am not speaking of insecure code written in PHP -- such buggy code is trivial to produce in any language. But we are still seeing numerous flaws in the Apache module that implements the core language itself. Such flaws can open up the entire server to attack, and the risks are greater than the benefits.
As I've mentioned before, work is underway to replace the "brains" of the air conditioning system in the department's machine room and hook it into the general HVAC monitoring system.
Among the other things we keep in that room is a magic button that cuts the power for the servers in that room, a relic of the days when water-cooled mainframe computers might need to be shut down all at once to prevent electrocution.
These days, of course, our systems are air cooled. And they're all on UPS power, which means that hitting the panic button just switches them onto battery power. But the button is still there, waiting to be pressed....
Which is what happened this morning. The Physical Plant folks working on the air conditioning accidentally triggered the power cutoff. To make matters worse (and more confusing), the cutoff doesn't just affect the power in our machine room, but also the power in the scientific-computing lab, the publications room, and, I believe, at least one or two of the biology labs nearby.
Our servers, with their UPSs, were fine. But any jobs that were running on the scientific-computing lab machines were stopped when the power went out and the machines crashed. The machines rebooted, as they were set to, but they didn't restart your jobs -- you'll have to restart them yourselves.
Before you do that, however, I encourage you to review the department's policy on long jobs. Let me summarize for you: You're not supposed to leave processes running when you're not sitting in front of a machine unless you check with me first. The lab machines are meant for use by people sitting in front of them first, with people logging in remotely to run interactive jobs next. Long, unattended jobs should be run so that they don't dominate the processing power of the machine when someone is trying to do things at the console.
That means that you should
nice command, as in
nice -n 19 your_process_name
Ideally, you should also write your code so that it periodically writes out its status and results, and can resume by reading in that information and starting from where it left off. Writing such code is a bit more difficult, but it might save you from having to redo hours of computations when the power fails, someone reboots the machine because its running too slowly, or other unforeseen events stop your job from running.
If you're still looking for reasons to tell me about your long jobs, let me point out that I routinely update packages on the lab machines for security issues, and some of those updates require a reboot to take effect. If I don't know your job is running, I might reboot the machine without checking with you first. Letting me know means that I can maintain a list of machines to avoid rebooting without notice.
Please remember that the lab machines are a shared resource, and sharing requires that everyone using them behave responsibly and respect the other users.
It turns out that there are some significant issues with the air-conditioning unit in the mathematics department machine room. They're being looked into by F&M and the CUC Physical Plant HVAC people.
There should be no disruption of services, but should it become necessary for something major to be done, I will let people know as far in advance as I can. I would also hope that we could arrange for any significant disruption to occur on a weekend or over a break period.
Thanks for your patience during this work.
The Amber cluster has been successfully moved into its new home. Tim and I will probably be doing some additional shuffling around over the next few weeks or months, but we should be able to either make those disruptions short enough as to be unnoticeable or announce the disruptions in advance.
There may still be some issues that
users might notice that I'm not seeing;
if you have any issues, please report
them to
system@math.hmc.edu.
Thanks for your patience and cooperation!
The mathematics and computer-science departments' Beowulf cluster, Amber, is going to be moving from the mathematics department's machine room to the much more commodious CS machine room.
We will be moving the cluster sometime tomorrow, Wednesday, 2005 November 2.
If all goes perfectly, the cluster move will be simple and quick. If things get a bit more complicated, we will have to disassemble and reassemble the cluster, which means disconnecting sixteen computers (power & Ethernet), moving them in groups of three or four, then reconnecting everything in the new location, which will require at least an hour, maybe longer.
To make the process as easy as possible, we're asking that anyone who is actively using the cluster stop their work by 10:00 AM on Wednesday. We will post here when the cluster is back up.
(People who are authorized to use the Amber cluster have already received e-mail messages at their math addresses with this information, and will also receive a message when the cluster is running again.)
The Amber cluster has sixteen Dell PowerEdge 400SC nodes, each with a 2.8 GHz Pentium 4 processor and 1 or 1.25 GiB of RAM. The nodes communicate over a gigabit Ethernet switch. The cluster is running CentOS 3 with various additional cluster-related software packages (notably LAM/MPI). Use of the cluster is limited to faculty, students, and staff of the colleges who are doing computationally intensive research, especially research that requires or can take advantage of parallel-computing techniques.
Amber cluster nodes were purchased with funds from several CS faculty members. Systems integration and support is provided by the mathematics department.
In the process of cleaning up the path
for most users, I inadvertantly left
most people using the java
and similar scripts installed by the
libgcj package. As those
scripts don't actually do anything,
that situation wasn't ideal. ;-)
I've added some code to the
global.tcshrc and
global.cshrc files, so you
should now have
/shared/local/java/bin
added to your path if you're using the
tcsh. (If you're not sure
what I'm talking about here, then you
are using the tcsh and
shouldn't have to do anything.) If
you're using another shell, you're
already having to massage your path;
you're welcome to take a look at the
global.tcshrc file (which
is in
~setup/global.tcshrc) to
see how my code works.
The longer term solution, I think, is
to figure out how to add the
/shared/local/java
binaries to the alternatives system
such that they're used in preference to
the libgcj scripts, but
have a lower priority (and are replaced
by) the binaries from a locally
installed Java package.
Late last week F&M was able to take a look at our machine-room air conditioning. It turned out that there was a loose wire in the thermostat that was periodically breaking contact and resetting the system. (At a guess, it's possible that as the room cooled down, the wire contracted and broke contact. Once the room warmed up again, the wire expanded and the system worked again.)
Whatever the exact details were, the air conditioning is now running again, and I have restarted the Amber cluster. Please let me know if you have any issues with the cluster.
In related news, I have swapped out the thermally compromised drive from our backup array with a new drive I'd purchased for that purpose a few months ago. The array is now working as expected, as is our disk-based backup system.
The air conditioning unit for our machine room is continuing to have problems. I have entered the room twice and found the controller flashing OFF. The buttons on the controller don't seem to work, and I have to power off the whole system before the controller responds again and the air conditioner runs.
I have reported the problem to F&M, but until they can fix it, I will have to keep the Amber cluster offline.
Sorry for any inconvenience.
Yesterday's power outage was fun for all, but it's had some negative effects on our computing services.
Our servers are all supplied with power through Uninterruptible Power Supplies -- big batteries, basically. The most important servers (the ones that provide file, print, e-mail, and web service) actually have two power supplies, each of which has a UPS that is, in turn, connected to a different power source. One of those sources has a local generator, so when the power went out, the servers first went on UPS, then were able to run off the generator power.
Unfortunately, our air conditioning unit for our server room, while separate from the Libra complex's air conditioning, does rely on actual power. And all the air conditioning systems share a set of chillers and other support equipment. That equipment is the source of the current air conditioning outage, and that is affecting our machine room.
The primary effect you may have noticed is that our Amber Beowulf cluster is offline. Sixteen machines generate a lot of heat, and therefore the Amber cluster will be offline until air conditioning is restored.
A secondary effect is that our disk-based backup system is suffering a thermal-related issue with one of the drives in the array. I first noticed the problem just before the power outage -- the air conditioning in the machine room wasn't working, and the machine had gotten hot enough for the drive to seize up. I was able to cool the system down and get the array rebuilding, but then we had the power failure followed by the air conditioning failure, so our disk-based backups will be offline until such time as the temperature drops, as well.
Note that we also do regular tape backups, so we do still have backups, it's just that they are slightly less current and much less convenient. Please try not to delete anything important until we have A/C back!
Faculty and staff (mostly) have small UPS units for their desktop machines. Those UPSs are meant to smooth the transition between line power and generator power during power outages and not to allow you to continue working through a significant power outage. These UPS units will not keep a typical desktop system running for more than about 5-15 minutes at the longest.
Because yesterday's power outage occurred on campus, the disruption prevented both line and generator power from being distributed. Thus your UPSs may have been run down to the point that your machines shutdown (or crashed when the power stopped).
You should save your work as soon as a power outage occurs. If the power isn't back after about three minutes, you should log out and shut your machine down manually.
Once the power is back, the UPS batteries will begin charging again. It should be safe to work with your machine once the power is back, but be aware that if additional power outages occur while the battery is charging, your machine will have less runtime than it did when the battery was fully charged.
I have updated our network install of Maple to 10.01, as mentioned in a previous entry.
I also have updates for standalone copies, so if you have one and you haven't updated by choosing the ``Check for Updates...'' option from the Tools menu, you can download the updates and apply them manually using the links in the previous entry.
Just in time for the new semester, both MathWorks, makers of MATLAB, and MapleSoft, makers of Maple, came out with new updates for their products.
Details about Service Pack 3 are
available. The network install has been
updated; if you type
matlab to start MATLAB on
a math Linux system, you'll get the
service pack 3 version.
I haven't yet figured out what the best way of distributing updates to locally installed copies of MATLAB is; although it sounds like we're going to basically need to reinstall the app on each machine.
See our MATLAB support page for information and ways to run different versions (including the classroom or research licenses or older versions).
As of this writing, Maplesoft only has
updates for the single-user version of
Maple. If you have Maple 10 installed
on your system and you would like to
update it yourself, you can
download the updates from our site
(this link will only work for machines
with an hmc.edu address)
or
get the updates direct from
Maplesoft.
We will be updating our network install of Maple as soon as the media are available. Check back here for updates.
Our Maple support page may have additional information you might find interesting or useful.
Bringing us along into 2005, I have installed Firefox, the official Mozilla standalone web browser; Thunderbird, the Mozilla project's standalone mail client; and Nvu, the Linspire web-page editor, which happens to be based on Mozilla code.
All of these programs are installed in
the /shared/local
partition, and should be usable from
any math department Linux system by
simply typing firefox,
thunderbird, or
nvu, respectively.
If you want to add an icon to your
GNOME Panel or KDE Kicker, please do
so. Icons are hiding in sneakily named
icons directories inside
the installation directories in
/shared/local.
I would strongly encourage you to
consider using Thunderbird with our
IMAP
server, imap.math.hmc.edu,
which will allow you to read mail with
Thunderbird while you're at a machine
in your office or one of the labs, but
also read mail from a text-based mail
client if you're so inclined, and read
mail using an IMAP mail client from
home.
You should be able to print to
wuffles now using math
department Linux systems, Macintoshes
with Mac OS X, and Windows machines.
Please see
my earlier message for details on
the name and IP address of the printer.
Drivers are available from
our wuffles page.
Note for Linuxy types trying to set
things up on their own: I haven't been
able to get the copier to behave using
straight CUPS and the PPD file; it
seems to work just fine when I install
the BrightQ drivers available from
canon.codehost.com.
I still want to get the thing working
without the additional software, but in
lieu of the looming start of the
semester, I'm tabling it 'til later.
I have added support information, including drivers and some of the secondary applications (e.g., scanner-interface software) for the new Canon imageRunner 8070 copier to the department's computing website.
The copier will be called
wuffles, at least on the
math network, and has the IP address
134.173.34.138 for those of you playing
from home.
wuffles will, we hope, be
up and running on Friday.
Enjoy!
You may have heard (or seen, as it's in
the hallway) that we're getting a new
networked copier. The new machine is a
Canon imageRunner 8070, and will be
replacing our existing imageRunner
5000, fluffy.
The new copier has several major improvements over the old model, including
From the manual, it sounds like it could potentially do a whole slew of additional things, some of which might actually be useful, however, we apparently haven't actually paid to turn any of that additional functionality on. We won't know for absolute certain what we have and what we don't until we can plug the thing in and get it running.
The downside of the newer, faster,
stronger model is that it uses more
electricity. As a result, we will need
to have the electrical socket rewired.
As fluffy also uses more
juice than your average household
incinerator, and there isn't room for
two crazy sockets in the same box, we
will have to take the old copier
offline, then have the electrical work
done, then have Canon come out and
assemble and configure the new copier.
That pretty much guarantees a downtime
of a day or two while we coordinate
several teams of workers. Oh, and Canon
shipped us (or ordered us) the wrong
finishing unit, so we can't really go
ahead until we have the right one
anyway.
You'll hear more when we know it -- in the meantime, I am in the process of assembling webpages with pointers to the software that you'll need to use the new copier. I'll announce that here once it's in place.
Apparently the problems I was having getting Maple 10 working on the cluster were related to the problems with getting Maple 10 running on other machines. But they're sorted out now, so I have made Maple 10 the default Maple installation on the mathematics cluster!
Maple 10's interface has changed dramatically. Not surprisingly, it has lots of new features (and probably some new bugs, too). So I am keeping Maple 9.5 around for a while.
Running maple or
xmaple will now launch
Maple 10. If you need to run the older
version, you can do so by typing the
full path to the command, as in
/shared/local/maple9.5/bin/{,x}maple
or by adding the old path to your
PATH environment variable
using one of the following methods:
setenv PATH /shared/local/maple9.5/bin:$PATH(fortcsh,csh)
or
set PATH=/shared/local/maple9.5/bin:$PATH(forbash,zsh,sh, etc.)
I expect that Maple 9.5 will be removed sometime before the end of the fall semester or whenever CIS's license server stops working for Maple 9.5.
The chiller is back up, which means that we have air conditioning in the machine room again, so I've restarted the Amber cluster.
Sorry for the disruption in service, but some things are out of my hands. The servers have to come first.
Tom Shaffer, the college's plant engineer, has informed us that the separate air conditioning system that supplies cooling for various labs and computer machine rooms is offline. As a result, I have taken down the Amber cluster until air conditioning is restored.
I may to take some additional servers offline in the near future, but we'll keep our fingers crossed that it won't come to that.
Some tiny number of you may have noticed that the Amber cluster was off line for a couple of hours. During that time, the cluster was completely dissassembled, stacked up in the hallway, and then moved to its new (temporary) location in the department's small machine room.
The cluster was moved because with the summer heat, my office was running around 85 - 90° Fahrenheit, which isn't healthy for people or computers. This move is temporary because the department's machine room is now completely filled up with machines, leaving little room for humans to move around and do any maintenance.
The plan is still for the cluster to move to the CS machine room by the end of the summer. It will remain on the mathematics department subnet, and will continue to be available to people who are in the amber group.
The old cluster will be retired; at this point I'm thinking that we will probably maintain the head node in some form so that people who haven't already done so can retrieve their data, but the rest of the machines will probably be stripped or scrapped outright. If you're in the market for a Pentium II machine, let me know and we might be able to hook you up.
I have installed the latest versions of
Intel's
FORTRAN and C++ compilers for 32-bit
architectures in
/shared/local/intel.
You can use the compilers by running
For the Intel C++ compiler
/shared/local/intel/bin/iccvars.csh /shared/local/intel/bin/iccvars.shFor the Intel FORTRAN compiler
/shared/local/intel/bin/ifortvars.csh /shared/local/intel/bin/ifortvars.sh
These commands set various environment
variables (PATH,
MANPATH,
LD_LIBRARY_PATH, etc.) to
include directories needed to run the
compilers. Their effects end when you
quit the shell you run them in (e.g.,
log out, close the terminal window). If
you should find yourself using these
compilers all the time, you can add the
contents of these files to your own
startup files.
The main advantages of the Intel compilers over the GNU Compiler Collection (gcc) compilers is that, in theory, Intel compilers take better advantage of the quirks in various CPU models. In practice, most code will not see a significant performance change when compiled with the Intel compilers, but there are exceptions. YMMV.
The Intel FORTRAN compiler also
supports FORTRAN90 and FORTRAN95,
whereas g77, the GNU
FORTRAN compiler, only supports
FORTRAN77 (as its name implies).
I have also installed the Intel Math Kernel Library, which provides mathematical functions optimized for use on Intel processors.
Documentation for these compilers and
the Intel Math Kernel Library is
available in
/shared/local/intel/doc,
and includes PDF and HTML manuals and
training material.
Please note that our license for using these materials requires that they be used solely for noncommercial purposes. If you're planning to compile code that you hope to make money on, please use the standard GCC compilers or download your own Intel compilers. (Even better, don't do commercial work on our systems.)
Enjoy!
I've just purged the system of accounts
that were marked as expired as of 2004.
The purge has gained us about 14 GB of
space on the
/home/students partition,
and bits and pieces elsewhere.
If, by chance, I accidentally deleted an account that should still exist, please let me know as soon as possible. I can still restore such accounts from our disk or tape backups.
Faculty folks: If you end up working with a student whose account has been removed, we have a tape archive of the older accounts, so we can restore their contents if need be. There will be a delay, however, as I plan to store that tape in another physical location.
There's been a problem with the open
file dialog in MATLAB ever since we
upgraded to MATLAB 7. The problem
manifests itself as follows: you click
on File->Open from the menu bar or
you click on the open icon on the tool
bar. You get a file picker dialog. You
move to the directory where your
.mat file lives, then
click it to select it and click open or
simply double-click the file name. One
of two things then happens: You get a
dialog telling you ``File not Found''
or you get an error message similar to
java.lang.InterruptedException at javax.swing.filechooser.FileSystemView.getFiles(Unknown Source) at javax.swing.plaf.basic.BasicDirectoryModel$LoadFilesThread.run(Unknown Source)
It turns out that this problem is
somehow triggered by the
LANG environment variable
(it looks like something to do with
Unicode). There are a couple of
workarounds:
load or
edit commands in the MATLAB
Command Windows
cd in your terminal window
or with the navigation buttons in the
MATLAB Current Directory browser pane)
and open files from the Current
Directory browser
LANG
environment variable before starting
MATLAB
I have replaced the link to the latest
MATLAB binary in
/shared/local/bin with a
small script that unsets the
LANG environment variable
and then starts MATLAB. The change
should be transparent to end users, but
it should be possible to open files
directly from the open file dialog with
this change.
Note that if you start MATLAB in any
way other than using the
matlab in
/shared/local/bin, this
change won't help you. You can check to
see what your shell thinks it should
run when you type matlab
by typing the following:
linux% which matlab
You should see
/shared/local/matlab/bin/matlab
If you don't, you can get the same effect by typing something similar to
For csh variants:
linux% ( unsetenv LANG ; /path/to/my/matlab ) &For Bourne-shell variants:
linux$ ( unset LANG ; /path/to/my/matlab ) &
Last night around 5:30 we had some
issue with the department's main
server. They originally manifested as
problems reading mail; investigation
showed that there was something up with
NIS,
which caused problems logging in,
lsing files, and so forth.
The server seemed to be thrashing
badly; most of the systems resources
appeared to be devoted to the
kswapd daemon.
ypserv was running, but
not listening to any network ports.
After trying various less drastic means
to try to get the system working
properly (including dumping it down to
single-user mode and then back to
network-server level, which initially
seemed to work but didn't last), I
rebooted the server. When it came back
up, it (of course) had to run a check
on the various /home
filesystems. As these total around 200
GB, this process took a considerable
amount of time. Once the checking was
complete, the server came back up and
appears to be running normally at this
time.
I had been thinking about scheduling a reboot for this system in the near future, after classes and exams were over, so actually having to reboot it wasn't the worst thing in the world. (It had been running for 186 days without a reboot.) I do, however, apologize for any inconvenience you may have experienced when the server was unavailable.
If you notice any problems, please let me know ASAP so that I can take a look at them and get them resolved as quickly as possible.
There's probably no good time to upgrade your operating system, and I think that goes quadruple or more for a systems administrator. Suddenly your familiar working environment is completely different. Icons and menus have changed or are in different places. Some bits are missing. New functionality has replaced the old, familiar (working) functionality. Keys are remapped so they don't do the same things. Programs you used to have aren't there any more, because you don't have packages for this OS....
Of course, given a couple of days to concentrate, you could clear up the problem in no time. Just write that script to clean up old SRPMs and build shiny new RPMs you can install. Take the time to port your old configuration files over to the new system. Figure out where they moved things (and speculate on why). But a couple of days off are pretty rare in this biz, and when a user asks you a question, it's hard to say no. So you stumble along from issue to issue (A computer just died! No, two! Someone needs an application built! Someone else needs some technical advice on a paper they're writing! The printers aren't working! The mail system is broken!) and gradually piece your world back together.
All that is my way of saying, relax, I'll get back to you as soon as I can. As soon as I can get my editor work, the mail server to send mail, printers to print, TeX to TeX, and so on.... Just relax....
The mathematics department hasn't had any systems running Red Hat Linux 7.3 for almost a year now. Accordingly, we are announcing the end of support for Red Hat Linux 7.3, and we are removing packages built for Red Hat Linux 7.3 from our mirror server.
The removal of these obsolete packages will free up some space for supported systems and will also allow us to clean up our directory structure a bit.
Support for Red Hat Linux 9 and the Fedora Legacy packages for RHL 9 will continue until sometime this summer, when the last of our RHL 9 systems will be retired, replaced, or rebuilt.
I have installed gaspode
in the Math Workroom (Olin-1264). It
should be available for general use as
of this writing.
You can obtain drivers, instructions, and other useful information for using this printer from its new webpage.
I have also added similar pages for the other ``public'' printers. They're all accessible from our printing support page (which has been up for some time).
Enjoy!
We have taken delivery of a new color
printer, gaspode, a
Hewlett Packard Color LaserJet
5550dtn.
This printer was a gift from Hewlett Packard and its Hardcopy Technologies Lab's director, John Meyer. We would not have received this generous gift without the work of Professor Mike Raugh, our department's Clinic Director.
gaspode replaces
winter, our Minolta-QMS
magiColor 6100. The new printer is much
faster (up to 27 pages per minute in
black and white or color) and uses HP's
imageRET technology to achieve
resolutions of up to 3600 dpi.
Information on printing to the new printer is available on our math computing support website. (Please use this link, as this information is not yet tied into the site as a whole; I expect to create similar pages for each printer in the near future.)
Please thank Mike Raugh for obtaining this printer for the department.
I have updated the versions of Sun's Java Software Devlopment Kits (SDKs) to the latest versions -- 1.4.2_07 and 1.5.0_02. The permission-elevation problem in the 1.4.2 series is addressed in the 07 update.
The standard Java remains 1.4.2. To use
Java 5 (really 1.5), you will have to
run the binaries by typing their full
pathnames or add the Java 5 directory
to your PATH. I recommend
that rather than using the
release-specific directory, you use
/shared/local/java5, which
is a link that will be updated to point
to the latest version installed on the
system.
MathWorks has announced the release of Service Pack 2 for Release 14 of MATLAB. I will be installing it as soon as I get hold of the media, but in the meantime, you can read about the changes in this release.
Please note that I will be installing
the new release in parallel with the
existing release. To use it, you will
need to specify the complete path to
the new version of MATLAB on the
command line, or add the new directory
to your PATH before the
old directory.
I've added the TEXMF tree
provided by Wolfram for use in
compiling TeX documents exported from
Mathematica to the system.
The files are located in
/shared/local/share/Mathematica/texmf;
to use them, modify your
TEXMF environment variable
by adding that directory. You will
probably want something like
setenv TEXMF "$TEXMF, \!\!/shared/local/share/Mathematica/texmf"(*csh)
or
export TEXMF="$TEXMF, \!\!/shared/local/share/Mathematica/texmf"(Bourne/Korn variants)
which adds the files from the
Mathematica TEXMF tree
after the rest of the files in the
standard TEXMF path.
Those of you using the Fugu SCP/SFTP client for Mac OS X should update to the latest version of the program.
It's available from the
upstream site or from
yum.math.
Red Hat released version 4 of their Red Hat Enterprise Linux products last week. RHEL 4 is based on Fedora Core, Red Hat's ``free'' distribution, and includes features such as GNOME 2.8, SELinux, and the 2.6 Linux kernel.
RHEL 4 also drops the Mozilla suite in favor of Firefox and Thunderbird, and changes a whole bunch of other stuff in ways I haven't yet discovered.
CentOS 4 will be coming out soon, incorporating these changes.
I have been running a release candidate of CentOS 4 on a machine in my office, and thus far my impression is that it has many shiny improvements over CentOS 3, but that the changes may cause some issues if they aren't handled carefully. I expect to install CentOS 4 on my workstation and run it for a while before making a decision about rolling the new version of the OS out onto desktops. (Among other things, there's a fair amount of locally built and deployed software that will need to be rebuilt, updated, or replaced before a rollout can happen.)
Exactly when we upgrade workstations to CentOS 4 is unclear at this time, although it's likely that the upgrade will happen this summer at the latest, and probably sooner than that for lab workstations.
I may update the Amber cluster sooner, to see whether the changes affect some problems that have been seen there. Our servers will remain on CentOS 3 until I can see clear evidence that updating them would add enough valuable features to be worthwhile.
As usual, if you have any questions or
comments, please feel free to write to
me at cmc@math.hmc.edu.