Thursday, November 14, 2024

The sad tale of the Alpha massacre

Must read

who, me? Good morning and welcome, once again, to Who, Me? in which Register readers share tales of tech support moments they might prefer to forget. But forgetting is not a way to learn from mistakes, is it?

This week’s hero is a veteran we’ll Regomize as “Gandalf” for such is the grayness of his beard nowadays. Back in less gray years, though, Gandalf worked for a significant – but now defunct – database maker.

Primary development was carried out on SunOS, but the developer also maintained releases for HP-UX, AIX, Tru64, Siemens Nixdorf UNIX, SCO UNIX, Linux and Windows NT among other exotic operating systems. Gandalf was on the porting team, whose mission was to back-port releases and fixes to the relevant platform branch, run the full suite of tests, and prepare software for release.

Got that so far? Good.

To aid in this endeavor, the team had a set of Quality Assurance tools for each specific database version for each platform. To test, they would simply copy the relevant set of QA Tools from a shared directory and place it on the local filesystem. Then they would set the shell variable QATOOLS to point to the location of the QA Tools. For example: set QATOOLS=/opt/qatools/

The /qatools/ directory contained all the necessary binaries, configuration files, log file locations and other necessary information to test that particular product suite: $QATOOLS/bin, $QATOOLS/etc, $QATOOLS/incl, $QATOOLS/var.

Of course it was important to ensure that they were using the right tools for the right version of the database on the right system, so each time before they began testing the needed to ensure the old version of the toolset had been cleared. This will become important in a minute.

One fine day, Gandalf and team were tasked with testing a new release of the Tru64 port on a DEC Alpha installation located remotely (in fact, it was in Menlo Park, California – a possibly unnecessary detail, but Gandalf included it so you may as well know). They logged in as root, and issued the command to clear out the past version of the tool kit:

rm -rf $QATOOLS/bin $QATOOLS/etc $QATOOLS/incl $QATOOLS/var

Shortly thereafter, Gandalf noticed that things stopped working. Most notably, the telnet connection (yes, we are well into the before times here) went dead and could not be revived.

Investigation revealed a genuinely horrific error: before typing in the very powerful command above, they had failed to point the QATOOLS variable at the correct location. Or indeed at any location.

If you recall anything about Alpha, you’ll know what that command did. As Gandalf put it: “The resulting carnage was as swift as the DEC Alpha was powerful.” And he recalled one of the graybeards at the time telling him: “That machine was dead before the sweat of your brow hit the keyboard!”

In short, everything was gone.

Thankfully the sysadmins in Menlo Park were able to reconstruct their devastated system and ultimately there were no serious repercussions for Gandalf or his team. Just an important lesson learned – and a heck of a war story gathered for the retelling.

Who Me? needs your war stories! Our mailbag is getting very low and dusty, so if you have a tale of tech support gone awry or lessons learned the hard way, please click here to send an email to Who, Me? so we can possibly share your adventures on some future Monday morn. ®

Latest article