Introduction to Xgrid:
Cluster Computing for Everyone
Barbara J. Breen1, John F. Lindner2
1Department
of Physics, University of Portland, Portland, Oregon 97203
2Department
of Physics, The College of Wooster, Wooster, Ohio 44691
(First posted 4 January 2007; last revised 24 July 2007)
Xgrid is the first distributed computing architecture built into a desktop operating system. It allows you to run a single job across multiple computers at once. All you need is at least one Macintosh computer running Mac OS X v10.4 or later. (Mac OS X Server is not required.) We provide explicit instructions and example code to get you started, including examples of how to distribute your computing jobs, even if your initial cluster consists of just two old laptops in your basement.
Apple's
Xgrid technology enables you to readily convert any ad hoc collection of
Macintosh computers into a low-cost supercomputing cluster. Xgrid functionality
is integrated into every copy of Mac OS X v10.4. For more information, visit http://www.apple.com/macosx/features/xgrid/.
In
this article, we show how to unlock this functionality. In Section 2, we guide
you through setting up the cluster. In Section 3, we illustrate two simple ways
to distribute jobs across the cluster: shell scripts and batch files. We don't
assume you know what shell scripts, batch files, or C/C++ programs are
(although you will need to learn). Instead, we supply explicit, practical examples.
In a
typical cluster of three or more computers (or processors), a client computer requests a job from the controller computer, which assigns an agent computer to perform it. However, the client, controller,
and agent can be the same computer. In fact, we initially set up a "mini-grid"
consisting of just two laptop computers, a G3 and a G4 PowerBook.
Designating
an agent is a simple procedure handled in the Mac OS Systems Preferences. For any computer running Mac OS X v10.4
"Tiger", simply check the Xgrid option under the Sharing pane of Systems Preferences, as in Fig. 1.
(Older computers running Mac OS X v10.3 "Panther" can download a
dedicated agent installer from http://www.apple.com/support/downloads/xgridagentformacosx103.html.)

Fig.
1. Creating an agent by checking Xgrid in the Sharing pane of System
Preferences.

Fig.
2. These are the simplest settings for an agent.
Once
the box is checked, the computer is available to act as an agent in a
distributed network. Clicking the 'Configure' button opens a dialog box that
allows this network to be described. The simplest configuration is to tell the
computer to use the first available controller (though a specific controller
can be specified), to always accept tasks (there is an option to accept tasks
only when idle) and to not require authentication (the other options here are
password or Kerberos single sign-on), as in Fig. 2.
2.2.1. XgridLite
The
friendliest way to start (and stop) an Xgrid controller is to download the free
XgridLite from http://edbaskerville.com/software/xgridlite/.
It installs an XgridLite icon
under "Other" in System Preferences, thereby enabling you to turn on and off
controllers in a way similar to turning on and off agents.

Fig.
3. Note the XgridLite icon under
"other".
Clicking
on the XgridLite icon opens a
dialog box that lets you configure the machine as the controller. You can also
choose whether or not to require authentication. (This authentication will be a
non-Kerberos password; you can't enable single sign-on with XgridLite.)

Fig.
4. One mouse click stops or starts a controller.
2.2.2. Terminal
Alternately,
you can turn on the controller directly from the Terminal app using the command-line tool xgridctl. Simply type the command
breen$ sudo xgridctl c start
and
provide an administrator's password when requested. The word sudo stands for "super user do" and "c" refers to the controller
(rather than "a" for agent, another option). To familiarize yourself with this
tool, access the manual by typing the command
breen$
man
xgridctl
The
simplest way to communicate with the cluster is via the Terminal app. Each session (every time a Terminal window is opened), you should
set the controller hostname and password to avoid having to continually type
these at every Xgrid command. From the Mac OS X v10.4 default bash terminal,
type
breen$ export
XGRID_CONTROLLER_HOSTNAME=<hostname>
breen$ export
XGRID_CONTROLLER_PASSWORD=<password>
From
the alternate tcsh terminal, type
breen$ setenv XGRID_CONTROLLER_HOSTNAME <hostname>
breen$ setenv XGRID_CONTROLLER_PASSWORD
<password>
The
password doesn't have to be set if you are not requiring authentication,
something we recommend in the early stages to ease the learning curve just a
bit. If the computers are on the same network, you may be able to designate the
controller hostname using the ".local" extension given in the System Preferences AppleTalk pane, for
example "breenCluster.local". Otherwise, you should find the controller's full
name including domain, such as "breenCluster.up.edu". IP addresses will also
work.
Xgrid
commands have the form
xgrid <options>
<action> <parameters>
Immediately
after setting the controller hostname, type
breen$ xgrid -grid list
If
the Xgrid is functioning, you should get something like
{gridList = (0); }
meaning
that there is one Xgrid of ID 0 on your network. If you don't get this
response, you've probably erred in setting the hostname or password. Watch
for typos! Any time the terminal doesn't
understand a command line instruction starting with xgrid, it will return the xgrid manual
page, which you can also helpfully invoke by typing
breen$ man xgrid
(Exit
from the manual by typing "control-z".) You can get the attributes of an Xgrid
with
breen$ xgrid -grid attributes -gid 0
{gridAttributes = {gridMegahertz = 0; isDefault
= YES; name = Xgrid; }; }
You
can run the Unix command "echo" on the cluster by typing
breen$ xgrid -job run /bin/echo "Hello,
World"
Hello, World
You
can submit the job "cal" (for calendar) to the cluster by typing
breen$ xgrid -job submit /usr/bin/cal 05 2007
{jobIdentifier = 80411; }
retrieve
the results by noting the job ID number (80411 in this example) and typing
breen$ xgrid -job results -id 80411
May 2007
S M Tu W Th F S
1 2 3 4
5
6 7 8
9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
and
get the job's attributes via
breen$ xgrid -job attributes -id 80411
{
jobAttributes = {
activeCPUPower = 0;
applicationIdentifier =
"com.apple.xgrid.cli";
dateNow = 2007-05-23
22:22:39 -0400;
dateStarted = 2007-05-23
22:22:15 -0400;
dateStopped = 2007-05-23
22:22:16 -0400;
dateSubmitted =
2007-05-23 22:22:14 -0400;
jobStatus = Finished;