Introduction to Xgrid:

Cluster Computing for Everyone

 

Barbara J. Breen1, John F. Lindner2

1Department of Physics, University of Portland, Portland, Oregon 97203

2Department of Physics, The College of Wooster, Wooster, Ohio 44691

 

(First posted 4 January 2007; last revised 24 July 2007)

 

Xgrid is the first distributed computing architecture built into a desktop operating system. It allows you to run a single job across multiple computers at once. All you need is at least one Macintosh computer running Mac OS X v10.4 or later. (Mac OS X Server is not required.) We provide explicit instructions and example code to get you started, including examples of how to distribute your computing jobs, even if your initial cluster consists of just two old laptops in your basement.

 

1. INTRODUCTION

Apple's Xgrid technology enables you to readily convert any ad hoc collection of Macintosh computers into a low-cost supercomputing cluster. Xgrid functionality is integrated into every copy of Mac OS X v10.4. For more information, visit http://www.apple.com/macosx/features/xgrid/.

 

In this article, we show how to unlock this functionality. In Section 2, we guide you through setting up the cluster. In Section 3, we illustrate two simple ways to distribute jobs across the cluster: shell scripts and batch files. We don't assume you know what shell scripts, batch files, or C/C++ programs are (although you will need to learn). Instead, we supply explicit, practical examples.

 

2. SETTING UP THE CLUSTER

In a typical cluster of three or more computers (or processors), a client computer requests a job from the controller computer, which assigns an agent computer to perform it. However, the client, controller, and agent can be the same computer. In fact, we initially set up a "mini-grid" consisting of just two laptop computers, a G3 and a G4 PowerBook.

2.1. Assigning Agents

Designating an agent is a simple procedure handled in the Mac OS Systems Preferences. For any computer running Mac OS X v10.4 "Tiger", simply check the Xgrid option under the Sharing pane of Systems Preferences, as in Fig. 1. (Older computers running Mac OS X v10.3 "Panther" can download a dedicated agent installer from http://www.apple.com/support/downloads/xgridagentformacosx103.html.)

 

Fig. 1. Creating an agent by checking Xgrid in the Sharing pane of System Preferences.

 

Fig. 2. These are the simplest settings for an agent.

 

 

Once the box is checked, the computer is available to act as an agent in a distributed network. Clicking the 'Configure' button opens a dialog box that allows this network to be described. The simplest configuration is to tell the computer to use the first available controller (though a specific controller can be specified), to always accept tasks (there is an option to accept tasks only when idle) and to not require authentication (the other options here are password or Kerberos single sign-on), as in Fig. 2.

2.2. Assigning a Controller

2.2.1. XgridLite

The friendliest way to start (and stop) an Xgrid controller is to download the free XgridLite from http://edbaskerville.com/software/xgridlite/. It installs an XgridLite icon under "Other" in System Preferences, thereby enabling you to turn on and off controllers in a way similar to turning on and off agents.

 

Fig. 3. Note the XgridLite icon under "other".

 

Clicking on the XgridLite icon opens a dialog box that lets you configure the machine as the controller. You can also choose whether or not to require authentication. (This authentication will be a non-Kerberos password; you can't enable single sign-on with XgridLite.)

 

Fig. 4. One mouse click stops or starts a controller.

 

2.2.2. Terminal

Alternately, you can turn on the controller directly from the Terminal app using the command-line tool xgridctl. Simply type the command

 

breen$ sudo xgridctl c start

 

and provide an administrator's password when requested. The word sudo stands for "super user do" and "c" refers to the controller (rather than "a" for agent, another option). To familiarize yourself with this tool, access the manual by typing the command

 

breen$ man xgridctl

 

2.3. Communicating with the Cluster

The simplest way to communicate with the cluster is via the Terminal app. Each session (every time a Terminal window is opened), you should set the controller hostname and password to avoid having to continually type these at every Xgrid command. From the Mac OS X v10.4 default bash terminal, type

 

breen$ export XGRID_CONTROLLER_HOSTNAME=<hostname>

breen$ export XGRID_CONTROLLER_PASSWORD=<password>

 

From the alternate tcsh terminal, type

 

breen$ setenv XGRID_CONTROLLER_HOSTNAME <hostname>

breen$ setenv XGRID_CONTROLLER_PASSWORD <password>

 

The password doesn't have to be set if you are not requiring authentication, something we recommend in the early stages to ease the learning curve just a bit. If the computers are on the same network, you may be able to designate the controller hostname using the ".local" extension given in the System Preferences AppleTalk pane, for example "breenCluster.local". Otherwise, you should find the controller's full name including domain, such as "breenCluster.up.edu". IP addresses will also work.

 

Xgrid commands have the form

 

       xgrid <options> <action> <parameters>

 

Immediately after setting the controller hostname, type

 

breen$ xgrid -grid list

 

If the Xgrid is functioning, you should get something like

 

{gridList = (0); }

 

meaning that there is one Xgrid of ID 0 on your network. If you don't get this response, you've probably erred in setting the hostname or password. Watch for typos! Any time the terminal doesn't understand a command line instruction starting with xgrid, it will return the xgrid manual page, which you can also helpfully invoke by typing

 

breen$ man xgrid

 

(Exit from the manual by typing "control-z".) You can get the attributes of an Xgrid with

 

breen$ xgrid -grid attributes -gid 0

{gridAttributes = {gridMegahertz = 0; isDefault = YES; name = Xgrid; }; }

 

You can run the Unix command "echo" on the cluster by typing

 

breen$ xgrid -job run /bin/echo "Hello, World"

Hello, World

 

You can submit the job "cal" (for calendar) to the cluster by typing

 

breen$ xgrid -job submit /usr/bin/cal 05 2007

{jobIdentifier = 80411; }

 

retrieve the results by noting the job ID number (80411 in this example) and typing

 

breen$ xgrid -job results -id 80411

      May 2007

 S  M Tu  W Th  F  S

       1  2  3  4  5

 6  7  8  9 10 11 12

13 14 15 16 17 18 19

20 21 22 23 24 25 26

27 28 29 30 31

 

and get the job's attributes via

 

breen$ xgrid -job attributes -id 80411

{

    jobAttributes = {

        activeCPUPower = 0;

        applicationIdentifier = "com.apple.xgrid.cli";

        dateNow = 2007-05-23 22:22:39 -0400;

        dateStarted = 2007-05-23 22:22:15 -0400;

        dateStopped = 2007-05-23 22:22:16 -0400;

        dateSubmitted = 2007-05-23 22:22:14 -0400;

        jobStatus = Finished;