Pegasus II Cluster

Pegasus Home
Overview
Usage
Diskless clusters


Thomas Vojta
Research Group
Physics Department

 

How to build a diskless Linux cluster?

Server Nodes

Reasons why you would want your compute nodes to be diskless:

money saved on hard disks
nodes use less power (and produce less heat)
system administration is simplified (instead of updating all nodes you modify a single boot image)

Of course, there are some disadvantages, too:

the RAM disk which holds the root file system will occupy a part of nodes' RAM
    (nowadays this is hardly an issue, since 64 MByte out of 8 GByte is not a noticeable chunk of memory)
the nodes will not have swap space (swapping over the network is obviously a bad idea)
cluster installation becomes more involved since one has to create the boot image and the root RAM disk

Root-NFS versus Root-RAM disk

When designing a diskless cluster, one has to decide where to put the root file systems of the nodes, on a RAM disk or on the server, mounted via NFS. We went with the RAM disk setup mainly for three reasons:

less network traffic
better scalability; in the Root-NFS approach one needs a separate file system on the server for each of the nodes
    while the same RAM disk image can be used for all nodes
all nodes are guaranteed to have the same configuration, and it does not wander in time

In the following we only discuss the strategy implemented on the Pegasus II cluster: 96 diskless compute nodes are booted using the PXE network boot ROM and the pxelinux network boot loader. The basic root file system is stored in a 64Mb RAM disk (~32Mb occupied), the rest, viz., the whole /usr, is then mounted via NFS as read-only.

While building the Pegasus cluster we used numerous web sources. The article by Charles M. Coldwell was particularly useful, and we based most of the Pegasus cluster installation on it.