Pegasus II Cluster |
|
How to build a diskless Linux cluster? Reasons why you would want your compute nodes to be diskless: money saved on hard disksnodes use less power (and produce less heat) system administration is simplified (instead of updating all nodes you modify a single boot image) Of course, there are some disadvantages, too: the RAM disk which holds the root file system will occupy a part of nodes' RAM(nowadays this is hardly an issue, since 64 MByte out of 8 GByte is not a noticeable chunk of memory) the nodes will not have swap space (swapping over the network is obviously a bad idea) cluster installation becomes more involved since one has to create the boot image and the root RAM disk Root-NFS versus Root-RAM disk When designing a diskless cluster, one has to decide where to put the root file systems of the nodes, on a RAM disk or on the server, mounted via NFS. We went with the RAM disk setup mainly for three reasons: less network trafficbetter scalability; in the Root-NFS approach one needs a separate file system on the server for each of the nodes while the same RAM disk image can be used for all nodes all nodes are guaranteed to have the same configuration, and it does not wander in time In the following we only discuss the strategy implemented on the Pegasus II cluster: 96 diskless compute nodes are booted using the PXE network boot ROM and the pxelinux network boot loader. The basic root file system is stored in a 64Mb RAM disk (~32Mb occupied), the rest, viz., the whole /usr, is then mounted via NFS as read-only. While building the Pegasus cluster we used numerous web sources. The article by Charles M. Coldwell was particularly useful, and we based most of the Pegasus cluster installation on it. |