In my lab at Duke University, we had a lot of old computers from prior research projects that were no longer being used. I volunteered to put them together into a cluster for the lab to use for computationally-intensive tasks. I didn’t know anything about cluster computing before this project, so it was a great experience learning how to put together and use a computer cluster.
If you’re new to cluster computing and are interested in setting up your own small computer cluster, the following overview may be helpful.
Hardware & Network
The cluster has seven x86-64 desktop computers of varying age with a range of processors and memory capacities. They are all connected with a single 8-port unmanaged network switch that is connected to Duke’s network. This is a photograph of the cluster:
Six of the computers (
compute nodes, and the remaining one (
dsg02) is the login node, SLURM
controller, and file server. This is the network topology:
The hardest part of setting up the cluster was figuring out what software to use and how to configure it. Since I was unfamiliar with cluster computing, I strongly favored projects with good documentation that were fairly easy to set up. I decided on the following:
Debian stable for the OS. It’s free software, is reliable, and has long-term support. The Debian project also works very hard to minimize changes to Debian stable, which reduces the work required to administer the cluster.
Gluster for the shared file system (for users’ home directories). The Gluster documentation is pretty good, so I found it easier to set up than the alternatives. It’s also a distributed file system, so if I need to add more storage capacity or speed up transfer rates in the future, I can add more storage nodes.
MUNGE for hosts to authenticate each other (needed for SLURM). This is easy to set up.
Sphinx to build the documentation for the cluster (hosted on the head node).
Apache as the web server on the head node to host the documentation and Ganglia. Debian makes setting up Apache very easy.
OpenSSH for users to connect to the cluster and transfer files with SFTP. I also set up passwordless (key-based) authentication for all users between hosts for MPICH.
I installed additional software for users to develop and run their programs, including:
- Miniconda for the Python environment because it’s the easiest way to get up-to-date Python packages on Debian stable.
- GNU Compiler Collection (GCC) for the C/C++/Fortran environment.
- GNU Octave as a free alternative to MATLAB.
- MATLAB, because the other researchers in my lab use it.
If you’re unfamiliar with computer clusters, it’s helpful to know how they work from the user’s perspective. This is how the small cluster I built is set up:
The user has access to his/her home directory and the
/tmp directory on each
node. The user’s home directory is shared across the nodes with Gluster, so all
programs and input/output files in the user’s home directory are available on
all nodes. To run a job on the cluster:
The user transfers his/her program and input data to the login node with SFTP.
The user SSHes into the cluster’s login node. He/she can run inexpensive tasks on the login node, such as compiling small programs. However, for computationally-intensive tasks, the user should submit a job with SLURM to run on the compute nodes.
On the login node, the user can use the following SLURM commands:
srunto run a single job and wait for it to complete,
sallocto allocate resources (primarily for an interactive job), or
sbatchto schedule a batch job for execution.
When the necessary resources (i.e. processors and memory) become available on the compute nodes, SLURM starts the job on the available compute nodes.
The user can cancel the job with
scancelor check its status with
If the user submitted a batch job, SLURM saves the standard output and standard error from the job to the specified location (typically the user would specify files in his/her home directory). The program being run can also save output files itself to the user’s home directory, because the user’s home directory is transparently synchronized between the nodes with Gluster.
When the job is complete, the user can download the output files from the login node with SFTP.
Configuration Management & Testing
One of my goals was to automate the installation and configuration of the cluster as much as possible in order to simplify maintenance and enable version control of the configuration. For installation and configuration, I’m using:
Ansible for configuration management. Ansible is relatively simple to set up, is extensible, and works well enough for my needs.
Git for version control of the configuration.
Debian preseeding for the initial installation of the OS. Unfortunately, preseeding is not well documented, but I was successful basing the template off of this example and the partman-auto documentation.
Jinja for generating the preseed files by filling a template with variables parsed from Ansible.
GNU Make for automating the build process of the configuration and test images.
Since users could be running jobs on the cluster, I needed a way to test changes that didn’t interfere with the actual cluster. I’m using the following additional software to test the configuration with a network of virtual machines on my laptop:
- Packer to build clean Debian virtual machine images with the preseed files.
- Vagrant to start and provision the virtual machines with Ansible.
- VirtualBox to run the virtual machines.
Documentation & Sustainability
One of my goals when building the cluster was to make it sustainable after I leave Duke. As a result, I automated as much of the configuration as possible and documented everything. I’m using Sphinx for documentation, and I’m keeping the configuration and documentation on Duke’s GitLab instance.
If you’d like to set up your own small cluster, the following resources may be helpful:
The documentation for the software I listed above.
Partway through the project, I found ajdecon’s ansible-simple-slurm-cluster repository with Ansible roles to set up a SLURM-based cluster. I made some different decisions than ajdecon, but his example was really helpful as an outline of what to do.
Many universities have documentation about their SLURM clusters; this is helpful to learn how users interact with the cluster. For example, UT Austin has good documentation for their Stampede cluster.
- To generate an initial configuration, use one of the configuration builders, which are available at
/usr/share/doc/slurmctld/slurm-wlm-configurator.htmlonce you have
slurmctldinstalled. Look at the man page for
slurm.conf(5)for more information about the options. [return]