3.) Building a Linux Cluster
Linux clusters are generally more common, robust, efficient and cost effective than Windows clusters. We will now look at the steps involved in building up a Linux cluster. For more information go here .
Step 1 Install a Linux distribution (I am using Red Hat 7.1 and working with two Linux boxes) on each computer in your cluster. During the installation process, assign hostnames and of course, unique IP addresses for each node in your cluster. Usually, one node is designated as the master node (where you'll control the cluster, write and run programs, etc.) with all the other nodes used as computational slaves. We name one of our nodes as Master and the other as Slave. Our cluster is private, so theoretically we could assign any valid IP address to our nodes as long as each has a unique value. I have used IP address 192.168.0.190 for the master node and 192.168.0.191 for the slave node. If you already have Linux installed on each node in your cluster, then you don't have to make changes to your IP addresses or hostnames unless you want to. Changes (if needed) can be made using your network configuration program Linuxconf in Red Hat. Finally, create identical user accounts on each node. In our case, we create the user DevArticle on each node in our cluster. You can either create the identical user accounts during installation, or you can use the adduser command as root. Step 2 We now need to configure rsh on each node in our cluster. Create .rhosts files in the user and root directories. Our .rhosts files for the DevArticle users are as follows: Master DevArticle Slave DevArticle Moreover, the .rhosts files for root users are as follows: Master root Slave root Next, we create a hosts file in the /etc directory. Below is our hosts file for Master (the master node): 192.168.0.190 Master.home.net Master 127.0.0.1 localhost 192.168.0.191 Slave Step 3 Do not remove the 127.0.0.1 localhost line. The hosts.allow files on each node was modified by adding ALL+ as the only line in the file. This allows anyone on any node permission to connect to any other node in our private cluster. To allow root users to use rsh, I had to add the following lines to the /etc/securetty file:
rsh, rlogin, rexec, pts/0, pts/1. Also, I modified the /etc/pam.d/rsh file: #%PAM-1.0 # For root login to succeed here with pam_securetty, "rsh" must be # listed in /etc/securetty. auth sufficient /lib/security/pam_nologin.so auth optional /lib/security/pam_securetty.so auth sufficient /lib/security/pam_env.so auth sufficient /lib/security/pam_rhosts_auth.so account sufficient /lib/security/pam_stack.so service=system-auth session sufficient /lib/security/pam_stack.so service=system-auth Step 4 Rsh, rlogin, Telnet and rexec are disabled in Red Hat 7.1 by default. To change this, I navigated to the /etc/xinetd.d directory and modified each of the command files (rsh, rlogin, telnet and rexec), changing the disabled = yes line to disabled = no. Once the changes were made to each file (and saved), I closed the editor and issued the following command: xinetd –restart -- to enable rsh, rlogin, etc.
Step 5 Next, download the latest version of MPICH (UNIX all flavors) to the master node from here. Untar the file in either the common user directory (the identical user you established for all nodes "DevArticle" on our cluster) or in the root directory (if you want to run the cluster as root). Issue the following command: tar zxfv mpich.tar.gz Change into the newly created mpich-1.2.2.3 directory. Type ./configure, and when the configuration is complete and you have a command prompt, type make. The make may take a few minutes, depending on the speed of your master computer. Once make has finished, add the mpich-1.2.2.3/bin and mpich-1.2.2.3/util directories to your PATH in .bash_profile or however you set your path environment statement. The full root paths for the MPICH bin and util directories on our master node are /root/mpich-1.2.2.3/util and /root/mpich-1.2.2.3/bin. For the DevArticle user on our cluster, /root is replaced with /home/DevArticle in the path statements. Log out and then log in to enable the modified PATH containing your MPICH directories. Step 6 Next, make all of the example files and the MPE graphic files. First, navigate to the mpich-1.2.2.3/examples/basic directory and type make to make all the basic example files. When this process has finished, you might as well change to the mpich-1.2.2.3/mpe/contrib directory and make some additional MPE example files, especially if you want to view graphics. Within the mpe/contrib directory, you should see several subdirectories. The one we will be interested in is the mandel directory. Change into the mandel directory and type make to create the pmandel exec file. You are now ready to test your cluster.
4.) Testing Your Linux Cluster
The first program we will run is cpilog. From within the mpich-.2.2.3/examples/basic directory, copy the cpilog exec file (if this file isn't present, use make command again) to your top-level directory. On our cluster, this is either /root (if we are logged in as root) or /home/DevArticle, if we are logged in as DevArticle (we have installed MPICH both places). Next, from your top directory, rcp the cpilog file to each node in your cluster, placing the file in the corresponding directory on each node. For example, if I am logged in as DevArticle on the master node, I'll issue rcp cpilog Slave:/home/ DevArticle to copy cpilog to the DevArticle directory on Slave. I'll do the same for each node (if there are more than two nodes). If I want to run a program as root, then I'll copy the cpilog file to the root directories of all nodes on the cluster. Congratulation your supercomputer (Linux cluster) is ready to run MPI programs! Once the files have been copied, I'll type the following from the top directory of my master node to test my cluster: mpirun -np 1 cpilog This will run the cpilog program on the master node to see if the program works correctly. Some MPI programs require at least two processors (-np 2), but cpilog will work with only one. The output looks like this: pi is approximately 3.1415926535899406, Error is 0.0000000000001474 Process 0 is running on Server.home.net wall clock time = 0.360909 Now try all two nodes (or however many you want to try) by typing: mpirun -np 2 cpilog and you'll see something like this: pi is approximately 3.1415926535899406, Error is 0.0000000000001474 Process 0 is running on Master.home.net Process 1 is running on Slave.home.net wall clock time = 0.0611228 The number following the -np parameter corresponds with the number of processors (nodes) you want to use in running your program. This number may not exceed the number of machines listed in your machines.LINUX file plus one (the master node is not listed in the machines.LINUX file). To see some graphics, we must run the pmandel program. Copy the pmandel exec file (from the mpich-1.2.2.3/mpe/contrib/mandel directory) to your top-level directory and then to each node (as you did for cpilog). Then, if X isn't already running, issue a startx command. From a command console, type xhost + to allow any node to use your X display, and then set your DISPLAY variable as follows: DISPLAY=Server:0 (be sure to replace Server with the hostname of your master node). Setting the DISPLAY variable directs all graphics output to your master node. Run pmandel by typing: mpirun -np 2 pmandel The pmandel program requires at least two processors to run correctly. You should see the Mandelbrot set rendered on your master node. Adding more processors (mpirun -np 10 pmandel) should increase the rendering speed dramatically. The mandelbrot set graphic has been partitioned into small rectangles for rendering by the individual nodes. You can actually see the nodes working as the rectangles are filled in. If one node is a bit slow, then the rectangles from that node will be the last to fill in. It is quite fascinating to watch. |