I’ve had Sun GridEngine running on our cluster of 12-core HP blades from its earliest days. What has not been working is the the inter-host communication (the ability of the system to schedule and distribute jobs across the nodes). I therefore set out to fix this situation. It turns out that the problems that prevented this from working are mainly caused by quirks in the way that the Debian (and by inheritance, Ubuntu) packaging was done.
Prerequisites for gridengine: Most of the problems that I saw with the Debianised gridengine system are due to a lack of these prerequisites:
1. check the hosts file for localhost.localdomain type entries. If these are present, they will cause host communication to fail. Ensure that, at minimum, there is an entry in the hosts file of the master for each exec node, and in the hosts file of the exec nodes there should be an entry for the master. For example:
I will set up a cluster between my desktop machine, KWIAT22 and my laptop, caleb.
/etc/hosts on KWIAT22 contains:
127.0.0.1 localhost #127.0.0.1 localhost.localdomain localhost 22.214.171.124 KWIAT22 126.96.36.199 caleb
plus some other irrelevant entries. Note that localhost.localdomain is commented out.
/etc/hosts on caleb contains:
127.0.0.1 caleb #127.0.0.1 localhost.localdomain localhost 188.8.131.52 caleb 184.108.40.206 KWIAT22
Note again, the localhost.localdomain entry has been commented out.
2. Java is required for inter-host communication. We will use Sun Java, as it is assumed to be most compatible with Sun GridEngine. Edit /etc/apt/sources.list and uncomment the entries for the partner repository:
deb http://archive.canonical.com/ubuntu maverick partner deb-src http://archive.canonical.com/ubuntu maverick partner
Then install the JRE:
apt-get install sun-java6-jre
Check which version of java we’ve got selected:
root@caleb:~# java -version java version "1.6.0_22" OpenJDK Runtime Environment (IcedTea6 1.10.1) (6b22-1.10.1-0ubuntu1) OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
From that we can see that I still have OpenJDK selected, so we change that:
root@caleb:~# update-alternatives --config java There are 2 choices for the alternative java (providing /usr/bin/java). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/lib/jvm/java-6-openjdk/jre/bin/java 1061 auto mode 1 /usr/lib/jvm/java-6-openjdk/jre/bin/java 1061 manual mode 2 /usr/lib/jvm/java-6-sun/jre/bin/java 63 manual mode Press enter to keep the current choice[*], or type selection number: 2 update-alternatives: using /usr/lib/jvm/java-6-sun/jre/bin/java to provide /usr/bin/java (java) in manual mode. root@caleb:~# java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
Now that we have these prerequisites satisfied, we can install the relevant gridengine packages. Installing gridengine on Ubuntu systems is made simple by the packages. We can install the packages on the master node (in our case KWIAT22):
apt-get install gridengine-client gridengine-qmon gridengine-exec gridengine-master
Configure SGE automatically? Yes
SGE cell name: default
SGE master hostname: KWIAT22 (this should be the fully qualified domain name of the SGE master, not localhost)
Output will typically look something like this:
Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: gridengine-common The following NEW packages will be installed: gridengine-client gridengine-common gridengine-exec gridengine-master gridengine-qmon 0 upgraded, 5 newly installed, 0 to remove and 37 not upgraded. Need to get 0 B/18.7 MB of archives. After this operation, 44.8 MB of additional disk space will be used. Do you want to continue [Y/n]? Preconfiguring packages ... Selecting previously deselected package gridengine-common. (Reading database ... 372804 files and directories currently installed.) Unpacking gridengine-common (from .../gridengine-common_6.2u5-1ubuntu1_all.deb) ... Selecting previously deselected package gridengine-client. Unpacking gridengine-client (from .../gridengine-client_6.2u5-1ubuntu1_amd64.deb) ... Selecting previously deselected package gridengine-exec. Unpacking gridengine-exec (from .../gridengine-exec_6.2u5-1ubuntu1_amd64.deb) ... Selecting previously deselected package gridengine-master. Unpacking gridengine-master (from .../gridengine-master_6.2u5-1ubuntu1_amd64.deb) ... Selecting previously deselected package gridengine-qmon. Unpacking gridengine-qmon (from .../gridengine-qmon_6.2u5-1ubuntu1_amd64.deb) ... Processing triggers for man-db ... Processing triggers for ureadahead ... Setting up gridengine-common (6.2u5-1ubuntu1) ... Creating config file /etc/default/gridengine with new version Setting up gridengine-client (6.2u5-1ubuntu1) ... Setting up gridengine-exec (6.2u5-1ubuntu1) ... error: communication error for "KWIAT22/execd/1" running on port 6445: "can't bind socket" error: commlib error: can't bind socket (no additional information available) .......................... critical error: abort qmaster registration due to communication errors daemonize error: child exited before sending daemonize state Setting up gridengine-master (6.2u5-1ubuntu1) ... Initializing cluster with the following parameters: => SGE_ROOT: /var/lib/gridengine => SGE_CELL: default => Spool directory: /var/spool/gridengine/spooldb => Initial manager user: sgeadmin Initializing spool (/var/spool/gridengine/spooldb) Initializing global configuration based on /usr/share/gridengine/default-configuration Initializing complexes based on /usr/share/gridengine/centry Initializing usersets based on /usr/share/gridengine/usersets Adding user sgeadmin as a manager Cluster creation complete Setting up gridengine-qmon (6.2u5-1ubuntu1) ...
Note that the execd cannot bind the socket. This occurs because of a left-over execd that failed to stop from a previous install. It also results if you don’t have java installed, as the execd won’t respond to /etc/init.d/gridengine-exec stop without java. Also, if you’re doing an apt-get purge gridengine-* to get back to a fresh slate, typically the execd will not be stopped properly, despite being removed from the system. This can be fixed by:
root@KWIAT22:~# ps aux |grep sge sgeadmin 22244 0.0 0.0 135172 4940 ? Sl 17:42 0:00 /usr/lib/gridengine/sge_qmaster sgeadmin 24272 0.0 0.0 58688 2500 ? Sl May16 0:22 /usr/lib/gridengine/sge_execd root@KWIAT22:~# kill 24272 root@KWIAT22:~# /etc/init.d/gridengine-exec start root@KWIAT22:~# /etc/init.d/gridengine-master restart * Restarting Sun Grid Engine Master Scheduler sge_qmaster
The logfiles we can use for tracking down problems in communication between the qmaster and execd processes are not in the standard debian/ubuntu locations. Instead, they are stored in /var/spool/gridengine/execd/messages for the qmaster and /tmp/execd_messages.[pid] or /var/spool/gridengine/execd/messages for the execd processes. The log messages for our previous socket problem look like this (/tmp/execd_messages.24107):
05/16/2011 20:17:16| main|KWIAT22|E|communication error for "KWIAT22/execd/1" running on port 6445: "can't bind socket" 05/16/2011 20:17:17| main|KWIAT22|E|commlib error: can't bind socket (no additional information available) 05/16/2011 20:17:45| main|KWIAT22|C|abort qmaster registration due to communication errors 05/16/2011 20:17:47| main|KWIAT22|W|daemonize error: child exited before sending daemonize state
If you see any lines containing |E| then you have an error that must be addressed. Any lines with |W| are warnings, and it’s probably wise to fix those too.
On the exec nodes:
apt-get install gridengine-exec
Configure SGE automatically? yes
SGE cell name: default
SGE master hostname: KWIAT22
After installing, you will see the following error in the /tmp/exed_messages.[pid] file and the process will exit:
05/18/2011 17:53:00| main|caleb|E|getting configuration: denied: host "caleb" is neither submit nor admin host 05/18/2011 17:53:05| main|caleb|C|can't get configuration qmaster - terminating
This occurs because the master doesn’t yet know about the exec node. We need to set up a basic configuration on the master. We will use the documentation in /usr/share/doc/gridengine-common/README.Debian, which I will duplicate here, to form the basis of our configuration:
Once you've installed SGE, you'll need to do at least some minimal cluster configuration. Quickstart ========== * Install gridengine-master, gridengine-exec and gridengine-client on the appropriate hosts. * Initially, only the sgeadmin user has admin privileges * It is suggested that you add yourself as a manager and perform the rest of these tasks as your own user: + sudo -u sgeadmin qconf -am myuser * and to a userlist: + qconf -au myuser users * Add a submission host: + qconf -as myhost.mydomain * Add an execution host: + qconf -ae You will now be prompted for information about the execution host. * Add a new host group: + qconf -ahgrp @allhosts * Add the exec host to the @allhosts list: + qconf -aattr hostgroup hostlist myhost.mydomain @allhosts * Add a queue: + qconf -aq main.q * Add the host group to the queue: + qconf -aattr queue hostlist @allhosts main.q * Make sure there is a slot allocated to the execd: + qconf -aattr queue slots "[myhost.mydomain=1]" main.q * Running qstat -f should then show you the execd waiting for jobs
The commands that I ran in my example:
sudo su sudo -u sgeadmin qconf -am rwh exit qconf -au rwh users qconf -as KWIAT22 qconf -ahgrp @allhosts # just save the file without modifying it qconf -aattr hostgroup hostlist KWIAT22 @allhosts qconf -aq main.q # just save the file without modifying it qconf -aattr queue hostlist @allhosts main.q qconf -aattr queue slots "4, [KWIAT22=3]" main.q # 4 by default for all nodes, 3 specifically for KWIAT22, which leaves 1 of the 4 cpus free for the master process
we then add caleb as a submit and exec host:
qconf -as caleb qconf -ae # change the hostname entry to caleb qconf -aattr hostgroup hostlist KWIAT22 @allhosts
Once this is done, we need to start the execd on caleb
Check that it doesn’t create a log file in /tmp/execd_messages.[pid]. If it doesn’t then it’s happy! Back on our master node, a qstat -f should now show us all set up. You can use the GUI qmon tool to get a better look at the setup. To use qmon, you must ssh to the master node with X11 forwarding enabled:
ssh -X hostname qmon
Click the queue control button and then the Hosts tab. If the exec nodes are communicating properly with the master, you should see them listed there, and they should NOT have dashes for the information columns. If a node does show dashes, it’s not communicating correctly, and you’ll need to go look in the log files for the reason. Note that if java is not installed, the communication between nodes will not work, and this may or may not show up in the log files.
Next, we need to set up a parallel environment. This will allow gridengine to start processes on the remote exec nodes. We can do this with qmon, though it’s also possible with the CLI tools. In qmon, click the bottom left button, Parallel Environment Configuration. Click Add. In our example, we’re setting up the simplest form of parallel environment, which doesn’t include any message passing functionality. Set the Name to simple_pe. In our case, we have two 4-core machines, with one core reserved for the master process, so we have 7 slots. The rest we leave as default values, just click OK, then click done. Now click the top, second from the left, button Cluster Queues. On the Cluster Queues tab, click main.q, then click Modify. Click the parallel environment tab, then click simple_pe and move it over to the referenced PEs box. Click OK, and Done.
Lastly, you need to set up passwordless ssh access from the master node to the exec nodes for the users of the gridengine system. This is left as an exercise for the reader, but you might start with learning about OpenSSH key management.