Majorana upgrade

In the last days, the cluster has been updated. As usual, report to cluster_admins@iridia.etc any issue, problem, lack of docs, that you encounter.

List of the changes:
* the OS is Rocks 6.2, based on CentOS 6.2
* a new rack (number 5) has been added. It is composed of 4 computational nodes (from c5-0 to c5-3), each with 24 cores, for a total of 96 new slots added to the cluster. Each one of these new cores is roughly 3 times faster than the nodes in rack4. It has also more memory, so each job running on a node of rack5 can use up to 2.4GB of memory. To see how to use the new nodes, take a look at cluster composition and how to submit a job. Enjoy responsibly.
* due to power considerations, we had to remove 8 nodes of rack1, which now has 32 cores available, in the nodes 7-12, 14, 15.

Read More

Majorana has been upgraded

In the previous days we performed a complete upgrade of our beloved cluster.
This was necessary as our installation was old and many errors were
present (mainly the recurrent problems with the filesystem).
Most of the cluster is back online (with the major exception of c1
nodes and some c3 nodes). Of course, since everything is new, there
might be some problems to iron out in these first days. Let us know if
you have problems.

Here is a list of the updates:

* We switched from 32 to 64 bits.
Remember to recompile everything, old (32 bits) executables will not work.
Recompiling everything includes deleting every files produced during
the compilation process, otherwise you may still use some old ones. In
other words, remember to “make clean” before “make”. In Argos you need
to execute “build_simulation_framework.sh clean”

* The Linux OS is now Rocks Cluster 6.0, based on CentOS 6.3, if you
need to mention it on some papers. You may see on internet that Rocks
6.0 is based on CentOS 6.2, but we upgraded our system further.

* The packages installed have been cleaned a bit. We took the occasion
to not re-install some packages that we believed were not required
anymore: we prefer to manage a system as clean as possible!
If you miss something that you really need, see this page:
http://majorana.ulb.ac.be/wordpress/missing-software/
Remember that you can see the list of all packages installed (on the
compute nodes) here:
http://majorana.ulb.ac.be/wordpress/installed-software

Do not hesitate to ask if you have any further question.
Expect some (small) issues at the beginning, the contrary would be surprising!
Ah! And we managed to keep all your data. :-)

Have a nice week,
the cluster team.

Read More

New c4 nodes

The cluster has grown!
We removed the c0 nodes (and thus the high_mem queue), that were the oldest ones in the cluster.
We installed a new rack (rack 4). The new sixteen nodes (c4-0 to c4-15) in this rack feature 32 cores each. Additionally, a gigabyte of memory is available to ALL the jobs that will run there, that is, as much as with the previous high_mem queue.

This adds 512 new cores to our cluster, now composed of almost
1100 CPU cores. Enjoy! :)

Read More

New switch, c3 nodes and new hi_mem queue

I’m happy to announce that the cluster is up and running. In these two
days we installed the new switch which should fix most of the problems
we were encountering in the past. The new switch also allows us to
have all nodes up and running.

New things:

– Most of the new c3 nodes are up and running. This means 256 new
slots for the users. Some of them are currently disabled as we have a
small problem with the hardware. You can explicitly submit on the c3
nodes using “#$ -l opteron6128″.

– Since we have more nodes we have introduced a new type of queue, the
high memory queue. The hi_mem queue is running on the c0 nodes and
allows you to run programs that use more than 450MB of memory. The
memory limit for this queue is 960MB(soft)/980MB(hard). There are 32
slots available for the hi_mem queue. You can submit jobs on this
queue using “#$ -l hi_mem” in your scripts. The short and long queues
on the c0s are now disabled. ┬áNote that if you don’t specify any queue
(that is, you don’t put any -l in your scripts) the jobs will run in
the first available slot.

Read More

New cluster administrators

Hi all,

there are two new cluster administrators – finally I am relieved of this
duty (after 3 years, yay!). The new main administrator is Jeremie, with
Manuele serving as standby in case of Jeremie’s absence.

As there are multiple administrators now, please direct all future
communication to cluster_admins@iridia.ulb.ac.be

Thank you
Arne

Read More

Proper usage of /tmp

  • Please include as a last line on your scripts something like “rm -rf
    $TMPDIR” in order to make sure that there’s no data left on the node
    after your job terminated. See link below for further info: How to submit a job
  • If your jobs fail, you can use the script ‘clean_temp_dirs.sh’ on the
    submit host to clean out any remaining data on the nodes. Please use it,
    but only when you have a lot of jobs that failed.
  • Please don’t write gigabytes of data. There’s 130GB free space on the
    c2-x, if you managed to fill this up you did something wrong.
  • Please be considerate. If you fuck up the cluster, you block all
    other users and, in terms of near deadlines like now, tempers rise
    easily.

Read More

GCC 4.4 and ARGoS on the IRIDIA cluster

****************************************
EDIT: With the new installation of the cluster, gcc44 is the default compiler, so you don’t need this anymore. Actually it might be possible that leaving these settings will prevent you from compiling. Remove them in case you have cmake problems (such as: CMake Error: Error required internal CMake variable not set, cmake may be not be built correctly. Missing variable is: CMAKE_C_COMPILER)
****************************************

OUTDATED! READ ABOVE!!!
On the IRIDIA cluster, GCC 4.4 is available under the commands gcc44/g++44.

In order to use this with CMake and ARGoS2, you can either set the appropriate environment variables or simply add a


set(CMAKE_C_COMPILER "gcc44")
set(CMAKE_CXX_COMPILER "g++44")

before the version check in the main CMakeList of *each* of your packages.

On a side note, I fixed the missing packages for ARGoS 1, so that old experiments can again be compiled.
OUTDATED! READ ABOVE!!!

Read More

Cluster issues

The cluster is still experiencing technical difficulties. Most probably this is connected to a faulty/overloaded switch, but I am not sure at the moment.

Please expect interruptions or other problems with the cluster. Thus, the cluster can be considered unstable until further notice. Sorry for the inconvenience.

Read More

New packages

New software packages!

  • bsdiff-4.3-3
  • google-perftools-1.6-1
  • google-perftools-devel-1.6-1
  • gsl-1.14-1
  • gsl-devel-1.14-1
  • mercurial-1.3.1-3
  • openjpeg-devel-1.3-6
  • openjpeg-libs-1.3-6

Read More