Current outage of majorana

Hello everybody,

in between the planned switch of majorana and the unplanned emergency shutdown last Sunday, the cluster will remain unavailable until probably Monday, 28th March. I apologize for the inconvenience. In the mean time, we will activate log-in access to oldmajorana and newmajorana. This should allow you to retrieve experiments and data. There will be no access to submit experiments. The cluster will be put into maintenance mode again on Friday, 25th March.
Please use the access also to set a new password on newmajorana, by following the instructions below. Starting from Monday, 28th March, you will need this password to log into majorana. Be aware that you only have until Friday, 25th March, to change your password however. If you do not have an account on newmajorana yet or encounter any problems, please contact us at

How to change your password on newmajorana:
1) Log into majorana: $ ssh
2) Continue to newmajorana: $ ssh newmajorana
3) Check your current (random) password: $ cat /var/mail/username
4) Change your password (by following the on screen instructions): $ passwd

Read More

Major software and hardware upgrade

Hello everybody,

As some of you may have noticed the software running on majorana is quite old
(we are still using kernel 2.6!).  We took the decision to wait before an update for multiple
reasons such as hardware compatibility, filesystem upgradability, the fact that
SGE (the current scheduler) is no longer supported and many more.  One of the
objectives of this hardware upgrade is also to perform a complete update of the
operating system, the distributed filesystem and move the cluster towards a
different (and most importantly actively supported) job scheduler.  Before you
start panicking, we want to stress that, for the moment, nothing will change.
The move to the new system will be gradual and even when complete, the old
filesystem (and data) will be available in a read-only format.

The new system, for now called newmajorana, is based on Springdale linux 8
( a distribution compatible with RHEL 8
developed for being used in an academic environment.  The new version of
lustre, the distributed filesystem,  is not compatible with the old version
currently in use in majorana so we had to build a completely new filesystem
that will start with the same capacity as the old one (so you will have the
same quotas).  Finally, we chose SLURM as job scheduler as it is one of the
most used job scheduler in HPC clusters (it is used in about 60% of the TOP500

At the moment, mewmajorana is composed of 3 new servers and 12 new computing
nodes each one composed of two AMD EPYC Rome CPUs with 32 cores for a total of
768 cores.  The update is significant and adapting to the new software will
take time so newmajorana will be for now in a “open beta” stage where we will
try to solve software issues and tuning the configuration of the scheduler.
The best way to find problems is to run experiments so if you want to
participate in the testing, send us a request.  The beta means that you can
have compilation and execution issues, there can be missing software, the
scheduler configuration may change and computing nodes can be rebooted.  The
advantage of participating is two-fold. First you will have access to 768
super-fast cores and, second, when the switch to the new software will be
complete you will be sure that your experiments will work without issues.

Once your account has been activated, newmajorana can be accessed by typing
ssh newmajorana from the submit node.

Read More

Majorana upgrade

In the last days, the cluster has been updated. As usual, report to cluster_admins@iridia.etc any issue, problem, lack of docs, that you encounter.

List of the changes:
* the OS is Rocks 6.2, based on CentOS 6.2
* a new rack (number 5) has been added. It is composed of 4 computational nodes (from c5-0 to c5-3), each with 24 cores, for a total of 96 new slots added to the cluster. Each one of these new cores is roughly 3 times faster than the nodes in rack4. It has also more memory, so each job running on a node of rack5 can use up to 2.4GB of memory. To see how to use the new nodes, take a look at cluster composition and how to submit a job. Enjoy responsibly.
* due to power considerations, we had to remove 8 nodes of rack1, which now has 32 cores available, in the nodes 7-12, 14, 15.

Read More

Majorana has been upgraded

In the previous days we performed a complete upgrade of our beloved cluster.
This was necessary as our installation was old and many errors were
present (mainly the recurrent problems with the filesystem).
Most of the cluster is back online (with the major exception of c1
nodes and some c3 nodes). Of course, since everything is new, there
might be some problems to iron out in these first days. Let us know if
you have problems.

Here is a list of the updates:

* We switched from 32 to 64 bits.
Remember to recompile everything, old (32 bits) executables will not work.
Recompiling everything includes deleting every files produced during
the compilation process, otherwise you may still use some old ones. In
other words, remember to “make clean” before “make”. In Argos you need
to execute “ clean”

* The Linux OS is now Rocks Cluster 6.0, based on CentOS 6.3, if you
need to mention it on some papers. You may see on internet that Rocks
6.0 is based on CentOS 6.2, but we upgraded our system further.

* The packages installed have been cleaned a bit. We took the occasion
to not re-install some packages that we believed were not required
anymore: we prefer to manage a system as clean as possible!
If you miss something that you really need, see this page:
Remember that you can see the list of all packages installed (on the
compute nodes) here:

Do not hesitate to ask if you have any further question.
Expect some (small) issues at the beginning, the contrary would be surprising!
Ah! And we managed to keep all your data. :-)

Have a nice week,
the cluster team.

Read More

New c4 nodes

The cluster has grown!
We removed the c0 nodes (and thus the high_mem queue), that were the oldest ones in the cluster.
We installed a new rack (rack 4). The new sixteen nodes (c4-0 to c4-15) in this rack feature 32 cores each. Additionally, a gigabyte of memory is available to ALL the jobs that will run there, that is, as much as with the previous high_mem queue.

This adds 512 new cores to our cluster, now composed of almost
1100 CPU cores. Enjoy! :)

Read More

New switch, c3 nodes and new hi_mem queue

I’m happy to announce that the cluster is up and running. In these two
days we installed the new switch which should fix most of the problems
we were encountering in the past. The new switch also allows us to
have all nodes up and running.

New things:

– Most of the new c3 nodes are up and running. This means 256 new
slots for the users. Some of them are currently disabled as we have a
small problem with the hardware. You can explicitly submit on the c3
nodes using “#$ -l opteron6128″.

– Since we have more nodes we have introduced a new type of queue, the
high memory queue. The hi_mem queue is running on the c0 nodes and
allows you to run programs that use more than 450MB of memory. The
memory limit for this queue is 960MB(soft)/980MB(hard). There are 32
slots available for the hi_mem queue. You can submit jobs on this
queue using “#$ -l hi_mem” in your scripts. The short and long queues
on the c0s are now disabled.  Note that if you don’t specify any queue
(that is, you don’t put any -l in your scripts) the jobs will run in
the first available slot.

Read More

New cluster administrators

Hi all,

there are two new cluster administrators – finally I am relieved of this
duty (after 3 years, yay!). The new main administrator is Jeremie, with
Manuele serving as standby in case of Jeremie’s absence.

As there are multiple administrators now, please direct all future
communication to

Thank you

Read More

Proper usage of /tmp

  • Please include as a last line on your scripts something like “rm -rf
    $TMPDIR” in order to make sure that there’s no data left on the node
    after your job terminated. See link below for further info: How to submit a job
  • If your jobs fail, you can use the script ‘’ on the
    submit host to clean out any remaining data on the nodes. Please use it,
    but only when you have a lot of jobs that failed.
  • Please don’t write gigabytes of data. There’s 130GB free space on the
    c2-x, if you managed to fill this up you did something wrong.
  • Please be considerate. If you fuck up the cluster, you block all
    other users and, in terms of near deadlines like now, tempers rise

Read More

GCC 4.4 and ARGoS on the IRIDIA cluster

EDIT: With the new installation of the cluster, gcc44 is the default compiler, so you don’t need this anymore. Actually it might be possible that leaving these settings will prevent you from compiling. Remove them in case you have cmake problems (such as: CMake Error: Error required internal CMake variable not set, cmake may be not be built correctly. Missing variable is: CMAKE_C_COMPILER)

On the IRIDIA cluster, GCC 4.4 is available under the commands gcc44/g++44.

In order to use this with CMake and ARGoS2, you can either set the appropriate environment variables or simply add a

set(CMAKE_C_COMPILER "gcc44")

before the version check in the main CMakeList of *each* of your packages.

On a side note, I fixed the missing packages for ARGoS 1, so that old experiments can again be compiled.

Read More