Extracting some particular information from the results of your computational experiments can be a computationally high-demanding task, involving a large amount of data. You may need for instance to extract a particular value of a bench of thousands of files, normalize these values, aggregate them in some way to perform statistical tests or to plot them to obtain whatever you may want.
You must pay as much attention to this process as to the submission process.
For each small step to the final aggregation, you must consider if it should be performed on the cluster or on your own computer. Computation of minutes can easily be done on the submit node, while transferring gigabytes of data in a raw can be more problematic for the network. Consider to compress files if you REALLY have to transfer a large amount of data.
If you use
rsync, DO NOT forget the “-z” flag for compression, and with
scp, use “-C”.
Every non-sense action like the transfer of dozen of gigabytes of uncompressed files to then count on your computer the number of files, of lines, or any simple action that could be directly performed on the cluster, will be effectively considered as a denial-of-service attack on majorana, as it may critically slow down the network, penalize other users and endanger the stability of the whole system.