Umbrella Maintenance 2026 Q1

16

Feb

-

18

Feb

The TU/e Umbrella HPC Cluster will be undergoing scheduled maintenance, from: Monday 16 February 2026, 09:00 CET to Wednesday 18 February 2026, 17:00 CET.

The entire cluster will be offline during this period. Please make sure your jobs finish before the maintenance starts, or that they can safely be interrupted and rerun.

All running jobs on Monday 16 February 2026 09:00 will be cancelled/killed!

No Backups!

There are no backups on the HPC cluster — do not use it for archiving. You are responsible for your own data management!

Impact

Minor impact

Starting after the maintenance, the two login nodes will be updated and rebooted every two weeks. Long running processes such as tmux, screen, and VS Code Server, will be terminated on reboot, and may require restarting.
The pam_slurm_adopt module will be enabled on compute nodes. SSH’ing into a compute node will work as it does now, but any process started through SSH will be associated with a Slurm job on that same node, and will be terminated when the job ends.
Tools such as top and ps will no longer show processes from other users.

Questions?

If you encounter any issues after the maintenance window, with which you would like assistance, please let us know. We can be reached by pe-mail and through Teams.

Overview of changes

Starting after the maintenance, the two login nodes will be updated and rebooted monthly. This improves security, and will also keep the nodes "fresh": old temporary files and orphaned processes will be cleared, leaving more resources available for current users.
The pam_slurm_adopt module will be enabled on compute nodes. This ensures that users will only use their allotted CPU cores, GPUs, and memory, and cannot interfere with other users’ jobs.
Tools such as top and ps will no longer show processes from other users. This slightly improves security.
Latest updates and patches to Rocky Linux 8 will be installed.
Some software (a.o. Slurm and rclone) will be updated.
Security fixes and firmware upgrades will be applied across all nodes and network switches, improving reliability and security.

August 25, 2025

Umbrella
maintenance

Umbrella Maintenance 2025 Q3

25

Aug

-

27

Aug

TU/e Umbrella HPC Cluster has a scheduled downtime for maintenance from Monday 25 August 09:00 CET to Wednesday 27 August 17:00 CET. The cluster will be unavailable during this time. Please make sure that your jobs are finished before the start of the maintenance or that they can continue after they were (hard) killed/cancelled.

All running Jobs on Monday 25 August 2025 09:00 will be cancelled!

Continue reading
03

Mar

-

05

Mar

Mar 03, 2025
Umbrella
maintenance

Umbrella Maintenance 2025 Q1

26

Aug

-

28

Aug

Aug 26, 2024
Umbrella
maintenance

Umbrella Maintenance 2024 Q3

12

Feb

-

14

Feb

Feb 12, 2024
Umbrella
maintenance

Umbrella Maintenance 2024 Q1

07

Aug

-

09

Aug

Aug 07, 2023
Umbrella
maintenance

Umbrella Maintenance 2023 Q3

Umbrella Maintenance 2026 Q1

Impact

Minor impact

Questions?

Overview of changes

RELATED ITEMS