Using Ant¶

Login¶

There is 1 login node:

Hostname	Node type
`ant`	Ant login node

Host key fingerprint:

Algorithm	Fingerprint (SHA256)
RSA	`SHA256:JOg7saslfaqZdPVy8sTv2qoWy/cCFgTIvADhzj6cHfw`
ECDSA	`SHA256:/fY0bZIwZ6O6+5CWAvgL79+AoxMlOelhdb71ecskKfE`
ED25519	`SHA256:luxPt965f5utw+7WkTvs9fJwMu93+vAktFFQA0WJHI8`

Building software¶

ESPResSo¶

Release 4.3:

# last update: March 2024
module load spack/default gcc/12.3.0 cuda/12.3.0 openmpi/4.1.6 \
            fftw/3.3.10 boost/1.83.0 cmake/3.27.9 python/3.12.1

git clone --recursive --branch python --origin upstream \
    https://github.com/espressomd/espresso.git espresso-4.3
cd espresso-4.3
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -c "requirements.txt" numpy scipy vtk h5py setuptools cython==3.0.6
mkdir build
cd build
cp ../maintainer/configs/maxset.hpp myconfig.hpp
sed -i "/ADDITIONAL_CHECKS/d" myconfig.hpp
cmake .. -D CMAKE_BUILD_TYPE=Release -D ESPRESSO_BUILD_WITH_CCACHE=OFF \
    -D ESPRESSO_BUILD_WITH_CUDA=ON -D CMAKE_CUDA_ARCHITECTURES="86" \
    -D CUDAToolkit_ROOT="${CUDA_HOME}" \
    -D ESPRESSO_BUILD_WITH_WALBERLA=ON -D ESPRESSO_BUILD_WITH_WALBERLA_AVX=ON \
    -D ESPRESSO_BUILD_WITH_SCAFACOS=OFF -D ESPRESSO_BUILD_WITH_HDF5=OFF
make -j 64
SITE_PACKAGES=$(python3 -c 'import sysconfig;print(sysconfig.get_path("platlib"))')
echo $(realpath ./src/python) > "${SITE_PACKAGES}/espresso.pth"
deactivate

Loading software¶

With EESSI:

# last update: October 2024
[user@ant ~]$ source /cvmfs/software.eessi.io/versions/2023.06/init/bash
{EESSI 2023.06} [user@ant ~]$ module load ESPResSo/4.2.2-foss-2023b
{EESSI 2023.06} [user@ant ~]$ module load pyMBE/0.8.0-foss-2023b
{EESSI 2023.06} [user@ant ~]$ python3 -c "import pyMBE"
{EESSI 2023.06} [user@ant ~]$ python3 -c "import espressomd"

With Spack:

# last update: October 2024
module load spack/default
module load gcc/12.3.0 openmpi/4.1.6 cuda/12.3.0

Submitting jobs¶

Batch command:

sbatch --job-name="test" --nodes=1 --ntasks=4 --mem-per-cpu=2GB job.sh

Job script:

#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --output %j.stdout
#SBATCH --error  %j.stderr
module load spack/default gcc/12.3.0 cuda/12.3.0 openmpi/4.1.6 \
            fftw/3.3.10 boost/1.83.0 python/3.12.1
source espresso-4.3/venv/bin/activate
srun --cpu-bind=cores python3 espresso-4.3/testsuite/python/particle.py
deactivate

Benchmarks¶

Multi-GPU job¶

Run mpi4py on multiple nodes, one CPU per GPU, and make only one GPU visible per CPU.

Environment:

python3 -m venv venv
. venv/bin/activate
pip install mpi4py pycuda "numpy<2"

Launcher (gpu_vis_wrapper):

#!/bin/bash
CUDA_VISIBLE_DEVICES=$((${SLURM_PROCID} % ${SLURM_GPUS_ON_NODE})) $*

Executor (list_cuda.py):

from mpi4py import MPI
import pycuda.driver as cuda
import os

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
cuda_visible_devices = os.environ.get("CUDA_VISIBLE_DEVICES", "")
host_name = os.environ.get("SLURMD_NODENAME", "")
print(f"{rank=} {host_name=} {cuda_visible_devices=}")
import pycuda.autoinit
cuda.init()
device_count = cuda.Device.count()
print(f"{rank=} {host_name=} Number of CUDA devices available: {device_count}")
for i in range(device_count):
    device = cuda.Device(i)
    print(f"{rank=} {host_name=} Device {i}: {device.name()} - Memory: {device.total_memory() // (1024**2)} MB")

Output:

$ srun --nodes=2 -J mpi4py --ntasks-per-node=2 --gres=gpu:2 --mem-per-cpu=100MB \
       --time=00:02:00 bash ./gpu_vis_wrapper python3 ./list_cuda.py
rank=0 host_name='compute02' cuda_visible_devices='0'
rank=0 host_name='compute02' Number of CUDA devices available: 1
rank=0 host_name='compute02' Device 0: NVIDIA L4 - Memory: 22478 MB
rank=1 host_name='compute02' cuda_visible_devices='1'
rank=1 host_name='compute02' Number of CUDA devices available: 1
rank=1 host_name='compute02' Device 0: NVIDIA L4 - Memory: 22478 MB
rank=2 host_name='compute03' cuda_visible_devices='0'
rank=2 host_name='compute03' Number of CUDA devices available: 1
rank=2 host_name='compute03' Device 0: NVIDIA L4 - Memory: 22478 MB
rank=3 host_name='compute03' cuda_visible_devices='1'
rank=3 host_name='compute03' Number of CUDA devices available: 1
rank=3 host_name='compute03' Device 0: NVIDIA L4 - Memory: 22478 MB