.. _Using bwUniCluster: Using bwUniCluster ================== .. _bwUniCluster login: Login ----- There are 4 login nodes and 2 gateways that redirect to any of the login nodes in a load-balanced way: ============================ ======================================= Hostname Node type ============================ ======================================= ``bwunicluster.scc.kit.edu`` login to one of the four login nodes ``uc2.scc.kit.edu`` login to one of the four login nodes ============================ ======================================= The login nodes can also be reached directly: ============================ ======================================= Hostname Node type ============================ ======================================= ``uc2-login1.scc.kit.edu`` bwUniCluster 2.0 first login node ``uc2-login2.scc.kit.edu`` bwUniCluster 2.0 second login node ``uc2-login3.scc.kit.edu`` bwUniCluster 2.0 third login node ``uc2-login4.scc.kit.edu`` bwUniCluster 2.0 fourth login node ============================ ======================================= Host key fingerprint: ========= ======================================================== Algorithm Fingerprint (SHA256) ========= ======================================================== RSA ``SHA256:p6Ion2YKZr5cnzf6L6DS1xGnIwnC1BhLbOEmDdp7FA0`` ECDSA ``SHA256:k8l1JnfLf1y1Qi55IQmo11+/NZx06Rbze7akT5R7tE8`` ED25519 ``SHA256:yEe5nJ5hZZ1YbgieWr+phqRZKYbrV7zRe8OR3X03cn0`` ========= ======================================================== More details can be found in the wiki page `bwUniCluster2.0/Login `__. .. _bwUniCluster building dependencies: Building dependencies --------------------- Boost ^^^^^ .. code-block:: bash # last update: June 2023 module load compiler/gnu/10.2 mpi/openmpi/4.1 mkdir boost-build cd boost-build BOOST_VERSION=1.82.0 BOOST_DOMAIN="https://boostorg.jfrog.io/artifactory/main" BOOST_ROOT="${HOME}/bin/boost_mpi_${BOOST_VERSION//./_}" mkdir -p "${BOOST_ROOT}" curl -sL "${BOOST_DOMAIN}/release/${BOOST_VERSION}/source/boost_${BOOST_VERSION//./_}.tar.bz2" | tar xj cd "boost_${BOOST_VERSION//./_}" echo 'using mpi ;' > tools/build/src/user-config.jam ./bootstrap.sh --with-libraries=filesystem,system,mpi,serialization,test ./b2 -j 4 install --prefix="${BOOST_ROOT}" FFTW ^^^^ .. code-block:: bash # last update: June 2023 module load compiler/gnu/10.2 mpi/openmpi/4.1 mkdir fftw-build cd fftw-build FFTW3_VERSION=3.3.10 FFTW3_ROOT="${HOME}/bin/fftw_${FFTW3_VERSION//./_}" curl -sL "https://www.fftw.org/fftw-${FFTW3_VERSION}.tar.gz" | tar xz cd "fftw-${FFTW3_VERSION}" ./configure --enable-shared --enable-mpi --enable-threads --enable-openmp \ --disable-fortran --enable-avx --prefix="${FFTW3_ROOT}" make -j 4 make install make clean CUDA ^^^^ .. code-block:: bash # last update: June 2023 module load compiler/gnu/10.2 devel/cuda/12.0 export CLUSTER_CUDA_ROOT="${HOME}/bin/cuda_12_0" mkdir -p "${CLUSTER_CUDA_ROOT}/lib" ln -s "${CUDA_HOME}/targets/x86_64-linux/lib/stubs/libcuda.so" "${CLUSTER_CUDA_ROOT}/lib/libcuda.so" ln -s "${CUDA_HOME}/targets/x86_64-linux/lib/stubs/libcuda.so" "${CLUSTER_CUDA_ROOT}/lib/libcuda.so.1" .. _bwUniCluster building software: Building software ----------------- ESPResSo ^^^^^^^^ Release 4.2: .. code-block:: bash # last update: June 2023 module load compiler/gnu/10.2 mpi/openmpi/4.1 devel/cmake/3.23.3 devel/cuda/12.0 \ lib/hdf5/1.12.2-gnu-10.2-openmpi-4.1 devel/python/3.8.6_gnu_10.2 CLUSTER_FFTW3_VERSION=3.3.10 CLUSTER_BOOST_VERSION=1.82.0 export BOOST_ROOT="${HOME}/bin/boost_mpi_${CLUSTER_BOOST_VERSION//./_}" export FFTW3_ROOT="${HOME}/bin/fftw_${CLUSTER_FFTW3_VERSION//./_}" export CUDA_ROOT="${HOME}/bin/cuda_12_0" export LD_LIBRARY_PATH="${BOOST_ROOT}/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="${FFTW3_ROOT}/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${CUDA_HOME}/targets/x86_64-linux/lib/stubs" export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${CUDA_ROOT}/lib" git clone --recursive --branch 4.2 --origin upstream \ https://github.com/espressomd/espresso.git espresso-4.2 cd espresso-4.2 python3 -m pip install --user -c "requirements.txt" cython setuptools numpy scipy vtk h5py mkdir build cd build cp ../maintainer/configs/maxset.hpp myconfig.hpp sed -i "/ADDITIONAL_CHECKS/d" myconfig.hpp cmake .. -D CMAKE_BUILD_TYPE=Release -D WITH_CUDA=ON \ -D WITH_CCACHE=OFF -D WITH_SCAFACOS=OFF -D WITH_HDF5=ON make -j 4 Release 4.3: .. code-block:: bash # last update: June 2023 module load compiler/gnu/10.2 mpi/openmpi/4.1 devel/cmake/3.23.3 devel/cuda/12.0 \ lib/hdf5/1.12.2-gnu-10.2-openmpi-4.1 devel/python/3.8.6_gnu_10.2 CLUSTER_FFTW3_VERSION=3.3.10 CLUSTER_BOOST_VERSION=1.82.0 export BOOST_ROOT="${HOME}/bin/boost_mpi_${CLUSTER_BOOST_VERSION//./_}" export FFTW3_ROOT="${HOME}/bin/fftw_${CLUSTER_FFTW3_VERSION//./_}" export CUDA_ROOT="${HOME}/bin/cuda_12_0" export LD_LIBRARY_PATH="${BOOST_ROOT}/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="${FFTW3_ROOT}/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${CUDA_HOME}/targets/x86_64-linux/lib/stubs" export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${CUDA_ROOT}/lib" git clone --recursive --branch python --origin upstream \ https://github.com/espressomd/espresso.git espresso-4.3 cd espresso-4.3 python3 -m pip install --user -c "requirements.txt" cython setuptools numpy scipy vtk h5py mkdir build cd build cp ../maintainer/configs/maxset.hpp myconfig.hpp sed -i "/ADDITIONAL_CHECKS/d" myconfig.hpp cmake .. -D CUDAToolkit_ROOT="/opt/bwhpc/common/devel/cuda/12.0" \ -D CMAKE_BUILD_TYPE=Release -D ESPRESSO_BUILD_WITH_CUDA=ON \ -D ESPRESSO_BUILD_WITH_CCACHE=OFF -D ESPRESSO_BUILD_WITH_WALBERLA=ON \ -D ESPRESSO_BUILD_WITH_SCAFACOS=OFF -D ESPRESSO_BUILD_WITH_HDF5=ON make -j 4 .. _bwUniCluster submitting jobs: Submitting jobs --------------- Batch command: .. code-block:: bash sbatch --partition=dev_multiple --nodes=2 --ntasks-per-node=2 job.sh Job script: .. code-block:: bash #!/bin/bash #SBATCH --job-name=test #SBATCH --time=00:10:00 #SBATCH --output %j.stdout #SBATCH --error %j.stderr # last update: July 2023 module load compiler/gnu/10.2 mpi/openmpi/4.1 devel/cmake/3.23.3 devel/cuda/12.0 \ lib/hdf5/1.12.2-gnu-10.2-openmpi-4.1 devel/python/3.8.6_gnu_10.2 CLUSTER_FFTW3_VERSION=3.3.10 CLUSTER_BOOST_VERSION=1.82.0 export BOOST_ROOT="${HOME}/bin/boost_mpi_${CLUSTER_BOOST_VERSION//./_}" export FFTW3_ROOT="${HOME}/bin/fftw_${CLUSTER_FFTW3_VERSION//./_}" export CUDA_ROOT="${HOME}/bin/cuda_12_0" export LD_LIBRARY_PATH="${BOOST_ROOT}/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="${FFTW3_ROOT}/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${CUDA_HOME}/targets/x86_64-linux/lib/stubs" export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}${CUDA_ROOT}/lib" export PYTHONPATH="${HOME}/espresso-4.3/build-maxset/src/python${PYTHONPATH:+:$PYTHONPATH}" mpiexec --bind-to core --map-by core python3 script.py The documentation recommends using the MPI-specific launcher, i.e. ``mpiexec`` or ``mpirun`` for OpenMPI, instead of SLURM's ``srun``. The number of processes and node information is automatically passed to the launcher. When using ``srun`` instead of the MPI-specific launcher, if the job script loads python via ``module load``, it is necessary to preload the SLURM shared objects, like so: .. code-block:: bash LD_PRELOAD=/usr/lib64/slurm/libslurmfull.so \ sbatch --partition=dev_multiple --nodes=2 --ntasks-per-node=2 job.sh Otherwise, the following fatal error is triggered: .. code-block:: none python3: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/auth_munge.so): /usr/lib64/slurm/auth_munge.so: undefined symbol: slurm_conf python3: error: Couldn't load specified plugin name for auth/munge: Dlopen of plugin file failed python3: error: cannot create auth context for auth/munge python3: fatal: failed to initialize auth plugin