mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Hpc

Threads by month
  • ----- 2025 -----
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
hpc@openeuler.org

August 2025

  • 1 participants
  • 1 discussions
[PATCH] Add support for application numa replicas feature
by Zheng Zengkai 26 Aug '25

26 Aug '25
This feature is aimed at optimizing the numa problem influence to hpc application performance. The main idea is as below: 1.Use ldd to get all the shared object dependencies of the app elf binary, including direct and indirect dependencies. 2.Create replicas for the app elf binary and its lib dependencies. 3.Use "patchelf --replace-needed" to relink the app elf replica and all its dependencies replicas, so that when using numactl to launch each app binary replica on the corresponding numa node, the accessed app elf and libs binary data are in the local numa node, so there will be no cross-numa access due to accessing the app binary or shared objects. Refer to the app-numa-replicas-usage on how to use it. Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com> --- .../app-numa-replicas-config | 8 +++ .../app-numa-replicas-install.sh | 5 ++ .../app-numa-replicas-uninstall.sh | 5 ++ .../app-numa-replicas/app-numa-replicas-usage | 22 +++++++ .../app-numa-replicas.service | 18 +++++ .../app-numa-replicas/app-numa-replicas.sh | 66 +++++++++++++++++++ 6 files changed, 124 insertions(+) create mode 100644 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config create mode 100755 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh create mode 100755 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh create mode 100644 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage create mode 100644 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service create mode 100755 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config new file mode 100644 index 00000000..8aaf7d82 --- /dev/null +++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config @@ -0,0 +1,8 @@ +# Absolute path to the application binary +APP_ELF_PATH= + +# The script used to set the environment variables, like the LD_LIBRARY_PATH, +# with the env script path specified, environment variables will be set along +# with service start. +# You can also set the environment variables manually before starting service. +ENV_SCRIPT_PATH= diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh new file mode 100755 index 00000000..afaba2aa --- /dev/null +++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh @@ -0,0 +1,5 @@ +#!/bin/bash + +cp app-numa-replicas-config /etc/sysconfig/ +cp app-numa-replicas.service /usr/lib/systemd/system/ +cp app-numa-replicas.sh /usr/sbin/ diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh new file mode 100755 index 00000000..5b7d940d --- /dev/null +++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh @@ -0,0 +1,5 @@ +#!/bin/bash + +rm -rf /etc/sysconfig/app-numa-replicas-config +rm -rf /usr/lib/systemd/system/app-numa-replicas.service +rm -rf /usr/sbin/app-numa-replicas.sh diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage new file mode 100644 index 00000000..00de4896 --- /dev/null +++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage @@ -0,0 +1,22 @@ +Usage for app-numa-replicas service: + +1. Set the APP_ELF_PATH, which is the absolute path to the + application elf binary in app-numa-replicas-config. +2. Set the ENV_SCRIPT_PATH, which is the script used to set + the environment variables, like the LD_LIBRARY_PATH. + Notice: This step is optional but recommended and the env + setting script need to be written by user. + With the env script path specified, environment variables + will be set along with service start. + Certainly, you can also set the environment variables + manually before starting service. +3. Run app-numa-replicas-install.sh +4. Run the command "systemctl start app-numa-replicas" to start + the service, which will create the replicas of app elf and its + dependencies for each numa node. + Run the command "journalctl -u app-numa-replicas.service -f" + to monitor the progress of the service. + Other standard commands like "systemctl status app-numa-replicas" + or "systemctl stop app-numa-replicas" are also supported. +5. Use "mpirun" and "numactl" command to launch the replica of + app elf binary for each numa node. diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service new file mode 100644 index 00000000..97a16116 --- /dev/null +++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service @@ -0,0 +1,18 @@ +[Unit] +Description=Application numa replicas service +After=network.target +ConditionFileIsExecutable=/usr/sbin/app-numa-replicas.sh + +[Service] +Type=oneshot +# Environment="APP_ELF_PATH=" +# Environment="ENV_SCRIPT_PATH=" +EnvironmentFile=/etc/sysconfig/app-numa-replicas-config +ExecStart=/usr/sbin/app-numa-replicas.sh $APP_ELF_PATH $ENV_SCRIPT_PATH +ExecStop=kill $MAINPID +RemainAfterExit=yes +StandardOutput=journal+console +StandardError=journal+console + +[Install] +WantedBy=multi-user.target diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh new file mode 100755 index 00000000..d1cb62fd --- /dev/null +++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh @@ -0,0 +1,66 @@ +#!/bin/bash + +declare -g ld_name='ld-linux-aarch64.so.1' +declare -g -A soname_dep_map +declare -g all_deps_list + +function relink_elf_deps() { + local elf=$1 + local replica_path=$2 + + for dep in $(patchelf --print-needed $elf | grep -Ev "$ld_name"); do + local dep_absolute_path=${soname_dep_map[$dep]} + if [ -n "$dep_absolute_path" ]; then + local real_soname=$(basename $dep_absolute_path) + [ -e $replica_path/lib/$real_soname ] || cp $dep_absolute_path $replica_path/lib + patchelf --replace-needed $dep $replica_path/lib/$real_soname $elf + fi + done +} + +function create_elf_and_libs_replicas() { + local app_elf=$1 + local filter='not found|not a dynamic executable' + + # ldd can get all the shared object dependencies(direct & indirect) of app elf. + for dep in $(ldd $app_elf | grep -Ev "$filter" | awk '/=>/ {print $3}'); do + if [ -f $dep ]; then + local dep_absolute_path=$(readlink -f $dep) + local soname=$(patchelf --print-soname $dep) + soname_dep_map[$soname]=$dep_absolute_path + all_deps_list+=("$dep_absolute_path") + fi + done + + local numa_node_num=$(numactl -H | grep "available:" | awk '{print $2}') + for i in $(seq 0 $((numa_node_num-1))); do + local numa_replica_path="/opt/numa-replica/$(basename $app_elf)/numa_$i" + echo -n "Creating replicas of $(basename $app_elf) and its so libs for numa node $i ..." + mkdir -p $numa_replica_path "$numa_replica_path/lib" + cp $app_elf $numa_replica_path + relink_elf_deps $numa_replica_path/$(basename $app_elf) $numa_replica_path + # as all_deps_list contains all the deps, no need to call relink_elf_deps recursively, + # patchelf --replace-needed direct_deps for all deps in the list is equivalent. + for dep in "${all_deps_list[@]}"; do + [ -e "$numa_replica_path/lib/$(basename $dep)" ] || cp $dep "$numa_replica_path/lib" + relink_elf_deps "$numa_replica_path/lib/$(basename $dep)" $numa_replica_path + done + echo "done" + done +} + +if [ -n "$2" ]; then + source "$2" +else + echo "Notice: Make sure that the environment variables are set correctly!" + echo "You can specify the environment setting scripts in /etc/sysconfig/app-numa-replicas-config," + echo "then the environment variables, like LD_LIBRARY_PATH, will be set along with service start" +fi + +if [ -n "$1" ]; then + create_elf_and_libs_replicas "$1" +else + echo "Error: Path of application elf binary is not specified!" + echo "Please check service config APP_ELF_PATH in /etc/sysconfig/app-numa-replicas-config" + exit 1 +fi -- 2.33.0
1 0
0 0

HyperKitty Powered by HyperKitty