This feature is aimed at optimizing the numa problem influence to
hpc application performance.
The main idea is as below:
1.Use ldd to get all the shared object dependencies of the app elf
binary, including direct and indirect dependencies.
2.Create replicas for the app elf binary and its lib dependencies.
3.Use "patchelf --replace-needed" to relink the app elf replica and
all its dependencies replicas, so that when using numactl to launch
each app binary replica on the corresponding numa node, the accessed
app elf and libs binary data are in the local numa node, so there
will be no cross-numa access due to accessing the app binary or
shared objects.
Refer to the app-numa-replicas-usage on how to use it.
Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com>
---
.../app-numa-replicas-config | 8 +++
.../app-numa-replicas-install.sh | 5 ++
.../app-numa-replicas-uninstall.sh | 5 ++
.../app-numa-replicas/app-numa-replicas-usage | 22 +++++++
.../app-numa-replicas.service | 18 +++++
.../app-numa-replicas/app-numa-replicas.sh | 66 +++++++++++++++++++
6 files changed, 124 insertions(+)
create mode 100644 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config
create mode 100755 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh
create mode 100755 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh
create mode 100644 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage
create mode 100644 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service
create mode 100755 ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh
diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config
new file mode 100644
index 00000000..8aaf7d82
--- /dev/null
+++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-config
@@ -0,0 +1,8 @@
+# Absolute path to the application binary
+APP_ELF_PATH=
+
+# The script used to set the environment variables, like the LD_LIBRARY_PATH,
+# with the env script path specified, environment variables will be set along
+# with service start.
+# You can also set the environment variables manually before starting service.
+ENV_SCRIPT_PATH=
diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh
new file mode 100755
index 00000000..afaba2aa
--- /dev/null
+++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-install.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+
+cp app-numa-replicas-config /etc/sysconfig/
+cp app-numa-replicas.service /usr/lib/systemd/system/
+cp app-numa-replicas.sh /usr/sbin/
diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh
new file mode 100755
index 00000000..5b7d940d
--- /dev/null
+++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-uninstall.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+
+rm -rf /etc/sysconfig/app-numa-replicas-config
+rm -rf /usr/lib/systemd/system/app-numa-replicas.service
+rm -rf /usr/sbin/app-numa-replicas.sh
diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage
new file mode 100644
index 00000000..00de4896
--- /dev/null
+++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas-usage
@@ -0,0 +1,22 @@
+Usage for app-numa-replicas service:
+
+1. Set the APP_ELF_PATH, which is the absolute path to the
+ application elf binary in app-numa-replicas-config.
+2. Set the ENV_SCRIPT_PATH, which is the script used to set
+ the environment variables, like the LD_LIBRARY_PATH.
+ Notice: This step is optional but recommended and the env
+ setting script need to be written by user.
+ With the env script path specified, environment variables
+ will be set along with service start.
+ Certainly, you can also set the environment variables
+ manually before starting service.
+3. Run app-numa-replicas-install.sh
+4. Run the command "systemctl start app-numa-replicas" to start
+ the service, which will create the replicas of app elf and its
+ dependencies for each numa node.
+ Run the command "journalctl -u app-numa-replicas.service -f"
+ to monitor the progress of the service.
+ Other standard commands like "systemctl status app-numa-replicas"
+ or "systemctl stop app-numa-replicas" are also supported.
+5. Use "mpirun" and "numactl" command to launch the replica of
+ app elf binary for each numa node.
diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service
new file mode 100644
index 00000000..97a16116
--- /dev/null
+++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.service
@@ -0,0 +1,18 @@
+[Unit]
+Description=Application numa replicas service
+After=network.target
+ConditionFileIsExecutable=/usr/sbin/app-numa-replicas.sh
+
+[Service]
+Type=oneshot
+# Environment="APP_ELF_PATH="
+# Environment="ENV_SCRIPT_PATH="
+EnvironmentFile=/etc/sysconfig/app-numa-replicas-config
+ExecStart=/usr/sbin/app-numa-replicas.sh $APP_ELF_PATH $ENV_SCRIPT_PATH
+ExecStop=kill $MAINPID
+RemainAfterExit=yes
+StandardOutput=journal+console
+StandardError=journal+console
+
+[Install]
+WantedBy=multi-user.target
diff --git a/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh
new file mode 100755
index 00000000..d1cb62fd
--- /dev/null
+++ b/ComputingLibraryTool/scripts/app-numa-replicas/app-numa-replicas.sh
@@ -0,0 +1,66 @@
+#!/bin/bash
+
+declare -g ld_name='ld-linux-aarch64.so.1'
+declare -g -A soname_dep_map
+declare -g all_deps_list
+
+function relink_elf_deps() {
+ local elf=$1
+ local replica_path=$2
+
+ for dep in $(patchelf --print-needed $elf | grep -Ev "$ld_name"); do
+ local dep_absolute_path=${soname_dep_map[$dep]}
+ if [ -n "$dep_absolute_path" ]; then
+ local real_soname=$(basename $dep_absolute_path)
+ [ -e $replica_path/lib/$real_soname ] || cp $dep_absolute_path $replica_path/lib
+ patchelf --replace-needed $dep $replica_path/lib/$real_soname $elf
+ fi
+ done
+}
+
+function create_elf_and_libs_replicas() {
+ local app_elf=$1
+ local filter='not found|not a dynamic executable'
+
+ # ldd can get all the shared object dependencies(direct & indirect) of app elf.
+ for dep in $(ldd $app_elf | grep -Ev "$filter" | awk '/=>/ {print $3}'); do
+ if [ -f $dep ]; then
+ local dep_absolute_path=$(readlink -f $dep)
+ local soname=$(patchelf --print-soname $dep)
+ soname_dep_map[$soname]=$dep_absolute_path
+ all_deps_list+=("$dep_absolute_path")
+ fi
+ done
+
+ local numa_node_num=$(numactl -H | grep "available:" | awk '{print $2}')
+ for i in $(seq 0 $((numa_node_num-1))); do
+ local numa_replica_path="/opt/numa-replica/$(basename $app_elf)/numa_$i"
+ echo -n "Creating replicas of $(basename $app_elf) and its so libs for numa node $i ..."
+ mkdir -p $numa_replica_path "$numa_replica_path/lib"
+ cp $app_elf $numa_replica_path
+ relink_elf_deps $numa_replica_path/$(basename $app_elf) $numa_replica_path
+ # as all_deps_list contains all the deps, no need to call relink_elf_deps recursively,
+ # patchelf --replace-needed direct_deps for all deps in the list is equivalent.
+ for dep in "${all_deps_list[@]}"; do
+ [ -e "$numa_replica_path/lib/$(basename $dep)" ] || cp $dep "$numa_replica_path/lib"
+ relink_elf_deps "$numa_replica_path/lib/$(basename $dep)" $numa_replica_path
+ done
+ echo "done"
+ done
+}
+
+if [ -n "$2" ]; then
+ source "$2"
+else
+ echo "Notice: Make sure that the environment variables are set correctly!"
+ echo "You can specify the environment setting scripts in /etc/sysconfig/app-numa-replicas-config,"
+ echo "then the environment variables, like LD_LIBRARY_PATH, will be set along with service start"
+fi
+
+if [ -n "$1" ]; then
+ create_elf_and_libs_replicas "$1"
+else
+ echo "Error: Path of application elf binary is not specified!"
+ echo "Please check service config APP_ELF_PATH in /etc/sysconfig/app-numa-replicas-config"
+ exit 1
+fi
--
2.33.0