[PATCH 0/5] perf arm: Fix workaround ARM PMUs cpu maps having

This patchset includes a fix for the issue of which ARM will try to open events on offlined CPUs when PMUs have a cpu map, and its essential dependent patches. Some modifications are added to adapt to OpenEuler. Ian Rogers (4): perf cpumap: Add is_subset function perf cpumap: Add intersect function perf pmu: Add CPU map for "cpu" PMUs perf arm: Workaround ARM PMUs cpu maps having offline cpus Yushan Wang (1): perf cpumap: Fix incompatible perf_cpu type for perf_cpu_map__intersect tools/lib/perf/cpumap.c | 55 ++++++++++++++++++++++++ tools/lib/perf/include/internal/cpumap.h | 1 + tools/lib/perf/include/perf/cpumap.h | 2 + tools/perf/arch/arm/util/pmu.c | 10 ++++- tools/perf/util/cpumap.c | 4 +- tools/perf/util/cpumap.h | 2 +- 6 files changed, 70 insertions(+), 4 deletions(-) -- 2.33.0

From: Ian Rogers <irogers@google.com> Returns true if the second argument is a subset of the first. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328232648.2127340-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/lib/perf/cpumap.c | 20 ++++++++++++++++++++ tools/lib/perf/include/internal/cpumap.h | 1 + 2 files changed, 21 insertions(+) diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index ca0215047c32..302eea4ef3d5 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -287,6 +287,26 @@ int perf_cpu_map__max(struct perf_cpu_map *map) return max; } +/** Is 'b' a subset of 'a'. */ +bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b) +{ + if (a == b || !b) + return true; + if (!a || b->nr > a->nr) + return false; + + for (int i = 0, j = 0; i < a->nr; i++) { + if (a->map[i].cpu > b->map[j].cpu) + return false; + if (a->map[i].cpu == b->map[j].cpu) { + j++; + if (j == b->nr) + return true; + } + } + return false; +} + /* * Merge two cpumaps * diff --git a/tools/lib/perf/include/internal/cpumap.h b/tools/lib/perf/include/internal/cpumap.h index 840d4032587b..58c2754ea7f3 100644 --- a/tools/lib/perf/include/internal/cpumap.h +++ b/tools/lib/perf/include/internal/cpumap.h @@ -15,5 +15,6 @@ struct perf_cpu_map { #endif int perf_cpu_map__idx(struct perf_cpu_map *cpus, int cpu); +bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b); #endif /* __LIBPERF_INTERNAL_CPUMAP_H */ -- 2.33.0

From: Ian Rogers <irogers@google.com> The merge function gives the union of two cpu maps. Add an intersect function which is necessary, for example, when intersecting a PMUs supported CPUs with user requested. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230526215410.2435674-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/lib/perf/cpumap.c | 35 ++++++++++++++++++++++++++++ tools/lib/perf/include/perf/cpumap.h | 2 ++ 2 files changed, 37 insertions(+) diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 302eea4ef3d5..7ffa055c32e0 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -363,3 +363,38 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, perf_cpu_map__put(orig); return merged; } + +struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig, + struct perf_cpu_map *other) +{ + struct perf_cpu *tmp_cpus; + int tmp_len; + int i, j, k; + struct perf_cpu_map *merged = NULL; + + if (perf_cpu_map__is_subset(other, orig)) + return perf_cpu_map__get(orig); + if (perf_cpu_map__is_subset(orig, other)) + return perf_cpu_map__get(other); + + tmp_len = max(orig->nr, other->nr); + tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); + if (!tmp_cpus) + return NULL; + + i = j = k = 0; + while (i < orig->nr && j < other->nr) { + if (orig->map[i].cpu < other->map[j].cpu) + i++; + else if (orig->map[i].cpu > other->map[j].cpu) + j++; + else { + j++; + tmp_cpus[k++] = orig->map[i++]; + } + } + if (k) + merged = cpu_map__trim_new(k, tmp_cpus); + free(tmp_cpus); + return merged; +} diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index 6a17ad730cbc..aee4e6ad7fb8 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -14,6 +14,8 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file); LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map); LIBPERF_API struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, struct perf_cpu_map *other); +LIBPERF_API struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig, + struct perf_cpu_map *other); LIBPERF_API void perf_cpu_map__put(struct perf_cpu_map *map); LIBPERF_API int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx); LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus); -- 2.33.0

From: Ian Rogers <irogers@google.com> A typical "cpu" PMU has no "cpus" or "cpumask" file meaning the CPU map is set to NULL, which also encodes an empty CPU map. Update pmu_cpumask so that if the "cpu" PMU fails to load a CPU map, use a default of all online PMUs. Remove const from cpu_map__online for the sake of reference counting. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230527072210.2900565-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/cpumap.c | 4 ++-- tools/perf/util/cpumap.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c index dc5c5e6fc502..337f0afa5007 100644 --- a/tools/perf/util/cpumap.c +++ b/tools/perf/util/cpumap.c @@ -577,9 +577,9 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size) return ptr - buf; } -const struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */ +struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */ { - static const struct perf_cpu_map *online = NULL; + static struct perf_cpu_map *online = NULL; if (!online) online = perf_cpu_map__new(NULL); /* from /sys/devices/system/cpu/online */ diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h index 3a442f021468..87e586144628 100644 --- a/tools/perf/util/cpumap.h +++ b/tools/perf/util/cpumap.h @@ -26,7 +26,7 @@ int cpu_map__build_socket_map(struct perf_cpu_map *cpus, struct perf_cpu_map **s int cpu_map__build_die_map(struct perf_cpu_map *cpus, struct perf_cpu_map **diep); int cpu_map__build_core_map(struct perf_cpu_map *cpus, struct perf_cpu_map **corep); int cpu_map__build_node_map(struct perf_cpu_map *cpus, struct perf_cpu_map **nodep); -const struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */ +struct perf_cpu_map *cpu_map__online(void); /* thread unsafe */ static inline int cpu_map__socket(struct perf_cpu_map *sock, int s) { -- 2.33.0

From: Ian Rogers <irogers@google.com> When PMUs have a cpu map in the 'cpus' or 'cpumask' file, perf will try to open events on those CPUs. ARM doesn't remove offline CPUs meaning taking a CPU offline will cause perf commands to fail unless a CPU map is passed on the command line. More context in: https://lore.kernel.org/lkml/20240603092812.46616-1-yangyicong@huawei.com/ Reported-by: Yicong Yang <yangyicong@huawei.com> Closes: https://lore.kernel.org/lkml/20240603092812.46616-2-yangyicong@huawei.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Tested-by: Leo Yan <leo.yan@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240607065343.695369-1-irogers@google.com --- tools/perf/arch/arm/util/pmu.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c index 887c8addc491..e10ac550e4e4 100644 --- a/tools/perf/arch/arm/util/pmu.c +++ b/tools/perf/arch/arm/util/pmu.c @@ -11,11 +11,14 @@ #include "arm-spe.h" #include "hisi-ptt.h" +#include "../../../util/cpumap.h" #include "../../../util/pmu.h" struct perf_event_attr -*perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused) +*perf_pmu__get_default_config(struct perf_pmu *pmu) { + struct perf_cpu_map *intersect; + #ifdef HAVE_AUXTRACE_SUPPORT if (!strcmp(pmu->name, CORESIGHT_ETM_PMU_NAME)) { /* add ETM default config here */ @@ -29,5 +32,10 @@ struct perf_event_attr } #endif + /* Workaround some ARM PMU's failing to correctly set CPU maps for online processors. */ + intersect = perf_cpu_map__intersect(cpu_map__online(), pmu->cpus); + perf_cpu_map__put(pmu->cpus); + pmu->cpus = intersect; + return NULL; } -- 2.33.0

The patch which fixes "ARM PMUs cpu maps having offline cpus" depends on an older patch includes an introduction of struct perf_cpu, which is a wrap of single int- typed cpu. The related modification is beyond the bugfix target of the original patch, so I reverted the presence of struct perf_cpu. Signed-off-by: Yushan Wang <wangyushan12@huawei.com> --- tools/lib/perf/cpumap.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 7ffa055c32e0..8546ea09029c 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -296,9 +296,9 @@ bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu return false; for (int i = 0, j = 0; i < a->nr; i++) { - if (a->map[i].cpu > b->map[j].cpu) + if (a->map[i] > b->map[j]) return false; - if (a->map[i].cpu == b->map[j].cpu) { + if (a->map[i] == b->map[j]) { j++; if (j == b->nr) return true; @@ -367,7 +367,7 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig, struct perf_cpu_map *other) { - struct perf_cpu *tmp_cpus; + int *tmp_cpus; int tmp_len; int i, j, k; struct perf_cpu_map *merged = NULL; @@ -378,15 +378,15 @@ struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig, return perf_cpu_map__get(other); tmp_len = max(orig->nr, other->nr); - tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); + tmp_cpus = malloc(tmp_len * sizeof(int)); if (!tmp_cpus) return NULL; i = j = k = 0; while (i < orig->nr && j < other->nr) { - if (orig->map[i].cpu < other->map[j].cpu) + if (orig->map[i] < other->map[j]) i++; - else if (orig->map[i].cpu > other->map[j].cpu) + else if (orig->map[i] > other->map[j]) j++; else { j++; -- 2.33.0
participants (1)
-
Yushan Wang