[PATCH OLK-6.6 01/15] cpu/SMT: Provide a default topology_is_primary_thread()

From: Yicong Yang <yangyicong@hisilicon.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IBZQQR ---------------------------------------------------------------------- Currently if architectures want to support HOTPLUG_SMT they need to provide a topology_is_primary_thread() telling the framework which thread in the SMT cannot offline. However arm64 doesn't have a restriction on which thread in the SMT cannot offline, a simplest choice is that just make 1st thread as the "primary" thread. So just make this as the default implementation in the framework and let architectures like x86 that have special primary thread to override this function (which they've already done). There's no need to provide a stub function if !CONFIG_SMP or !CONFIG_HOTPLUG_SMT. In such case the testing CPU is already the 1st CPU in the SMT so it's always the primary thread. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/x86/include/asm/topology.h | 1 - arch/x86/kernel/smpboot.c | 1 + include/linux/topology.h | 24 ++++++++++++++++++++++++ 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h index 5b4f87329031..8bf8d5546e97 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -152,7 +152,6 @@ static inline int topology_phys_to_logical_die(unsigned int die, unsigned int cpu) { return 0; } static inline int topology_max_die_per_package(void) { return 1; } static inline int topology_max_smt_threads(void) { return 1; } -static inline bool topology_is_primary_thread(unsigned int cpu) { return true; } static inline bool topology_smt_supported(void) { return false; } #endif diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 5726a3c9623e..2c897ad8b014 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -284,6 +284,7 @@ bool topology_is_primary_thread(unsigned int cpu) { return apic_id_is_primary_thread(per_cpu(x86_cpu_to_apicid, cpu)); } +#define topology_is_primary_thread topology_is_primary_thread /** * topology_smt_supported - Check whether SMT is supported by the CPUs diff --git a/include/linux/topology.h b/include/linux/topology.h index 80d27d717631..3aa7a3ef6771 100644 --- a/include/linux/topology.h +++ b/include/linux/topology.h @@ -212,6 +212,30 @@ static inline const struct cpumask *cpu_smt_mask(int cpu) } #endif +#ifndef topology_is_primary_thread + +#define topology_is_primary_thread topology_is_primary_thread + +static inline bool topology_is_primary_thread(unsigned int cpu) +{ + /* + * When disabling SMT the primary thread of the SMT will remain + * enabled/active. Architectures do have a special primary thread + * (e.g. x86) needs to override this function. Otherwise can make + * the first thread in the SMT as the primary thread. + * + * The sibling cpumask of an offline CPU contains always the CPU + * itself for architectures using the implementation of + * CONFIG_GENERIC_ARCH_TOPOLOGY for building their topology. + * Other architectures not using CONFIG_GENERIC_ARCH_TOPOLOGY for + * building their topology have to check whether to use this default + * implementation or to override it. + */ + return cpu == cpumask_first(topology_sibling_cpumask(cpu)); +} + +#endif + static inline const struct cpumask *cpu_cpu_mask(int cpu) { return cpumask_of_node(cpu_to_node(cpu)); -- 2.33.0

From: Sudeep Holla <sudeep.holla@arm.com> Currently as we parse the CPU topology from /cpu-map node from the device tree, we assign generated cluster count as the physical package identifier for each CPU which is wrong. The device tree bindings for CPU topology supports sockets to infer the socket or physical package identifier for a given CPU. Since it is fairly new and not supported on most of the old and existing systems, we can assume all such systems have single socket/physical package. Fix the physical package identifier to 0 by removing the assignment of cluster identifier to the same. Link: https://lore.kernel.org/r/20220704101605.1318280-17-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/base/arch_topology.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 48f36d8b9176..728c30ac343c 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -473,7 +473,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth) bool leaf = true; bool has_cores = false; struct device_node *c; - static int package_id __initdata; int core_id = 0; int i, ret; @@ -512,7 +511,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth) } if (leaf) { - ret = parse_core(c, package_id, core_id++); + ret = parse_core(c, 0, core_id++); } else { pr_err("%pOF: Non-leaf cluster with core %s\n", cluster, name); @@ -529,9 +528,6 @@ static int __init parse_cluster(struct device_node *cluster, int depth) if (leaf && !has_cores) pr_warn("%pOF: empty cluster\n", cluster); - if (leaf) - package_id++; - return 0; } -- 2.33.0

From: Sudeep Holla <sudeep.holla@arm.com> Finally let us add support for socket nodes in /cpu-map in the device tree. Since this may not be present in all the old platforms and even most of the existing platforms, we need to assume absence of the socket node indicates that it is a single socket system and handle appropriately. Also it is likely that most single socket systems skip to as the node since it is optional. Link: https://lore.kernel.org/r/20220704101605.1318280-20-sudeep.holla@arm.com Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/base/arch_topology.c | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-) diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 728c30ac343c..1dddf518da0a 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -467,7 +467,7 @@ static int __init parse_core(struct device_node *core, int package_id, return 0; } -static int __init parse_cluster(struct device_node *cluster, int depth) +static int __init parse_cluster(struct device_node *cluster, int package_id, int depth) { char name[20]; bool leaf = true; @@ -487,7 +487,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth) c = of_get_child_by_name(cluster, name); if (c) { leaf = false; - ret = parse_cluster(c, depth + 1); + ret = parse_cluster(c, package_id, depth + 1); of_node_put(c); if (ret != 0) return ret; @@ -511,7 +511,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth) } if (leaf) { - ret = parse_core(c, 0, core_id++); + ret = parse_core(c, package_id, core_id++); } else { pr_err("%pOF: Non-leaf cluster with core %s\n", cluster, name); @@ -531,6 +531,32 @@ static int __init parse_cluster(struct device_node *cluster, int depth) return 0; } +static int __init parse_socket(struct device_node *socket) +{ + char name[20]; + struct device_node *c; + bool has_socket = false; + int package_id = 0, ret; + + do { + snprintf(name, sizeof(name), "socket%d", package_id); + c = of_get_child_by_name(socket, name); + if (c) { + has_socket = true; + ret = parse_cluster(c, package_id, 0); + of_node_put(c); + if (ret != 0) + return ret; + } + package_id++; + } while (c); + + if (!has_socket) + ret = parse_cluster(socket, 0, 0); + + return ret; +} + static int __init parse_dt_topology(void) { struct device_node *cn, *map; @@ -551,7 +577,7 @@ static int __init parse_dt_topology(void) if (!map) goto out; - ret = parse_cluster(map, 0); + ret = parse_socket(map); if (ret != 0) goto out_map; -- 2.33.0

From: Michael Ellerman <mpe@ellerman.id.au> In order to export the cpuhp_smt_control enum as part of the interface between generic and architecture code, the architecture code needs to include asm/topology.h. But that leads to circular header dependencies. So split the enum and related declarations into a separate header. [ ldufour: Reworded the commit's description ] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20230705145143.40545-3-ldufour@linux.ibm.com Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/x86/include/asm/topology.h | 2 ++ include/linux/cpu.h | 25 +------------------------ include/linux/cpu_smt.h | 29 +++++++++++++++++++++++++++++ kernel/cpu.c | 1 + 4 files changed, 33 insertions(+), 24 deletions(-) create mode 100644 include/linux/cpu_smt.h diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h index 8bf8d5546e97..75d7de7e0b80 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -135,6 +135,8 @@ static inline int topology_max_smt_threads(void) return __max_smt_threads; } +#include <linux/cpu_smt.h> + int topology_update_package_map(unsigned int apicid, unsigned int cpu); int topology_update_die_map(unsigned int dieid, unsigned int cpu); int topology_phys_to_logical_pkg(unsigned int pkg); diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 7a77bde22b61..c39f41017af5 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -19,6 +19,7 @@ #include <linux/cpumask.h> #include <linux/cpuhotplug.h> #include <linux/cpuhplock.h> +#include <linux/cpu_smt.h> struct device; struct device_node; @@ -186,30 +187,6 @@ void cpuhp_report_idle_dead(void); static inline void cpuhp_report_idle_dead(void) { } #endif /* #ifdef CONFIG_HOTPLUG_CPU */ -enum cpuhp_smt_control { - CPU_SMT_ENABLED, - CPU_SMT_DISABLED, - CPU_SMT_FORCE_DISABLED, - CPU_SMT_NOT_SUPPORTED, - CPU_SMT_NOT_IMPLEMENTED, -}; - -#if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) -extern enum cpuhp_smt_control cpu_smt_control; -extern void cpu_smt_disable(bool force); -extern void cpu_smt_check_topology(void); -extern bool cpu_smt_possible(void); -extern int cpuhp_smt_enable(void); -extern int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval); -#else -# define cpu_smt_control (CPU_SMT_NOT_IMPLEMENTED) -static inline void cpu_smt_disable(bool force) { } -static inline void cpu_smt_check_topology(void) { } -static inline bool cpu_smt_possible(void) { return false; } -static inline int cpuhp_smt_enable(void) { return 0; } -static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; } -#endif - extern bool cpu_mitigations_off(void); extern bool cpu_mitigations_auto_nosmt(void); diff --git a/include/linux/cpu_smt.h b/include/linux/cpu_smt.h new file mode 100644 index 000000000000..722c2e306fef --- /dev/null +++ b/include/linux/cpu_smt.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_CPU_SMT_H_ +#define _LINUX_CPU_SMT_H_ + +enum cpuhp_smt_control { + CPU_SMT_ENABLED, + CPU_SMT_DISABLED, + CPU_SMT_FORCE_DISABLED, + CPU_SMT_NOT_SUPPORTED, + CPU_SMT_NOT_IMPLEMENTED, +}; + +#if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) +extern enum cpuhp_smt_control cpu_smt_control; +extern void cpu_smt_disable(bool force); +extern void cpu_smt_check_topology(void); +extern bool cpu_smt_possible(void); +extern int cpuhp_smt_enable(void); +extern int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval); +#else +# define cpu_smt_control (CPU_SMT_NOT_IMPLEMENTED) +static inline void cpu_smt_disable(bool force) { } +static inline void cpu_smt_check_topology(void) { } +static inline bool cpu_smt_possible(void) { return false; } +static inline int cpuhp_smt_enable(void) { return 0; } +static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; } +#endif + +#endif /* _LINUX_CPU_SMT_H_ */ diff --git a/kernel/cpu.c b/kernel/cpu.c index 1c370f87d86f..fcc43aa23ed6 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -412,6 +412,7 @@ static void lockdep_release_cpus_lock(void) void __weak arch_smt_update(void) { } #ifdef CONFIG_HOTPLUG_SMT + enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; void __init cpu_smt_disable(bool force) -- 2.33.0

From: Michael Ellerman <mpe@ellerman.id.au> Some architectures allow partial SMT states at boot time, ie. when not all SMT threads are brought online. To support that the SMT code needs to know the maximum number of SMT threads, and also the currently configured number. The architecture code knows the max number of threads, so have the architecture code pass that value to cpu_smt_set_num_threads(). Note that although topology_max_smt_threads() exists, it is not configured early enough to be used here. As architecture, like PowerPC, allows the threads number to be set through the kernel command line, also pass that value. [ ldufour: Slightly reword the commit message ] [ ldufour: Rename cpu_smt_check_topology and add a num_threads argument ] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20230705145143.40545-5-ldufour@linux.ibm.com Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/x86/kernel/cpu/common.c | 2 +- include/linux/cpu_smt.h | 8 ++++++-- kernel/cpu.c | 21 ++++++++++++++++++++- 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 919f39154302..3bb626f1d99d 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2253,7 +2253,7 @@ void __init arch_cpu_finalize_init(void) * identify_boot_cpu() initialized SMT support information, let the * core code know. */ - cpu_smt_check_topology(); + cpu_smt_set_num_threads(smp_num_siblings, smp_num_siblings); if (!IS_ENABLED(CONFIG_SMP)) { pr_info("CPU: "); diff --git a/include/linux/cpu_smt.h b/include/linux/cpu_smt.h index 722c2e306fef..0c1664294b57 100644 --- a/include/linux/cpu_smt.h +++ b/include/linux/cpu_smt.h @@ -12,15 +12,19 @@ enum cpuhp_smt_control { #if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) extern enum cpuhp_smt_control cpu_smt_control; +extern unsigned int cpu_smt_num_threads; extern void cpu_smt_disable(bool force); -extern void cpu_smt_check_topology(void); +extern void cpu_smt_set_num_threads(unsigned int num_threads, + unsigned int max_threads); extern bool cpu_smt_possible(void); extern int cpuhp_smt_enable(void); extern int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval); #else # define cpu_smt_control (CPU_SMT_NOT_IMPLEMENTED) +# define cpu_smt_num_threads 1 static inline void cpu_smt_disable(bool force) { } -static inline void cpu_smt_check_topology(void) { } +static inline void cpu_smt_set_num_threads(unsigned int num_threads, + unsigned int max_threads) { } static inline bool cpu_smt_possible(void) { return false; } static inline int cpuhp_smt_enable(void) { return 0; } static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; } diff --git a/kernel/cpu.c b/kernel/cpu.c index fcc43aa23ed6..075250153f94 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -414,6 +414,8 @@ void __weak arch_smt_update(void) { } #ifdef CONFIG_HOTPLUG_SMT enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; +static unsigned int cpu_smt_max_threads __ro_after_init; +unsigned int cpu_smt_num_threads __read_mostly = UINT_MAX; void __init cpu_smt_disable(bool force) { @@ -427,16 +429,33 @@ void __init cpu_smt_disable(bool force) pr_info("SMT: disabled\n"); cpu_smt_control = CPU_SMT_DISABLED; } + cpu_smt_num_threads = 1; } /* * The decision whether SMT is supported can only be done after the full * CPU identification. Called from architecture code. */ -void __init cpu_smt_check_topology(void) +void __init cpu_smt_set_num_threads(unsigned int num_threads, + unsigned int max_threads) { + WARN_ON(!num_threads || (num_threads > max_threads)); + if (!topology_smt_supported()) cpu_smt_control = CPU_SMT_NOT_SUPPORTED; + + cpu_smt_max_threads = max_threads; + + /* + * If SMT has been disabled via the kernel command line or SMT is + * not supported, set cpu_smt_num_threads to 1 for consistency. + * If enabled, take the architecture requested number of threads + * to bring up into account. + */ + if (cpu_smt_control != CPU_SMT_ENABLED) + cpu_smt_num_threads = 1; + else if (num_threads < cpu_smt_num_threads) + cpu_smt_num_threads = num_threads; } static int __init smt_cmdline_disable(char *str) -- 2.33.0

From: Laurent Dufour <ldufour@linux.ibm.com> Since the maximum number of threads is now passed to cpu_smt_set_num_threads(), checking that value is enough to know whether SMT is supported. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20230705145143.40545-6-ldufour@linux.ibm.com Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/x86/include/asm/topology.h | 2 -- arch/x86/kernel/smpboot.c | 8 -------- kernel/cpu.c | 2 +- 3 files changed, 1 insertion(+), 11 deletions(-) diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h index 75d7de7e0b80..39b3b4a12773 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -142,7 +142,6 @@ int topology_update_die_map(unsigned int dieid, unsigned int cpu); int topology_phys_to_logical_pkg(unsigned int pkg); int topology_phys_to_logical_die(unsigned int die, unsigned int cpu); bool topology_is_primary_thread(unsigned int cpu); -bool topology_smt_supported(void); #else #define topology_max_packages() (1) static inline int @@ -154,7 +153,6 @@ static inline int topology_phys_to_logical_die(unsigned int die, unsigned int cpu) { return 0; } static inline int topology_max_die_per_package(void) { return 1; } static inline int topology_max_smt_threads(void) { return 1; } -static inline bool topology_smt_supported(void) { return false; } #endif static inline void arch_fix_phys_package_id(int num, u32 slot) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 2c897ad8b014..58f683388fc5 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -286,14 +286,6 @@ bool topology_is_primary_thread(unsigned int cpu) } #define topology_is_primary_thread topology_is_primary_thread -/** - * topology_smt_supported - Check whether SMT is supported by the CPUs - */ -bool topology_smt_supported(void) -{ - return smp_num_siblings > 1; -} - /** * topology_phys_to_logical_pkg - Map a physical package id to a logical * diff --git a/kernel/cpu.c b/kernel/cpu.c index 075250153f94..3db5760b8d29 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -441,7 +441,7 @@ void __init cpu_smt_set_num_threads(unsigned int num_threads, { WARN_ON(!num_threads || (num_threads > max_threads)); - if (!topology_smt_supported()) + if (max_threads == 1) cpu_smt_control = CPU_SMT_NOT_SUPPORTED; cpu_smt_max_threads = max_threads; -- 2.33.0

From: Yicong Yang <yangyicong@hisilicon.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IBZQQR ---------------------------------------------------------------------- On building the topology from the devicetree, we've already gotten the SMT thread number of each core. Update the largest SMT thread number and enable the SMT control by the end of topology parsing. The framework's SMT control provides two interface to the users [1] through /sys/devices/system/cpu/smt/control: 1) enable SMT by writing "on" and disable by "off" 2) enable SMT by writing max_thread_number or disable by writing 1 Both method support to completely disable/enable the SMT cores so both work correctly for symmetric SMT platform and asymmetric platform with non-SMT and one type SMT cores like: core A: 1 thread core B: X (X!=1) threads Note that for a theoretically possible multiple SMT-X (X>1) core platform the SMT control is also supported as expected but only by writing the "on/off" method. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu... Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/base/arch_topology.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 1dddf518da0a..76244508ccc8 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -16,6 +16,7 @@ #include <linux/sched/topology.h> #include <linux/cpuset.h> #include <linux/cpumask.h> +#include <linux/cpu_smt.h> #include <linux/init.h> #include <linux/percpu.h> #include <linux/rcupdate.h> @@ -390,6 +391,10 @@ core_initcall(free_raw_capacity); #endif #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV) + +/* Used to enable the SMT control */ +static unsigned int max_smt_thread_num = 1; + /* * This function returns the logic cpu number of the node. * There are basically three kinds of return values: @@ -449,6 +454,8 @@ static int __init parse_core(struct device_node *core, int package_id, i++; } while (t); + max_smt_thread_num = max_t(unsigned int, max_smt_thread_num, i); + cpu = get_cpu_for_node(core); if (cpu >= 0) { if (!leaf) { @@ -554,6 +561,17 @@ static int __init parse_socket(struct device_node *socket) if (!has_socket) ret = parse_cluster(socket, 0, 0); + /* + * Reset the max_smt_thread_num to 1 on failure. Since on failure + * we need to notify the framework the SMT is not supported, but + * max_smt_thread_num can be initialized to the SMT thread number + * of the cores which are successfully parsed. + */ + if (ret) + max_smt_thread_num = 1; + + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num); + return ret; } -- 2.33.0

From: Yicong Yang <yangyicong@hisilicon.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IBZQQR ---------------------------------------------------------------------- For ACPI we'll build the topology from PPTT and we cannot directly get the SMT number of each core. Instead using a temporary xarray to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL) and SMT information of the first core in its heterogeneous CPU cluster when building the topology. Then we can know the largest SMT number in the system. If a homogeneous system's using ACPI 6.2 or later, all the CPUs should be under the root node of PPTT. There'll be only one entry in the xarray and all the CPUs in the system will be assumed identical. The framework's SMT control provides two interface to the users [1] through /sys/devices/system/cpu/smt/control: 1) enable SMT by writing "on" and disable by "off" 2) enable SMT by writing max_thread_number or disable by writing 1 Both method support to completely disable/enable the SMT cores so both work correctly for symmetric SMT platform and asymmetric platform with non-SMT and one type SMT cores like: core A: 1 thread core B: X (X!=1) threads Note that for a theoretically possible multiple SMT-X (X>1) core platform the SMT control is also supported as expected but only by writing the "on/off" method. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu... Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/arm64/kernel/topology.c | 55 ++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index ea650f2b0e2e..dcc7a666ecfe 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -15,8 +15,10 @@ #include <linux/arch_topology.h> #include <linux/cacheinfo.h> #include <linux/cpufreq.h> +#include <linux/cpu_smt.h> #include <linux/init.h> #include <linux/percpu.h> +#include <linux/xarray.h> #include <asm/cpu.h> #include <asm/cputype.h> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu) return !!is_threaded; } +struct cpu_smt_info { + unsigned int thread_num; + int core_id; +}; + /* * Propagate the topology information of the processor_topology_node tree to the * cpu_topology array. */ int __init parse_acpi_topology(void) { + unsigned int max_smt_thread_num = 1; + struct cpu_smt_info *entry; + struct xarray hetero_cpu; + unsigned long hetero_id; int cpu, topology_id; if (acpi_disabled) return 0; + xa_init(&hetero_cpu); + for_each_possible_cpu(cpu) { int i, cache_id; @@ -59,6 +72,34 @@ int __init parse_acpi_topology(void) cpu_topology[cpu].thread_id = topology_id; topology_id = find_acpi_cpu_topology(cpu, 1); cpu_topology[cpu].core_id = topology_id; + + /* + * In the PPTT, CPUs below a node with the 'identical + * implementation' flag have the same number of threads. + * Count the number of threads for only one CPU (i.e. + * one core_id) among those with the same hetero_id. + * See the comment of find_acpi_cpu_topology_hetero_id() + * for more details. + * + * One entry is created for each node having: + * - the 'identical implementation' flag + * - its parent not having the flag + */ + hetero_id = find_acpi_cpu_topology_hetero_id(cpu); + entry = xa_load(&hetero_cpu, hetero_id); + if (!entry) { + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + WARN_ON_ONCE(!entry); + + if (entry) { + entry->core_id = topology_id; + entry->thread_num = 1; + xa_store(&hetero_cpu, hetero_id, + entry, GFP_KERNEL); + } + } else if (entry->core_id == topology_id) { + entry->thread_num++; + } } else { cpu_topology[cpu].thread_id = -1; cpu_topology[cpu].core_id = topology_id; @@ -81,6 +122,20 @@ int __init parse_acpi_topology(void) } } + /* + * This is a short loop since the number of XArray elements is the + * number of heterogeneous CPU clusters. On a homogeneous system + * there's only one entry in the XArray. + */ + xa_for_each(&hetero_cpu, hetero_id, entry) { + max_smt_thread_num = max(max_smt_thread_num, entry->thread_num); + xa_erase(&hetero_cpu, hetero_id); + kfree(entry); + } + + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num); + xa_destroy(&hetero_cpu); + return 0; } #endif -- 2.33.0

From: Yicong Yang <yangyicong@hisilicon.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/IBZQQR ---------------------------------------------------------------------- Enable HOTPLUG_SMT for SMT control. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 01281aad440b..6890847c1a13 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -203,6 +203,7 @@ config ARM64 select HAVE_KPROBES select HAVE_KRETPROBES select HAVE_GENERIC_VDSO + select HOTPLUG_SMT if HOTPLUG_CPU select IOMMU_DMA if IOMMU_SUPPORT select IRQ_DOMAIN select IRQ_FORCED_THREADING -- 2.33.0

From: Joey Gouly <joey.gouly@arm.com> Add a HWCAP for FEAT_HBC, so that userspace can make a decision on using this feature. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230804143746.3900803-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/arm64/include/asm/hwcap.h | 1 + arch/arm64/include/uapi/asm/hwcap.h | 1 + arch/arm64/kernel/cpufeature.c | 3 ++- arch/arm64/kernel/cpuinfo.c | 1 + 4 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h index f68fbb207473..2155094ef9f7 100644 --- a/arch/arm64/include/asm/hwcap.h +++ b/arch/arm64/include/asm/hwcap.h @@ -108,6 +108,7 @@ #define KERNEL_HWCAP_ECV __khwcap2_feature(ECV) #define KERNEL_HWCAP_AFP __khwcap2_feature(AFP) #define KERNEL_HWCAP_RPRES __khwcap2_feature(RPRES) +#define KERNEL_HWCAP_HBC __khwcap2_feature(HBC) /* * This yields a mask that user programs can use to figure out what diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h index f03731847d9d..0a3df77b8b64 100644 --- a/arch/arm64/include/uapi/asm/hwcap.h +++ b/arch/arm64/include/uapi/asm/hwcap.h @@ -78,5 +78,6 @@ #define HWCAP2_ECV (1 << 19) #define HWCAP2_AFP (1 << 20) #define HWCAP2_RPRES (1 << 21) +#define HWCAP2_HBC (1UL << 22) #endif /* _UAPI__ASM_HWCAP_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 752e1001a5c7..433f95c7bb94 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -232,7 +232,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { }; static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_CLEARBHB_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_CLEARBHB_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_RPRES_SHIFT, 4, 0), ARM64_FTR_END, }; @@ -2507,6 +2507,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV), HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP), HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES), + HWCAP_CAP(ID_AA64ISAR2_EL1, BC, IMP, CAP_HWCAP, KERNEL_HWCAP_HBC), {}, }; diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index d72898af05b4..04a7d2a974ce 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -97,6 +97,7 @@ static const char *const hwcap_str[] = { [KERNEL_HWCAP_ECV] = "ecv", [KERNEL_HWCAP_AFP] = "afp", [KERNEL_HWCAP_RPRES] = "rpres", + [KERNEL_HWCAP_HBC] = "hbc", }; #ifdef CONFIG_COMPAT -- 2.33.0

From: Kristina Martsenko <kristina.martsenko@arm.com> The Arm v8.8 extension adds a new control FEAT_TIDCP1 that allows the kernel to disable all implementation-defined system registers and instructions in userspace. This can improve robustness against covert channels between processes, for example in cases where the firmware or hardware didn't disable that functionality by default. The kernel does not currently support any implementation-defined features, as there are no hwcaps for any such features, so disable all imp-def features unconditionally. Any use of imp-def instructions will result in a SIGILL being delivered to the process (same as for undefined instructions). Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220622115424.683520-1-kristina.martsenko@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/arm64/include/asm/sysreg.h | 4 ++++ arch/arm64/kernel/cpufeature.c | 18 ++++++++++++++++++ arch/arm64/tools/cpucaps | 1 + 3 files changed, 23 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 3cafd4ef0c74..b8d00be796cf 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -907,6 +907,7 @@ /* id_aa64mmfr1 */ #define ID_AA64MMFR1_ECBHB_SHIFT 60 +#define ID_AA64MMFR1_TIDCP1_SHIFT 52 #define ID_AA64MMFR1_AFP_SHIFT 44 #define ID_AA64MMFR1_ETS_SHIFT 36 #define ID_AA64MMFR1_TWED_SHIFT 32 @@ -922,6 +923,9 @@ #define ID_AA64MMFR1_VMIDBITS_8 0 #define ID_AA64MMFR1_VMIDBITS_16 2 +#define ID_AA64MMFR1_TIDCP1_NI 0 +#define ID_AA64MMFR1_TIDCP1_IMP 1 + /* id_aa64mmfr2 */ #define ID_AA64MMFR2_E0PD_SHIFT 60 #define ID_AA64MMFR2_EVT_SHIFT 56 diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 433f95c7bb94..e0b79f932a5f 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -337,6 +337,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = { }; static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = { + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_TIDCP1_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_AFP_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_ETS_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_TWED_SHIFT, 4, 0), @@ -1943,6 +1944,11 @@ static bool is_kvm_protected_mode(const struct arm64_cpu_capabilities *entry, in } #endif /* CONFIG_KVM */ +static void cpu_trap_el0_impdef(const struct arm64_cpu_capabilities *__unused) +{ + sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_TIDCP); +} + /* Internal helper functions to match cpu capability type */ static bool cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap) @@ -2385,6 +2391,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, .min_field_value = 1, }, + { + .desc = "Trap EL0 IMPLEMENTATION DEFINED functionality", + .capability = ARM64_HAS_TIDCP1, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .sys_reg = SYS_ID_AA64MMFR1_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64MMFR1_TIDCP1_SHIFT, + .field_width = 4, + .min_field_value = ID_AA64MMFR1_TIDCP1_IMP, + .matches = has_cpuid_feature, + .cpu_enable = cpu_trap_el0_impdef, + }, {}, }; diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index a590cabe890c..c44cd3b076e9 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -34,6 +34,7 @@ HAS_RNG HAS_SB HAS_STAGE2_FWB HAS_SYSREG_GIC_CPUIF +HAS_TIDCP1 HAS_TLB_RANGE HAS_VIRT_HOST_EXTN HW_DBM -- 2.33.0

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,转换为PR失败! 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/UHB... 失败原因:补丁集缺失封面信息 建议解决方法:请提供补丁集并重新发送您的补丁集到邮件列表 FeedBack: The patch(es) which you have sent to kernel@openeuler.org has been converted to PR failed! Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/UHB... Failed Reason: the cover of the patches is missing Suggest Solution: please checkout and apply the patches' cover and send all again

From: Yu Liao <liaoyu15@huawei.com> stable inclusion from stable-v6.6.44 commit 3a58c590f6bd1d20eb1e76c5cea31c36cc032339 bugzilla: https://gitee.com/openeuler/kernel/issues/IAHMJO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit f7d43dd206e7e18c182f200e67a8db8c209907fa upstream. Running the LTP hotplug stress test on a aarch64 machine results in rcu_sched stall warnings when the broadcast hrtimer was owned by the un-plugged CPU. The issue is the following: CPU1 (owns the broadcast hrtimer) CPU2 tick_broadcast_enter() // shutdown local timer device broadcast_shutdown_local() ... tick_broadcast_exit() clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT) // timer device is not programmed cpumask_set_cpu(cpu, tick_broadcast_force_mask) initiates offlining of CPU1 take_cpu_down() /* * CPU1 shuts down and does not * send broadcast IPI anymore */ takedown_cpu() hotplug_cpu__broadcast_tick_pull() // move broadcast hrtimer to this CPU clockevents_program_event() bc_set_next() hrtimer_start() /* * timer device is not programmed * because only the first expiring * timer will trigger clockevent * device reprogramming */ What happens is that CPU2 exits broadcast mode with force bit set, then the local timer device is not reprogrammed and CPU2 expects to receive the expired event by the broadcast IPI. But this does not happen because CPU1 is offlined by CPU2. CPU switches the clockevent device to ONESHOT state, but does not reprogram the device. The subsequent reprogramming of the hrtimer broadcast device does not program the clockevent device of CPU2 either because the pending expiry time is already in the past and the CPU expects the event to be delivered. As a consequence all CPUs which wait for a broadcast event to be delivered are stuck forever. Fix this issue by reprogramming the local timer device if the broadcast force bit of the CPU is set so that the broadcast hrtimer is delivered. [ tglx: Massage comment and change log. Add Fixes tag ] Fixes: 989dcb645ca7 ("tick: Handle broadcast wakeup of multiple cpus") Signed-off-by: Yu Liao <liaoyu15@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240711124843.64167-1-liaoyu15@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- kernel/time/tick-broadcast.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 0916cc9adb82..ba551ec546f5 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -1137,6 +1137,7 @@ void tick_broadcast_switch_to_oneshot(void) #ifdef CONFIG_HOTPLUG_CPU void hotplug_cpu__broadcast_tick_pull(int deadcpu) { + struct tick_device *td = this_cpu_ptr(&tick_cpu_device); struct clock_event_device *bc; unsigned long flags; @@ -1144,6 +1145,28 @@ void hotplug_cpu__broadcast_tick_pull(int deadcpu) bc = tick_broadcast_device.evtdev; if (bc && broadcast_needs_cpu(bc, deadcpu)) { + /* + * If the broadcast force bit of the current CPU is set, + * then the current CPU has not yet reprogrammed the local + * timer device to avoid a ping-pong race. See + * ___tick_broadcast_oneshot_control(). + * + * If the broadcast device is hrtimer based then + * programming the broadcast event below does not have any + * effect because the local clockevent device is not + * running and not programmed because the broadcast event + * is not earlier than the pending event of the local clock + * event device. As a consequence all CPUs waiting for a + * broadcast event are stuck forever. + * + * Detect this condition and reprogram the cpu local timer + * device to avoid the starvation. + */ + if (tick_check_broadcast_expired()) { + cpumask_clear_cpu(smp_processor_id(), tick_broadcast_force_mask); + tick_program_event(td->evtdev->next_event, 1); + } + /* This moves the broadcast assignment to this CPU: */ clockevents_program_event(bc, bc->next_event, 1); } -- 2.33.0

From: Mark Brown <broonie@kernel.org> Currently we don't have any coverage of the syscall ABI so let's add a very dumb test program which sets up register patterns, does a sysscall and then checks that the register state after the syscall matches what we expect. The program is written in an extremely simplistic fashion with the goal of making it easy to verify that it's doing what it thinks it's doing, it is not a model of how one should write actual code. Currently we validate the general purpose, FPSIMD and SVE registers. There are other thing things that could be covered like FPCR and flags registers, these can be covered incrementally - my main focus at the minute is covering the ABI for the SVE registers. The program repeats the tests for all possible SVE vector lengths in case some vector length specific optimisation causes issues, as well as testing FPSIMD only. It tries two syscalls, getpid() and sched_yield(), in an effort to cover both immediate return to userspace and scheduling another task though there are no guarantees which cases will be hit. A new test directory "abi" is added to hold the test, it doesn't seem to fit well into any of the existing directories. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- tools/testing/selftests/arm64/Makefile | 2 +- tools/testing/selftests/arm64/abi/.gitignore | 1 + tools/testing/selftests/arm64/abi/Makefile | 8 + .../selftests/arm64/abi/syscall-abi-asm.S | 240 +++++++++++++ .../testing/selftests/arm64/abi/syscall-abi.c | 318 ++++++++++++++++++ 5 files changed, 568 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/arm64/abi/.gitignore create mode 100644 tools/testing/selftests/arm64/abi/Makefile create mode 100644 tools/testing/selftests/arm64/abi/syscall-abi-asm.S create mode 100644 tools/testing/selftests/arm64/abi/syscall-abi.c diff --git a/tools/testing/selftests/arm64/Makefile b/tools/testing/selftests/arm64/Makefile index ced910fb4019..1e8d9a8f59df 100644 --- a/tools/testing/selftests/arm64/Makefile +++ b/tools/testing/selftests/arm64/Makefile @@ -4,7 +4,7 @@ ARCH ?= $(shell uname -m 2>/dev/null || echo not) ifneq (,$(filter $(ARCH),aarch64 arm64)) -ARM64_SUBTARGETS ?= tags signal pauth fp mte bti +ARM64_SUBTARGETS ?= tags signal pauth fp mte bti abi else ARM64_SUBTARGETS := endif diff --git a/tools/testing/selftests/arm64/abi/.gitignore b/tools/testing/selftests/arm64/abi/.gitignore new file mode 100644 index 000000000000..b79cf5814c23 --- /dev/null +++ b/tools/testing/selftests/arm64/abi/.gitignore @@ -0,0 +1 @@ +syscall-abi diff --git a/tools/testing/selftests/arm64/abi/Makefile b/tools/testing/selftests/arm64/abi/Makefile new file mode 100644 index 000000000000..96eba974ac8d --- /dev/null +++ b/tools/testing/selftests/arm64/abi/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2021 ARM Limited + +TEST_GEN_PROGS := syscall-abi + +include ../../lib.mk + +$(OUTPUT)/syscall-abi: syscall-abi.c syscall-abi-asm.S diff --git a/tools/testing/selftests/arm64/abi/syscall-abi-asm.S b/tools/testing/selftests/arm64/abi/syscall-abi-asm.S new file mode 100644 index 000000000000..983467cfcee0 --- /dev/null +++ b/tools/testing/selftests/arm64/abi/syscall-abi-asm.S @@ -0,0 +1,240 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (C) 2021 ARM Limited. +// +// Assembly portion of the syscall ABI test + +// +// Load values from memory into registers, invoke a syscall and save the +// register values back to memory for later checking. The syscall to be +// invoked is configured in x8 of the input GPR data. +// +// x0: SVE VL, 0 for FP only +// +// GPRs: gpr_in, gpr_out +// FPRs: fpr_in, fpr_out +// Zn: z_in, z_out +// Pn: p_in, p_out +// FFR: ffr_in, ffr_out + +.arch_extension sve + +.globl do_syscall +do_syscall: + // Store callee saved registers x19-x29 (80 bytes) plus x0 and x1 + stp x29, x30, [sp, #-112]! + mov x29, sp + stp x0, x1, [sp, #16] + stp x19, x20, [sp, #32] + stp x21, x22, [sp, #48] + stp x23, x24, [sp, #64] + stp x25, x26, [sp, #80] + stp x27, x28, [sp, #96] + + // Load GPRs x8-x28, and save our SP/FP for later comparison + ldr x2, =gpr_in + add x2, x2, #64 + ldp x8, x9, [x2], #16 + ldp x10, x11, [x2], #16 + ldp x12, x13, [x2], #16 + ldp x14, x15, [x2], #16 + ldp x16, x17, [x2], #16 + ldp x18, x19, [x2], #16 + ldp x20, x21, [x2], #16 + ldp x22, x23, [x2], #16 + ldp x24, x25, [x2], #16 + ldp x26, x27, [x2], #16 + ldr x28, [x2], #8 + str x29, [x2], #8 // FP + str x30, [x2], #8 // LR + + // Load FPRs if we're not doing SVE + cbnz x0, 1f + ldr x2, =fpr_in + ldp q0, q1, [x2] + ldp q2, q3, [x2, #16 * 2] + ldp q4, q5, [x2, #16 * 4] + ldp q6, q7, [x2, #16 * 6] + ldp q8, q9, [x2, #16 * 8] + ldp q10, q11, [x2, #16 * 10] + ldp q12, q13, [x2, #16 * 12] + ldp q14, q15, [x2, #16 * 14] + ldp q16, q17, [x2, #16 * 16] + ldp q18, q19, [x2, #16 * 18] + ldp q20, q21, [x2, #16 * 20] + ldp q22, q23, [x2, #16 * 22] + ldp q24, q25, [x2, #16 * 24] + ldp q26, q27, [x2, #16 * 26] + ldp q28, q29, [x2, #16 * 28] + ldp q30, q31, [x2, #16 * 30] +1: + + // Load the SVE registers if we're doing SVE + cbz x0, 1f + + ldr x2, =z_in + ldr z0, [x2, #0, MUL VL] + ldr z1, [x2, #1, MUL VL] + ldr z2, [x2, #2, MUL VL] + ldr z3, [x2, #3, MUL VL] + ldr z4, [x2, #4, MUL VL] + ldr z5, [x2, #5, MUL VL] + ldr z6, [x2, #6, MUL VL] + ldr z7, [x2, #7, MUL VL] + ldr z8, [x2, #8, MUL VL] + ldr z9, [x2, #9, MUL VL] + ldr z10, [x2, #10, MUL VL] + ldr z11, [x2, #11, MUL VL] + ldr z12, [x2, #12, MUL VL] + ldr z13, [x2, #13, MUL VL] + ldr z14, [x2, #14, MUL VL] + ldr z15, [x2, #15, MUL VL] + ldr z16, [x2, #16, MUL VL] + ldr z17, [x2, #17, MUL VL] + ldr z18, [x2, #18, MUL VL] + ldr z19, [x2, #19, MUL VL] + ldr z20, [x2, #20, MUL VL] + ldr z21, [x2, #21, MUL VL] + ldr z22, [x2, #22, MUL VL] + ldr z23, [x2, #23, MUL VL] + ldr z24, [x2, #24, MUL VL] + ldr z25, [x2, #25, MUL VL] + ldr z26, [x2, #26, MUL VL] + ldr z27, [x2, #27, MUL VL] + ldr z28, [x2, #28, MUL VL] + ldr z29, [x2, #29, MUL VL] + ldr z30, [x2, #30, MUL VL] + ldr z31, [x2, #31, MUL VL] + + ldr x2, =ffr_in + ldr p0, [x2, #0] + wrffr p0.b + + ldr x2, =p_in + ldr p0, [x2, #0, MUL VL] + ldr p1, [x2, #1, MUL VL] + ldr p2, [x2, #2, MUL VL] + ldr p3, [x2, #3, MUL VL] + ldr p4, [x2, #4, MUL VL] + ldr p5, [x2, #5, MUL VL] + ldr p6, [x2, #6, MUL VL] + ldr p7, [x2, #7, MUL VL] + ldr p8, [x2, #8, MUL VL] + ldr p9, [x2, #9, MUL VL] + ldr p10, [x2, #10, MUL VL] + ldr p11, [x2, #11, MUL VL] + ldr p12, [x2, #12, MUL VL] + ldr p13, [x2, #13, MUL VL] + ldr p14, [x2, #14, MUL VL] + ldr p15, [x2, #15, MUL VL] +1: + + // Do the syscall + svc #0 + + // Save GPRs x8-x30 + ldr x2, =gpr_out + add x2, x2, #64 + stp x8, x9, [x2], #16 + stp x10, x11, [x2], #16 + stp x12, x13, [x2], #16 + stp x14, x15, [x2], #16 + stp x16, x17, [x2], #16 + stp x18, x19, [x2], #16 + stp x20, x21, [x2], #16 + stp x22, x23, [x2], #16 + stp x24, x25, [x2], #16 + stp x26, x27, [x2], #16 + stp x28, x29, [x2], #16 + str x30, [x2] + + // Restore x0 and x1 for feature checks + ldp x0, x1, [sp, #16] + + // Save FPSIMD state + ldr x2, =fpr_out + stp q0, q1, [x2] + stp q2, q3, [x2, #16 * 2] + stp q4, q5, [x2, #16 * 4] + stp q6, q7, [x2, #16 * 6] + stp q8, q9, [x2, #16 * 8] + stp q10, q11, [x2, #16 * 10] + stp q12, q13, [x2, #16 * 12] + stp q14, q15, [x2, #16 * 14] + stp q16, q17, [x2, #16 * 16] + stp q18, q19, [x2, #16 * 18] + stp q20, q21, [x2, #16 * 20] + stp q22, q23, [x2, #16 * 22] + stp q24, q25, [x2, #16 * 24] + stp q26, q27, [x2, #16 * 26] + stp q28, q29, [x2, #16 * 28] + stp q30, q31, [x2, #16 * 30] + + // Save the SVE state if we have some + cbz x0, 1f + + ldr x2, =z_out + str z0, [x2, #0, MUL VL] + str z1, [x2, #1, MUL VL] + str z2, [x2, #2, MUL VL] + str z3, [x2, #3, MUL VL] + str z4, [x2, #4, MUL VL] + str z5, [x2, #5, MUL VL] + str z6, [x2, #6, MUL VL] + str z7, [x2, #7, MUL VL] + str z8, [x2, #8, MUL VL] + str z9, [x2, #9, MUL VL] + str z10, [x2, #10, MUL VL] + str z11, [x2, #11, MUL VL] + str z12, [x2, #12, MUL VL] + str z13, [x2, #13, MUL VL] + str z14, [x2, #14, MUL VL] + str z15, [x2, #15, MUL VL] + str z16, [x2, #16, MUL VL] + str z17, [x2, #17, MUL VL] + str z18, [x2, #18, MUL VL] + str z19, [x2, #19, MUL VL] + str z20, [x2, #20, MUL VL] + str z21, [x2, #21, MUL VL] + str z22, [x2, #22, MUL VL] + str z23, [x2, #23, MUL VL] + str z24, [x2, #24, MUL VL] + str z25, [x2, #25, MUL VL] + str z26, [x2, #26, MUL VL] + str z27, [x2, #27, MUL VL] + str z28, [x2, #28, MUL VL] + str z29, [x2, #29, MUL VL] + str z30, [x2, #30, MUL VL] + str z31, [x2, #31, MUL VL] + + ldr x2, =p_out + str p0, [x2, #0, MUL VL] + str p1, [x2, #1, MUL VL] + str p2, [x2, #2, MUL VL] + str p3, [x2, #3, MUL VL] + str p4, [x2, #4, MUL VL] + str p5, [x2, #5, MUL VL] + str p6, [x2, #6, MUL VL] + str p7, [x2, #7, MUL VL] + str p8, [x2, #8, MUL VL] + str p9, [x2, #9, MUL VL] + str p10, [x2, #10, MUL VL] + str p11, [x2, #11, MUL VL] + str p12, [x2, #12, MUL VL] + str p13, [x2, #13, MUL VL] + str p14, [x2, #14, MUL VL] + str p15, [x2, #15, MUL VL] + + ldr x2, =ffr_out + rdffr p0.b + str p0, [x2, #0] +1: + + // Restore callee saved registers x19-x30 + ldp x19, x20, [sp, #32] + ldp x21, x22, [sp, #48] + ldp x23, x24, [sp, #64] + ldp x25, x26, [sp, #80] + ldp x27, x28, [sp, #96] + ldp x29, x30, [sp], #112 + + ret diff --git a/tools/testing/selftests/arm64/abi/syscall-abi.c b/tools/testing/selftests/arm64/abi/syscall-abi.c new file mode 100644 index 000000000000..d8eeeafb50dc --- /dev/null +++ b/tools/testing/selftests/arm64/abi/syscall-abi.c @@ -0,0 +1,318 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2021 ARM Limited. + */ + +#include <errno.h> +#include <stdbool.h> +#include <stddef.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <sys/auxv.h> +#include <sys/prctl.h> +#include <asm/hwcap.h> +#include <asm/sigcontext.h> +#include <asm/unistd.h> + +#include "../../kselftest.h" + +#define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0])) +#define NUM_VL ((SVE_VQ_MAX - SVE_VQ_MIN) + 1) + +extern void do_syscall(int sve_vl); + +static void fill_random(void *buf, size_t size) +{ + int i; + uint32_t *lbuf = buf; + + /* random() returns a 32 bit number regardless of the size of long */ + for (i = 0; i < size / sizeof(uint32_t); i++) + lbuf[i] = random(); +} + +/* + * We also repeat the test for several syscalls to try to expose different + * behaviour. + */ +static struct syscall_cfg { + int syscall_nr; + const char *name; +} syscalls[] = { + { __NR_getpid, "getpid()" }, + { __NR_sched_yield, "sched_yield()" }, +}; + +#define NUM_GPR 31 +uint64_t gpr_in[NUM_GPR]; +uint64_t gpr_out[NUM_GPR]; + +static void setup_gpr(struct syscall_cfg *cfg, int sve_vl) +{ + fill_random(gpr_in, sizeof(gpr_in)); + gpr_in[8] = cfg->syscall_nr; + memset(gpr_out, 0, sizeof(gpr_out)); +} + +static int check_gpr(struct syscall_cfg *cfg, int sve_vl) +{ + int errors = 0; + int i; + + /* + * GPR x0-x7 may be clobbered, and all others should be preserved. + */ + for (i = 9; i < ARRAY_SIZE(gpr_in); i++) { + if (gpr_in[i] != gpr_out[i]) { + ksft_print_msg("%s SVE VL %d mismatch in GPR %d: %llx != %llx\n", + cfg->name, sve_vl, i, + gpr_in[i], gpr_out[i]); + errors++; + } + } + + return errors; +} + +#define NUM_FPR 32 +uint64_t fpr_in[NUM_FPR * 2]; +uint64_t fpr_out[NUM_FPR * 2]; + +static void setup_fpr(struct syscall_cfg *cfg, int sve_vl) +{ + fill_random(fpr_in, sizeof(fpr_in)); + memset(fpr_out, 0, sizeof(fpr_out)); +} + +static int check_fpr(struct syscall_cfg *cfg, int sve_vl) +{ + int errors = 0; + int i; + + if (!sve_vl) { + for (i = 0; i < ARRAY_SIZE(fpr_in); i++) { + if (fpr_in[i] != fpr_out[i]) { + ksft_print_msg("%s Q%d/%d mismatch %llx != %llx\n", + cfg->name, + i / 2, i % 2, + fpr_in[i], fpr_out[i]); + errors++; + } + } + } + + return errors; +} + +static uint8_t z_zero[__SVE_ZREG_SIZE(SVE_VQ_MAX)]; +uint8_t z_in[SVE_NUM_PREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)]; +uint8_t z_out[SVE_NUM_PREGS * __SVE_ZREG_SIZE(SVE_VQ_MAX)]; + +static void setup_z(struct syscall_cfg *cfg, int sve_vl) +{ + fill_random(z_in, sizeof(z_in)); + fill_random(z_out, sizeof(z_out)); +} + +static int check_z(struct syscall_cfg *cfg, int sve_vl) +{ + size_t reg_size = sve_vl; + int errors = 0; + int i; + + if (!sve_vl) + return 0; + + /* + * After a syscall the low 128 bits of the Z registers should + * be preserved and the rest be zeroed or preserved. + */ + for (i = 0; i < SVE_NUM_ZREGS; i++) { + void *in = &z_in[reg_size * i]; + void *out = &z_out[reg_size * i]; + + if (memcmp(in, out, SVE_VQ_BYTES) != 0) { + ksft_print_msg("%s SVE VL %d Z%d low 128 bits changed\n", + cfg->name, sve_vl, i); + errors++; + } + } + + return errors; +} + +uint8_t p_in[SVE_NUM_PREGS * __SVE_PREG_SIZE(SVE_VQ_MAX)]; +uint8_t p_out[SVE_NUM_PREGS * __SVE_PREG_SIZE(SVE_VQ_MAX)]; + +static void setup_p(struct syscall_cfg *cfg, int sve_vl) +{ + fill_random(p_in, sizeof(p_in)); + fill_random(p_out, sizeof(p_out)); +} + +static int check_p(struct syscall_cfg *cfg, int sve_vl) +{ + size_t reg_size = sve_vq_from_vl(sve_vl) * 2; /* 1 bit per VL byte */ + + int errors = 0; + int i; + + if (!sve_vl) + return 0; + + /* After a syscall the P registers should be preserved or zeroed */ + for (i = 0; i < SVE_NUM_PREGS * reg_size; i++) + if (p_out[i] && (p_in[i] != p_out[i])) + errors++; + if (errors) + ksft_print_msg("%s SVE VL %d predicate registers non-zero\n", + cfg->name, sve_vl); + + return errors; +} + +uint8_t ffr_in[__SVE_PREG_SIZE(SVE_VQ_MAX)]; +uint8_t ffr_out[__SVE_PREG_SIZE(SVE_VQ_MAX)]; + +static void setup_ffr(struct syscall_cfg *cfg, int sve_vl) +{ + /* + * It is only valid to set a contiguous set of bits starting + * at 0. For now since we're expecting this to be cleared by + * a syscall just set all bits. + */ + memset(ffr_in, 0xff, sizeof(ffr_in)); + fill_random(ffr_out, sizeof(ffr_out)); +} + +static int check_ffr(struct syscall_cfg *cfg, int sve_vl) +{ + size_t reg_size = sve_vq_from_vl(sve_vl) * 2; /* 1 bit per VL byte */ + int errors = 0; + int i; + + if (!sve_vl) + return 0; + + /* After a syscall the P registers should be preserved or zeroed */ + for (i = 0; i < reg_size; i++) + if (ffr_out[i] && (ffr_in[i] != ffr_out[i])) + errors++; + if (errors) + ksft_print_msg("%s SVE VL %d FFR non-zero\n", + cfg->name, sve_vl); + + return errors; +} + +typedef void (*setup_fn)(struct syscall_cfg *cfg, int sve_vl); +typedef int (*check_fn)(struct syscall_cfg *cfg, int sve_vl); + +/* + * Each set of registers has a setup function which is called before + * the syscall to fill values in a global variable for loading by the + * test code and a check function which validates that the results are + * as expected. Vector lengths are passed everywhere, a vector length + * of 0 should be treated as do not test. + */ +static struct { + setup_fn setup; + check_fn check; +} regset[] = { + { setup_gpr, check_gpr }, + { setup_fpr, check_fpr }, + { setup_z, check_z }, + { setup_p, check_p }, + { setup_ffr, check_ffr }, +}; + +static bool do_test(struct syscall_cfg *cfg, int sve_vl) +{ + int errors = 0; + int i; + + for (i = 0; i < ARRAY_SIZE(regset); i++) + regset[i].setup(cfg, sve_vl); + + do_syscall(sve_vl); + + for (i = 0; i < ARRAY_SIZE(regset); i++) + errors += regset[i].check(cfg, sve_vl); + + return errors == 0; +} + +static void test_one_syscall(struct syscall_cfg *cfg) +{ + int sve_vq, sve_vl; + + /* FPSIMD only case */ + ksft_test_result(do_test(cfg, 0), + "%s FPSIMD\n", cfg->name); + + if (!(getauxval(AT_HWCAP) & HWCAP_SVE)) + return; + + for (sve_vq = SVE_VQ_MAX; sve_vq > 0; --sve_vq) { + sve_vl = prctl(PR_SVE_SET_VL, sve_vq * 16); + if (sve_vl == -1) + ksft_exit_fail_msg("PR_SVE_SET_VL failed: %s (%d)\n", + strerror(errno), errno); + + sve_vl &= PR_SVE_VL_LEN_MASK; + + if (sve_vq != sve_vq_from_vl(sve_vl)) + sve_vq = sve_vq_from_vl(sve_vl); + + ksft_test_result(do_test(cfg, sve_vl), + "%s SVE VL %d\n", cfg->name, sve_vl); + } +} + +int sve_count_vls(void) +{ + unsigned int vq; + int vl_count = 0; + int vl; + + if (!(getauxval(AT_HWCAP) & HWCAP_SVE)) + return 0; + + /* + * Enumerate up to SVE_VQ_MAX vector lengths + */ + for (vq = SVE_VQ_MAX; vq > 0; --vq) { + vl = prctl(PR_SVE_SET_VL, vq * 16); + if (vl == -1) + ksft_exit_fail_msg("PR_SVE_SET_VL failed: %s (%d)\n", + strerror(errno), errno); + + vl &= PR_SVE_VL_LEN_MASK; + + if (vq != sve_vq_from_vl(vl)) + vq = sve_vq_from_vl(vl); + + vl_count++; + } + + return vl_count; +} + +int main(void) +{ + int i; + + srandom(getpid()); + + ksft_print_header(); + ksft_set_plan(ARRAY_SIZE(syscalls) * (sve_count_vls() + 1)); + + for (i = 0; i < ARRAY_SIZE(syscalls); i++) + test_one_syscall(&syscalls[i]); + + ksft_print_cnts(); + + return 0; +} -- 2.33.0

From: Mark Brown <broonie@kernel.org> Add some trivial hwcap validation which checks that /proc/cpuinfo and AT_HWCAP agree with each other and can verify that for extensions that can generate a SIGILL due to adding new instructions one appears or doesn't appear as expected. I've added SVE and SME, other capabilities can be added later if this gets merged. This isn't super exciting but on the other hand took very little time to write and should be handy when verifying that you wired up AT_HWCAP properly. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829154602.827275-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- tools/testing/selftests/arm64/abi/Makefile | 2 +- tools/testing/selftests/arm64/abi/hwcap.c | 188 +++++++++++++++++++++ 2 files changed, 189 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/arm64/abi/hwcap.c diff --git a/tools/testing/selftests/arm64/abi/Makefile b/tools/testing/selftests/arm64/abi/Makefile index 96eba974ac8d..e84ccbed660b 100644 --- a/tools/testing/selftests/arm64/abi/Makefile +++ b/tools/testing/selftests/arm64/abi/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 # Copyright (C) 2021 ARM Limited -TEST_GEN_PROGS := syscall-abi +TEST_GEN_PROGS := hwcap syscall-abi include ../../lib.mk diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c new file mode 100644 index 000000000000..8b5c646cea63 --- /dev/null +++ b/tools/testing/selftests/arm64/abi/hwcap.c @@ -0,0 +1,188 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2022 ARM Limited. + */ + +#include <errno.h> +#include <signal.h> +#include <stdbool.h> +#include <stddef.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <sys/auxv.h> +#include <sys/prctl.h> +#include <asm/hwcap.h> +#include <asm/sigcontext.h> +#include <asm/unistd.h> + +#include "../../kselftest.h" + +#define TESTS_PER_HWCAP 2 + +/* + * Function expected to generate SIGILL when the feature is not + * supported and return when it is supported. If SIGILL is generated + * then the handler must be able to skip over the instruction safely. + * + * Note that it is expected that for many architecture extensions + * there are no specific traps due to no architecture state being + * added so we may not fault if running on a kernel which doesn't know + * to add the hwcap. + */ +typedef void (*sigill_fn)(void); + +static void sme_sigill(void) +{ + /* RDSVL x0, #0 */ + asm volatile(".inst 0x04bf5800" : : : "x0"); +} + +static void sve_sigill(void) +{ + /* RDVL x0, #0 */ + asm volatile(".inst 0x04bf5000" : : : "x0"); +} + +static const struct hwcap_data { + const char *name; + unsigned long at_hwcap; + unsigned long hwcap_bit; + const char *cpuinfo; + sigill_fn sigill_fn; + bool sigill_reliable; +} hwcaps[] = { + { + .name = "SME", + .at_hwcap = AT_HWCAP2, + .hwcap_bit = HWCAP2_SME, + .cpuinfo = "sme", + .sigill_fn = sme_sigill, + .sigill_reliable = true, + }, + { + .name = "SVE", + .at_hwcap = AT_HWCAP, + .hwcap_bit = HWCAP_SVE, + .cpuinfo = "sve", + .sigill_fn = sve_sigill, + .sigill_reliable = true, + }, +}; + +static bool seen_sigill; + +static void handle_sigill(int sig, siginfo_t *info, void *context) +{ + ucontext_t *uc = context; + + seen_sigill = true; + + /* Skip over the offending instruction */ + uc->uc_mcontext.pc += 4; +} + +bool cpuinfo_present(const char *name) +{ + FILE *f; + char buf[2048], name_space[30], name_newline[30]; + char *s; + + /* + * The feature should appear with a leading space and either a + * trailing space or a newline. + */ + snprintf(name_space, sizeof(name_space), " %s ", name); + snprintf(name_newline, sizeof(name_newline), " %s\n", name); + + f = fopen("/proc/cpuinfo", "r"); + if (!f) { + ksft_print_msg("Failed to open /proc/cpuinfo\n"); + return false; + } + + while (fgets(buf, sizeof(buf), f)) { + /* Features: line? */ + if (strncmp(buf, "Features\t:", strlen("Features\t:")) != 0) + continue; + + /* All CPUs should be symmetric, don't read any more */ + fclose(f); + + s = strstr(buf, name_space); + if (s) + return true; + s = strstr(buf, name_newline); + if (s) + return true; + + return false; + } + + ksft_print_msg("Failed to find Features in /proc/cpuinfo\n"); + fclose(f); + return false; +} + +int main(void) +{ + const struct hwcap_data *hwcap; + int i, ret; + bool have_cpuinfo, have_hwcap; + struct sigaction sa; + + ksft_print_header(); + ksft_set_plan(ARRAY_SIZE(hwcaps) * TESTS_PER_HWCAP); + + memset(&sa, 0, sizeof(sa)); + sa.sa_sigaction = handle_sigill; + sa.sa_flags = SA_RESTART | SA_SIGINFO; + sigemptyset(&sa.sa_mask); + ret = sigaction(SIGILL, &sa, NULL); + if (ret < 0) + ksft_exit_fail_msg("Failed to install SIGILL handler: %s (%d)\n", + strerror(errno), errno); + + for (i = 0; i < ARRAY_SIZE(hwcaps); i++) { + hwcap = &hwcaps[i]; + + have_hwcap = getauxval(hwcaps->at_hwcap) & hwcap->hwcap_bit; + have_cpuinfo = cpuinfo_present(hwcap->cpuinfo); + + if (have_hwcap) + ksft_print_msg("%s present", hwcap->name); + + ksft_test_result(have_hwcap == have_cpuinfo, + "cpuinfo_match_%s\n", hwcap->name); + + if (hwcap->sigill_fn) { + seen_sigill = false; + hwcap->sigill_fn(); + + if (have_hwcap) { + /* Should be able to use the extension */ + ksft_test_result(!seen_sigill, "sigill_%s\n", + hwcap->name); + } else if (hwcap->sigill_reliable) { + /* Guaranteed a SIGILL */ + ksft_test_result(seen_sigill, "sigill_%s\n", + hwcap->name); + } else { + /* Missing SIGILL might be fine */ + ksft_print_msg("SIGILL %sreported for %s\n", + seen_sigill ? "" : "not ", + hwcap->name); + ksft_test_result_skip("sigill_%s\n", + hwcap->name); + } + } else { + ksft_test_result_skip("sigill_%s\n", + hwcap->name); + } + } + + ksft_print_cnts(); + + return 0; +} -- 2.33.0

From: Joey Gouly <joey.gouly@arm.com> Add a test for the newly added HWCAP2_HBC. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230804143746.3900803-3-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- tools/testing/selftests/arm64/abi/hwcap.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c index 8b5c646cea63..65a7b01d7cdc 100644 --- a/tools/testing/selftests/arm64/abi/hwcap.c +++ b/tools/testing/selftests/arm64/abi/hwcap.c @@ -45,6 +45,13 @@ static void sve_sigill(void) asm volatile(".inst 0x04bf5000" : : : "x0"); } +static void hbc_sigill(void) +{ + /* BC.EQ +4 */ + asm volatile("cmp xzr, xzr\n" + ".inst 0x54000030" : : : "cc"); +} + static const struct hwcap_data { const char *name; unsigned long at_hwcap; @@ -69,6 +76,14 @@ static const struct hwcap_data { .sigill_fn = sve_sigill, .sigill_reliable = true, }, + { + .name = "HBC", + .at_hwcap = AT_HWCAP2, + .hwcap_bit = HWCAP2_HBC, + .cpuinfo = "hbc", + .sigill_fn = hbc_sigill, + .sigill_reliable = true, + }, }; static bool seen_sigill; -- 2.33.0
participants (2)
-
patchwork bot
-
Qi Xi