[PATCH OLK-6.6 0/9] cpufreq: CPPC: Don't warn on failing to read perf counters on offline cpus

From: linhongye <15041989+linhongye@user.noreply.gitee.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT ---------------------------------------------------------------------- Jie Zhan (2): cpufreq: CPPC: Don't warn on failing to read perf counters on offline cpus cpufreq: CPPC: Fix error handling in cppc_scale_freq_workfn() Lifeng Zheng (1): arm64: topology: Setup amu fie when cpu hotplugging Vincent Guittot (6): sched/topology: Add a new arch_scale_freq_ref() method cpufreq: Use the fixed and coherent frequency for scaling capacity cpufreq/schedutil: Use a fixed reference frequency energy_model: Use a fixed reference frequency cpufreq/cppc: Set the frequency used for computing the capacity arm64/amu: Use capacity_ref_freq() to set AMU ratio arch/arm/include/asm/topology.h | 1 + arch/arm64/include/asm/topology.h | 1 + arch/arm64/kernel/topology.c | 77 +++++++++++-------------------- arch/riscv/include/asm/topology.h | 1 + drivers/base/arch_topology.c | 56 +++++++++++++++------- drivers/cpufreq/cppc_cpufreq.c | 23 ++++++--- drivers/cpufreq/cpufreq.c | 4 +- include/linux/arch_topology.h | 8 ++++ include/linux/cpufreq.h | 1 + include/linux/energy_model.h | 6 +-- include/linux/sched/topology.h | 8 ++++ kernel/sched/cpufreq_schedutil.c | 26 ++++++++++- 12 files changed, 132 insertions(+), 80 deletions(-) -- 2.33.0

From: Vincent Guittot <vincent.guittot@linaro.org> mainline inclusion from mainline-v6.8-rc6 commit 9942cb22ea458c34fa17b73d143ea32d4df1caca category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ---------------------------------------------------------------------- Create a new method to get a unique and fixed max frequency. Currently cpuinfo.max_freq or the highest (or last) state of performance domain are used as the max frequency when computing the frequency for a level of utilization, but: - cpuinfo_max_freq can change at runtime. boost is one example of such change. - cpuinfo.max_freq and last item of the PD can be different leading to different results between cpufreq and energy model. We need to save the reference frequency that has been used when computing the CPUs capacity and use this fixed and coherent value to convert between frequency and CPU's capacity. In fact, we already save the frequency that has been used when computing the capacity of each CPU. We extend the precision to save kHz instead of MHz currently and we modify the type to be aligned with other variables used when converting frequency to capacity and the other way. [ mingo: Minor edits. ] Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20231211104855.558096-2-vincent.guittot@linaro.org Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- arch/arm/include/asm/topology.h | 1 + arch/arm64/include/asm/topology.h | 1 + arch/riscv/include/asm/topology.h | 1 + drivers/base/arch_topology.c | 29 ++++++++++++++--------------- include/linux/arch_topology.h | 7 +++++++ include/linux/sched/topology.h | 8 ++++++++ 6 files changed, 32 insertions(+), 15 deletions(-) diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h index c7d2510e5a78..853c4f81ba4a 100644 --- a/arch/arm/include/asm/topology.h +++ b/arch/arm/include/asm/topology.h @@ -13,6 +13,7 @@ #define arch_set_freq_scale topology_set_freq_scale #define arch_scale_freq_capacity topology_get_freq_scale #define arch_scale_freq_invariant topology_scale_freq_invariant +#define arch_scale_freq_ref topology_get_freq_ref #endif /* Replace task scheduler's default cpu-invariant accounting */ diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h index 9fab663dd2de..a323b109b9c4 100644 --- a/arch/arm64/include/asm/topology.h +++ b/arch/arm64/include/asm/topology.h @@ -23,6 +23,7 @@ void update_freq_counters_refs(void); #define arch_set_freq_scale topology_set_freq_scale #define arch_scale_freq_capacity topology_get_freq_scale #define arch_scale_freq_invariant topology_scale_freq_invariant +#define arch_scale_freq_ref topology_get_freq_ref #ifdef CONFIG_ACPI_CPPC_LIB #define arch_init_invariance_cppc topology_init_cpu_capacity_cppc diff --git a/arch/riscv/include/asm/topology.h b/arch/riscv/include/asm/topology.h index e316ab3b77f3..61183688bdd5 100644 --- a/arch/riscv/include/asm/topology.h +++ b/arch/riscv/include/asm/topology.h @@ -9,6 +9,7 @@ #define arch_set_freq_scale topology_set_freq_scale #define arch_scale_freq_capacity topology_get_freq_scale #define arch_scale_freq_invariant topology_scale_freq_invariant +#define arch_scale_freq_ref topology_get_freq_ref /* Replace task scheduler's default cpu-invariant accounting */ #define arch_scale_cpu_capacity topology_get_cpu_scale diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index b741b5ba82bd..0c9ae5b157b1 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -19,6 +19,7 @@ #include <linux/init.h> #include <linux/rcupdate.h> #include <linux/sched.h> +#include <linux/units.h> #define CREATE_TRACE_POINTS #include <trace/events/thermal_pressure.h> @@ -26,7 +27,8 @@ static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data); static struct cpumask scale_freq_counters_mask; static bool scale_freq_invariant; -static DEFINE_PER_CPU(u32, freq_factor) = 1; +DEFINE_PER_CPU(unsigned long, capacity_freq_ref) = 1; +EXPORT_PER_CPU_SYMBOL_GPL(capacity_freq_ref); static bool supports_scale_freq_counters(const struct cpumask *cpus) { @@ -170,9 +172,9 @@ DEFINE_PER_CPU(unsigned long, thermal_pressure); * operating on stale data when hot-plug is used for some CPUs. The * @capped_freq reflects the currently allowed max CPUs frequency due to * thermal capping. It might be also a boost frequency value, which is bigger - * than the internal 'freq_factor' max frequency. In such case the pressure - * value should simply be removed, since this is an indication that there is - * no thermal throttling. The @capped_freq must be provided in kHz. + * than the internal 'capacity_freq_ref' max frequency. In such case the + * pressure value should simply be removed, since this is an indication that + * there is no thermal throttling. The @capped_freq must be provided in kHz. */ void topology_update_thermal_pressure(const struct cpumask *cpus, unsigned long capped_freq) @@ -183,10 +185,7 @@ void topology_update_thermal_pressure(const struct cpumask *cpus, cpu = cpumask_first(cpus); max_capacity = arch_scale_cpu_capacity(cpu); - max_freq = per_cpu(freq_factor, cpu); - - /* Convert to MHz scale which is used in 'freq_factor' */ - capped_freq /= 1000; + max_freq = arch_scale_freq_ref(cpu); /* * Handle properly the boost frequencies, which should simply clean @@ -279,13 +278,13 @@ void topology_normalize_cpu_scale(void) capacity_scale = 1; for_each_possible_cpu(cpu) { - capacity = raw_capacity[cpu] * per_cpu(freq_factor, cpu); + capacity = raw_capacity[cpu] * per_cpu(capacity_freq_ref, cpu); capacity_scale = max(capacity, capacity_scale); } pr_debug("cpu_capacity: capacity_scale=%llu\n", capacity_scale); for_each_possible_cpu(cpu) { - capacity = raw_capacity[cpu] * per_cpu(freq_factor, cpu); + capacity = raw_capacity[cpu] * per_cpu(capacity_freq_ref, cpu); capacity = div64_u64(capacity << SCHED_CAPACITY_SHIFT, capacity_scale); topology_set_cpu_scale(cpu, capacity); @@ -321,15 +320,15 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu) cpu_node, raw_capacity[cpu]); /* - * Update freq_factor for calculating early boot cpu capacities. + * Update capacity_freq_ref for calculating early boot CPU capacities. * For non-clk CPU DVFS mechanism, there's no way to get the * frequency value now, assuming they are running at the same - * frequency (by keeping the initial freq_factor value). + * frequency (by keeping the initial capacity_freq_ref value). */ cpu_clk = of_clk_get(cpu_node, 0); if (!PTR_ERR_OR_ZERO(cpu_clk)) { - per_cpu(freq_factor, cpu) = - clk_get_rate(cpu_clk) / 1000; + per_cpu(capacity_freq_ref, cpu) = + clk_get_rate(cpu_clk) / HZ_PER_KHZ; clk_put(cpu_clk); } } else { @@ -411,7 +410,7 @@ init_cpu_capacity_callback(struct notifier_block *nb, cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus); for_each_cpu(cpu, policy->related_cpus) - per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq / 1000; + per_cpu(capacity_freq_ref, cpu) = policy->cpuinfo.max_freq; if (cpumask_empty(cpus_to_visit)) { topology_normalize_cpu_scale(); diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h index a07b510e7dc5..32c24ff4f2a8 100644 --- a/include/linux/arch_topology.h +++ b/include/linux/arch_topology.h @@ -27,6 +27,13 @@ static inline unsigned long topology_get_cpu_scale(int cpu) void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity); +DECLARE_PER_CPU(unsigned long, capacity_freq_ref); + +static inline unsigned long topology_get_freq_ref(int cpu) +{ + return per_cpu(capacity_freq_ref, cpu); +} + DECLARE_PER_CPU(unsigned long, arch_freq_scale); static inline unsigned long topology_get_freq_scale(int cpu) diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index de545ba85218..a6e04b4a21d7 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -279,6 +279,14 @@ void arch_update_thermal_pressure(const struct cpumask *cpus, { } #endif +#ifndef arch_scale_freq_ref +static __always_inline +unsigned int arch_scale_freq_ref(int cpu) +{ + return 0; +} +#endif + static inline int task_node(const struct task_struct *p) { return cpu_to_node(task_cpu(p)); -- 2.33.0

From: Vincent Guittot <vincent.guittot@linaro.org> mainline inclusion from mainline-v6.8-rc6 commit 599457ba15403037b489fe536266a3d5f9efaed7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ---------------------------------------------------------------------- cpuinfo.max_freq can change at runtime because of boost as an example. This implies that the value could be different from the frequency that has been used to compute the capacity of a CPU. The new arch_scale_freq_ref() returns a fixed and coherent frequency that can be used to compute the capacity for a given frequency. [ Also fix a arch_set_freq_scale() newline style wart in <linux/cpufreq.h>. ] Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20231211104855.558096-3-vincent.guittot@linaro.org Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cpufreq.c | 4 ++-- include/linux/cpufreq.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 84943895b6dd..6e4cc4dbb840 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -454,7 +454,7 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy, arch_set_freq_scale(policy->related_cpus, policy->cur, - policy->cpuinfo.max_freq); + arch_scale_freq_ref(policy->cpu)); spin_lock(&policy->transition_lock); policy->transition_ongoing = false; @@ -2199,7 +2199,7 @@ unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, policy->cur = freq; arch_set_freq_scale(policy->related_cpus, freq, - policy->cpuinfo.max_freq); + arch_scale_freq_ref(policy->cpu)); cpufreq_stats_record_transition(policy, freq); if (trace_cpu_frequency_enabled()) { diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 9e1fb9a5480b..9be837765a99 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -1231,6 +1231,7 @@ void arch_set_freq_scale(const struct cpumask *cpus, { } #endif + /* the following are really really optional */ extern struct freq_attr cpufreq_freq_attr_scaling_available_freqs; extern struct freq_attr cpufreq_freq_attr_scaling_boost_freqs; -- 2.33.0

From: Vincent Guittot <vincent.guittot@linaro.org> mainline inclusion from mainline-v6.8-rc6 commit b3edde44e5d4504c23a176819865cd603fd16d6c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ---------------------------------------------------------------------- cpuinfo.max_freq can change at runtime because of boost as an example. This implies that the value could be different than the one that has been used when computing the capacity of a CPU. The new arch_scale_freq_ref() returns a fixed and coherent reference frequency that can be used when computing a frequency based on utilization. Use this arch_scale_freq_ref() when available and fallback to policy otherwise. Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20231211104855.558096-4-vincent.guittot@linaro.org Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- kernel/sched/cpufreq_schedutil.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 458d359f5991..7f6c9874a200 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -114,6 +114,28 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy) } } +/** + * get_capacity_ref_freq - get the reference frequency that has been used to + * correlate frequency and compute capacity for a given cpufreq policy. We use + * the CPU managing it for the arch_scale_freq_ref() call in the function. + * @policy: the cpufreq policy of the CPU in question. + * + * Return: the reference CPU frequency to compute a capacity. + */ +static __always_inline +unsigned long get_capacity_ref_freq(struct cpufreq_policy *policy) +{ + unsigned int freq = arch_scale_freq_ref(policy->cpu); + + if (freq) + return freq; + + if (arch_scale_freq_invariant()) + return policy->cpuinfo.max_freq; + + return policy->cur; +} + /** * get_next_freq - Compute a new frequency for a given cpufreq policy. * @sg_policy: schedutil policy object to compute the new frequency for. @@ -140,9 +162,9 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy, unsigned long util, unsigned long max) { struct cpufreq_policy *policy = sg_policy->policy; - unsigned int freq = arch_scale_freq_invariant() ? - policy->cpuinfo.max_freq : policy->cur; + unsigned int freq; + freq = get_capacity_ref_freq(policy); util = map_util_perf(util); freq = map_util_freq(util, freq, max); -- 2.33.0

From: Vincent Guittot <vincent.guittot@linaro.org> mainline inclusion from mainline-v6.8-rc6 commit 15cbbd1d317e07b4e5c6aca5d4c5579539a82784 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ---------------------------------------------------------------------- The last item of a performance domain is not always the performance point that has been used to compute CPU's capacity. This can lead to different target frequency compared with other part of the system like schedutil and would result in wrong energy estimation. A new arch_scale_freq_ref() is available to return a fixed and coherent frequency reference that can be used when computing the CPU's frequency for an level of utilization. Use this function to get this reference frequency. Energy model is never used without defining arch_scale_freq_ref() but can be compiled. Define a default arch_scale_freq_ref() returning 0 in such case. Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20231211104855.558096-5-vincent.guittot@linaro.org Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- include/linux/energy_model.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index 0091bf129f77..784efdba6703 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -229,7 +229,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, unsigned long max_util, unsigned long sum_util, unsigned long allowed_cpu_cap) { - unsigned long freq, scale_cpu; + unsigned long freq, ref_freq, scale_cpu; struct em_perf_state *ps; int cpu; @@ -246,11 +246,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, */ cpu = cpumask_first(to_cpumask(pd->cpus)); scale_cpu = arch_scale_cpu_capacity(cpu); - ps = &pd->table[pd->nr_perf_states - 1]; + ref_freq = arch_scale_freq_ref(cpu); max_util = map_util_perf(max_util); max_util = min(max_util, allowed_cpu_cap); - freq = map_util_freq(max_util, ps->frequency, scale_cpu); + freq = map_util_freq(max_util, ref_freq, scale_cpu); /* * Find the lowest performance state of the Energy Model above the -- 2.33.0

From: Vincent Guittot <vincent.guittot@linaro.org> mainline inclusion from mainline-v6.8-rc6 commit 5477fa249b56c59c3baa1b237bf083cffa64c84a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ---------------------------------------------------------------------- Save the frequency associated to the performance that has been used when initializing the capacity of CPUs. Also, cppc cpufreq driver can register an artificial energy model. In such case, it needs the frequency for this compute capacity. Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20231211104855.558096-7-vincent.guittot@linaro.org Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/base/arch_topology.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 0c9ae5b157b1..1aa76b5c96c2 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -349,6 +349,7 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu) void topology_init_cpu_capacity_cppc(void) { + u64 capacity, capacity_scale = 0; struct cppc_perf_caps perf_caps; int cpu; @@ -365,6 +366,10 @@ void topology_init_cpu_capacity_cppc(void) (perf_caps.highest_perf >= perf_caps.nominal_perf) && (perf_caps.highest_perf >= perf_caps.lowest_perf)) { raw_capacity[cpu] = perf_caps.highest_perf; + capacity_scale = max_t(u64, capacity_scale, raw_capacity[cpu]); + + per_cpu(capacity_freq_ref, cpu) = cppc_perf_to_khz(&perf_caps, raw_capacity[cpu]); + pr_debug("cpu_capacity: CPU%d cpu_capacity=%u (raw).\n", cpu, raw_capacity[cpu]); continue; @@ -375,7 +380,15 @@ void topology_init_cpu_capacity_cppc(void) goto exit; } - topology_normalize_cpu_scale(); + for_each_possible_cpu(cpu) { + capacity = raw_capacity[cpu]; + capacity = div64_u64(capacity << SCHED_CAPACITY_SHIFT, + capacity_scale); + topology_set_cpu_scale(cpu, capacity); + pr_debug("cpu_capacity: CPU%d cpu_capacity=%lu\n", + cpu, topology_get_cpu_scale(cpu)); + } + schedule_work(&update_topology_flags_work); pr_debug("cpu_capacity: cpu_capacity initialization done\n"); -- 2.33.0

From: Vincent Guittot <vincent.guittot@linaro.org> mainline inclusion from mainline-v6.8-rc6 commit 1f023007f5e782bda19ad9104830c404fd622c5d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ---------------------------------------------------------------------- Use the new capacity_ref_freq() method to set the ratio that is used by AMU for computing the arch_scale_freq_capacity(). This helps to keep everything aligned using the same reference for computing CPUs capacity. The default value of the ratio (stored in per_cpu(arch_max_freq_scale)) ensures that arch_scale_freq_capacity() returns max capacity until it is set to its correct value with the cpu capacity and capacity_ref_freq(). Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20231211104855.558096-8-vincent.guittot@linaro.org Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- arch/arm64/kernel/topology.c | 26 +++++++++++++------------- drivers/base/arch_topology.c | 12 +++++++++++- include/linux/arch_topology.h | 1 + 3 files changed, 25 insertions(+), 14 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 817d788cd866..1a2c72f3e7f8 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -82,7 +82,12 @@ int __init parse_acpi_topology(void) #undef pr_fmt #define pr_fmt(fmt) "AMU: " fmt -static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale); +/* + * Ensure that amu_scale_freq_tick() will return SCHED_CAPACITY_SCALE until + * the CPU capacity and its associated frequency have been correctly + * initialized. + */ +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale) = 1UL << (2 * SCHED_CAPACITY_SHIFT); static DEFINE_PER_CPU(u64, arch_const_cycles_prev); static DEFINE_PER_CPU(u64, arch_core_cycles_prev); static cpumask_var_t amu_fie_cpus; @@ -112,14 +117,14 @@ static inline bool freq_counters_valid(int cpu) return true; } -static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) +void freq_inv_set_max_ratio(int cpu, u64 max_rate) { - u64 ratio; + u64 ratio, ref_rate = arch_timer_get_rate(); if (unlikely(!max_rate || !ref_rate)) { - pr_debug("CPU%d: invalid maximum or reference frequency.\n", + WARN_ONCE(1, "CPU%d: invalid maximum or reference frequency.\n", cpu); - return -EINVAL; + return; } /* @@ -139,12 +144,10 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) ratio = div64_u64(ratio, max_rate); if (!ratio) { WARN_ONCE(1, "Reference frequency too low.\n"); - return -EINVAL; + return; } - per_cpu(arch_max_freq_scale, cpu) = (unsigned long)ratio; - - return 0; + WRITE_ONCE(per_cpu(arch_max_freq_scale, cpu), (unsigned long)ratio); } static void amu_scale_freq_tick(void) @@ -195,10 +198,7 @@ static void amu_fie_setup(const struct cpumask *cpus) return; for_each_cpu(cpu, cpus) { - if (!freq_counters_valid(cpu) || - freq_inv_set_max_ratio(cpu, - cpufreq_get_hw_max_freq(cpu) * 1000ULL, - arch_timer_get_rate())) + if (!freq_counters_valid(cpu)) return; } diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 1aa76b5c96c2..5aaa0865625d 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -344,6 +344,10 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu) return !ret; } +void __weak freq_inv_set_max_ratio(int cpu, u64 max_rate) +{ +} + #ifdef CONFIG_ACPI_CPPC_LIB #include <acpi/cppc_acpi.h> @@ -381,6 +385,9 @@ void topology_init_cpu_capacity_cppc(void) } for_each_possible_cpu(cpu) { + freq_inv_set_max_ratio(cpu, + per_cpu(capacity_freq_ref, cpu) * HZ_PER_KHZ); + capacity = raw_capacity[cpu]; capacity = div64_u64(capacity << SCHED_CAPACITY_SHIFT, capacity_scale); @@ -422,8 +429,11 @@ init_cpu_capacity_callback(struct notifier_block *nb, cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus); - for_each_cpu(cpu, policy->related_cpus) + for_each_cpu(cpu, policy->related_cpus) { per_cpu(capacity_freq_ref, cpu) = policy->cpuinfo.max_freq; + freq_inv_set_max_ratio(cpu, + per_cpu(capacity_freq_ref, cpu) * HZ_PER_KHZ); + } if (cpumask_empty(cpus_to_visit)) { topology_normalize_cpu_scale(); diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h index 32c24ff4f2a8..a63d61ca55af 100644 --- a/include/linux/arch_topology.h +++ b/include/linux/arch_topology.h @@ -99,6 +99,7 @@ void update_siblings_masks(unsigned int cpu); void remove_cpu_topology(unsigned int cpuid); void reset_cpu_topology(void); int parse_acpi_topology(void); +void freq_inv_set_max_ratio(int cpu, u64 max_rate); #endif #endif /* _LINUX_ARCH_TOPOLOGY_H_ */ -- 2.33.0

From: Jie Zhan <zhanjie9@hisilicon.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT ---------------------------------------------------------------------- Reading perf counters on offline cpus should be expected to fail, e.g. it returns -EFAULT as counters are shown to be 0. Remove the unnecessary warning print on this failure path. Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cppc_cpufreq.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 1f6022e4020e..fc7153423ac5 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -176,16 +176,14 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy) init_irq_work(&cppc_fi->irq_work, cppc_irq_work); ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs); - if (ret) { - pr_warn("%s: failed to read perf counters for cpu:%d: %d\n", - __func__, cpu, ret); - + if (ret && cpu_online(cpu)) { /* * Don't abort if the CPU was offline while the driver * was getting registered. */ - if (cpu_online(cpu)) - return; + pr_debug("%s: failed to read perf counters for cpu:%d: %d\n", + __func__, cpu, ret); + return; } } -- 2.33.0

From: Jie Zhan <zhanjie9@hisilicon.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT ---------------------------------------------------------------------- Perf counters could be 0 if the cpu is in a low-power idle state. Just try it again next time and update the frequency scale when the cpu is active and perf counters successfully return. Also, remove the FIE source on an actual failure. Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cppc_cpufreq.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index fc7153423ac5..34d29aeae880 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -113,12 +113,23 @@ static void cppc_scale_freq_workfn(struct kthread_work *work) struct cppc_cpudata *cpu_data; unsigned long local_freq_scale; u64 perf; + int ret; cppc_fi = container_of(work, struct cppc_freq_invariance, work); cpu_data = cppc_fi->cpu_data; - if (cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs)) { + ret = cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs); + /* + * Perf counters could be 0 if the cpu is in a low-power idle state. + * Just try it again next time. + */ + if (ret == -EFAULT) + return; + + if (ret) { pr_warn("%s: failed to read perf counters\n", __func__); + topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, + cpu_data->shared_cpu_map); return; } -- 2.33.0

driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICEXYT ---------------------------------------------------------------------- Amu fie was set up by a cpufreq policy notifier after the policy was created. This caused some problems: 1. The cpus related to the same policy would all fail to set up amu fie if one of them couldn't pass the freq_counters_valid() check. 2. The cpus fail to set up amu fie would never have a chance to set up again. When boot with maxcpu=1 restrict, the support amu flags of the offline cpus would never be setup. After that, when cpufreq policy was being created, the online cpu might set up amu fie fail because the other cpus related to the same policy couldn't pass the freq_counters_valid() check. Hotplug the offline cpus, since the policy was already created, amu_fie_setup() would never be called again. All cpus couldn't setup amu fie in this situation. After commit 1f023007f5e7 ("arm64/amu: Use capacity_ref_freq() to set AMU ratio"), the max_freq stores in policy data is never needed when setting up amu fie. This indicates that the setting up of amu fie does not depend on the policy any more. So each cpu can set up amu fie separately during hotplug and the problems above will be solved. Fixes: 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- arch/arm64/kernel/topology.c | 53 +++++++++++------------------------- 1 file changed, 16 insertions(+), 37 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 71555edba910..d9a6370c05c5 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -14,7 +14,6 @@ #include <linux/acpi.h> #include <linux/arch_topology.h> #include <linux/cacheinfo.h> -#include <linux/cpufreq.h> #include <linux/cpu_smt.h> #include <linux/init.h> #include <linux/percpu.h> @@ -243,52 +242,28 @@ static struct scale_freq_data amu_sfd = { .set_freq_scale = amu_scale_freq_tick, }; -static void amu_fie_setup(const struct cpumask *cpus) +static void amu_fie_setup(unsigned int cpu) { - int cpu; - - /* We are already set since the last insmod of cpufreq driver */ - if (unlikely(cpumask_subset(cpus, amu_fie_cpus))) + if (cpumask_test_cpu(cpu, amu_fie_cpus)) return; - for_each_cpu(cpu, cpus) { - if (!freq_counters_valid(cpu)) - return; - } + if (!freq_counters_valid(cpu)) + return; - cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus); + cpumask_set_cpu(cpu, amu_fie_cpus); topology_set_scale_freq_source(&amu_sfd, amu_fie_cpus); - pr_debug("CPUs[%*pbl]: counters will be used for FIE.", - cpumask_pr_args(cpus)); + pr_debug("CPUs[%u]: counters will be used for FIE.", cpu); } -static int init_amu_fie_callback(struct notifier_block *nb, unsigned long val, - void *data) +static int cpuhp_topology_online(unsigned int cpu) { - struct cpufreq_policy *policy = data; - - if (val == CPUFREQ_CREATE_POLICY) - amu_fie_setup(policy->related_cpus); - - /* - * We don't need to handle CPUFREQ_REMOVE_POLICY event as the AMU - * counters don't have any dependency on cpufreq driver once we have - * initialized AMU support and enabled invariance. The AMU counters will - * keep on working just fine in the absence of the cpufreq driver, and - * for the CPUs for which there are no counters available, the last set - * value of arch_freq_scale will remain valid as that is the frequency - * those CPUs are running at. - */ + amu_fie_setup(cpu); return 0; } -static struct notifier_block init_amu_fie_notifier = { - .notifier_call = init_amu_fie_callback, -}; - static int __init init_amu_fie(void) { int ret; @@ -296,12 +271,16 @@ static int __init init_amu_fie(void) if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) return -ENOMEM; - ret = cpufreq_register_notifier(&init_amu_fie_notifier, - CPUFREQ_POLICY_NOTIFIER); - if (ret) + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, + "arm64/topology:online", + cpuhp_topology_online, + NULL); + if (ret < 0) { free_cpumask_var(amu_fie_cpus); + return ret; + } - return ret; + return 0; } core_initcall(init_amu_fie); -- 2.33.0

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/16700 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/V3P... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/16700 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/V3P...
participants (2)
-
Lifeng Zheng
-
patchwork bot