[PATCH OLK-6.6 01/12] cpufreq: Support per-policy performance boost

From: Jie Zhan <zhanjie9@hisilicon.com> The boost control currently applies to the whole system. However, users may prefer to boost a subset of cores in order to provide prioritized performance to workloads running on the boosted cores. Enable per-policy boost by adding a 'boost' sysfs interface under each policy path. This can be found at: /sys/devices/system/cpu/cpufreq/policy<*>/boost Same to the global boost switch, writing 1/0 to the per-policy 'boost' enables/disables boost on a cpufreq policy respectively. The user view of global and per-policy boost controls should be: 1. Enabling global boost initially enables boost on all policies, and per-policy boost can then be enabled or disabled individually, given that the platform does support so. 2. Disabling global boost makes the per-policy boost interface illegal. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Reviewed-by: Wei Xu <xuwei5@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cpufreq.c | 43 +++++++++++++++++++++++++++++++++++++++ include/linux/cpufreq.h | 3 +++ 2 files changed, 46 insertions(+) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index c2227be7bad8..0d5942f3f0df 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -81,6 +81,7 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy); static int cpufreq_set_policy(struct cpufreq_policy *policy, struct cpufreq_governor *new_gov, unsigned int new_pol); +static bool cpufreq_boost_supported(void); /* * Two notifier lists: the "policy" list is involved in the @@ -618,6 +619,40 @@ static ssize_t store_boost(struct kobject *kobj, struct kobj_attribute *attr, } define_one_global_rw(boost); +static ssize_t show_local_boost(struct cpufreq_policy *policy, char *buf) +{ + return sysfs_emit(buf, "%d\n", policy->boost_enabled); +} + +static ssize_t store_local_boost(struct cpufreq_policy *policy, + const char *buf, size_t count) +{ + int ret, enable; + + ret = kstrtoint(buf, 10, &enable); + if (ret || enable < 0 || enable > 1) + return -EINVAL; + + if (!cpufreq_driver->boost_enabled) + return -EINVAL; + + if (policy->boost_enabled == enable) + return count; + + cpus_read_lock(); + ret = cpufreq_driver->set_boost(policy, enable); + cpus_read_unlock(); + + if (ret) + return ret; + + policy->boost_enabled = enable; + + return count; +} + +static struct freq_attr local_boost = __ATTR(boost, 0644, show_local_boost, store_local_boost); + static struct cpufreq_governor *find_governor(const char *str_governor) { struct cpufreq_governor *t; @@ -1057,6 +1092,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy) return ret; } + if (cpufreq_boost_supported()) { + ret = sysfs_create_file(&policy->kobj, &local_boost.attr); + if (ret) + return ret; + } + return 0; } @@ -2687,6 +2728,8 @@ int cpufreq_boost_trigger_state(int state) ret = cpufreq_driver->set_boost(policy, state); if (ret) goto err_reset_state; + + policy->boost_enabled = state; } cpus_read_unlock(); diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 3e2818678cc3..896ca659e66a 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -134,6 +134,9 @@ struct cpufreq_policy { */ bool dvfs_possible_from_any_cpu; + /* Per policy boost enabled flag. */ + bool boost_enabled; + /* Cached frequency lookup from cpufreq_driver_resolve_freq. */ unsigned int cached_target_freq; unsigned int cached_resolved_idx; -- 2.33.0

From: Sibi Sankar <quic_sibis@quicinc.com> stable inclusion from stable-v6.6.23 commit 9d47d2e7f82dec7cab09996bdb9062786eb827f6 bugzilla: https://gitee.com/openeuler/kernel/issues/I9MPZ8 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- [ Upstream commit f37a4d6b4a2c77414e8b9d25dd5ee31537ce9b00 ] In the existing code, per-policy flags don't have any impact i.e. if cpufreq_driver boost is enabled and boost is disabled for one or more of the policies, the cpufreq driver will behave as if boost is enabled. Fix this by incorporating per-policy boost flag in the policy->max computation used in cpufreq_frequency_table_cpuinfo and setting the default per-policy boost to mirror the cpufreq_driver boost flag. Fixes: 218a06a79d9a ("cpufreq: Support per-policy performance boost") Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Dhruva Gole <d-gole@ti.com> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com> Tested-by:Yipeng Zou <zouyipeng@huawei.com> <mailto:zouyipeng@huawei.com> Reviewed-by: Yipeng Zou <zouyipeng@huawei.com> <mailto:zouyipeng@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cpufreq.c | 18 ++++++++++++------ drivers/cpufreq/freq_table.c | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 0d5942f3f0df..6fdee87494c8 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -639,14 +639,16 @@ static ssize_t store_local_boost(struct cpufreq_policy *policy, if (policy->boost_enabled == enable) return count; + policy->boost_enabled = enable; + cpus_read_lock(); ret = cpufreq_driver->set_boost(policy, enable); cpus_read_unlock(); - if (ret) + if (ret) { + policy->boost_enabled = !policy->boost_enabled; return ret; - - policy->boost_enabled = enable; + } return count; } @@ -1411,6 +1413,9 @@ static int cpufreq_online(unsigned int cpu) goto out_free_policy; } + /* Let the per-policy boost flag mirror the cpufreq_driver boost during init */ + policy->boost_enabled = cpufreq_boost_enabled() && policy_has_boost_freq(policy); + /* * The initialization has succeeded and the policy is online. * If there is a problem with its frequency table, take it @@ -2725,11 +2730,12 @@ int cpufreq_boost_trigger_state(int state) cpus_read_lock(); for_each_active_policy(policy) { + policy->boost_enabled = state; ret = cpufreq_driver->set_boost(policy, state); - if (ret) + if (ret) { + policy->boost_enabled = !policy->boost_enabled; goto err_reset_state; - - policy->boost_enabled = state; + } } cpus_read_unlock(); diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c index 67e56cf638ef..924b75bcb061 100644 --- a/drivers/cpufreq/freq_table.c +++ b/drivers/cpufreq/freq_table.c @@ -40,7 +40,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy, cpufreq_for_each_valid_entry(pos, table) { freq = pos->frequency; - if (!cpufreq_boost_enabled() + if ((!cpufreq_boost_enabled() || !policy->boost_enabled) && (pos->flags & CPUFREQ_BOOST_FREQ)) continue; -- 2.33.0

From: Mario Limonciello <mario.limonciello@amd.com> stable inclusion from stable-v6.6.41 commit e408184365c70f8a273f1a281e7a33416e742687 bugzilla: https://gitee.com/openeuler/kernel/issues/IAHMJO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit 102fa9c4b439ca3bd93d13fb53f5b7592d96a109 upstream. The behavior introduced in commit f37a4d6b4a2c ("cpufreq: Fix per-policy boost behavior on SoCs using cpufreq_boost_set_sw()") sets up the boost policy incorrectly when boost has been enabled by the platform firmware initially even if a driver sets the policy up. This is because policy_has_boost_freq() assumes that there is a frequency table set up by the driver and that the boost frequencies are advertised in that table. This assumption doesn't work for acpi-cpufreq or amd-pstate. Only use this check to enable boost if it's not already enabled instead of also disabling it if alreayd enabled. Fixes: f37a4d6b4a2c ("cpufreq: Fix per-policy boost behavior on SoCs using cpufreq_boost_set_sw()") Link: https://patch.msgid.link/20240626204723.6237-1-mario.limonciello@amd.com Reviewed-by: Sibi Sankar <quic_sibis@quicinc.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Suggested-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cpufreq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 6fdee87494c8..2c0d534627d3 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1414,7 +1414,8 @@ static int cpufreq_online(unsigned int cpu) } /* Let the per-policy boost flag mirror the cpufreq_driver boost during init */ - policy->boost_enabled = cpufreq_boost_enabled() && policy_has_boost_freq(policy); + if (cpufreq_boost_enabled() && policy_has_boost_freq(policy)) + policy->boost_enabled = true; /* * The initialization has succeeded and the policy is online. -- 2.33.0

From: Lifeng Zheng <zhenglifeng1@huawei.com> mainline inclusion from mainline-v6.7-rc5 commit dd016f379ebc2d43a9405742d1a6066577509bd7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBQYEH CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... --------------------------------------------------------------------- In cpufreq_online() of cpufreq.c, the per-policy boost flag is already set to mirror the cpufreq_driver boost during init but using freq_table to judge if the policy has boost frequency. There are two drawbacks to this approach: 1. It doesn't work for the cpufreq drivers that do not use a frequency table. For now, acpi-cpufreq and amd-pstate have to enable boost in policy initialization. And cppc_cpufreq never set policy to boost when going online no matter what the cpufreq_driver boost flag is. 2. If the CPU goes offline when cpufreq_driver boost is enabled and then goes online when cpufreq_driver boost is disabled, the per-policy boost flag will incorrectly remain true. Running set_boost at the end of the online process is a more generic way for all cpufreq drivers. Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Link: https://patch.msgid.link/20250117101457.1530653-3-zhenglifeng1@huawei.com Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cpufreq.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 2c0d534627d3..017ae0a585be 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1413,10 +1413,6 @@ static int cpufreq_online(unsigned int cpu) goto out_free_policy; } - /* Let the per-policy boost flag mirror the cpufreq_driver boost during init */ - if (cpufreq_boost_enabled() && policy_has_boost_freq(policy)) - policy->boost_enabled = true; - /* * The initialization has succeeded and the policy is online. * If there is a problem with its frequency table, take it @@ -1569,6 +1565,18 @@ static int cpufreq_online(unsigned int cpu) if (cpufreq_thermal_control_enabled(cpufreq_driver)) policy->cdev = of_cpufreq_cooling_register(policy); + /* Let the per-policy boost flag mirror the cpufreq_driver boost during init */ + if (policy->boost_enabled != cpufreq_boost_enabled()) { + policy->boost_enabled = cpufreq_boost_enabled(); + ret = cpufreq_driver->set_boost(policy, policy->boost_enabled); + if (ret) { + /* If the set_boost fails, the online operation is not affected */ + pr_info("%s: CPU%d: Cannot %s BOOST\n", __func__, policy->cpu, + policy->boost_enabled ? "enable" : "disable"); + policy->boost_enabled = !policy->boost_enabled; + } + } + pr_debug("initialization complete\n"); return 0; -- 2.33.0

From: Lifeng Zheng <zhenglifeng1@huawei.com> mainline inclusion from mainline-v6.7-rc5 commit 03d8b4e76266e11662c5e544854b737843173e2d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBQYEH CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... --------------------------------------------------------------------- In policy initialization, policy->max and policy->cpuinfo.max_freq are always set to the value calculated from caps->nominal_perf. This will cause the frequency stay on base frequency even if the policy is already boosted when a CPU is going online. Fix this by using policy->boost_enabled to determine which value should be set. Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20250117101457.1530653-4-zhenglifeng1@huawei.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cppc_cpufreq.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 07d644bb71b9..6f6e7d8f5b32 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -544,8 +544,8 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy) */ policy->min = cppc_cpufreq_perf_to_khz(cpu_data, caps->lowest_nonlinear_perf); - policy->max = cppc_cpufreq_perf_to_khz(cpu_data, - caps->nominal_perf); + policy->max = cppc_cpufreq_perf_to_khz(cpu_data, policy->boost_enabled ? + caps->highest_perf : caps->nominal_perf); /* * Set cpuinfo.min_freq to Lowest to make the full range of performance @@ -554,8 +554,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy) */ policy->cpuinfo.min_freq = cppc_cpufreq_perf_to_khz(cpu_data, caps->lowest_perf); - policy->cpuinfo.max_freq = cppc_cpufreq_perf_to_khz(cpu_data, - caps->nominal_perf); + policy->cpuinfo.max_freq = policy->max; policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu); policy->shared_type = cpu_data->shared_type; -- 2.33.0

From: Stuart Hayes <stuart.w.hayes@gmail.com> When acpi-cpufreq is loaded, boost is enabled on every CPU (by setting an MSR) before the driver is registered with cpufreq. This can be very time consuming, because it is done with a CPU hotplug startup callback, and cpuhp_setup_state() schedules the callback (cpufreq_boost_online()) to run on each CPU one at a time, waiting for each to run before calling the next. If cpufreq_register_driver() fails--if, for example, there are no ACPI P-states present--this is wasted time. Since cpufreq already sets up a CPU hotplug startup callback if and when acpi-cpufreq is registered, set the boost MSRs in acpi_cpufreq_cpu_init(), which is called by the cpufreq cpuhp callback. This allows acpi-cpufreq to exit quickly if it is loaded but not needed. On one system with 192 CPUs, this patch speeds up boot by about 30 seconds. Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/acpi-cpufreq.c | 31 +++---------------------------- 1 file changed, 3 insertions(+), 28 deletions(-) diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index 28467d83c745..8b5c84c9dc1e 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -530,15 +530,6 @@ static void free_acpi_perf_data(void) free_percpu(acpi_perf_data); } -static int cpufreq_boost_online(unsigned int cpu) -{ - /* - * On the CPU_UP path we simply keep the boost-disable flag - * in sync with the current global state. - */ - return boost_set_msr(acpi_cpufreq_driver.boost_enabled); -} - static int cpufreq_boost_down_prep(unsigned int cpu) { /* @@ -892,6 +883,8 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) if (perf->states[0].core_frequency * 1000 != freq_table[0].frequency) pr_warn(FW_WARN "P-state 0 is not max freq\n"); + set_boost(policy, acpi_cpufreq_driver.boost_enabled); + return result; err_unreg: @@ -911,6 +904,7 @@ static int acpi_cpufreq_cpu_exit(struct cpufreq_policy *policy) pr_debug("%s\n", __func__); + cpufreq_boost_down_prep(policy->cpu); policy->fast_switch_possible = false; policy->driver_data = NULL; acpi_processor_unregister_performance(data->acpi_perf_cpu); @@ -967,25 +961,9 @@ static void __init acpi_cpufreq_boost_init(void) acpi_cpufreq_driver.set_boost = set_boost; acpi_cpufreq_driver.boost_enabled = boost_state(0); - /* - * This calls the online callback on all online cpu and forces all - * MSRs to the same value. - */ - ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "cpufreq/acpi:online", - cpufreq_boost_online, cpufreq_boost_down_prep); - if (ret < 0) { - pr_err("acpi_cpufreq: failed to register hotplug callbacks\n"); - return; - } acpi_cpufreq_online = ret; } -static void acpi_cpufreq_boost_exit(void) -{ - if (acpi_cpufreq_online > 0) - cpuhp_remove_state_nocalls(acpi_cpufreq_online); -} - static int __init acpi_cpufreq_init(void) { int ret; @@ -1027,7 +1005,6 @@ static int __init acpi_cpufreq_init(void) ret = cpufreq_register_driver(&acpi_cpufreq_driver); if (ret) { free_acpi_perf_data(); - acpi_cpufreq_boost_exit(); } return ret; } @@ -1036,8 +1013,6 @@ static void __exit acpi_cpufreq_exit(void) { pr_debug("%s\n", __func__); - acpi_cpufreq_boost_exit(); - cpufreq_unregister_driver(&acpi_cpufreq_driver); free_acpi_perf_data(); -- 2.33.0

From: Stuart Hayes <stuart.w.hayes@gmail.com> Stop trying to set boost MSRs on CPUs that don't support boost. This corrects a bug in the recent patch "Defer setting boost MSRs". Fixes: 13fdbc8b8da6 ("cpufreq: ACPI: Defer setting boost MSRs") Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/acpi-cpufreq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index 8b5c84c9dc1e..0def3221a2e5 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -883,7 +883,8 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) if (perf->states[0].core_frequency * 1000 != freq_table[0].frequency) pr_warn(FW_WARN "P-state 0 is not max freq\n"); - set_boost(policy, acpi_cpufreq_driver.boost_enabled); + if (acpi_cpufreq_driver.set_boost) + set_boost(policy, acpi_cpufreq_driver.boost_enabled); return result; -- 2.33.0

From: Mario Limonciello <mario.limonciello@amd.com> stable inclusion from stable-v6.6.41 commit 2ca2fd474d862c9b144ea4ed100ee4591f52d574 bugzilla: https://gitee.com/openeuler/kernel/issues/IAHMJO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit d92467ad9d9ee63a700934b9228a989ef671d511 upstream. When boost is set for CPUs using acpi-cpufreq, the policy is not updated which can cause boost to be incorrectly not reported. Fixes: 218a06a79d9a ("cpufreq: Support per-policy performance boost") Link: https://patch.msgid.link/20240626204723.6237-2-mario.limonciello@amd.com Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Suggested-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/acpi-cpufreq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index 0def3221a2e5..17933355947c 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -883,8 +883,10 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) if (perf->states[0].core_frequency * 1000 != freq_table[0].frequency) pr_warn(FW_WARN "P-state 0 is not max freq\n"); - if (acpi_cpufreq_driver.set_boost) + if (acpi_cpufreq_driver.set_boost) { set_boost(policy, acpi_cpufreq_driver.boost_enabled); + policy->boost_enabled = acpi_cpufreq_driver.boost_enabled; + } return result; -- 2.33.0

From: Lifeng Zheng <zhenglifeng1@huawei.com> mainline inclusion from mainline-v6.7-rc5 commit 2b16c631832df6cf8782fb1fdc7df8a4f03f4f16 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBQYEH CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... --------------------------------------------------------------------- At the end of cpufreq_online() in cpufreq.c, set_boost is executed and the per-policy boost flag is set to mirror the cpufreq_driver boost, so it is not necessary to run set_boost in acpi_cpufreq_cpu_init(). Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20250117101457.1530653-5-zhenglifeng1@huawei.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/acpi-cpufreq.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index 17933355947c..dadf8860cc2d 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -883,11 +883,6 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) if (perf->states[0].core_frequency * 1000 != freq_table[0].frequency) pr_warn(FW_WARN "P-state 0 is not max freq\n"); - if (acpi_cpufreq_driver.set_boost) { - set_boost(policy, acpi_cpufreq_driver.boost_enabled); - policy->boost_enabled = acpi_cpufreq_driver.boost_enabled; - } - return result; err_unreg: -- 2.33.0

From: Aboorva Devarajan <aboorvad@linux.ibm.com> mainline inclusion from mainline-v6.7-rc5 commit 0813fd2e14ca6ecd4e6ba005a9766f08e26020d7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBQYEH CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... --------------------------------------------------------------------- Ensure cpufreq_driver->set_boost is non-NULL before using it in cpufreq_online() to prevent a potential NULL pointer dereference. Reported-by: Gautam Menghani <gautam@linux.ibm.com> Closes: https://lore.kernel.org/all/c9e56c5f54cc33338762c94e9bed7b5a0d5de812.camel@l... Fixes: da59223d340c ("cpufreq: Introduce a more generic way to set default per-policy boost flag") Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Link: https://patch.msgid.link/20250205181347.2079272-1-aboorvad@linux.ibm.com [ rjw: Minor edits in the subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cpufreq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 017ae0a585be..717eedd7868e 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1566,7 +1566,8 @@ static int cpufreq_online(unsigned int cpu) policy->cdev = of_cpufreq_cooling_register(policy); /* Let the per-policy boost flag mirror the cpufreq_driver boost during init */ - if (policy->boost_enabled != cpufreq_boost_enabled()) { + if (cpufreq_driver->set_boost && + policy->boost_enabled != cpufreq_boost_enabled()) { policy->boost_enabled = cpufreq_boost_enabled(); ret = cpufreq_driver->set_boost(policy, policy->boost_enabled); if (ret) { -- 2.33.0

From: Jie Zhan <zhanjie9@hisilicon.com> mainline inclusion from mainline-v6.14-rc3 commit 3698dd6b139dc37b35a9ad83d9330c1f99666c02 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IC29P3 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... --------------------------------------------------------------------- We observed an issue that the CPU frequency can't raise up with a 100% CPU load when NOHZ is off and the 'conservative' governor is selected. 'idle_time' can be negative if it's obtained from get_cpu_idle_time_jiffy() when NOHZ is off. This was found and explained in commit 9485e4ca0b48 ("cpufreq: governor: Fix handling of special cases in dbs_update()"). However, commit 7592019634f8 ("cpufreq: governors: Fix long idle detection logic in load calculation") introduced a comparison between 'idle_time' and 'samling_rate' to detect a long idle interval. While 'idle_time' is converted to int before comparison, it's actually promoted to unsigned again when compared with an unsigned 'sampling_rate'. Hence, this leads to wrong idle interval detection when it's in fact 100% busy and sets policy_dbs->idle_periods to a very large value. 'conservative' adjusts the frequency to minimum because of the large 'idle_periods', such that the frequency can't raise up. 'Ondemand' doesn't use policy_dbs->idle_periods so it fortunately avoids the issue. Correct negative 'idle_time' to 0 before any use of it in dbs_update(). Fixes: 7592019634f8 ("cpufreq: governors: Fix long idle detection logic in load calculation") Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Link: https://patch.msgid.link/20250213035510.2402076-1-zhanjie9@hisilicon.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cpufreq_governor.c | 45 +++++++++++++++--------------- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index 55c80319d268..5981e3ef9ce0 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -145,7 +145,23 @@ unsigned int dbs_update(struct cpufreq_policy *policy) time_elapsed = update_time - j_cdbs->prev_update_time; j_cdbs->prev_update_time = update_time; - idle_time = cur_idle_time - j_cdbs->prev_cpu_idle; + /* + * cur_idle_time could be smaller than j_cdbs->prev_cpu_idle if + * it's obtained from get_cpu_idle_time_jiffy() when NOHZ is + * off, where idle_time is calculated by the difference between + * time elapsed in jiffies and "busy time" obtained from CPU + * statistics. If a CPU is 100% busy, the time elapsed and busy + * time should grow with the same amount in two consecutive + * samples, but in practice there could be a tiny difference, + * making the accumulated idle time decrease sometimes. Hence, + * in this case, idle_time should be regarded as 0 in order to + * make the further process correct. + */ + if (cur_idle_time > j_cdbs->prev_cpu_idle) + idle_time = cur_idle_time - j_cdbs->prev_cpu_idle; + else + idle_time = 0; + j_cdbs->prev_cpu_idle = cur_idle_time; if (ignore_nice) { @@ -162,7 +178,7 @@ unsigned int dbs_update(struct cpufreq_policy *policy) * calls, so the previous load value can be used then. */ load = j_cdbs->prev_load; - } else if (unlikely((int)idle_time > 2 * sampling_rate && + } else if (unlikely(idle_time > 2 * sampling_rate && j_cdbs->prev_load)) { /* * If the CPU had gone completely idle and a task has @@ -189,30 +205,15 @@ unsigned int dbs_update(struct cpufreq_policy *policy) load = j_cdbs->prev_load; j_cdbs->prev_load = 0; } else { - if (time_elapsed >= idle_time) { + if (time_elapsed > idle_time) load = 100 * (time_elapsed - idle_time) / time_elapsed; - } else { - /* - * That can happen if idle_time is returned by - * get_cpu_idle_time_jiffy(). In that case - * idle_time is roughly equal to the difference - * between time_elapsed and "busy time" obtained - * from CPU statistics. Then, the "busy time" - * can end up being greater than time_elapsed - * (for example, if jiffies_64 and the CPU - * statistics are updated by different CPUs), - * so idle_time may in fact be negative. That - * means, though, that the CPU was busy all - * the time (on the rough average) during the - * last sampling interval and 100 can be - * returned as the load. - */ - load = (int)idle_time < 0 ? 100 : 0; - } + else + load = 0; + j_cdbs->prev_load = load; } - if (unlikely((int)idle_time > 2 * sampling_rate)) { + if (unlikely(idle_time > 2 * sampling_rate)) { unsigned int periods = idle_time / sampling_rate; if (periods < idle_periods) -- 2.33.0

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,转换为PR失败! 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/WXY... 失败原因:补丁集缺失封面信息 建议解决方法:请提供补丁集并重新发送您的补丁集到邮件列表 FeedBack: The patch(es) which you have sent to kernel@openeuler.org has been converted to PR failed! Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/WXY... Failed Reason: the cover of the patches is missing Suggest Solution: please checkout and apply the patches' cover and send all again

From: Lifeng Zheng <zhenglifeng1@huawei.com> mainline inclusion from mainline-v6.7-rc5 commit 1608f0230510489d74a2e24e47054233b7e4678a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBQYEH CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... --------------------------------------------------------------------- It turns out that CPUX will stay on the base frequency after performing these operations: 1. boost all CPUs: echo 1 > /sys/devices/system/cpu/cpufreq/boost 2. offline one CPU: echo 0 > /sys/devices/system/cpu/cpuX/online 3. deboost all CPUs: echo 0 > /sys/devices/system/cpu/cpufreq/boost 4. online CPUX: echo 1 > /sys/devices/system/cpu/cpuX/online 5. boost all CPUs again: echo 1 > /sys/devices/system/cpu/cpufreq/boost This is because max_freq_req of the policy is not updated during the online process, and the value of max_freq_req before the last offline is retained. When the CPU is boosted again, freq_qos_update_request() will do nothing because the old value is the same as the new one. This causes the CPU to stay at the base frequency. Updating max_freq_req in cpufreq_online() will solve this problem. Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20250117101457.1530653-2-zhenglifeng1@huawei.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Xinghai Cen <cenxinghai@h-partners.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- drivers/cpufreq/cpufreq.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 717eedd7868e..99fa68e530ae 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1476,6 +1476,10 @@ static int cpufreq_online(unsigned int cpu) blocking_notifier_call_chain(&cpufreq_policy_notifier_list, CPUFREQ_CREATE_POLICY, policy); + } else { + ret = freq_qos_update_request(policy->max_freq_req, policy->max); + if (ret < 0) + goto out_destroy_policy; } if (cpufreq_driver->get && has_target()) { -- 2.33.0
participants (2)
-
patchwork bot
-
Qi Xi