[PATCH OLK-6.6 00/10] backport cpuidle patches from linux mainline
From: Hongye Lin <linhongye@h-partners.com> driver inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 ---------------------------------------------------------------------- backport cpuidle patches from linux mainline Bowen Yu (3): update to sysfs_emit() for safer buffer handling cpufreq: Update sscanf() to kstrtouint() cpufreq: Replace magic number Hongye Lin (1): Revert "cpufreq: CPPC: Don't warn on failing to read perf counters on offline cpus" Jie Zhan (4): cpufreq: CPPC: Don't warn if FIE init fails to read counters ACPI: CPPC: Factor out and export per-cpu cppc_perf_ctrs_in_pcc_cpu() cpufreq: CPPC: Factor out cppc_fie_kworker_init() cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs Pengjie Zhang (1): PM / devfreq: use _visible attribute to replace create/remove_sysfs_files() Sumit Gupta (1): arm64: topology: Fix false warning in counters_read_on_cpu() for same-CPU reads arch/arm64/kernel/topology.c | 21 +++++-- drivers/acpi/cppc_acpi.c | 48 ++++++++------- drivers/cpufreq/cppc_cpufreq.c | 105 ++++++++++++++++++++++----------- drivers/cpufreq/cpufreq.c | 14 ++--- drivers/devfreq/devfreq.c | 99 ++++++++++++++++++------------- include/acpi/cppc_acpi.h | 5 ++ 6 files changed, 181 insertions(+), 111 deletions(-) -- 2.33.0
From: Bowen Yu <yubowen8@huawei.com> driver inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 ---------------------------------------------------------------------- Replace sprintf() and scnprintf() with sysfs_emit() and sysfs_emit_at() in the cpufreq core. This ensures safer buffer handling and consistency with sysfs interfaces. Fixes: 3ed783c89579 ("smart_grid: cpufreq: introduce smart_grid cpufreq control") Signed-off-by: Bowen Yu <yubowen8@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cpufreq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 117591212850..0fef63158f10 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -2917,11 +2917,11 @@ static ssize_t show_smart_grid_governor(struct kobject *kobj, mutex_lock(&sg_zone_lock); if (!sg_zone.enable) { mutex_unlock(&sg_zone_lock); - return sprintf(buf, "smart_grid governor disable\n"); + return sysfs_emit(buf, "smart_grid governor disable\n"); } for (gov_index = 0; gov_index < SMART_GRID_ZONE_NR; gov_index++) - len += sprintf(buf + len, "smart_grid-%d: %s\n", gov_index, + len += sysfs_emit_at(buf, len, "smart_grid-%d: %s\n", gov_index, sg_zone.governor_name[gov_index]); mutex_unlock(&sg_zone_lock); @@ -2992,7 +2992,7 @@ define_one_global_rw(smart_grid_governor); static ssize_t show_smart_grid_governor_enable(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { - return sprintf(buf, "%u\n", sg_zone.enable); + return sysfs_emit(buf, "%u\n", sg_zone.enable); } static void smart_grid_irq_work(struct irq_work *irq_work) -- 2.33.0
From: Bowen Yu <yubowen8@huawei.com> mainline inclusion from mainline-v6.16-rc1 commit 1da98dc52b948a6063415d8bae0c60ef89044a8c category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- In store_scaling_setspeed(), sscanf is still used to read to sysfs. Newer kstrtox provide more features including overflow protection, better errorhandling and allows for other systems of numeration. It is therefore better to update sscanf() to kstrtouint(). Signed-off-by: Bowen Yu <yubowen8@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20250519070938.931396-1-yubowen8@huawei.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bowen Yu <yubowen8@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cpufreq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 0fef63158f10..fca922b84490 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -942,9 +942,9 @@ static ssize_t store_scaling_setspeed(struct cpufreq_policy *policy, if (!policy->governor || !policy->governor->store_setspeed) return -EINVAL; - ret = sscanf(buf, "%u", &freq); - if (ret != 1) - return -EINVAL; + ret = kstrtouint(buf, 0, &freq); + if (ret) + return ret; policy->governor->store_setspeed(policy, freq); -- 2.33.0
From: Bowen Yu <yubowen8@huawei.com> mainline inclusion from mainline-v6.16-rc1 commit 9c5075fc9d322670d4a82881bf41922a90fe423a category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- Setting the length of str_governor with a magic number could cause overflow when max length increases, it is better to use the defined macro in this case. Signed-off-by: Bowen Yu <yubowen8@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20250519070908.930879-1-yubowen8@huawei.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bowen Yu <yubowen8@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fca922b84490..6fb64dece82e 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -831,7 +831,7 @@ static ssize_t show_scaling_governor(struct cpufreq_policy *policy, char *buf) static ssize_t store_scaling_governor(struct cpufreq_policy *policy, const char *buf, size_t count) { - char str_governor[16]; + char str_governor[CPUFREQ_NAME_LEN]; int ret; ret = sscanf(buf, "%15s", str_governor); -- 2.33.0
From: Pengjie Zhang <zhangpengjie2@huawei.com> mainline inclusion from mainline-v7.1-rc1 commit 943a872fe41a8352d64b20de77d8b707978e5732 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- Previously, non-generic attributes (polling_interval, timer) used separate create/delete logic, leading to race conditions during concurrent access in creation/deletion. Multi-threaded operations also caused inconsistencies between governor capabilities and attribute states. 1.Use is_visible + sysfs_update_group() to unify management of these attributes, eliminating creation/deletion races. 2.Add locks and validation to these attributes, ensuring consistency between current governor capabilities and attribute operations in multi-threaded environments. Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/devfreq/devfreq.c | 99 +++++++++++++++++++++++---------------- 1 file changed, 58 insertions(+), 41 deletions(-) diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c index 29251141460e..f3275b33b3f5 100644 --- a/drivers/devfreq/devfreq.c +++ b/drivers/devfreq/devfreq.c @@ -38,6 +38,7 @@ static struct class *devfreq_class; static struct dentry *devfreq_debugfs; +static const struct attribute_group gov_attr_group; /* * devfreq core provides delayed work based load monitoring helper @@ -785,11 +786,6 @@ static void devfreq_dev_release(struct device *dev) kfree(devfreq); } -static void create_sysfs_files(struct devfreq *devfreq, - const struct devfreq_governor *gov); -static void remove_sysfs_files(struct devfreq *devfreq, - const struct devfreq_governor *gov); - /** * devfreq_add_device() - Add devfreq feature to the device * @dev: the device to add devfreq feature. @@ -956,7 +952,10 @@ struct devfreq *devfreq_add_device(struct device *dev, __func__); goto err_init; } - create_sysfs_files(devfreq, devfreq->governor); + + err = sysfs_update_group(&devfreq->dev.kobj, &gov_attr_group); + if (err) + goto err_init; list_add(&devfreq->node, &devfreq_list); @@ -998,9 +997,7 @@ int devfreq_remove_device(struct devfreq *devfreq) if (devfreq->governor) { devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_STOP, NULL); - remove_sysfs_files(devfreq, devfreq->governor); } - device_unregister(&devfreq->dev); return 0; @@ -1460,7 +1457,6 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr, __func__, df->governor->name, ret); goto out; } - remove_sysfs_files(df, df->governor); /* * Start the new governor and create the specific sysfs files @@ -1489,7 +1485,7 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr, * Create the sysfs files for the new governor. But if failed to start * the new governor, restore the sysfs files of previous governor. */ - create_sysfs_files(df, df->governor); + ret = sysfs_update_group(&df->dev.kobj, &gov_attr_group); out: mutex_unlock(&devfreq_list_lock); @@ -1810,14 +1806,16 @@ static struct attribute *devfreq_attrs[] = { &dev_attr_trans_stat.attr, NULL, }; -ATTRIBUTE_GROUPS(devfreq); static ssize_t polling_interval_show(struct device *dev, struct device_attribute *attr, char *buf) { struct devfreq *df = to_devfreq(dev); - if (!df->profile) + guard(mutex)(&devfreq_list_lock); + + if (!df->profile || !df->governor || + !IS_SUPPORTED_ATTR(df->governor->attrs, POLLING_INTERVAL)) return -EINVAL; return sprintf(buf, "%d\n", df->profile->polling_ms); @@ -1831,7 +1829,10 @@ static ssize_t polling_interval_store(struct device *dev, unsigned int value; int ret; - if (!df->governor) + guard(mutex)(&devfreq_list_lock); + + if (!df->governor || + !IS_SUPPORTED_ATTR(df->governor->attrs, POLLING_INTERVAL)) return -EINVAL; ret = sscanf(buf, "%u", &value); @@ -1850,7 +1851,10 @@ static ssize_t timer_show(struct device *dev, { struct devfreq *df = to_devfreq(dev); - if (!df->profile) + guard(mutex)(&devfreq_list_lock); + + if (!df->profile || !df->governor || + !IS_SUPPORTED_ATTR(df->governor->attrs, TIMER)) return -EINVAL; return sprintf(buf, "%s\n", timer_name[df->profile->timer]); @@ -1864,7 +1868,10 @@ static ssize_t timer_store(struct device *dev, struct device_attribute *attr, int timer = -1; int ret = 0, i; - if (!df->governor || !df->profile) + guard(mutex)(&devfreq_list_lock); + + if (!df->governor || !df->profile || + !IS_SUPPORTED_ATTR(df->governor->attrs, TIMER)) return -EINVAL; ret = sscanf(buf, "%16s", str_timer); @@ -1908,37 +1915,47 @@ static ssize_t timer_store(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_RW(timer); -#define CREATE_SYSFS_FILE(df, name) \ -{ \ - int ret; \ - ret = sysfs_create_file(&df->dev.kobj, &dev_attr_##name.attr); \ - if (ret < 0) { \ - dev_warn(&df->dev, \ - "Unable to create attr(%s)\n", "##name"); \ - } \ -} \ +static struct attribute *governor_attrs[] = { + &dev_attr_polling_interval.attr, + &dev_attr_timer.attr, + NULL +}; -/* Create the specific sysfs files which depend on each governor. */ -static void create_sysfs_files(struct devfreq *devfreq, - const struct devfreq_governor *gov) +static umode_t gov_attr_visible(struct kobject *kobj, + struct attribute *attr, int n) { - if (IS_SUPPORTED_ATTR(gov->attrs, POLLING_INTERVAL)) - CREATE_SYSFS_FILE(devfreq, polling_interval); - if (IS_SUPPORTED_ATTR(gov->attrs, TIMER)) - CREATE_SYSFS_FILE(devfreq, timer); -} + struct device *dev = kobj_to_dev(kobj); + struct devfreq *df = to_devfreq(dev); -/* Remove the specific sysfs files which depend on each governor. */ -static void remove_sysfs_files(struct devfreq *devfreq, - const struct devfreq_governor *gov) -{ - if (IS_SUPPORTED_ATTR(gov->attrs, POLLING_INTERVAL)) - sysfs_remove_file(&devfreq->dev.kobj, - &dev_attr_polling_interval.attr); - if (IS_SUPPORTED_ATTR(gov->attrs, TIMER)) - sysfs_remove_file(&devfreq->dev.kobj, &dev_attr_timer.attr); + if (!df->governor || !df->governor->attrs) + return 0; + + if (attr == &dev_attr_polling_interval.attr && + IS_SUPPORTED_ATTR(df->governor->attrs, POLLING_INTERVAL)) + return attr->mode; + + if (attr == &dev_attr_timer.attr && + IS_SUPPORTED_ATTR(df->governor->attrs, TIMER)) + return attr->mode; + + return 0; } +static const struct attribute_group devfreq_group = { + .attrs = devfreq_attrs, +}; + +static const struct attribute_group gov_attr_group = { + .attrs = governor_attrs, + .is_visible = gov_attr_visible, +}; + +static const struct attribute_group *devfreq_groups[] = { + &devfreq_group, + &gov_attr_group, + NULL +}; + /** * devfreq_summary_show() - Show the summary of the devfreq devices * @s: seq_file instance to show the summary of devfreq devices -- 2.33.0
From: Hongye Lin <linhongye@h-partners.com> driver inclusion category: cleanup bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 ---------------------------------------------------------------------- This reverts commit d08e86c77069fbbdd7fdbdaa408c198223bc0900. Fixes: d08e86c77069 ("cpufreq: CPPC: Don't warn on failing to read perf counters on offline cpus") Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cppc_cpufreq.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index b0a73531a4ea..073a07cb5939 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -178,14 +178,16 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy) init_irq_work(&cppc_fi->irq_work, cppc_irq_work); ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs); - if (ret && cpu_online(cpu)) { + if (ret) { + pr_warn("%s: failed to read perf counters for cpu:%d: %d\n", + __func__, cpu, ret); + /* * Don't abort if the CPU was offline while the driver * was getting registered. */ - pr_debug("%s: failed to read perf counters for cpu:%d: %d\n", - __func__, cpu, ret); - return; + if (cpu_online(cpu)) + return; } } -- 2.33.0
From: Jie Zhan <zhanjie9@hisilicon.com> mainline inclusion from mainline-v6.19-rc1 commit 1971b18785d198ae5adbb861136ae5c0f195c14d category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- During the CPPC FIE initialization, reading perf counters on offline cpus should be expected to fail. Don't warn on this case. Also, change the error log level to debug since FIE is optional. Co-developed-by: Bowen Yu <yubowen8@huawei.com> Signed-off-by: Bowen Yu <yubowen8@huawei.com> # Changing loglevel to debug Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> [ Viresh: Added back the dropped comment. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cppc_cpufreq.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 073a07cb5939..85b405c878d9 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -178,16 +178,15 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy) init_irq_work(&cppc_fi->irq_work, cppc_irq_work); ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs); - if (ret) { - pr_warn("%s: failed to read perf counters for cpu:%d: %d\n", - __func__, cpu, ret); - /* - * Don't abort if the CPU was offline while the driver - * was getting registered. - */ - if (cpu_online(cpu)) - return; + /* + * Don't abort as the CPU was offline while the driver was + * getting registered. + */ + if (ret && cpu_online(cpu)) { + pr_debug("%s: failed to read perf counters for cpu:%d: %d\n", + __func__, cpu, ret); + return; } } -- 2.33.0
From: Jie Zhan <zhanjie9@hisilicon.com> mainline inclusion from mainline-v7.0-rc1 commit f9cadb3d56912a70571fdd95f426b757557c465b category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- Factor out cppc_perf_ctrs_in_pcc_cpu() for checking whether per-cpu CPC regs are defined in PCC channels, and export it out for further use. Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/acpi/cppc_acpi.c | 48 ++++++++++++++++++++++------------------ include/acpi/cppc_acpi.h | 5 +++++ 2 files changed, 32 insertions(+), 21 deletions(-) diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index cdfd189b1932..52522c57231f 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -1356,6 +1356,32 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) } EXPORT_SYMBOL_GPL(cppc_get_perf_caps); +/** + * cppc_perf_ctrs_in_pcc_cpu - Check if any perf counters of a CPU are in PCC. + * @cpu: CPU on which to check perf counters. + * + * Return: true if any of the counters are in PCC regions, false otherwise + */ +bool cppc_perf_ctrs_in_pcc_cpu(unsigned int cpu) +{ + struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu); + struct cpc_register_resource *ref_perf_reg; + + /* + * If reference perf register is not supported then we should use the + * nominal perf value + */ + ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF]; + if (!CPC_SUPPORTED(ref_perf_reg)) + ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF]; + + return CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) || + CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) || + CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME]) || + CPC_IN_PCC(ref_perf_reg); +} +EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc_cpu); + /** * cppc_perf_ctrs_in_pcc - Check if any perf counters are in a PCC region. * @@ -1370,27 +1396,7 @@ bool cppc_perf_ctrs_in_pcc(void) int cpu; for_each_online_cpu(cpu) { - struct cpc_register_resource *ref_perf_reg; - struct cpc_desc *cpc_desc; - - cpc_desc = per_cpu(cpc_desc_ptr, cpu); - - if (CPC_IN_PCC(&cpc_desc->cpc_regs[DELIVERED_CTR]) || - CPC_IN_PCC(&cpc_desc->cpc_regs[REFERENCE_CTR]) || - CPC_IN_PCC(&cpc_desc->cpc_regs[CTR_WRAP_TIME])) - return true; - - - ref_perf_reg = &cpc_desc->cpc_regs[REFERENCE_PERF]; - - /* - * If reference perf register is not supported then we should - * use the nominal perf value - */ - if (!CPC_SUPPORTED(ref_perf_reg)) - ref_perf_reg = &cpc_desc->cpc_regs[NOMINAL_PERF]; - - if (CPC_IN_PCC(ref_perf_reg)) + if (cppc_perf_ctrs_in_pcc_cpu(cpu)) return true; } diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h index e8960208f576..5feec732c96d 100644 --- a/include/acpi/cppc_acpi.h +++ b/include/acpi/cppc_acpi.h @@ -145,6 +145,7 @@ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs); extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls); extern int cppc_set_enable(int cpu, bool enable); extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps); +extern bool cppc_perf_ctrs_in_pcc_cpu(unsigned int cpu); extern bool cppc_perf_ctrs_in_pcc(void); extern unsigned int cppc_perf_to_khz(struct cppc_perf_caps *caps, unsigned int perf); extern unsigned int cppc_khz_to_perf(struct cppc_perf_caps *caps, unsigned int freq); @@ -195,6 +196,10 @@ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps) { return -ENOTSUPP; } +static inline bool cppc_perf_ctrs_in_pcc_cpu(unsigned int cpu) +{ + return false; +} static inline bool cppc_perf_ctrs_in_pcc(void) { return false; -- 2.33.0
From: Jie Zhan <zhanjie9@hisilicon.com> mainline inclusion from mainline-v7.0-rc1 commit 206b6612556398e717b1e293d96992d5ab2b8f32 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- Factor out the CPPC FIE kworker init in cppc_freq_invariance_init() because it's a standalone procedure for use when the CPC regs are in PCC channels. Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cppc_cpufreq.c | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 85b405c878d9..612adb06b942 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -220,7 +220,7 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy) } } -static void __init cppc_freq_invariance_init(void) +static void cppc_fie_kworker_init(void) { struct sched_attr attr = { .size = sizeof(struct sched_attr), @@ -237,17 +237,6 @@ static void __init cppc_freq_invariance_init(void) }; int ret; - if (fie_disabled != FIE_ENABLED && fie_disabled != FIE_DISABLED) { - fie_disabled = FIE_ENABLED; - if (cppc_perf_ctrs_in_pcc()) { - pr_info("FIE not enabled on systems with registers in PCC\n"); - fie_disabled = FIE_DISABLED; - } - } - - if (fie_disabled) - return; - kworker_fie = kthread_create_worker(0, "cppc_fie"); if (IS_ERR(kworker_fie)) { pr_warn("%s: failed to create kworker_fie: %ld\n", __func__, @@ -265,6 +254,22 @@ static void __init cppc_freq_invariance_init(void) } } +static void __init cppc_freq_invariance_init(void) +{ + if (fie_disabled != FIE_ENABLED && fie_disabled != FIE_DISABLED) { + fie_disabled = FIE_ENABLED; + if (cppc_perf_ctrs_in_pcc()) { + pr_info("FIE not enabled on systems with registers in PCC\n"); + fie_disabled = FIE_DISABLED; + } + } + + if (fie_disabled) + return; + + cppc_fie_kworker_init(); +} + static void cppc_freq_invariance_exit(void) { if (fie_disabled) -- 2.33.0
From: Jie Zhan <zhanjie9@hisilicon.com> mainline inclusion from mainline-v7.0-rc1 commit 997c021abc6eb9cf7df39fa77fa5e666ad55e3a3 category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- Currently, the CPPC Frequency Invariance Engine (FIE) is invoked from the scheduler tick but defers the update of arch_freq_scale to a separate thread because cppc_get_perf_ctrs() would sleep if the CPC regs are in PCC. However, this deferred update mechanism is unnecessary and introduces extra overhead for non-PCC register spaces (e.g. System Memory or FFH), where accessing the regs won't sleep and can be safely performed from the tick context. Furthermore, with the CPPC FIE registered, it throws repeated warnings of "cppc_scale_freq_workfn: failed to read perf counters" on our platform with the CPC regs in System Memory and a power-down idle state enabled. That's because the remote CPU can be in a power-down idle state, and reading its perf counters returns 0. Moving the FIE handling back to the scheduler tick process makes the CPU handle its own perf counters, so it won't be idle and the issue would be inherently solved. To address the above issues, update arch_freq_scale directly in ticks for non-PCC regs and keep the deferred update mechanism for PCC regs. Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- drivers/cpufreq/cppc_cpufreq.c | 77 +++++++++++++++++++++++----------- 1 file changed, 52 insertions(+), 25 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 612adb06b942..def8c835db8e 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -81,32 +81,25 @@ struct fb_ctr_pair { }; /** - * cppc_scale_freq_workfn - CPPC arch_freq_scale updater for frequency invariance - * @work: The work item. + * __cppc_scale_freq_tick - CPPC arch_freq_scale updater for frequency invariance + * @cppc_fi: per-cpu CPPC FIE data. * - * The CPPC driver register itself with the topology core to provide its own + * The CPPC driver registers itself with the topology core to provide its own * implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which * gets called by the scheduler on every tick. * * Note that the arch specific counters have higher priority than CPPC counters, * if available, though the CPPC driver doesn't need to have any special * handling for that. - * - * On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we - * reach here from hard-irq context), which then schedules a normal work item - * and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable - * based on the counter updates since the last tick. */ -static void cppc_scale_freq_workfn(struct kthread_work *work) +static void __cppc_scale_freq_tick(struct cppc_freq_invariance *cppc_fi) { - struct cppc_freq_invariance *cppc_fi; struct cppc_perf_fb_ctrs fb_ctrs = {0}; struct cppc_cpudata *cpu_data; unsigned long local_freq_scale; u64 perf; int ret; - cppc_fi = container_of(work, struct cppc_freq_invariance, work); cpu_data = cppc_fi->cpu_data; ret = cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs); @@ -138,6 +131,24 @@ static void cppc_scale_freq_workfn(struct kthread_work *work) per_cpu(arch_freq_scale, cppc_fi->cpu) = local_freq_scale; } +static void cppc_scale_freq_tick(void) +{ + __cppc_scale_freq_tick(&per_cpu(cppc_freq_inv, smp_processor_id())); +} + +static struct scale_freq_data cppc_sftd = { + .source = SCALE_FREQ_SOURCE_CPPC, + .set_freq_scale = cppc_scale_freq_tick, +}; + +static void cppc_scale_freq_workfn(struct kthread_work *work) +{ + struct cppc_freq_invariance *cppc_fi; + + cppc_fi = container_of(work, struct cppc_freq_invariance, work); + __cppc_scale_freq_tick(cppc_fi); +} + static void cppc_irq_work(struct irq_work *irq_work) { struct cppc_freq_invariance *cppc_fi; @@ -146,7 +157,14 @@ static void cppc_irq_work(struct irq_work *irq_work) kthread_queue_work(kworker_fie, &cppc_fi->work); } -static void cppc_scale_freq_tick(void) +/* + * Reading perf counters may sleep if the CPC regs are in PCC. Thus, we + * schedule an irq work in scale_freq_tick (since we reach here from hard-irq + * context), which then schedules a normal work item cppc_scale_freq_workfn() + * that updates the per_cpu arch_freq_scale variable based on the counter + * updates since the last tick. + */ +static void cppc_scale_freq_tick_pcc(void) { struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, smp_processor_id()); @@ -157,13 +175,14 @@ static void cppc_scale_freq_tick(void) irq_work_queue(&cppc_fi->irq_work); } -static struct scale_freq_data cppc_sftd = { +static struct scale_freq_data cppc_sftd_pcc = { .source = SCALE_FREQ_SOURCE_CPPC, - .set_freq_scale = cppc_scale_freq_tick, + .set_freq_scale = cppc_scale_freq_tick_pcc, }; static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy) { + struct scale_freq_data *sftd = &cppc_sftd; struct cppc_freq_invariance *cppc_fi; int cpu, ret; @@ -174,8 +193,11 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy) cppc_fi = &per_cpu(cppc_freq_inv, cpu); cppc_fi->cpu = cpu; cppc_fi->cpu_data = policy->driver_data; - kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn); - init_irq_work(&cppc_fi->irq_work, cppc_irq_work); + if (cppc_perf_ctrs_in_pcc_cpu(cpu)) { + kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn); + init_irq_work(&cppc_fi->irq_work, cppc_irq_work); + sftd = &cppc_sftd_pcc; + } ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs); @@ -191,7 +213,7 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy) } /* Register for freq-invariance */ - topology_set_scale_freq_source(&cppc_sftd, policy->cpus); + topology_set_scale_freq_source(sftd, policy->cpus); } /* @@ -214,6 +236,8 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy) topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, policy->related_cpus); for_each_cpu(cpu, policy->related_cpus) { + if (!cppc_perf_ctrs_in_pcc_cpu(cpu)) + continue; cppc_fi = &per_cpu(cppc_freq_inv, cpu); irq_work_sync(&cppc_fi->irq_work); kthread_cancel_work_sync(&cppc_fi->work); @@ -242,6 +266,7 @@ static void cppc_fie_kworker_init(void) pr_warn("%s: failed to create kworker_fie: %ld\n", __func__, PTR_ERR(kworker_fie)); fie_disabled = FIE_DISABLED; + kworker_fie = NULL; return; } @@ -251,20 +276,24 @@ static void cppc_fie_kworker_init(void) ret); kthread_destroy_worker(kworker_fie); fie_disabled = FIE_DISABLED; + kworker_fie = NULL; } } static void __init cppc_freq_invariance_init(void) { - if (fie_disabled != FIE_ENABLED && fie_disabled != FIE_DISABLED) { - fie_disabled = FIE_ENABLED; - if (cppc_perf_ctrs_in_pcc()) { + bool perf_ctrs_in_pcc = cppc_perf_ctrs_in_pcc(); + + if (fie_disabled == FIE_UNSET) { + if (perf_ctrs_in_pcc) { pr_info("FIE not enabled on systems with registers in PCC\n"); fie_disabled = FIE_DISABLED; + } else { + fie_disabled = FIE_ENABLED; } } - if (fie_disabled) + if (fie_disabled || !perf_ctrs_in_pcc) return; cppc_fie_kworker_init(); @@ -272,10 +301,8 @@ static void __init cppc_freq_invariance_init(void) static void cppc_freq_invariance_exit(void) { - if (fie_disabled) - return; - - kthread_destroy_worker(kworker_fie); + if (kworker_fie) + kthread_destroy_worker(kworker_fie); } #else -- 2.33.0
From: Sumit Gupta <sumitg@nvidia.com> mainline inclusion from mainline-v6.18-rc1 commit df6e4ab654dc482c1d45776257a62ac10e14086c category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/9163 CVE: NA Reference: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... ---------------------------------------------------------------------- The counters_read_on_cpu() function warns when called with IRQs disabled to prevent deadlock in smp_call_function_single(). However, this warning is spurious when reading counters on the current CPU, since no IPI is needed for same CPU reads. Commit 12eb8f4fff24 ("cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs") changed the CPPC Frequency Invariance Engine to read AMU counters directly from the scheduler tick for non-PCC register spaces (like FFH), instead of deferring to a kthread. This means counters_read_on_cpu() is now called with IRQs disabled from the tick handler, triggering the warning. Fix this by restructuring the logic: when IRQs are disabled (tick context), call the function directly for same-CPU reads. Otherwise use smp_call_function_single(). Fixes: 997c021abc6e ("cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs") Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Hongye Lin <linhongye@h-partners.com> --- arch/arm64/kernel/topology.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index d9a6370c05c5..d78c917b3f9c 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -314,16 +314,25 @@ static inline int counters_read_on_cpu(int cpu, smp_call_func_t func, u64 *val) { /* - * Abort call on counterless CPU or when interrupts are - * disabled - can lead to deadlock in smp sync call. + * Abort call on counterless CPU. */ if (!cpu_has_amu_feat(cpu)) return -EOPNOTSUPP; - if (WARN_ON_ONCE(irqs_disabled())) - return -EPERM; - - smp_call_function_single(cpu, func, val, 1); + if (irqs_disabled()) { + /* + * When IRQs are disabled (tick path: sched_tick -> + * topology_scale_freq_tick or cppc_scale_freq_tick), only local + * CPU counter reads are allowed. Remote CPU counter read would + * require smp_call_function_single() which is unsafe with IRQs + * disabled. + */ + if (WARN_ON_ONCE(cpu != smp_processor_id())) + return -EPERM; + func(val); + } else { + smp_call_function_single(cpu, func, val, 1); + } return 0; } -- 2.33.0
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://atomgit.com/openeuler/kernel/merge_requests/22739 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/3FV... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://atomgit.com/openeuler/kernel/merge_requests/22739 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/3FV...
participants (2)
-
Lifeng Zheng -
patchwork bot