[PATCH OLK-6.6 0/6] Some bugfixes for interference
This series addresses several build and runtime issues discovered in the cgroup interference (IFS) subsystem. Patch 0001~0003 fix some build errors and runtime null-ptr-deref. Patch 0004 adds a fallback to sched_clock when TSC is unavailable, ensuring IFS can still function on systems without TSC. Patch 0005 fixes the missing interference.stat file in v1 hierarchy. Patch 0006 adds a version compatibility check between the IFS subsystem and cgroup to prevent potential mismatches. It would be changed as shown below: ```bash [root@kp91 cpu,cpuacct]# pwd /sys/fs/cgroup/cpu,cpuacct [root@kp91 cpu,cpuacct]# mkdir test [root@kp91 cpu,cpuacct]# mount -t cgroup2 none test [root@kp91 cpu,cpuacct]# [root@kp91 cpu,cpuacct]# cd test/ [root@kp91 test]# cat interference.stat cat: interference.stat: Operation not supported [root@kp91 test]# [root@kp91 test]# dmesg | grep mismatch [ 1060.893809] cgroup version mismatch: subsystem v1, cgroup v2 ``` Tengda Wu (6): interference: Avoid null pointer dereference in cgroup_ifs_add_files interference: Resolve arm32 build error by restricting IFS to x86/arm64 interference: Add dependency on CONFIG_SCHED_INFO to avoid build error interference: Add fallback to sched_clock when TSC is unavailable interference: Fix missing interference.stat file in v1 hierarchy interference: Add version compatibility check between subsystem and cgroup include/linux/cgroup.h | 25 ++++++++++++++------- init/Kconfig | 2 ++ kernel/cgroup/ifs.c | 51 +++++++++++++++++++++++++++++------------- kernel/sched/stats.h | 2 +- 4 files changed, 56 insertions(+), 24 deletions(-) -- 2.34.1
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICAOAT ----------------------------------------- IFS subsystem disables itself in cgroup_ifs_tsc_init() when TSC is not detected, skipping the initialization of cgroup_ifs_files. However, ifs_enable may remain true in this case, leading to a subsequent call to cgroup_ifs_add_files that uses the uninitialized cgroup_ifs_files and causes a null pointer dereference. Fix this by explicitly setting ifs_enable to false when IFS is disabled. Fixes: 9c735dcd118d ("interference: Use hardware timer counter") Signed-off-by: Tengda Wu <wutengda2@huawei.com> --- kernel/cgroup/ifs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/cgroup/ifs.c b/kernel/cgroup/ifs.c index e6419899d725..8106e91f1e92 100644 --- a/kernel/cgroup/ifs.c +++ b/kernel/cgroup/ifs.c @@ -340,6 +340,7 @@ static int cgroup_ifs_tsc_init(void) #endif if (!freq) { pr_warn("Disable CGROUP IFS: no constant tsc\n"); + ifs_enable = false; return -1; } -- 2.34.1
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICAOAT ----------------------------------------- The 64-bit division operations in tdesc_print() and tdesc_init() cause linkage failures on 32-bit ARM architectures, with errors such as: arm-linux-gnueabi-ld: kernel/cgroup/ifs.o: in function `tdesc_print': ifs.c:(.text.unlikely+0xa8): undefined reference to `__aeabi_uldivmod' arm-linux-gnueabi-ld: ifs.c:(.text.unlikely+0xc4): undefined reference to `__aeabi_uldivmod' arm-linux-gnueabi-ld: ifs.c:(.text.unlikely+0x130): undefined reference to `__aeabi_uldivmod' arm-linux-gnueabi-ld: ifs.c:(.text.unlikely+0x14c): undefined reference to `__aeabi_uldivmod' arm-linux-gnueabi-ld: ifs.c:(.text.unlikely+0x190): undefined reference to `__aeabi_uldivmod' This occurs because 32-bit ARM lacks native 64-bit division support and relies on compiler-provided helper functions that may not be available in all kernel build configurations. Fix this by restrict its compilation to x86 and ARM64 architectures where the feature is both usable and safe. Fixes: bc298fe613e2 ("interference: Add distribution histogram support") Signed-off-by: Tengda Wu <wutengda2@huawei.com> --- init/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/init/Kconfig b/init/Kconfig index 6c1f5079467f..704f63fe9e2d 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1390,6 +1390,7 @@ config CGROUP_FILES config CGROUP_IFS bool "Cgroup-based Interference Statistics" default n + depends on X86 || ARM64 select KERNFS select IRQ_TIME_ACCOUTING help -- 2.34.1
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICAOAT ----------------------------------------- The last_waking member of struct sched_info is only available when CONFIG_SCHED_INFO is enabled. Building the IFS subsystem without this configuration results in the compilation error: 'struct sched_info' has no member named 'last_waking' To resolve this, make CONFIG_CGROUP_IFS depend on CONFIG_SCHED_INFO in the Kconfig definition, ensuring both features are enabled together. Fixes: 4c89400decb9 ("interference: Add sched wakelat interference track support") Signed-off-by: Tengda Wu <wutengda2@huawei.com> --- init/Kconfig | 1 + kernel/sched/stats.h | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/init/Kconfig b/init/Kconfig index 704f63fe9e2d..f94a0c0894fc 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1392,6 +1392,7 @@ config CGROUP_IFS default n depends on X86 || ARM64 select KERNFS + select SCHED_INFO select IRQ_TIME_ACCOUTING help This option will provide online low-overhead interference diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index 1b736b931247..ca031b9af93a 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -136,7 +136,7 @@ __schedstats_from_se(struct sched_entity *se) #define QOS_THROTTLED 2 #endif -#ifdef CONFIG_CGROUP_IFS +#if defined(CONFIG_CGROUP_IFS) && defined(CONFIG_SCHED_INFO) static inline void ifs_account_rundelay(struct task_struct *task, u64 delta) { /* -- 2.34.1
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/ICAOAT ----------------------------------------- Currently, if TSC is not available, IFS will be directly disabled, which prevents enabling IFS even in environments like qemu where higher overhead is tolerable. To address this, add a fallback logic to cgroup_ifs_time_counter. Even if TSC is unavailable, it can now fall back to sched_clock. Signed-off-by: Tengda Wu <wutengda2@huawei.com> --- include/linux/cgroup.h | 25 +++++++++++++++++-------- kernel/cgroup/ifs.c | 30 +++++++++++++++--------------- 2 files changed, 32 insertions(+), 23 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 794cf8953b7f..05a9cef2edac 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -920,6 +920,12 @@ static inline bool cgroup_ifs_enabled(void) return static_branch_unlikely(&cgrp_ifs_enabled); } +DECLARE_STATIC_KEY_TRUE(cgrp_ifs_tsc_available); +static inline bool cgroup_ifs_tsc_available(void) +{ + return static_branch_likely(&cgrp_ifs_tsc_available); +} + static inline struct cgroup_ifs *cgroup_ifs(struct cgroup *cgrp) { return cgroup_ino(cgrp) == 1 ? &cgroup_root_ifs : cgrp->ifs; @@ -955,19 +961,22 @@ static inline void cgroup_ifs_account_delta(struct cgroup_ifs_cpu *ifsc, static inline u64 cgroup_ifs_time_counter(void) { + if (cgroup_ifs_tsc_available()) { #if defined(__aarch64__) - u64 counter; + u64 counter; - asm volatile("mrs %0, cntvct_el0" : "=r" (counter) :: "memory"); - return counter; + asm volatile("mrs %0, cntvct_el0" : "=r" (counter) :: "memory"); + return counter; #elif defined(__x86_64__) - unsigned int lo, hi; + unsigned int lo, hi; - asm volatile("rdtsc" : "=a"(lo), "=d"(hi) :: "memory"); - return ((u64)hi << 32) | lo; -#else - return sched_clock(); + asm volatile("rdtsc" : "=a"(lo), "=d"(hi) :: "memory"); + return ((u64)hi << 32) | lo; #endif + } + + /* fallback */ + return sched_clock(); } static inline void cgroup_ifs_enter_lock(u64 *clock) diff --git a/kernel/cgroup/ifs.c b/kernel/cgroup/ifs.c index 8106e91f1e92..4ffd3b5c9db0 100644 --- a/kernel/cgroup/ifs.c +++ b/kernel/cgroup/ifs.c @@ -49,6 +49,7 @@ struct cgroup_ifs cgroup_root_ifs = { }; DEFINE_STATIC_KEY_FALSE(cgrp_ifs_enabled); +DEFINE_STATIC_KEY_TRUE(cgrp_ifs_tsc_available); #ifdef CONFIG_CGROUP_IFS_DEFAULT_ENABLED static bool ifs_enable = true; @@ -315,7 +316,8 @@ static void tdesc_init(struct ifs_tdesc *desc, u64 freq) static void cgroup_ifs_tdesc_init(void) { tdesc_init(&ifs_tdesc[IFS_TIMER_CLK], NSEC_PER_SEC); - tdesc_init(&ifs_tdesc[IFS_TIMER_TSC], this_cpu_read(ifs_tsc_freq)); + if (cgroup_ifs_tsc_available()) + tdesc_init(&ifs_tdesc[IFS_TIMER_TSC], this_cpu_read(ifs_tsc_freq)); } static u64 tsc_cycles_to_nsec(u64 tsc_cycles) @@ -327,7 +329,7 @@ static u64 tsc_cycles_to_nsec(u64 tsc_cycles) #endif } -static int cgroup_ifs_tsc_init(void) +static void cgroup_ifs_tsc_init(void) { u64 freq = 0; int cpu; @@ -339,15 +341,13 @@ static int cgroup_ifs_tsc_init(void) freq = tsc_khz * 1000; #endif if (!freq) { - pr_warn("Disable CGROUP IFS: no constant tsc\n"); - ifs_enable = false; - return -1; + pr_warn("IFS: no constant tsc, use default clocksource as time source\n"); + static_branch_disable(&cgrp_ifs_tsc_available); + return; } for_each_possible_cpu(cpu) per_cpu(ifs_tsc_freq, cpu) = freq; - - return 0; } static bool should_print(int type) @@ -363,6 +363,11 @@ static bool should_print(int type) return true; } +static bool use_tsc(enum ifs_types t) +{ + return cgroup_ifs_tsc_available() && (t == IFS_SPINLOCK || t == IFS_MUTEX); +} + static int print_sum_time(struct cgroup_ifs *ifs, struct seq_file *seq) { u64 time[NR_IFS_TYPES] = { 0 }; @@ -382,7 +387,7 @@ static int print_sum_time(struct cgroup_ifs *ifs, struct seq_file *seq) for (i = 0; i < NR_IFS_TYPES; i++) { if (!should_print(i)) continue; - if (i == IFS_SPINLOCK || i == IFS_MUTEX) + if (use_tsc(i)) time[i] = tsc_cycles_to_nsec(time[i]); seq_printf(seq, "%-18s%llu\n", ifs_type_name(i), time[i]); } @@ -426,7 +431,7 @@ static int print_hist_count(struct cgroup_ifs *ifs, struct seq_file *seq) if (!should_print(i)) continue; - if (i == IFS_SPINLOCK || i == IFS_MUTEX) + if (use_tsc(i)) desc = &ifs_tdesc[IFS_TIMER_TSC]; else desc = &ifs_tdesc[IFS_TIMER_CLK]; @@ -533,9 +538,7 @@ void cgroup_ifs_init(void) if (!ifs_enable) return; - if (cgroup_ifs_tsc_init() < 0) - return; - + cgroup_ifs_tsc_init(); BUG_ON(cgroup_init_cftypes(NULL, cgroup_ifs_files)); cgroup_ifs_tdesc_init(); } @@ -581,9 +584,6 @@ static __init int cgroup_ifs_enable(void) if (!ifs_enable) return 0; - if (!this_cpu_read(ifs_tsc_freq)) - return 0; - static_branch_enable(&cgrp_ifs_enabled); return 0; } -- 2.34.1
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICAOAT ----------------------------------------- The 'interference.stat' file was missing from the cgroup v1 hierarchy when booting with cgroup v1 enabled. This occurred because both cgroup_ifs_enable and cgroup_v1_ifs_init were registered using late_initcall_sync. The cgroup_v1_ifs_init function depends on the 'cgrp_ifs_enabled' flag being set by cgroup_ifs_enable. Without an explicit priority distinction between the two late_initcall_sync registrations, cgroup_v1_ifs_init executed before cgroup_ifs_enable during boot, resulting in 'interference.stat' not being added to the v1 directories. Fix this by changing the registration of cgroup_ifs_enable to device_initcall. This ensures cgroup_ifs_enable (Level 6) runs strictly before cgroup_v1_ifs_init (Level 7 / late_initcall_sync), correctly setting 'cgrp_ifs_enabled' before the v1 interfaces are initialized. Fixes: c27ddc47dc1b ("ifs: Defer cgroup ifs enable") Signed-off-by: Tengda Wu <wutengda2@huawei.com> --- kernel/cgroup/ifs.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup/ifs.c b/kernel/cgroup/ifs.c index 4ffd3b5c9db0..9d9405e94b6a 100644 --- a/kernel/cgroup/ifs.c +++ b/kernel/cgroup/ifs.c @@ -587,4 +587,14 @@ static __init int cgroup_ifs_enable(void) static_branch_enable(&cgrp_ifs_enabled); return 0; } -late_initcall_sync(cgroup_ifs_enable); + +/* + * Execution Timing Constraints: + * 1. Must be late enough (e.g., after SUBSYS_INITCALL) to avoid the + * intermediate state of the core cgroup subsystem initialization, ensuring + * all internal structures are stable. + * 2. Must execute strictly before cgroup_v1_ifs_init(), which runs at + * LATE_INITCALL_SYNC, as cgroup_v1_ifs_init() relies on 'cgrp_ifs_enabled' + * being set before its execution begins. + */ +device_initcall(cgroup_ifs_enable); -- 2.34.1
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICAOAT ----------------------------------------- Add a sanity check to ensure IFS subsystem and its cgroup share the same cgroup version (either both v1 or both v2). This prevents operations that mix incompatible cgroup hierarchies, which could lead to incorrect accounting for interference. Fixes: 28d4e4995d76 ("interference: Add cgroup v1 support for CGROUP_IFS") Signed-off-by: Tengda Wu <wutengda2@huawei.com> --- kernel/cgroup/ifs.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/kernel/cgroup/ifs.c b/kernel/cgroup/ifs.c index 9d9405e94b6a..d7a3e4214143 100644 --- a/kernel/cgroup/ifs.c +++ b/kernel/cgroup/ifs.c @@ -477,6 +477,16 @@ static int cgroup_ifs_show(struct seq_file *seq, void *v) return -EINVAL; } +#ifdef CONFIG_CGROUP_CPUACCT + if ((!cgroup_subsys_on_dfl(cpuacct_cgrp_subsys) && cgroup_on_dfl(cgrp)) || + (cgroup_subsys_on_dfl(cpuacct_cgrp_subsys) && !cgroup_on_dfl(cgrp))) { + pr_info("cgroup version mismatch: subsystem %s, cgroup %s\n", + cgroup_subsys_on_dfl(cpuacct_cgrp_subsys) ? "v2" : "v1", + cgroup_on_dfl(cgrp) ? "v2" : "v1"); + return -EOPNOTSUPP; + } +#endif + ret = print_sum_time(ifs, seq); if (ret) return ret; -- 2.34.1
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/19636 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/OIX... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/19636 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/OIX...
participants (2)
-
patchwork bot -
Tengda Wu