Dong Kai (3): corelockup: Add support of cpu core hang check corelockup: Disable wfi/wfe mode for pmu based nmi corelockup: Add detector enable support by cmdline
Li Huafei (1): watchdog/corelockup: Support corelockup on X86_64
Xu Qiang (4): watchdog/corelockup: Optimized core lockup detection judgment rules watchdog/corelockup: Add interface to control the detection sensitivity. watchdog/corelockup: Depends on the hardlockup detection switch config: Open CONFIG_CORELOCKUP_DETECTOR
arch/arm64/configs/openeuler_defconfig | 1 + include/linux/nmi.h | 10 ++ kernel/watchdog.c | 45 +++++- kernel/watchdog_hld.c | 198 +++++++++++++++++++++++++ lib/Kconfig.debug | 9 ++ 5 files changed, 261 insertions(+), 2 deletions(-)
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/1922 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/2...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/1922 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/2...
From: Dong Kai dongkai11@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT CVE: NA
--------------------------------
The softlockup and hardlockup detector only check the status of the cpu which it resides. If certain cpu core suspends, they are both not works. There is no any valid log but the cpu already abnormal and brings a lot of problems of system. To detect this case, we add the corelockup detector.
First we use whether cpu core can responds to nmi as a sectence to determine if it is suspended. Then things is simple. Per cpu core maintains it's nmi interrupt counts and detector the nmi_counts of next cpu core. If the nmi interrupt counts not changed any more which means it can't respond nmi normally, we regard it as suspend.
To ensure robustness, only consecutive lost nmi more than two times then trigger the warn.
The detection chain is as following: cpu0->cpu1->...->cpuN->cpu0
Signed-off-by: Dong Kai dongkai11@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Conflicts: kernel/watchdog.c Signed-off-by: Li Huafei lihuafei1@huawei.com --- include/linux/nmi.h | 6 ++ kernel/watchdog.c | 15 +++- kernel/watchdog_hld.c | 165 ++++++++++++++++++++++++++++++++++++++++++ lib/Kconfig.debug | 8 ++ 4 files changed, 192 insertions(+), 2 deletions(-)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 468da521f1c5..eec3e1b18a6b 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -125,6 +125,12 @@ static inline int hardlockup_detector_perf_init(void) { return 0; } # endif #endif
+#ifdef CONFIG_CORELOCKUP_DETECTOR +extern void corelockup_detector_init(void); +extern void corelockup_detector_online_cpu(unsigned int cpu); +extern void corelockup_detector_offline_cpu(unsigned int cpu); +#endif + void watchdog_nmi_stop(void); void watchdog_nmi_start(void); int watchdog_nmi_probe(void); diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 3c2f54595c32..f2fd70767f5e 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -527,15 +527,23 @@ static void softlockup_start_all(void)
int lockup_detector_online_cpu(unsigned int cpu) { - if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) + if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) { watchdog_enable(cpu); +#ifdef CONFIG_CORELOCKUP_DETECTOR + corelockup_detector_online_cpu(cpu); +#endif + } return 0; }
int lockup_detector_offline_cpu(unsigned int cpu) { - if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) + if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) { watchdog_disable(cpu); +#ifdef CONFIG_CORELOCKUP_DETECTOR + corelockup_detector_offline_cpu(cpu); +#endif + } return 0; }
@@ -867,4 +875,7 @@ void __init lockup_detector_init(void) nmi_watchdog_available = true; lockup_detector_setup(); watchdog_sysctl_init(); +#ifdef CONFIG_CORELOCKUP_DETECTOR + corelockup_detector_init(); +#endif } diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index a3f35067b4d0..ecca12e6599d 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -41,6 +41,163 @@ notrace void __weak arch_touch_nmi_watchdog(void) EXPORT_SYMBOL(arch_touch_nmi_watchdog); #endif
+#ifdef CONFIG_CORELOCKUP_DETECTOR +/* + * The softlockup and hardlockup detector only check the status + * of the cpu which it resides. If certain cpu core suspends, + * they are both not works. There is no any valid log but the + * cpu already abnormal and brings a lot of problems of system. + * To detect this case, we add the corelockup detector. + * + * First we use whether cpu core can responds to nmi as a sectence + * to determine if it is suspended. Then things is simple. Per cpu + * core maintains it's nmi interrupt counts and detector the + * nmi_counts of next cpu core. If the nmi interrupt counts not + * changed any more which means it can't respond nmi normally, we + * regard it as suspend. + * + * To ensure robustness, only consecutive lost nmi more than two + * times then trigger the warn. + * + * The detection chain is as following: + * cpu0->cpu1->...->cpuN->cpu0 + * + * detector_cpu: the target cpu to detector of current cpu + * nmi_interrupts: the nmi counts of current cpu + * nmi_cnt_saved: saved nmi counts of detector_cpu + * nmi_cnt_missed: the nmi consecutive miss counts of detector_cpu + */ +static DEFINE_PER_CPU(unsigned int, detector_cpu); +static DEFINE_PER_CPU(unsigned long, nmi_interrupts); +static DEFINE_PER_CPU(unsigned long, nmi_cnt_saved); +static DEFINE_PER_CPU(unsigned long, nmi_cnt_missed); +static DEFINE_PER_CPU(bool, core_watchdog_warn); + +static void watchdog_nmi_interrupts(void) +{ + __this_cpu_inc(nmi_interrupts); +} + +static void corelockup_status_copy(unsigned int from, unsigned int to) +{ + per_cpu(nmi_cnt_saved, to) = per_cpu(nmi_cnt_saved, from); + per_cpu(nmi_cnt_missed, to) = per_cpu(nmi_cnt_missed, from); + + /* always update detector cpu at the end */ + per_cpu(detector_cpu, to) = per_cpu(detector_cpu, from); +} + +static void corelockup_status_init(unsigned int cpu, unsigned int target) +{ + /* + * initialize saved count to max to avoid unnecessary misjudge + * caused by delay running of nmi on target cpu + */ + per_cpu(nmi_cnt_saved, cpu) = ULONG_MAX; + per_cpu(nmi_cnt_missed, cpu) = 0; + + /* always update detector cpu at the end */ + per_cpu(detector_cpu, cpu) = target; +} + +void __init corelockup_detector_init(void) +{ + unsigned int cpu, next; + + /* detector cpu is set to the next valid logically one */ + for_each_cpu_and(cpu, &watchdog_cpumask, cpu_online_mask) { + next = cpumask_next_and(cpu, &watchdog_cpumask, + cpu_online_mask); + if (next >= nr_cpu_ids) + next = cpumask_first_and(&watchdog_cpumask, + cpu_online_mask); + corelockup_status_init(cpu, next); + } +} + +/* + * Before: first->next + * After: first->[new]->next + */ +void corelockup_detector_online_cpu(unsigned int cpu) +{ + unsigned int first = cpumask_first_and(&watchdog_cpumask, + cpu_online_mask); + + if (WARN_ON(first >= nr_cpu_ids)) + return; + + /* cpu->next */ + corelockup_status_copy(first, cpu); + + /* first->cpu */ + corelockup_status_init(first, cpu); +} + +/* + * Before: prev->cpu->next + * After: prev->next + */ +void corelockup_detector_offline_cpu(unsigned int cpu) +{ + unsigned int prev = nr_cpu_ids; + unsigned int i; + + /* found prev cpu */ + for_each_cpu_and(i, &watchdog_cpumask, cpu_online_mask) { + if (per_cpu(detector_cpu, i) == cpu) { + prev = i; + break; + } + } + + if (WARN_ON(prev == nr_cpu_ids)) + return; + + /* prev->next */ + corelockup_status_copy(cpu, prev); +} + +static bool is_corelockup(unsigned int cpu) +{ + unsigned long nmi_int = per_cpu(nmi_interrupts, cpu); + + /* skip check if only one cpu online */ + if (cpu == smp_processor_id()) + return false; + + if (__this_cpu_read(nmi_cnt_saved) != nmi_int) { + __this_cpu_write(nmi_cnt_saved, nmi_int); + __this_cpu_write(nmi_cnt_missed, 0); + per_cpu(core_watchdog_warn, cpu) = false; + return false; + } + + __this_cpu_inc(nmi_cnt_missed); + if (__this_cpu_read(nmi_cnt_missed) > 2) + return true; + + return false; +} +NOKPROBE_SYMBOL(is_corelockup); + +static void watchdog_corelockup_check(struct pt_regs *regs) +{ + unsigned int cpu = __this_cpu_read(detector_cpu); + + if (is_corelockup(cpu)) { + if (per_cpu(core_watchdog_warn, cpu) == true) + return; + pr_emerg("Watchdog detected core LOCKUP on cpu %d\n", cpu); + + if (hardlockup_panic) + nmi_panic(regs, "Core LOCKUP"); + + per_cpu(core_watchdog_warn, cpu) = true; + } +} +#endif + #ifdef CONFIG_HARDLOCKUP_CHECK_TIMESTAMP static DEFINE_PER_CPU(ktime_t, last_timestamp); static DEFINE_PER_CPU(unsigned int, nmi_rearmed); @@ -108,6 +265,14 @@ static inline bool watchdog_check_timestamp(void)
void watchdog_hardlockup_check(struct pt_regs *regs) { +#ifdef CONFIG_CORELOCKUP_DETECTOR + /* Kick nmi interrupts */ + watchdog_nmi_interrupts(); + + /* corelockup check */ + watchdog_corelockup_check(regs); +#endif + if (__this_cpu_read(watchdog_nmi_touch) == true) { __this_cpu_write(watchdog_nmi_touch, false); return; diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 3165775ffcf3..9cbb4ed7f17c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1071,6 +1071,14 @@ config HARDLOCKUP_DETECTOR chance to run. The current stack trace is displayed upon detection and the system will stay locked up.
+config CORELOCKUP_DETECTOR + bool "Detect Core Lockups" + depends on HARDLOCKUP_DETECTOR && SOFTLOCKUP_DETECTOR + depends on ARM64 + default n + help + Corelockups is used to check whether cpu core hungup or not. + config BOOTPARAM_HARDLOCKUP_PANIC bool "Panic (Reboot) On Hard Lockups" depends on HARDLOCKUP_DETECTOR
From: Dong Kai dongkai11@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT CVE: NA
--------------------------------
When using pmu events as nmi source, the pmu clock is disabled under wfi/wfe mode. And the nmi can't respond periodically. To minimize the misjudgment by wfi/wfe, we adopt a simple method which to disable wfi/wfe at the right time and the watchdog hrtimer is a good baseline.
The watchdog hrtimer is based on generate timer and has high freq than nmi. If watchdog hrtimer not works we disable wfi/wfe mode then the pmu nmi should always responds as long as the cpu core not suspend.
Signed-off-by: Dong Kai dongkai11@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Conflicts: arch/arm64/include/asm/barrier.h Signed-off-by: Li Huafei lihuafei1@huawei.com --- arch/arm64/include/asm/barrier.h | 19 +++++++++- include/linux/nmi.h | 2 + kernel/watchdog.c | 12 ++++++ kernel/watchdog_hld.c | 63 ++++++++++++++++++++++++++++++++ 4 files changed, 94 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h index cf2987464c18..a70cc61ee432 100644 --- a/arch/arm64/include/asm/barrier.h +++ b/arch/arm64/include/asm/barrier.h @@ -17,12 +17,27 @@ #define nops(n) asm volatile(__nops(n))
#define sev() asm volatile("sev" : : : "memory") -#define wfe() asm volatile("wfe" : : : "memory") #define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \ : : "r" (val) : "memory") -#define wfi() asm volatile("wfi" : : : "memory") #define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \ : : "r" (val) : "memory") +#ifdef CONFIG_CORELOCKUP_DETECTOR +extern unsigned int close_wfi_wfe; +#define wfe() \ + do { \ + if (likely(close_wfi_wfe == 0)) \ + asm volatile("wfe" : : : "memory"); \ + } while (0) +#define wfi() \ + do { \ + if (likely(close_wfi_wfe == 0)) \ + asm volatile("wfi" : : : "memory"); \ + } while (0) + +#else +#define wfe() asm volatile("wfe" : : : "memory") +#define wfi() asm volatile("wfi" : : : "memory") +#endif
#define isb() asm volatile("isb" : : : "memory") #define dmb(opt) asm volatile("dmb " #opt : : : "memory") diff --git a/include/linux/nmi.h b/include/linux/nmi.h index eec3e1b18a6b..e750b9ad7a41 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -129,6 +129,8 @@ static inline int hardlockup_detector_perf_init(void) { return 0; } extern void corelockup_detector_init(void); extern void corelockup_detector_online_cpu(unsigned int cpu); extern void corelockup_detector_offline_cpu(unsigned int cpu); +extern void watchdog_check_hrtimer(void); +extern unsigned long watchdog_hrtimer_interrupts(unsigned int cpu); #endif
void watchdog_nmi_stop(void); diff --git a/kernel/watchdog.c b/kernel/watchdog.c index f2fd70767f5e..d8ae6a58c6ec 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -352,6 +352,13 @@ static int softlockup_fn(void *data) return 0; }
+#ifdef CONFIG_CORELOCKUP_DETECTOR +unsigned long watchdog_hrtimer_interrupts(unsigned int cpu) +{ + return per_cpu(hrtimer_interrupts, cpu); +} +#endif + /* watchdog kicker functions */ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) { @@ -363,6 +370,11 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) if (!watchdog_enabled) return HRTIMER_NORESTART;
+#ifdef CONFIG_CORELOCKUP_DETECTOR + /* check hrtimer of detector cpu */ + watchdog_check_hrtimer(); +#endif + /* kick the hardlockup detector */ watchdog_interrupt_count();
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index ecca12e6599d..840558756bdd 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -62,16 +62,37 @@ EXPORT_SYMBOL(arch_touch_nmi_watchdog); * The detection chain is as following: * cpu0->cpu1->...->cpuN->cpu0 * + * When using pmu events as nmi source, the pmu clock is disabled + * under wfi/wfe mode. And the nmi can't respond periodically. + * To minimize the misjudgment by wfi/wfe, we adopt a simple method + * which to disable wfi/wfe at the right time and the watchdog hrtimer + * is a good baseline. + * + * The watchdog hrtimer is based on generate timer and has high freq + * than nmi. If watchdog hrtimer not works we disable wfi/wfe mode + * then the pmu nmi should always responds as long as the cpu core + * not suspend. + * * detector_cpu: the target cpu to detector of current cpu * nmi_interrupts: the nmi counts of current cpu * nmi_cnt_saved: saved nmi counts of detector_cpu * nmi_cnt_missed: the nmi consecutive miss counts of detector_cpu + * hrint_saved: saved hrtimer interrupts of detector_cpu + * hrint_missed: the hrtimer consecutive miss counts of detector_cpu + * corelockup_cpumask/close_wfi_wfe: + * the cpu mask is set if certain cpu maybe fall in suspend and close + * wfi/wfe mode if any bit is set */ static DEFINE_PER_CPU(unsigned int, detector_cpu); static DEFINE_PER_CPU(unsigned long, nmi_interrupts); static DEFINE_PER_CPU(unsigned long, nmi_cnt_saved); static DEFINE_PER_CPU(unsigned long, nmi_cnt_missed); static DEFINE_PER_CPU(bool, core_watchdog_warn); +static DEFINE_PER_CPU(unsigned long, hrint_saved); +static DEFINE_PER_CPU(unsigned long, hrint_missed); +struct cpumask corelockup_cpumask __read_mostly; +unsigned int close_wfi_wfe; +static bool pmu_based_nmi;
static void watchdog_nmi_interrupts(void) { @@ -82,6 +103,8 @@ static void corelockup_status_copy(unsigned int from, unsigned int to) { per_cpu(nmi_cnt_saved, to) = per_cpu(nmi_cnt_saved, from); per_cpu(nmi_cnt_missed, to) = per_cpu(nmi_cnt_missed, from); + per_cpu(hrint_saved, to) = per_cpu(hrint_saved, from); + per_cpu(hrint_missed, to) = per_cpu(hrint_missed, from);
/* always update detector cpu at the end */ per_cpu(detector_cpu, to) = per_cpu(detector_cpu, from); @@ -95,6 +118,8 @@ static void corelockup_status_init(unsigned int cpu, unsigned int target) */ per_cpu(nmi_cnt_saved, cpu) = ULONG_MAX; per_cpu(nmi_cnt_missed, cpu) = 0; + per_cpu(hrint_saved, cpu) = ULONG_MAX; + per_cpu(hrint_missed, cpu) = 0;
/* always update detector cpu at the end */ per_cpu(detector_cpu, cpu) = target; @@ -115,6 +140,38 @@ void __init corelockup_detector_init(void) } }
+void watchdog_check_hrtimer(void) +{ + unsigned int cpu = __this_cpu_read(detector_cpu); + unsigned long hrint = watchdog_hrtimer_interrupts(cpu); + + /* + * The freq of hrtimer is fast than nmi interrupts and + * the core mustn't hangs if hrtimer still working. + * So update the nmi interrupts in hrtimer either to + * improved robustness of nmi counts check. + */ + watchdog_nmi_interrupts(); + + if (!pmu_based_nmi) + return; + + if (__this_cpu_read(hrint_saved) != hrint) { + __this_cpu_write(hrint_saved, hrint); + __this_cpu_write(hrint_missed, 0); + cpumask_clear_cpu(cpu, &corelockup_cpumask); + } else { + __this_cpu_inc(hrint_missed); + if (__this_cpu_read(hrint_missed) > 2) + cpumask_set_cpu(cpu, &corelockup_cpumask); + } + + if (likely(cpumask_empty(&corelockup_cpumask))) + close_wfi_wfe = 0; + else + close_wfi_wfe = 1; +} + /* * Before: first->next * After: first->[new]->next @@ -143,6 +200,9 @@ void corelockup_detector_offline_cpu(unsigned int cpu) unsigned int prev = nr_cpu_ids; unsigned int i;
+ /* clear bitmap */ + cpumask_clear_cpu(cpu, &corelockup_cpumask); + /* found prev cpu */ for_each_cpu_and(i, &watchdog_cpumask, cpu_online_mask) { if (per_cpu(detector_cpu, i) == cpu) { @@ -477,6 +537,9 @@ int __init hardlockup_detector_perf_init(void) perf_event_release_kernel(this_cpu_read(watchdog_ev)); this_cpu_write(watchdog_ev, NULL); } +#ifdef CONFIG_CORELOCKUP_DETECTOR + pmu_based_nmi = true; +#endif return ret; } #endif /* CONFIG_HARDLOCKUP_DETECTOR_PERF */
From: Dong Kai dongkai11@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT CVE: NA
--------------------------------
Add cmdline params "enable_corelockup_detector" to support enable core suspend detector. And enable defaultly within ascend features.
Signed-off-by: Dong Kai dongkai11@huawei.com Reviewed-by: Kuohai Xu xukuohai@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Conflicts: kernel/watchdog.c Signed-off-by: Li Huafei lihuafei1@huawei.com --- include/linux/nmi.h | 1 + kernel/watchdog.c | 12 ++++++++---- kernel/watchdog_hld.c | 18 ++++++++++++++---- 3 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index e750b9ad7a41..b3229c437264 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -131,6 +131,7 @@ extern void corelockup_detector_online_cpu(unsigned int cpu); extern void corelockup_detector_offline_cpu(unsigned int cpu); extern void watchdog_check_hrtimer(void); extern unsigned long watchdog_hrtimer_interrupts(unsigned int cpu); +extern bool enable_corelockup_detector; #endif
void watchdog_nmi_stop(void); diff --git a/kernel/watchdog.c b/kernel/watchdog.c index d8ae6a58c6ec..4ef8c343c442 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -372,7 +372,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
#ifdef CONFIG_CORELOCKUP_DETECTOR /* check hrtimer of detector cpu */ - watchdog_check_hrtimer(); + if (enable_corelockup_detector) + watchdog_check_hrtimer(); #endif
/* kick the hardlockup detector */ @@ -542,7 +543,8 @@ int lockup_detector_online_cpu(unsigned int cpu) if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) { watchdog_enable(cpu); #ifdef CONFIG_CORELOCKUP_DETECTOR - corelockup_detector_online_cpu(cpu); + if (enable_corelockup_detector) + corelockup_detector_online_cpu(cpu); #endif } return 0; @@ -553,7 +555,8 @@ int lockup_detector_offline_cpu(unsigned int cpu) if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) { watchdog_disable(cpu); #ifdef CONFIG_CORELOCKUP_DETECTOR - corelockup_detector_offline_cpu(cpu); + if (enable_corelockup_detector) + corelockup_detector_offline_cpu(cpu); #endif } return 0; @@ -888,6 +891,7 @@ void __init lockup_detector_init(void) lockup_detector_setup(); watchdog_sysctl_init(); #ifdef CONFIG_CORELOCKUP_DETECTOR - corelockup_detector_init(); + if (enable_corelockup_detector) + corelockup_detector_init(); #endif } diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index 840558756bdd..fe5bc46f73d0 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -93,6 +93,14 @@ static DEFINE_PER_CPU(unsigned long, hrint_missed); struct cpumask corelockup_cpumask __read_mostly; unsigned int close_wfi_wfe; static bool pmu_based_nmi; +bool enable_corelockup_detector; + +static int __init enable_corelockup_detector_setup(char *str) +{ + enable_corelockup_detector = true; + return 1; +} +__setup("enable_corelockup_detector", enable_corelockup_detector_setup);
static void watchdog_nmi_interrupts(void) { @@ -326,11 +334,13 @@ static inline bool watchdog_check_timestamp(void) void watchdog_hardlockup_check(struct pt_regs *regs) { #ifdef CONFIG_CORELOCKUP_DETECTOR - /* Kick nmi interrupts */ - watchdog_nmi_interrupts(); + if (enable_corelockup_detector) { + /* Kick nmi interrupts */ + watchdog_nmi_interrupts();
- /* corelockup check */ - watchdog_corelockup_check(regs); + /* corelockup check */ + watchdog_corelockup_check(regs); + } #endif
if (__this_cpu_read(watchdog_nmi_touch) == true) {
From: Xu Qiang xuqiang36@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT CVE: NA
--------------------------------
Optimized core lockup detection judgment rules to make it easier to understand.
Core suspension detection is performed in the hrtimer interrupt processing function. The detection condition is that the hrtimer interrupt and NMI interrupt are not updated for multiple consecutive times.
Signed-off-by: Xu Qiang xuqiang36@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Conflicts: arch/arm64/include/asm/barrier.h Signed-off-by: Li Huafei lihuafei1@huawei.com --- arch/arm64/include/asm/barrier.h | 19 +----- kernel/watchdog_hld.c | 104 +++++++++---------------------- 2 files changed, 31 insertions(+), 92 deletions(-)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h index a70cc61ee432..cf2987464c18 100644 --- a/arch/arm64/include/asm/barrier.h +++ b/arch/arm64/include/asm/barrier.h @@ -17,27 +17,12 @@ #define nops(n) asm volatile(__nops(n))
#define sev() asm volatile("sev" : : : "memory") +#define wfe() asm volatile("wfe" : : : "memory") #define wfet(val) asm volatile("msr s0_3_c1_c0_0, %0" \ : : "r" (val) : "memory") +#define wfi() asm volatile("wfi" : : : "memory") #define wfit(val) asm volatile("msr s0_3_c1_c0_1, %0" \ : : "r" (val) : "memory") -#ifdef CONFIG_CORELOCKUP_DETECTOR -extern unsigned int close_wfi_wfe; -#define wfe() \ - do { \ - if (likely(close_wfi_wfe == 0)) \ - asm volatile("wfe" : : : "memory"); \ - } while (0) -#define wfi() \ - do { \ - if (likely(close_wfi_wfe == 0)) \ - asm volatile("wfi" : : : "memory"); \ - } while (0) - -#else -#define wfe() asm volatile("wfe" : : : "memory") -#define wfi() asm volatile("wfi" : : : "memory") -#endif
#define isb() asm volatile("isb" : : : "memory") #define dmb(opt) asm volatile("dmb " #opt : : : "memory") diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index fe5bc46f73d0..392837224ec1 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -64,14 +64,9 @@ EXPORT_SYMBOL(arch_touch_nmi_watchdog); * * When using pmu events as nmi source, the pmu clock is disabled * under wfi/wfe mode. And the nmi can't respond periodically. - * To minimize the misjudgment by wfi/wfe, we adopt a simple method - * which to disable wfi/wfe at the right time and the watchdog hrtimer - * is a good baseline. - * - * The watchdog hrtimer is based on generate timer and has high freq - * than nmi. If watchdog hrtimer not works we disable wfi/wfe mode - * then the pmu nmi should always responds as long as the cpu core - * not suspend. + * However, when the core is suspended, the hrtimer interrupt and + * NMI interrupt cannot be received. This can be used as the basis + * for determining whether the core is suspended. * * detector_cpu: the target cpu to detector of current cpu * nmi_interrupts: the nmi counts of current cpu @@ -79,20 +74,14 @@ EXPORT_SYMBOL(arch_touch_nmi_watchdog); * nmi_cnt_missed: the nmi consecutive miss counts of detector_cpu * hrint_saved: saved hrtimer interrupts of detector_cpu * hrint_missed: the hrtimer consecutive miss counts of detector_cpu - * corelockup_cpumask/close_wfi_wfe: - * the cpu mask is set if certain cpu maybe fall in suspend and close - * wfi/wfe mode if any bit is set */ static DEFINE_PER_CPU(unsigned int, detector_cpu); static DEFINE_PER_CPU(unsigned long, nmi_interrupts); static DEFINE_PER_CPU(unsigned long, nmi_cnt_saved); static DEFINE_PER_CPU(unsigned long, nmi_cnt_missed); -static DEFINE_PER_CPU(bool, core_watchdog_warn); static DEFINE_PER_CPU(unsigned long, hrint_saved); static DEFINE_PER_CPU(unsigned long, hrint_missed); -struct cpumask corelockup_cpumask __read_mostly; -unsigned int close_wfi_wfe; -static bool pmu_based_nmi; +static unsigned long corelockup_allcpu_dumped; bool enable_corelockup_detector;
static int __init enable_corelockup_detector_setup(char *str) @@ -152,6 +141,11 @@ void watchdog_check_hrtimer(void) { unsigned int cpu = __this_cpu_read(detector_cpu); unsigned long hrint = watchdog_hrtimer_interrupts(cpu); + unsigned long nmi_int = per_cpu(nmi_interrupts, cpu); + + /* skip check if only one cpu online */ + if (cpu == smp_processor_id()) + return;
/* * The freq of hrtimer is fast than nmi interrupts and @@ -161,23 +155,31 @@ void watchdog_check_hrtimer(void) */ watchdog_nmi_interrupts();
- if (!pmu_based_nmi) - return; - if (__this_cpu_read(hrint_saved) != hrint) { __this_cpu_write(hrint_saved, hrint); __this_cpu_write(hrint_missed, 0); - cpumask_clear_cpu(cpu, &corelockup_cpumask); - } else { - __this_cpu_inc(hrint_missed); - if (__this_cpu_read(hrint_missed) > 2) - cpumask_set_cpu(cpu, &corelockup_cpumask); + return; + } + __this_cpu_inc(hrint_missed); + + if (__this_cpu_read(nmi_cnt_saved) != nmi_int) { + __this_cpu_write(nmi_cnt_saved, nmi_int); + __this_cpu_write(nmi_cnt_missed, 0); + return; } + __this_cpu_inc(nmi_cnt_missed);
- if (likely(cpumask_empty(&corelockup_cpumask))) - close_wfi_wfe = 0; - else - close_wfi_wfe = 1; + if ((__this_cpu_read(hrint_missed) > 5) && (__this_cpu_read(nmi_cnt_missed) > 5)) { + pr_emerg("Watchdog detected core LOCKUP on cpu %d\n", cpu); + + if (!test_and_set_bit(0, &corelockup_allcpu_dumped)) { + trigger_allbutself_cpu_backtrace(); + panic("Core LOCKUP"); + } else { + while (1) + cpu_relax(); + } + } }
/* @@ -208,9 +210,6 @@ void corelockup_detector_offline_cpu(unsigned int cpu) unsigned int prev = nr_cpu_ids; unsigned int i;
- /* clear bitmap */ - cpumask_clear_cpu(cpu, &corelockup_cpumask); - /* found prev cpu */ for_each_cpu_and(i, &watchdog_cpumask, cpu_online_mask) { if (per_cpu(detector_cpu, i) == cpu) { @@ -225,45 +224,6 @@ void corelockup_detector_offline_cpu(unsigned int cpu) /* prev->next */ corelockup_status_copy(cpu, prev); } - -static bool is_corelockup(unsigned int cpu) -{ - unsigned long nmi_int = per_cpu(nmi_interrupts, cpu); - - /* skip check if only one cpu online */ - if (cpu == smp_processor_id()) - return false; - - if (__this_cpu_read(nmi_cnt_saved) != nmi_int) { - __this_cpu_write(nmi_cnt_saved, nmi_int); - __this_cpu_write(nmi_cnt_missed, 0); - per_cpu(core_watchdog_warn, cpu) = false; - return false; - } - - __this_cpu_inc(nmi_cnt_missed); - if (__this_cpu_read(nmi_cnt_missed) > 2) - return true; - - return false; -} -NOKPROBE_SYMBOL(is_corelockup); - -static void watchdog_corelockup_check(struct pt_regs *regs) -{ - unsigned int cpu = __this_cpu_read(detector_cpu); - - if (is_corelockup(cpu)) { - if (per_cpu(core_watchdog_warn, cpu) == true) - return; - pr_emerg("Watchdog detected core LOCKUP on cpu %d\n", cpu); - - if (hardlockup_panic) - nmi_panic(regs, "Core LOCKUP"); - - per_cpu(core_watchdog_warn, cpu) = true; - } -} #endif
#ifdef CONFIG_HARDLOCKUP_CHECK_TIMESTAMP @@ -337,9 +297,6 @@ void watchdog_hardlockup_check(struct pt_regs *regs) if (enable_corelockup_detector) { /* Kick nmi interrupts */ watchdog_nmi_interrupts(); - - /* corelockup check */ - watchdog_corelockup_check(regs); } #endif
@@ -547,9 +504,6 @@ int __init hardlockup_detector_perf_init(void) perf_event_release_kernel(this_cpu_read(watchdog_ev)); this_cpu_write(watchdog_ev, NULL); } -#ifdef CONFIG_CORELOCKUP_DETECTOR - pmu_based_nmi = true; -#endif return ret; } #endif /* CONFIG_HARDLOCKUP_DETECTOR_PERF */
From: Xu Qiang xuqiang36@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT CVE: NA
--------------------------------
A user-mode interface is added to control the core lockup detection sensitivity.
Signed-off-by: Xu Qiang xuqiang36@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Conflicts: kernel/sysctl.c Signed-off-by: Li Huafei lihuafei1@huawei.com --- include/linux/nmi.h | 1 + kernel/watchdog.c | 14 ++++++++++++++ kernel/watchdog_hld.c | 4 +++- 3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index b3229c437264..e14216d1d5aa 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -132,6 +132,7 @@ extern void corelockup_detector_offline_cpu(unsigned int cpu); extern void watchdog_check_hrtimer(void); extern unsigned long watchdog_hrtimer_interrupts(unsigned int cpu); extern bool enable_corelockup_detector; +extern int corelockup_miss_thresh; #endif
void watchdog_nmi_stop(void); diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 4ef8c343c442..73d5eb55b112 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -778,6 +778,9 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write, }
static const int sixty = 60; +#ifdef CONFIG_HARDLOCKUP_DETECTOR +static const int five = 5; +#endif
static struct ctl_table watchdog_sysctls[] = { { @@ -866,6 +869,17 @@ static struct ctl_table watchdog_sysctls[] = { .extra2 = SYSCTL_ONE, }, #endif /* CONFIG_SMP */ +#ifdef CONFIG_CORELOCKUP_DETECTOR + { + .procname = "corelockup_thresh", + .data = &corelockup_miss_thresh, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_THREE, + .extra2 = (void *)&five, + }, +#endif #endif {} }; diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index 392837224ec1..48b815afb946 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -83,6 +83,7 @@ static DEFINE_PER_CPU(unsigned long, hrint_saved); static DEFINE_PER_CPU(unsigned long, hrint_missed); static unsigned long corelockup_allcpu_dumped; bool enable_corelockup_detector; +int __read_mostly corelockup_miss_thresh = 5;
static int __init enable_corelockup_detector_setup(char *str) { @@ -169,7 +170,8 @@ void watchdog_check_hrtimer(void) } __this_cpu_inc(nmi_cnt_missed);
- if ((__this_cpu_read(hrint_missed) > 5) && (__this_cpu_read(nmi_cnt_missed) > 5)) { + if ((__this_cpu_read(hrint_missed) > corelockup_miss_thresh) + && (__this_cpu_read(nmi_cnt_missed) > corelockup_miss_thresh)) { pr_emerg("Watchdog detected core LOCKUP on cpu %d\n", cpu);
if (!test_and_set_bit(0, &corelockup_allcpu_dumped)) {
From: Xu Qiang xuqiang36@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT CVE: NA
--------------------------------
When hard lockup detection is disabled, core lockup detection is not performed.
Signed-off-by: Xu Qiang xuqiang36@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Signed-off-by: Li Huafei lihuafei1@huawei.com --- kernel/watchdog_hld.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index 48b815afb946..fba2d5e5061e 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -148,6 +148,10 @@ void watchdog_check_hrtimer(void) if (cpu == smp_processor_id()) return;
+ /* return if hard lockup detector is disable */ + if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED)) + return; + /* * The freq of hrtimer is fast than nmi interrupts and * the core mustn't hangs if hrtimer still working.
From: Xu Qiang xuqiang36@huawei.com
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT CVE: NA
--------------------------------
Signed-off-by: Xu Qiang xuqiang36@huawei.com Reviewed-by: Ding Tianhong dingtianhong@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Conflicts: arch/arm64/configs/openeuler_defconfig Signed-off-by: Li Huafei lihuafei1@huawei.com --- arch/arm64/configs/openeuler_defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index cad620ac08aa..45efb14b95b5 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -7634,6 +7634,7 @@ CONFIG_SOFTLOCKUP_DETECTOR=y # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set CONFIG_SDEI_WATCHDOG=y CONFIG_HARDLOCKUP_DETECTOR=y +CONFIG_CORELOCKUP_DETECTOR=y CONFIG_DETECT_HUNG_TASK=y CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120 # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7WQWT
--------------------------------
The corelockup implementation only relies on softlockup and hardlockup and has no architecture-related code of its own. Currently, only ARM64 and X86_64 are verified, and X86_64 is added to enable it.
Also, corelockup only supports multicore systems, making it dependent on CONFIG_SMP.
Signed-off-by: Li Huafei lihuafei1@huawei.com --- lib/Kconfig.debug | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 9cbb4ed7f17c..b7f84c4abbf5 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1074,7 +1074,8 @@ config HARDLOCKUP_DETECTOR config CORELOCKUP_DETECTOR bool "Detect Core Lockups" depends on HARDLOCKUP_DETECTOR && SOFTLOCKUP_DETECTOR - depends on ARM64 + depends on ARM64 || X86_64 + depends on SMP default n help Corelockups is used to check whether cpu core hungup or not.