hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7PN0A CVE: NA
-------------------------------------------------
BUG reported when setuping MPAM driver:
[Thu Jul 27 12:15:54 2023] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49 [Thu Jul 27 12:15:54 2023] in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 593, name: kworker/72:1 [Thu Jul 27 12:15:54 2023] CPU: 72 PID: 593 Comm: kworker/72:1 Not tainted 5.10.0-03467-g02e1abb0f821-dirty #1 [Thu Jul 27 12:15:54 2023] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.79 08/21/2021 [Thu Jul 27 12:15:54 2023] Loaded X.509 cert 'Build time autogenerated kernel key: 9ae2dc86231d0b23cd114b5ed4089cac17566b0f' [Thu Jul 27 12:15:54 2023] Workqueue: events mpam_enable [Thu Jul 27 12:15:54 2023] Load PGP public keys [Thu Jul 27 12:15:54 2023] Call trace: [Thu Jul 27 12:15:54 2023] dump_backtrace+0x0/0x30c [Thu Jul 27 12:15:54 2023] show_stack+0x20/0x30 [Thu Jul 27 12:15:54 2023] dump_stack+0x11c/0x174 [Thu Jul 27 12:15:54 2023] ___might_sleep+0x15c/0x1a0 [Thu Jul 27 12:15:54 2023] __might_sleep+0x7c/0x100 [Thu Jul 27 12:15:54 2023] cpus_read_lock+0x3c/0x110 [Thu Jul 27 12:15:54 2023] __cpuhp_setup_state+0x3c/0x80 [Thu Jul 27 12:15:54 2023] mpam_enable+0x148/0x3a4 [Thu Jul 27 12:15:54 2023] process_one_work+0x3cc/0x984 [Thu Jul 27 12:15:54 2023] worker_thread+0x2b0/0x71c [Thu Jul 27 12:15:54 2023] kthread+0x1e0/0x220 [Thu Jul 27 12:15:54 2023] ret_from_fork+0x10/0x18 [Thu Jul 27 12:15:54 2023] kmemleak: Kernel memory leak detector initialized (mem pool available: 11650) [Thu Jul 27 12:15:54 2023] kmemleak: Automatic memory scanning thread started [Thu Jul 27 12:15:54 2023] cryptd: max_cpu_qlen set to 1000 [Thu Jul 27 12:15:54 2023] Key type encrypted registered [Thu Jul 27 12:15:54 2023] AppArmor: AppArmor sha1 policy hashing enabled [Thu Jul 27 12:15:54 2023] integrity: Loading X.509 certificate: UEFI:db
Patch: bc9e3f9895ef2 ("arm64/mpam: Fix mpam corrupt when cpu online") has reported the 'Bad PC' BUG, concurrently calling cpuhp_remove_state() and cpuhp_setup_state() in mpam_enable is dangerous as we share the same state-=>mpam_cpuhp_state, which may be changed after the interrupt returns,
Patch bc9e3f9895ef2 disable irqs but it didn't completely solve this problem, and trigger this BUG report, we can use lockless atomic_fetch_inc() function to avoid hot-plug mpam_cpu_online() twice in mpam_enable() instead.
Fixes: bc9e3f9895ef2 ("arm64/mpam: Fix mpam corrupt when cpu online") Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com --- arch/arm64/kernel/mpam/mpam_device.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c index adf4bc034a51f..9ff8947a2ddf7 100644 --- a/arch/arm64/kernel/mpam/mpam_device.c +++ b/arch/arm64/kernel/mpam/mpam_device.c @@ -540,6 +540,7 @@ static void mpam_disable_irqs(void) static void mpam_enable(struct work_struct *work) { int err; + static atomic_t once; unsigned long flags; struct mpam_device *dev; bool all_devices_probed = true; @@ -557,7 +558,7 @@ static void mpam_enable(struct work_struct *work) } mutex_unlock(&mpam_devices_lock);
- if (!all_devices_probed) + if (!all_devices_probed || atomic_fetch_inc(&once)) return;
mutex_lock(&mpam_devices_lock); @@ -596,11 +597,9 @@ static void mpam_enable(struct work_struct *work) pr_err("Failed to setup/init resctrl\n"); mutex_unlock(&mpam_devices_lock);
- local_irq_disable(); mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mpam:online", mpam_cpu_online, mpam_cpu_offline); - local_irq_enable(); if (mpam_cpuhp_state <= 0) pr_err("Failed to re-register 'dyn' cpuhp callbacks"); mutex_unlock(&mpam_cpuhp_lock);