From: Shuai Xue xueshuai@linux.alibaba.com
stable inclusion from stable-v5.10.150 commit 110146ce8f84f8f0ef6f09d36e7c5936d5e2f0cf category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6D0XA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 415fed694fe11395df56e05022d6e7cee1d39dd3 ]
If an error is detected as a result of user-space process accessing a corrupt memory location, the CPU may take an abort. Then the platform firmware reports kernel via NMI like notifications, e.g. NOTIFY_SEA, NOTIFY_SOFTWARE_DELEGATED, etc.
For NMI like notifications, commit 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors") keep track of whether memory_failure() work was queued, and make task_work pending to flush out the queue so that the work is processed before return to user-space.
The code use init_mm to check whether the error occurs in user space:
if (current->mm != &init_mm)
The condition is always true, becase _nobody_ ever has "init_mm" as a real VM any more.
In addition to abort, errors can also be signaled as asynchronous exceptions, such as interrupt and SError. In such case, the interrupted current process could be any kind of thread. When a kernel thread is interrupted, the work ghes_kick_task_work deferred to task_work will never be processed because entry_handler returns to call ret_to_kernel() instead of ret_to_user(). Consequently, the estatus_node alloced from ghes_estatus_pool in ghes_in_nmi_queue_one_entry() will not be freed. After around 200 allocations in our platform, the ghes_estatus_pool will run of memory and ghes_in_nmi_queue_one_entry() returns ENOMEM. As a result, the event failed to be processed.
sdei: event 805 on CPU 113 failed with error: -2
Finally, a lot of unhandled events may cause platform firmware to exceed some threshold and reboot.
The condition should generally just do
if (current->mm)
as described in active_mm.rst documentation.
Then if an asynchronous error is detected when a kernel thread is running, (e.g. when detected by a background scrubber), do not add task_work to it as the original patch intends to do.
Fixes: 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors") Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Tony Luck tony.luck@intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 9c38c2cdd2fd..85c546533f5e 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -993,7 +993,7 @@ static void ghes_proc_in_irq(struct irq_work *irq_work) ghes_estatus_cache_add(generic, estatus); }
- if (task_work_pending && current->mm != &init_mm) { + if (task_work_pending && current->mm) { estatus_node->task_work.func = ghes_kick_task_work; estatus_node->task_work_cpu = smp_processor_id(); ret = task_work_add(current, &estatus_node->task_work,