[PATCH OLK-5.10 0/2] ARM: fix hash_name() issue
Fix hash_name() issue. Russell King (Oracle) (2): ARM: allow __do_kernel_fault() to report execution of memory faults ARM: fix hash_name() fault arch/arm/mm/fault.c | 63 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 50 insertions(+), 13 deletions(-) -- 2.39.2
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> mainline inclusion from mainline-v6.19-rc1 commit 40b466db1dffb41f0529035c59c5739636d0e5b8 category: other bugzilla: https://gitee.com/openeuler/kernel/issues/IDA8JM Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Allow __do_kernel_fault() to detect the execution of memory, so we can provide the same fault message as do_page_fault() would do. This is required when we split the kernel address fault handling from the main do_page_fault() code path. Reviewed-by: Xie Yuanbin <xieyuanbin1@huawei.com> Tested-by: Xie Yuanbin <xieyuanbin1@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Conflicts: arch/arm/mm/fault.c [There is a compilation error because the definition of is_permission_fault() comes after __do_kernel_fault(). Referring to Russell King's follow-up change: "Group is_permission_fault() with is_translation_fault(), which is needed to use is_permission_fault() in __do_kernel_fault(). As this is static inline, there is no need for this to be under CONFIG_MMU."] Signed-off-by: Zizhi Wo <wozizhi@huawei.com> --- arch/arm/mm/fault.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 4c64d90155c1..9e90e1d139e0 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -131,6 +131,19 @@ static inline bool is_translation_fault(unsigned int fsr) return false; } +static inline bool is_permission_fault(unsigned int fsr) +{ + int fs = fsr_fs(fsr); +#ifdef CONFIG_ARM_LPAE + if ((fs & FS_MMU_NOLL_MASK) == FS_PERM_NOLL) + return true; +#else + if (fs == FS_L1_PERM || fs == FS_L2_PERM) + return true; +#endif + return false; +} + static void die_kernel_fault(const char *msg, struct mm_struct *mm, unsigned long addr, unsigned int fsr, struct pt_regs *regs) @@ -165,6 +178,8 @@ __do_kernel_fault(struct mm_struct *mm, unsigned long addr, unsigned int fsr, */ if (addr < PAGE_SIZE) { msg = "NULL pointer dereference"; + } else if (is_permission_fault(fsr) && fsr & FSR_LNX_PF) { + msg = "execution of memory"; } else { if (is_translation_fault(fsr) && kfence_handle_page_fault(addr, is_write_fault(fsr), regs)) @@ -231,19 +246,6 @@ void do_bad_area(unsigned long addr, unsigned int fsr, struct pt_regs *regs) #define VM_FAULT_BADMAP 0x010000 #define VM_FAULT_BADACCESS 0x020000 -static inline bool is_permission_fault(unsigned int fsr) -{ - int fs = fsr_fs(fsr); -#ifdef CONFIG_ARM_LPAE - if ((fs & FS_MMU_NOLL_MASK) == FS_PERM_NOLL) - return true; -#else - if (fs == FS_L1_PERM || fs == FS_L2_PERM) - return true; -#endif - return false; -} - static vm_fault_t __kprobes __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int flags, unsigned long vma_flags, struct pt_regs *regs) -- 2.39.2
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> mainline inclusion from mainline-v6.19-rc1 commit 7733bc7d299d682f2723dc38fc7f370b9bf973e9 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IDA8JM Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Zizhi Wo reports: "During the execution of hash_name()->load_unaligned_zeropad(), a potential memory access beyond the PAGE boundary may occur. For example, when the filename length is near the PAGE_SIZE boundary. This triggers a page fault, which leads to a call to do_page_fault()->mmap_read_trylock(). If we can't acquire the lock, we have to fall back to the mmap_read_lock() path, which calls might_sleep(). This breaks RCU semantics because path lookup occurs under an RCU read-side critical section." This is seen with CONFIG_DEBUG_ATOMIC_SLEEP=y and CONFIG_KFENCE=y. Kernel addresses (with the exception of the vectors/kuser helper page) do not have VMAs associated with them. If the vectors/kuser helper page faults, then there are two possibilities: 1. if the fault happened while in kernel mode, then we're basically dead, because the CPU won't be able to vector through this page to handle the fault. 2. if the fault happened while in user mode, that means the page was protected from user access, and we want to fault anyway. Thus, we can handle kernel addresses from any context entirely separately without going anywhere near the mmap lock. This gives us an entirely non-sleeping path for all kernel mode kernel address faults. As we handle the kernel address faults before interrupts are enabled, this change has the side effect of improving the branch predictor hardening, but does not completely solve the issue. Reported-by: Zizhi Wo <wozizhi@huaweicloud.com> Reported-by: Xie Yuanbin <xieyuanbin1@huawei.com> Link: https://lore.kernel.org/r/20251126090505.3057219-1-wozizhi@huaweicloud.com Reviewed-by: Xie Yuanbin <xieyuanbin1@huawei.com> Tested-by: Xie Yuanbin <xieyuanbin1@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Conflicts: arch/arm/mm/fault.c [A simple context conflict, unrelated to this patch.] Fixes: bfcfaa77bdf0 ("vfs: use 'unsigned long' accesses for dcache name comparison and hashing") Signed-off-by: Zizhi Wo <wozizhi@huawei.com> --- arch/arm/mm/fault.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 9e90e1d139e0..be30efcbaf7d 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -273,6 +273,35 @@ __do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int flags, return handle_mm_fault(vma, addr & PAGE_MASK, flags, regs); } +static int __kprobes +do_kernel_address_page_fault(struct mm_struct *mm, unsigned long addr, + unsigned int fsr, struct pt_regs *regs) +{ + if (user_mode(regs)) { + /* + * Fault from user mode for a kernel space address. User mode + * should not be faulting in kernel space, which includes the + * vector/khelper page. Send a SIGSEGV. + */ + __do_user_fault(addr, fsr, SIGSEGV, SEGV_MAPERR, regs); + } else { + /* + * Fault from kernel mode. Enable interrupts if they were + * enabled in the parent context. Section (upper page table) + * translation faults are handled via do_translation_fault(), + * so we will only get here for a non-present kernel space + * PTE or PTE permission fault. This may happen in exceptional + * circumstances and need the fixup tables to be walked. + */ + if (interrupts_enabled(regs)) + local_irq_enable(); + + __do_kernel_fault(mm, addr, fsr, regs); + } + + return 0; +} + static int __kprobes do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { @@ -285,6 +314,12 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) if (kprobe_page_fault(regs, fsr)) return 0; + /* + * Handle kernel addresses faults separately, which avoids touching + * the mmap lock from contexts that are not able to sleep. + */ + if (addr >= TASK_SIZE) + return do_kernel_address_page_fault(mm, addr, fsr, regs); /* Enable interrupts if they were enabled in the parent context. */ if (interrupts_enabled(regs)) -- 2.39.2
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/19694 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/3OZ... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/19694 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/3OZ...
participants (2)
-
patchwork bot -
Zizhi Wo