From: Suren Baghdasaryan surenb@google.com
mainline inclusion from mainline-v6.7 commit 46e714c729c8d1d8110bc0545d7ffe8a759c9dc0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8YKL3
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
A test [1] in Android test suite started failing after [2] was merged. It turns out that after handling a major fault under per-VMA lock, the process major fault counter does not register that fault as major. Before [2] read faults would be done under mmap_lock, in which case FAULT_FLAG_TRIED flag is set before retrying. That in turn causes mm_account_fault() to account the fault as major once retry completes. With per-VMA locks we often retry because a fault can't be handled without locking the whole mm using mmap_lock. Therefore such retries do not set FAULT_FLAG_TRIED flag. This logic does not work after [2] because we can now handle read major faults under per-VMA lock and upon retry the fact there was a major fault gets lost. Fix this by setting FAULT_FLAG_TRIED after retrying under per-VMA lock if VM_FAULT_MAJOR was returned. Ideally we would use an additional VM_FAULT bit to indicate the reason for the retry (could not handle under per-VMA lock vs other reason) but this simpler solution seems to work, so keeping it simple.
[1] https://cs.android.com/android/platform/superproject/+/master:test/vts-testc... [2] https://lore.kernel.org/all/20231006195318.4087158-6-willy@infradead.org/
Link: https://lkml.kernel.org/r/20231226214610.109282-1-surenb@google.com Fixes: 12214eba1992 ("mm: handle read faults under the VMA lock") Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: Matthew Wilcox willy@infradead.org Cc: Alexander Gordeev agordeev@linux.ibm.com Cc: Andy Lutomirski luto@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Cc: Christophe Leroy christophe.leroy@csgroup.eu Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Gerald Schaefer gerald.schaefer@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Palmer Dabbelt palmer@dabbelt.com Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org
conflict: arch/s390/mm/fault.c
Signed-off-by: Tong Tiangen tongtiangen@huawei.com --- arch/arm64/mm/fault.c | 2 ++ arch/powerpc/mm/fault.c | 2 ++ arch/riscv/mm/fault.c | 2 ++ arch/s390/mm/fault.c | 3 +++ arch/x86/mm/fault.c | 2 ++ 5 files changed, 11 insertions(+)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 171810268f6f..c376e58e8cf0 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -633,6 +633,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, goto done; } count_vm_vma_lock_event(VMA_LOCK_RETRY); + if (fault & VM_FAULT_MAJOR) + mm_flags |= FAULT_FLAG_TRIED;
/* Quick path to respond to signals */ if (fault_signal_pending(fault, regs)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index b1723094d464..ec23164ad768 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -496,6 +496,8 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address, goto done; } count_vm_vma_lock_event(VMA_LOCK_RETRY); + if (fault & VM_FAULT_MAJOR) + flags |= FAULT_FLAG_TRIED;
if (fault_signal_pending(fault, regs)) return user_mode(regs) ? 0 : SIGBUS; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 90d4ba36d1d0..081339ddf47e 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -304,6 +304,8 @@ void handle_page_fault(struct pt_regs *regs) goto done; } count_vm_vma_lock_event(VMA_LOCK_RETRY); + if (fault & VM_FAULT_MAJOR) + flags |= FAULT_FLAG_TRIED;
if (fault_signal_pending(fault, regs)) { if (!user_mode(regs)) diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index b678295931c3..f5463535013a 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -424,6 +424,9 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) goto out; } count_vm_vma_lock_event(VMA_LOCK_RETRY); + if (fault & VM_FAULT_MAJOR) + flags |= FAULT_FLAG_TRIED; + /* Quick path to respond to signals */ if (fault_signal_pending(fault, regs)) { fault = VM_FAULT_SIGNAL; diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index ab778eac1952..679b09cfe241 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1370,6 +1370,8 @@ void do_user_addr_fault(struct pt_regs *regs, goto done; } count_vm_vma_lock_event(VMA_LOCK_RETRY); + if (fault & VM_FAULT_MAJOR) + flags |= FAULT_FLAG_TRIED;
/* Quick path to respond to signals */ if (fault_signal_pending(fault, regs)) {