From: Anshuman Khandual khandual@linux.vnet.ibm.com
maillist inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7U78A CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?...
-------------------------------------------
Patch series "arm64: support batched/deferred tlb shootdown during page reclamation/migration", v11.
Though ARM64 has the hardware to do tlb shootdown, the hardware broadcasting is not free. A simplest micro benchmark shows even on snapdragon 888 with only 8 cores, the overhead for ptep_clear_flush is huge even for paging out one page mapped by only one process: 5.36% a.out [kernel.kallsyms] [k] ptep_clear_flush
While pages are mapped by multiple processes or HW has more CPUs, the cost should become even higher due to the bad scalability of tlb shootdown. The same benchmark can result in 16.99% CPU consumption on ARM64 server with around 100 cores according to the test on patch 4/4.
This patchset leverages the existing BATCHED_UNMAP_TLB_FLUSH by 1. only send tlbi instructions in the first stage - arch_tlbbatch_add_mm() 2. wait for the completion of tlbi by dsb while doing tlbbatch sync in arch_tlbbatch_flush()
Testing on snapdragon shows the overhead of ptep_clear_flush is removed by the patchset. The micro benchmark becomes 5% faster even for one page mapped by single process on snapdragon 888.
Since BATCHED_UNMAP_TLB_FLUSH is implemented only on x86, the patchset does some renaming/extension for the current implementation first (Patch 1-3), then add the support on arm64 (Patch 4).
This patch (of 4):
The entire scheme of deferred TLB flush in reclaim path rests on the fact that the cost to refill TLB entries is less than flushing out individual entries by sending IPI to remote CPUs. But architecture can have different ways to evaluate that. Hence apart from checking TTU_BATCH_FLUSH in the TTU flags, rest of the decision should be architecture specific.
[yangyicong@hisilicon.com: rebase and fix incorrect return value type] Link: https://lkml.kernel.org/r/20230717131004.12662-1-yangyicong@huawei.com Link: https://lkml.kernel.org/r/20230717131004.12662-2-yangyicong@huawei.com Signed-off-by: Anshuman Khandual khandual@linux.vnet.ibm.com [https://lore.kernel.org/linuxppc-dev/20171101101735.2318-2-khandual@linux.vn...] Signed-off-by: Yicong Yang yangyicong@hisilicon.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Barry Song baohua@kernel.org Reviewed-by: Xin Hao xhao@linux.alibaba.com Tested-by: Punit Agrawal punit.agrawal@bytedance.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Arnd Bergmann arnd@arndb.de Cc: Darren Hart darren@os.amperecomputing.com Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: Jonathan Corbet corbet@lwn.net Cc: lipeifeng lipeifeng@oppo.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ryan Roberts ryan.roberts@arm.com Cc: Steven Miao realmz6@gmail.com Cc: Will Deacon will@kernel.org Cc: Zeng Tao prime.zeng@hisilicon.com Cc: Barry Song v-songbaohua@oppo.com Cc: Mel Gorman mgorman@suse.de Cc: Nadav Amit namit@vmware.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jinjiang Tu tujinjiang@huawei.com --- arch/x86/include/asm/tlbflush.h | 12 ++++++++++++ mm/rmap.c | 9 +-------- 2 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index fa952eadbc2e..c1d8df34c6c6 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -239,6 +239,18 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a) flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false); }
+static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) +{ + bool should_defer = false; + + /* If remote CPUs need to be flushed then defer batch the flush */ + if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids) + should_defer = true; + put_cpu(); + + return should_defer; +} + static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) { /* diff --git a/mm/rmap.c b/mm/rmap.c index 3e12d26d8c55..d4803e04b00d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -652,17 +652,10 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable) */ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags) { - bool should_defer = false; - if (!(flags & TTU_BATCH_FLUSH)) return false;
- /* If remote CPUs need to be flushed then defer batch the flush */ - if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids) - should_defer = true; - put_cpu(); - - return should_defer; + return arch_tlbbatch_should_defer(mm); }
/*