[PATCH OLK-6.6 0/3] Fix PMD sharing disabled by missing SPLIT_PMD_PTLOCKS
Patch 1: Define CONFIG_SPLIT_PTE_PTLOCKS and CONFIG_SPLIT_PMD_PTLOCKS by turning the old USE_SPLIT_* macros into proper Kconfig options. Patch 2: in the same series with patch 1. Patch 3: new bugfix for patch 1. David Hildenbrand (2): mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options powerpc/8xx: document and enforce that split PT locks are not used Guenter Roeck (1): mm: make SPLIT_PTE_PTLOCKS depend on SMP arch/arm/mm/fault-armv.c | 6 +++--- arch/powerpc/mm/pgtable.c | 4 ++++ arch/x86/xen/mmu_pv.c | 7 ++++--- drivers/misc/zcopy/zcopy.c | 4 ++-- include/linux/mm.h | 8 ++++---- include/linux/mm_types.h | 2 +- include/linux/mm_types_task.h | 3 --- kernel/fork.c | 4 ++-- mm/Kconfig | 19 ++++++++++++------- mm/memory.c | 2 +- 10 files changed, 33 insertions(+), 26 deletions(-) -- 2.53.0
From: David Hildenbrand <david@redhat.com> mainline inclusion from mainline-v6.12-rc1 commit 394290cba9664ed3ab80d0e247402102a9b7287a category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/15533 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Patch series "mm: split PTE/PMD PT table Kconfig cleanups+clarifications". This series is a follow up to the fixes: "[PATCH v1 0/2] mm/hugetlb: fix hugetlb vs. core-mm PT locking" When working on the fixes, I wondered why 8xx is fine (-> never uses split PT locks) and how PT locking even works properly with PMD page table sharing (-> always requires split PMD PT locks). Let's improve the split PT lock detection, make hugetlb properly depend on it and make 8xx bail out if it would ever get enabled by accident. As an alternative to patch #3 we could extend the Kconfig SPLIT_PTE_PTLOCKS option from patch #2 -- but enforcing it closer to the code that actually implements it feels a bit nicer for documentation purposes, and there is no need to actually disable it because it should always be disabled (!SMP). Did a bunch of cross-compilations to make sure that split PTE/PMD PT locks are still getting used where we would expect them. [1] https://lkml.kernel.org/r/20240725183955.2268884-1-david@redhat.com This patch (of 3): Let's clean that up a bit and prepare for depending on CONFIG_SPLIT_PMD_PTLOCKS in other Kconfig options. More cleanups would be reasonable (like the arch-specific "depends on" for CONFIG_SPLIT_PTE_PTLOCKS), but we'll leave that for another day. Link: https://lkml.kernel.org/r/20240726150728.3159964-1-david@redhat.com Link: https://lkml.kernel.org/r/20240726150728.3159964-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: kernel/fork.c drivers/misc/zcopy/zcopy.c [Context conflicts] Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/arm/mm/fault-armv.c | 6 +++--- arch/x86/xen/mmu_pv.c | 7 ++++--- drivers/misc/zcopy/zcopy.c | 4 ++-- include/linux/mm.h | 8 ++++---- include/linux/mm_types.h | 2 +- include/linux/mm_types_task.h | 3 --- kernel/fork.c | 4 ++-- mm/Kconfig | 18 +++++++++++------- mm/memory.c | 2 +- 9 files changed, 28 insertions(+), 26 deletions(-) diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 2286c2ea60ec..831793cd6ff9 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -61,7 +61,7 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address, return ret; } -#if USE_SPLIT_PTE_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) /* * If we are using split PTE locks, then we need to take the page * lock here. Otherwise we are using shared mm->page_table_lock @@ -80,10 +80,10 @@ static inline void do_pte_unlock(spinlock_t *ptl) { spin_unlock(ptl); } -#else /* !USE_SPLIT_PTE_PTLOCKS */ +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ static inline void do_pte_lock(spinlock_t *ptl) {} static inline void do_pte_unlock(spinlock_t *ptl) {} -#endif /* USE_SPLIT_PTE_PTLOCKS */ +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ static int adjust_pte(struct vm_area_struct *vma, unsigned long address, unsigned long pfn) diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 23f30ca52816..bdb4f1043f17 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -712,7 +712,7 @@ static spinlock_t *xen_pte_lock(struct page *page, struct mm_struct *mm) { spinlock_t *ptl = NULL; -#if USE_SPLIT_PTE_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) ptl = ptlock_ptr(page_ptdesc(page)); spin_lock_nest_lock(ptl, &mm->page_table_lock); #endif @@ -1607,7 +1607,8 @@ static inline void xen_alloc_ptpage(struct mm_struct *mm, unsigned long pfn, __set_pfn_prot(pfn, PAGE_KERNEL_RO); - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS && !pinned) + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS) && + !pinned) __pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE, pfn); xen_mc_issue(XEN_LAZY_MMU); @@ -1635,7 +1636,7 @@ static inline void xen_release_ptpage(unsigned long pfn, unsigned level) if (pinned) { xen_mc_batch(); - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS) + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS)) __pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, pfn); __set_pfn_prot(pfn, PAGE_KERNEL); diff --git a/drivers/misc/zcopy/zcopy.c b/drivers/misc/zcopy/zcopy.c index a64bc468c46a..6ae77009546b 100644 --- a/drivers/misc/zcopy/zcopy.c +++ b/drivers/misc/zcopy/zcopy.c @@ -65,7 +65,7 @@ static void (*__zcopy_mmu_notifier_arch_invalidate_secondary_tlbs)(struct mm_str static struct kretprobe __kretprobe; -#if USE_SPLIT_PTE_PTLOCKS && ALLOC_SPLIT_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) && ALLOC_SPLIT_PTLOCKS static struct kmem_cache *zcopy_page_ptl_cachep; bool ptlock_alloc(struct ptdesc *ptdesc) { @@ -776,7 +776,7 @@ static int register_unexport_func(void) { int ret; -#if USE_SPLIT_PTE_PTLOCKS && ALLOC_SPLIT_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) && ALLOC_SPLIT_PTLOCKS zcopy_page_ptl_cachep = (struct kmem_cache *)__kallsyms_lookup_name("page_ptl_cachep"); ret = REGISTER_CHECK(__zcopy_pud_alloc, "__pud_alloc"); diff --git a/include/linux/mm.h b/include/linux/mm.h index 1116be35323e..7d3cac3414e9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3108,7 +3108,7 @@ static inline void pagetable_free(struct ptdesc *pt) __free_pages(page, compound_order(page)); } -#if USE_SPLIT_PTE_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) #if ALLOC_SPLIT_PTLOCKS void __init ptlock_cache_init(void); bool ptlock_alloc(struct ptdesc *ptdesc); @@ -3159,7 +3159,7 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) return true; } -#else /* !USE_SPLIT_PTE_PTLOCKS */ +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ /* * We use mm->page_table_lock to guard all pagetable pages of the mm. */ @@ -3170,7 +3170,7 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) static inline void ptlock_cache_init(void) {} static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } static inline void ptlock_free(struct ptdesc *ptdesc) {} -#endif /* USE_SPLIT_PTE_PTLOCKS */ +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) { @@ -3230,7 +3230,7 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel(pmd))? \ NULL: pte_offset_kernel(pmd, address)) -#if USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_SPLIT_PMD_PTLOCKS) static inline struct page *pmd_pgtable_page(pmd_t *pmd) { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0810cf6ec28e..65ab92013542 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -935,7 +935,7 @@ struct mm_struct { #ifdef CONFIG_MMU_NOTIFIER struct mmu_notifier_subscriptions *notifier_subscriptions; #endif -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) pgtable_t pmd_huge_pte; /* protected by page_table_lock */ #endif #ifdef CONFIG_NUMA_BALANCING diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h index aa44fff8bb9d..688e0f8c0943 100644 --- a/include/linux/mm_types_task.h +++ b/include/linux/mm_types_task.h @@ -19,9 +19,6 @@ #include <asm/tlbbatch.h> #endif -#define USE_SPLIT_PTE_PTLOCKS (NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS) -#define USE_SPLIT_PMD_PTLOCKS (USE_SPLIT_PTE_PTLOCKS && \ - IS_ENABLED(CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK)) #define ALLOC_SPLIT_PTLOCKS (SPINLOCK_SIZE > BITS_PER_LONG/8) /* diff --git a/kernel/fork.c b/kernel/fork.c index 348cea380ec9..4b71b0e4078c 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -895,7 +895,7 @@ static void check_mm(struct mm_struct *mm) pr_alert("BUG: non-zero pgtables_bytes on freeing mm: %ld\n", mm_pgtables_bytes(mm)); -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) VM_BUG_ON_MM(mm->pmd_huge_pte, mm); #endif } @@ -1390,7 +1390,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, RCU_INIT_POINTER(mm->exe_file, NULL); mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) mm->pmd_huge_pte = NULL; #endif #if defined(CONFIG_DAMON_MEM_SAMPLING) diff --git a/mm/Kconfig b/mm/Kconfig index 98d61de2db7e..88b15c387efd 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -616,17 +616,21 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE # at the same time (e.g. copy_page_range()). # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page. # -config SPLIT_PTLOCK_CPUS - int - default "999999" if !MMU - default "999999" if ARM && !CPU_CACHE_VIPT - default "999999" if PARISC && !PA20 - default "999999" if SPARC32 - default "4" +config SPLIT_PTE_PTLOCKS + def_bool y + depends on MMU + depends on NR_CPUS >= 4 + depends on !ARM || CPU_CACHE_VIPT + depends on !PARISC || PA20 + depends on !SPARC32 config ARCH_ENABLE_SPLIT_PMD_PTLOCK bool +config SPLIT_PMD_PTLOCKS + def_bool y + depends on SPLIT_PTE_PTLOCKS && ARCH_ENABLE_SPLIT_PMD_PTLOCK + # # support for memory balloon config MEMORY_BALLOON diff --git a/mm/memory.c b/mm/memory.c index ee5388130f2b..bb0ba6f854c0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6971,7 +6971,7 @@ long copy_folio_from_user(struct folio *dst_folio, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ -#if USE_SPLIT_PTE_PTLOCKS && ALLOC_SPLIT_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) && ALLOC_SPLIT_PTLOCKS static struct kmem_cache *page_ptl_cachep; -- 2.53.0
From: David Hildenbrand <david@redhat.com> mainline inclusion from mainline-v6.12-rc1 commit 073ebebd186250038a04be9693240f90f398c4b1 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/15533 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Right now, we cannot have split PT locks because 8xx does not support SMP. But for the sake of documentation *why* 8xx is fine regarding what we documented in huge_pte_lockptr(), let's just add code to enforce it at the same time as documenting it. This should also make everybody who wants to copy from the 8xx approach of supporting such unusual ways of mapping hugetlb folios aware that it gets tricky once multiple page tables are involved. Link: https://lkml.kernel.org/r/20240726150728.3159964-4-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Borislav Petkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: arch/powerpc/mm/pgtable.c [Context conflicts] Signed-off-by: Qi Xi <xiqi2@huawei.com> --- arch/powerpc/mm/pgtable.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 79b7b35c4899..c4761792f4d5 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -297,6 +297,10 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, } #if defined(CONFIG_PPC_8xx) +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) || defined(CONFIG_SPLIT_PMD_PTLOCKS) +/* We need the same lock to protect the PMD table and the two PTE tables. */ +#error "8M hugetlb folios are incompatible with split page table locks" +#endif void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned long sz) { -- 2.53.0
From: Guenter Roeck <linux@roeck-us.net> mainline inclusion from mainline-v6.12-rc1 commit a3344078101ceee46d14a93f7e3a3b91a55d215b category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/15533 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- SPLIT_PTE_PTLOCKS depends on "NR_CPUS >= 4". Unfortunately, that evaluates to true if there is no NR_CPUS configuration option. This results in CONFIG_SPLIT_PTE_PTLOCKS=y for mac_defconfig. This in turn causes the m68k "q800" and "virt" machines to crash in qemu if debugging options are enabled. Making CONFIG_SPLIT_PTE_PTLOCKS dependent on the existence of NR_CPUS does not work since a dependency on the existence of a numeric Kconfig entry always evaluates to false. Example: config HAVE_NO_NR_CPUS def_bool y depends on !NR_CPUS After adding this to a Kconfig file, "make defconfig" includes: $ grep NR_CPUS .config CONFIG_NR_CPUS=64 CONFIG_HAVE_NO_NR_CPUS=y Defining NR_CPUS for m68k does not help either since many architectures define NR_CPUS only for SMP configurations. Make SPLIT_PTE_PTLOCKS depend on SMP instead to solve the problem. Link: https://lkml.kernel.org/r/20240924154205.1491376-1-linux@roeck-us.net Fixes: 394290cba966 ("mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Qi Xi <xiqi2@huawei.com> --- mm/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/Kconfig b/mm/Kconfig index 88b15c387efd..58432d1ebf76 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -619,6 +619,7 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE config SPLIT_PTE_PTLOCKS def_bool y depends on MMU + depends on SMP depends on NR_CPUS >= 4 depends on !ARM || CPU_CACHE_VIPT depends on !PARISC || PA20 -- 2.53.0
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://atomgit.com/openeuler/kernel/merge_requests/23700 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/KIH... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://atomgit.com/openeuler/kernel/merge_requests/23700 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/KIH...
participants (2)
-
patchwork bot -
Qi Xi