[PATCH OLK-6.6 0/4] Fixes CVE-2026-23100
Fixes CVE-2026-23100. David Hildenbrand (Red Hat) (3): mm/hugetlb: fix hugetlb_pmd_shared() mm/hugetlb: fix two comments related to huge_pmd_unshare() mm/rmap: fix two comments related to huge_pmd_unshare() Jane Chu (1): mm/hugetlb: fix copy_hugetlb_page_range() to use ->pt_share_count include/linux/hugetlb.h | 2 +- include/linux/mm_types.h | 5 +++++ mm/hugetlb.c | 39 +++++++++++++-------------------------- mm/rmap.c | 20 ++++---------------- 4 files changed, 23 insertions(+), 43 deletions(-) -- 2.25.1
From: Jane Chu <jane.chu@oracle.com> stable inclusion from stable-v6.6.127 commit 8c9a1b0710510c3cc25526c815ca798705cb1936 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13632 CVE: CVE-2026-23100 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit 14967a9c7d247841b0312c48dcf8cd29e55a4cc8 upstream. commit 59d9094df3d79 ("mm: hugetlb: independent PMD page table shared count") introduced ->pt_share_count dedicated to hugetlb PMD share count tracking, but omitted fixing copy_hugetlb_page_range(), leaving the function relying on page_count() for tracking that no longer works. When lazy page table copy for hugetlb is disabled, that is, revert commit bcd51a3c679d ("hugetlb: lazy page table copies in fork()") fork()'ing with hugetlb PMD sharing quickly lockup - [ 239.446559] watchdog: BUG: soft lockup - CPU#75 stuck for 27s! [ 239.446611] RIP: 0010:native_queued_spin_lock_slowpath+0x7e/0x2e0 [ 239.446631] Call Trace: [ 239.446633] <TASK> [ 239.446636] _raw_spin_lock+0x3f/0x60 [ 239.446639] copy_hugetlb_page_range+0x258/0xb50 [ 239.446645] copy_page_range+0x22b/0x2c0 [ 239.446651] dup_mmap+0x3e2/0x770 [ 239.446654] dup_mm.constprop.0+0x5e/0x230 [ 239.446657] copy_process+0xd17/0x1760 [ 239.446660] kernel_clone+0xc0/0x3e0 [ 239.446661] __do_sys_clone+0x65/0xa0 [ 239.446664] do_syscall_64+0x82/0x930 [ 239.446668] ? count_memcg_events+0xd2/0x190 [ 239.446671] ? syscall_trace_enter+0x14e/0x1f0 [ 239.446676] ? syscall_exit_work+0x118/0x150 [ 239.446677] ? arch_exit_to_user_mode_prepare.constprop.0+0x9/0xb0 [ 239.446681] ? clear_bhb_loop+0x30/0x80 [ 239.446684] ? clear_bhb_loop+0x30/0x80 [ 239.446686] entry_SYSCALL_64_after_hwframe+0x76/0x7e There are two options to resolve the potential latent issue: 1. warn against PMD sharing in copy_hugetlb_page_range(), 2. fix it. This patch opts for the second option. While at it, simplify the comment, the details are not actually relevant anymore. Link: https://lkml.kernel.org/r/20250916004520.1604530-1-jane.chu@oracle.com Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ze Zuo <zuoze1@huawei.com> --- include/linux/mm_types.h | 5 +++++ mm/hugetlb.c | 15 +++++---------- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index b25cc93103c3..0810cf6ec28e 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -534,6 +534,11 @@ static inline int ptdesc_pmd_pts_count(struct ptdesc *ptdesc) { return atomic_read(&ptdesc->pt_share_count); } + +static inline bool ptdesc_pmd_is_shared(struct ptdesc *ptdesc) +{ + return !!ptdesc_pmd_pts_count(ptdesc); +} #else static inline void ptdesc_pmd_pts_init(struct ptdesc *ptdesc) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1a98b599c0b0..0e07b61f9277 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5510,18 +5510,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, break; } - /* - * If the pagetables are shared don't copy or take references. - * - * dst_pte == src_pte is the common case of src/dest sharing. - * However, src could have 'unshared' and dst shares with - * another vma. So page_count of ptep page is checked instead - * to reliably determine whether pte is shared. - */ - if (page_count(virt_to_page(dst_pte)) > 1) { +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING + /* If the pagetables are shared, there is nothing to do */ + if (ptdesc_pmd_is_shared(virt_to_ptdesc(dst_pte))) { addr |= last_addr_mask; continue; } +#endif dst_ptl = huge_pte_lock(h, dst, dst_pte); src_ptl = huge_pte_lockptr(h, src, src_pte); @@ -7541,7 +7536,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, hugetlb_vma_assert_locked(vma); if (sz != PMD_SIZE) return 0; - if (!ptdesc_pmd_pts_count(virt_to_ptdesc(ptep))) + if (!ptdesc_pmd_is_shared(virt_to_ptdesc(ptep))) return 0; pud_clear(pud); -- 2.25.1
From: "David Hildenbrand (Red Hat)" <david@kernel.org> stable inclusion from stable-v6.6.127 commit 51dcf459845fd28f5a0d83d408a379b274ec5cc5 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13632 CVE: CVE-2026-23100 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... -------------------------------- commit ca1a47cd3f5f4c46ca188b1c9a27af87d1ab2216 upstream. Patch series "mm/hugetlb: fixes for PMD table sharing (incl. using mmu_gather)", v3. One functional fix, one performance regression fix, and two related comment fixes. I cleaned up my prototype I recently shared [1] for the performance fix, deferring most of the cleanups I had in the prototype to a later point. While doing that I identified the other things. The goal of this patch set is to be backported to stable trees "fairly" easily. At least patch #1 and #4. Patch #1 fixes hugetlb_pmd_shared() not detecting any sharing Patch #2 + #3 are simple comment fixes that patch #4 interacts with. Patch #4 is a fix for the reported performance regression due to excessive IPI broadcasts during fork()+exit(). The last patch is all about TLB flushes, IPIs and mmu_gather. Read: complicated There are plenty of cleanups in the future to be had + one reasonable optimization on x86. But that's all out of scope for this series. Runtime tested, with a focus on fixing the performance regression using the original reproducer [2] on x86. This patch (of 4): We switched from (wrongly) using the page count to an independent shared count. Now, shared page tables have a refcount of 1 (excluding speculative references) and instead use ptdesc->pt_share_count to identify sharing. We didn't convert hugetlb_pmd_shared(), so right now, we would never detect a shared PMD table as such, because sharing/unsharing no longer touches the refcount of a PMD table. Page migration, like mbind() or migrate_pages() would allow for migrating folios mapped into such shared PMD tables, even though the folios are not exclusive. In smaps we would account them as "private" although they are "shared", and we would be wrongly setting the PM_MMAP_EXCLUSIVE in the pagemap interface. Fix it by properly using ptdesc_pmd_is_shared() in hugetlb_pmd_shared(). Link: https://lkml.kernel.org/r/20251223214037.580860-1-david@kernel.org Link: https://lkml.kernel.org/r/20251223214037.580860-2-david@kernel.org Link: https://lore.kernel.org/all/8cab934d-4a56-44aa-b641-bfd7e23bd673@kernel.org/ [1] Link: https://lore.kernel.org/all/8cab934d-4a56-44aa-b641-bfd7e23bd673@kernel.org/ [2] Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Reviewed-by: Lance Yang <lance.yang@linux.dev> Tested-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Liu Shixin <liushixin2@huawei.com> Cc: "Uschakow, Stanislav" <suschako@amazon.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ze Zuo <zuoze1@huawei.com> --- include/linux/hugetlb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 230fd4b2fedf..e4e243bad311 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1339,7 +1339,7 @@ static inline __init void hugetlb_cma_reserve(int order) #ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING static inline bool hugetlb_pmd_shared(pte_t *pte) { - return page_count(virt_to_page(pte)) > 1; + return ptdesc_pmd_is_shared(virt_to_ptdesc(pte)); } #else static inline bool hugetlb_pmd_shared(pte_t *pte) -- 2.25.1
From: "David Hildenbrand (Red Hat)" <david@kernel.org> mainline inclusion from mainline-v6.19-rc7 commit 3937027caecb4f8251e82dd857ba1d749bb5a428 category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13632 CVE: CVE-2026-23100 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Ever since we stopped using the page count to detect shared PMD page tables, these comments are outdated. The only reason we have to flush the TLB early is because once we drop the i_mmap_rwsem, the previously shared page table could get freed (to then get reallocated and used for other purpose). So we really have to flush the TLB before that could happen. So let's simplify the comments a bit. The "If we unshared PMDs, the TLB flush was not recorded in mmu_gather." part introduced as in commit a4a118f2eead ("hugetlbfs: flush TLBs correctly after huge_pmd_unshare") was confusing: sure it is recorded in the mmu_gather, otherwise tlb_flush_mmu_tlbonly() wouldn't do anything. So let's drop that comment while at it as well. We'll centralize these comments in a single helper as we rework the code next. Link: https://lkml.kernel.org/r/20251223214037.580860-3-david@kernel.org Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: "Uschakow, Stanislav" <suschako@amazon.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: mm/hugetlb.c [Conficts due to context.] Signed-off-by: Ze Zuo <zuoze1@huawei.com> --- mm/hugetlb.c | 24 ++++++++---------------- 1 file changed, 8 insertions(+), 16 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0e07b61f9277..4ab735e18efa 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5861,17 +5861,10 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, tlb_end_vma(tlb, vma); /* - * If we unshared PMDs, the TLB flush was not recorded in mmu_gather. We - * could defer the flush until now, since by holding i_mmap_rwsem we - * guaranteed that the last refernece would not be dropped. But we must - * do the flushing before we return, as otherwise i_mmap_rwsem will be - * dropped and the last reference to the shared PMDs page might be - * dropped as well. - * - * In theory we could defer the freeing of the PMD pages as well, but - * huge_pmd_unshare() relies on the exact page_count for the PMD page to - * detect sharing, so we cannot defer the release of the page either. - * Instead, do flush now. + * There is nothing protecting a previously-shared page table that we + * unshared through huge_pmd_unshare() from getting freed after we + * release i_mmap_rwsem, so flush the TLB now. If huge_pmd_unshare() + * succeeded, flush the range corresponding to the pud. */ if (force_flush) tlb_flush_mmu_tlbonly(tlb); @@ -7157,11 +7150,10 @@ long hugetlb_change_protection(struct vm_area_struct *vma, cond_resched(); } /* - * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare - * may have cleared our pud entry and done put_page on the page table: - * once we release i_mmap_rwsem, another task can do the final put_page - * and that page table be reused and filled with junk. If we actually - * did unshare a page of pmds, flush the range corresponding to the pud. + * There is nothing protecting a previously-shared page table that we + * unshared through huge_pmd_unshare() from getting freed after we + * release i_mmap_rwsem, so flush the TLB now. If huge_pmd_unshare() + * succeeded, flush the range corresponding to the pud. */ if (shared_pmd) flush_hugetlb_tlb_range(vma, range.start, range.end); -- 2.25.1
From: "David Hildenbrand (Red Hat)" <david@kernel.org> mainline inclusion from mainline-v6.19-rc7 commit a8682d500f691b6dfaa16ae1502d990aeb86e8be category: bugfix bugzilla: https://atomgit.com/src-openeuler/kernel/issues/13632 CVE: CVE-2026-23100 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- PMD page table unsharing no longer touches the refcount of a PMD page table. Also, it is not about dropping the refcount of a "PMD page" but the "PMD page table". Let's just simplify by saying that the PMD page table was unmapped, consequently also unmapping the folio that was mapped into this page. This code should be deduplicated in the future. Link: https://lkml.kernel.org/r/20251223214037.580860-4-david@kernel.org Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: "Uschakow, Stanislav" <suschako@amazon.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ze Zuo <zuoze1@huawei.com> --- mm/rmap.c | 20 ++++---------------- 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 2fea90f048af..5949fe733818 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1754,14 +1754,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, flush_tlb_range(vma, range.start, range.end); /* - * The ref count of the PMD page was - * dropped which is part of the way map - * counting is done for shared PMDs. - * Return 'true' here. When there is - * no other sharing, huge_pmd_unshare - * returns false and we will unmap the - * actual page and drop map count - * to zero. + * The PMD table was unmapped, + * consequently unmapping the folio. */ goto walk_done; } @@ -2138,14 +2132,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, range.start, range.end); /* - * The ref count of the PMD page was - * dropped which is part of the way map - * counting is done for shared PMDs. - * Return 'true' here. When there is - * no other sharing, huge_pmd_unshare - * returns false and we will unmap the - * actual page and drop map count - * to zero. + * The PMD table was unmapped, + * consequently unmapping the folio. */ page_vma_mapped_walk_done(&pvmw); break; -- 2.25.1
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://atomgit.com/openeuler/kernel/merge_requests/21070 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/5WW... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://atomgit.com/openeuler/kernel/merge_requests/21070 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/5WW...
participants (2)
-
patchwork bot -
Ze Zuo