mainline inclusion from mainline-v7.0-rc6 commit 4c5e7f0fcd592801c9cc18f29f80fbee84eb8669 category: bugfix bugzilla: https://gitcode.com/openeuler/kernel/issues/8836 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- On arm64 server, we found folio that get from migration entry isn't locked in softleaf_to_folio(). This issue triggers when mTHP splitting and zap_nonpresent_ptes() races, and the root cause is lack of memory barrier in softleaf_to_folio(). The race is as follows: CPU0 CPU1 deferred_split_scan() zap_nonpresent_ptes() lock folio split_folio() unmap_folio() change ptes to migration entries __split_folio_to_order() softleaf_to_folio() set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry)) smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio)) prep_compound_page() for tail pages In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages are visible before the tail page becomes non-compound. smp_wmb() should be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a result, if zap_nonpresent_ptes() accesses migration entry that stores tail pfn, softleaf_to_folio() may see the updated compound_head of tail page before page->flags. To fix it, add missing smp_rmb() if the softleaf entry is migration entry in softleaf_to_folio() and softleaf_to_page(). [tujinjiang@huawei.com: update function name and comments] Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Barry Song <baohua@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: include/linux/leafops.h include/linux/swapops.h mm/filemap.c [miragtion entry hasn't been renamed to softleaf entry. Add new helper migration_entry_to_compound_page().] Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> --- include/linux/swapops.h | 31 ++++++++++++++++++++++++++++--- mm/filemap.c | 2 +- 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index e749c4c86b26..ed33367fb6a6 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -194,17 +194,42 @@ static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) return swp_offset(entry); } -static inline struct page *migration_entry_to_page(swp_entry_t entry) +static inline void migration_entry_sync_page(struct page *head) { - struct page *p = pfn_to_page(swp_offset(entry)); + /* + * Ensure we do not race with split, which might alter tail pages into new + * head pages and thus result in observing an unlocked page. + * This matches the write barrier in __split_huge_page_tail(). + */ + smp_rmb(); + /* * Any use of migration entries may only occur while the * corresponding page is locked */ - BUG_ON(!PageLocked(compound_head(p))); + BUG_ON(!PageLocked(head)); +} + +static inline struct page *migration_entry_to_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + + migration_entry_sync_page(compound_head(p)); + return p; } +static inline struct page *migration_entry_to_compound_page(swp_entry_t entry) +{ + struct page *p = pfn_to_page(swp_offset(entry)); + struct page *head; + + head = compound_head(p); + migration_entry_sync_page(head); + + return head; +} + static inline void make_migration_entry_read(swp_entry_t *entry) { *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry)); diff --git a/mm/filemap.c b/mm/filemap.c index 18e304ce6229..c2932db70212 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1334,7 +1334,7 @@ void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep, bool delayacct = false; unsigned long pflags = 0; wait_queue_head_t *q; - struct page *page = compound_head(migration_entry_to_page(entry)); + struct page *page = migration_entry_to_compound_page(entry); q = page_waitqueue(page); if (!PageUptodate(page) && PageWorkingset(page)) { -- 2.43.0