This is to optimize fork/unmap/zap() with PTE-mapped THP.
Catalin Marinas (1): arm64: Mark the 'addr' argument to set_ptes() and __set_pte_at() as unused
David Hildenbrand (24): arm/pgtable: define PFN_PTE_SHIFT nios2/pgtable: define PFN_PTE_SHIFT powerpc/pgtable: define PFN_PTE_SHIFT riscv/pgtable: define PFN_PTE_SHIFT s390/pgtable: define PFN_PTE_SHIFT sparc/pgtable: define PFN_PTE_SHIFT mm/pgtable: make pte_next_pfn() independent of set_ptes() arm/mm: use pte_next_pfn() in set_ptes() powerpc/mm: use pte_next_pfn() in set_ptes() mm/memory: factor out copying the actual PTE in copy_present_pte() mm/memory: pass PTE to copy_present_pte() mm/memory: optimize fork() with PTE-mapped THP mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch() mm/memory: ignore writable bit in folio_pte_batch() mm/memory: factor out zapping of present pte into zap_present_pte() mm/memory: handle !page case in zap_present_pte() separately mm/memory: further separate anon and pagecache folio handling in zap_present_pte() mm/memory: factor out zapping folio pte into zap_present_folio_pte() mm/mmu_gather: pass "delay_rmap" instead of encoded page to __tlb_remove_page_size() mm/mmu_gather: define ENCODED_PAGE_FLAG_DELAY_RMAP mm/mmu_gather: add tlb_remove_tlb_entries() mm/mmu_gather: add __tlb_remove_folio_pages() mm/mmu_gather: improve cond_resched() handling with large folios and expensive page freeing mm/memory: optimize unmap/zap with PTE-mapped THP
Kefeng Wang (7): s390: use pfn_swap_entry_folio() in ptep_zap_swap_entry() mm: use pfn_swap_entry_folio() in __split_huge_pmd_locked() mm: use pfn_swap_entry_to_folio() in zap_huge_pmd() mm: use pfn_swap_entry_folio() in copy_nonpresent_pte() mm: convert to should_zap_page() to should_zap_folio() mm: convert mm_counter() to take a folio mm: convert mm_counter_file() to take a folio
Matthew Wilcox (Oracle) (2): mm: add pfn_swap_entry_folio() mprotect: use pfn_swap_entry_folio
Peter Xu (1): mm/memory: fix missing pte marker for !page on pte zaps
Ryan Roberts (2): arm64/mm: Hoist synchronization out of set_ptes() loop arm64/mm: make set_ptes() robust when OAs cross 48-bit boundary
arch/arm/include/asm/pgtable.h | 2 + arch/arm/mm/mmu.c | 2 +- arch/arm64/include/asm/mte.h | 4 +- arch/arm64/include/asm/pgtable.h | 58 ++-- arch/arm64/kernel/mte.c | 4 +- arch/nios2/include/asm/pgtable.h | 2 + arch/powerpc/include/asm/pgtable.h | 2 + arch/powerpc/include/asm/tlb.h | 2 + arch/powerpc/mm/pgtable.c | 5 +- arch/riscv/include/asm/pgtable.h | 2 + arch/s390/include/asm/pgtable.h | 2 + arch/s390/include/asm/tlb.h | 30 +- arch/s390/mm/pgtable.c | 4 +- arch/sparc/include/asm/pgtable_64.h | 2 + include/asm-generic/tlb.h | 44 ++- include/linux/mm.h | 12 +- include/linux/mm_types.h | 37 ++- include/linux/pgtable.h | 103 ++++++- include/linux/swapops.h | 13 + kernel/events/uprobes.c | 2 +- mm/filemap.c | 2 +- mm/huge_memory.c | 23 +- mm/khugepaged.c | 4 +- mm/memory.c | 421 ++++++++++++++++++++-------- mm/mmu_gather.c | 111 ++++++-- mm/mprotect.c | 4 +- mm/rmap.c | 10 +- mm/swap.c | 12 +- mm/swap_state.c | 15 +- mm/userfaultfd.c | 2 +- 30 files changed, 718 insertions(+), 218 deletions(-)