mm: add huge pfnmap support for remap_pfn_range() Overview ======== This patch series adds huge page support for remap_pfn_range(), automatically creating huge mappings when prerequisites are satisfied (size, alignment, architecture support, etc.) and falling back to normal page mappings otherwise. This work builds on Peter Xu's previous efforts on huge pfnmap support [0]. TODO ==== - Add PUD-level huge page support. Currently, only PMD-level huge pages are supported. - Consider the logic related to vmap_page_range and extract reusable common code. Tests Done ========== - Cross-build tests. - Performance tests with custom device driver implementing mmap() with remap_pfn_range(): - lat_mem_rd benchmark modified to use mmap(device_fd) instead of malloc() shows around 40% improvement in memory access latency with huge page support compared to normal page mappings. numactl -C 0 lat_mem_rd -t 4096M (stride=64) Memory Size (MB) Without Huge Mapping With Huge Mapping Improvement ---------------- ----------------- -------------- ----------- 64.00 148.858 ns 100.780 ns 32.3% 128.00 164.745 ns 103.537 ns 37.2% 256.00 169.907 ns 103.179 ns 39.3% 512.00 171.285 ns 103.072 ns 39.8% 1024.00 173.054 ns 103.055 ns 40.4% 2048.00 172.820 ns 103.091 ns 40.3% 4096.00 172.877 ns 103.115 ns 40.4% - Custom memory copy operations on mmap(device_fd) show around 18% performance improvement with huge page support compared to normal page mappings. numactl -C 0 memcpy_test (memory copy performance test) Memory Size (MB) Without Huge Mapping With Huge Mapping Improvement ---------------- ----------------- -------------- ----------- 1024.00 95.76 ms 77.91 ms 18.6% 2048.00 190.87 ms 155.64 ms 18.5% 4096.00 380.84 ms 311.45 ms 18.2% [0] https://lore.kernel.org/all/20240826204353.2228736-2-peterx@redhat.com/T/#u David Hildenbrand (1): mm/huge_memory: check pmd_special() only after pmd_present() Peter Xu (11): mm: introduce ARCH_SUPPORTS_HUGE_PFNMAP and special bits to pmd/pud mm: drop is_huge_zero_pud() mm: mark special bits for huge pfn mappings when inject mm: allow THP orders for PFNMAPs mm/gup: detect huge pfnmap entries in gup-fast mm/pagewalk: check pfnmap for folio_walk_start() mm/fork: accept huge pfnmap entries mm: always define pxx_pgprot() mm/x86: support large pfn mappings mm/arm64: support large pfn mappings arm64: mm: Drop dead code for pud special bit handling Yin Tirui (2): pgtable: add pte_clrhuge() implementation mm: introduce remap_pfn_range_try_pmd() for PMD-level hugepage mapping arch/arm/include/asm/pgtable-3level.h | 1 + arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgtable.h | 38 ++++++ arch/loongarch/include/asm/pgtable.h | 6 + arch/mips/include/asm/pgtable.h | 6 + arch/powerpc/include/asm/book3s/32/pgtable.h | 5 + arch/powerpc/include/asm/book3s/64/pgtable.h | 7 +- arch/powerpc/include/asm/nohash/32/pte-8xx.h | 7 ++ arch/powerpc/include/asm/nohash/pgtable.h | 7 ++ arch/powerpc/include/asm/pgtable.h | 1 + arch/riscv/include/asm/pgtable.h | 5 + arch/s390/include/asm/pgtable.h | 6 + arch/sparc/include/asm/pgtable_64.h | 6 + arch/sw_64/include/asm/pgtable.h | 6 + arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 80 +++++++----- fs/dax.c | 2 +- include/linux/huge_mm.h | 16 +-- include/linux/mm.h | 26 ++++ include/linux/pgtable.h | 14 ++- mm/Kconfig | 13 ++ mm/gup.c | 6 + mm/huge_memory.c | 93 ++++++++++---- mm/memory.c | 126 +++++++++++++++---- 24 files changed, 383 insertions(+), 96 deletions(-) -- 2.43.0