This patchset implements support of userspace translation tables and private read-only data replication for AArch64 and is going to improve latency and memory bandwidth by reducing cross-NUMA memory accesses. openEuler 25.03 is used as a baseline. Current implementation supports next functionality: 1. Per-NUMA node replication of userspace translation tables and private read-only data. We replicate only __private read-only__ data to avoid dealing with replicas coherence and consistency support. Translation tables, in turn, are able to replicate for any kind of underlying data. 2. Ability to enable userspace replication for a certain process via procfs or for a group of processes via memory cgroup. 3. 4K and 64K pages are supported. 4. Several table replication policies supported: 0 - table replication disabled; 1 - table replication enabled only for data might be replicated, e.g. private read-only data; 2 - table replication enabled for any data; Tables are getting replicated during page faults. Several data replication policies supported as well: 0 - data replication disabled; 1 - appropriate data getting replicated on NUMA faults via NUMA balancer; 2 - appropriate data that has been already faulted in are getting replicated immediately, further data will be replicated on NUMA faults. That is no difference between 1 and 2 policies if it is enabled at the start of a process. 3 - appropriate data (with related page tables) are getting replicated immediately after populating either on page faults or somehow else. Data replication might be enabled (!= 0) only with table replication enabled (!= 0). 5. Replicated data pages can't be a ksm, migration or swap/reclaim candidates by design. But for other pages these work as well with replicated translation tables support. Known problems: 1. Current implementation doesn't support huge pages, so you have to build the kernel with huge pages disabled for user replication to work. Huge pages support will be added in the nearest future. 2. mremap syscall doesn't work with replicated memory yet. 3. page_idle, uprobes and userfaultfd support replicated translation tables, but not replicated data. Be responsible using these features with userspace replication enabled. 4. When replicating translation tables during page faults, there should be enough space on __each__ NUMA node for table allocations. Otherwise it will cause OOM-killer. Despite the problems above, they are mostly not related to workloads assumed to benefit from user replication feature, and such workloads will work properly with the feature enabled. Gadeev Dmitry (1): mm: Support NUMA-aware replication of read-only data and translation tables of user space applications with different policies arch/arm64/include/asm/numa_replication.h | 3 + arch/arm64/mm/init.c | 2 +- arch/arm64/mm/pgd.c | 13 +- fs/exec.c | 7 + fs/proc/base.c | 149 ++ fs/proc/task_mmu.c | 112 +- include/asm-generic/pgalloc.h | 19 +- include/asm-generic/tlb.h | 22 + include/linux/cgroup.h | 1 + include/linux/gfp_types.h | 12 +- include/linux/memcontrol.h | 4 + include/linux/mm.h | 77 +- include/linux/mm_inline.h | 5 + include/linux/mm_types.h | 70 +- include/linux/numa_kernel_replication.h | 237 ++- include/linux/numa_user_replication.h | 827 ++++++++ include/linux/page-flags.h | 18 +- include/trace/events/mmflags.h | 10 +- include/uapi/asm-generic/mman-common.h | 3 + kernel/cgroup/cgroup.c | 2 +- kernel/events/uprobes.c | 5 +- kernel/fork.c | 18 + kernel/sched/fair.c | 8 +- mm/Kconfig | 13 + mm/Makefile | 1 + mm/gup.c | 15 +- mm/ksm.c | 15 +- mm/madvise.c | 19 +- mm/memcontrol.c | 181 +- mm/memory.c | 566 +++++- mm/mempolicy.c | 5 + mm/migrate.c | 11 +- mm/migrate_device.c | 17 +- mm/mlock.c | 22 + mm/mmap.c | 26 + mm/mmu_gather.c | 55 +- mm/mprotect.c | 424 ++-- mm/mremap.c | 97 +- mm/numa_kernel_replication.c | 5 +- mm/numa_user_replication.c | 2242 +++++++++++++++++++++ mm/page_alloc.c | 8 +- mm/page_idle.c | 3 +- mm/page_vma_mapped.c | 3 +- mm/rmap.c | 41 +- mm/swap.c | 7 +- mm/swapfile.c | 3 +- mm/userfaultfd.c | 7 +- mm/userswap.c | 11 +- 48 files changed, 4979 insertions(+), 442 deletions(-) create mode 100644 include/linux/numa_user_replication.h create mode 100644 mm/numa_user_replication.c -- 2.53.0