From: Yang Shi yang@os.amperecomputing.com
mainline inclusion from mainline-v6.8-rc3 commit c4608d1bf7c6536d1a3d233eb21e50678681564e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I9H84X CVE: NA
-------------------------------------------------
commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") incured regression for stress-ng pthread benchmark [1]. It is because THP get allocated to pthread's stack area much more possible than before. Pthread's stack area is allocated by mmap without VM_GROWSDOWN or VM_GROWSUP flag, so kernel can't tell whether it is a stack area or not.
The MAP_STACK flag is used to mark the stack area, but it is a no-op on Linux. Mapping MAP_STACK to VM_NOHUGEPAGE to prevent from allocating THP for such stack area.
With this change the stack area looks like:
fffd18e10000-fffd19610000 rw-p 00000000 00:00 0 Size: 8192 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 12 kB Pss: 12 kB Pss_Dirty: 12 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 12 kB Referenced: 12 kB Anonymous: 12 kB KSM: 0 kB LazyFree: 0 kB AnonHugePages: 0 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 0 VmFlags: rd wr mr mw me ac nh
The "nh" flag is set.
[1] https://lore.kernel.org/linux-mm/202312192310.56367035-oliver.sang@intel.com...
Link: https://lkml.kernel.org/r/20231221065943.2803551-2-shy828301@gmail.com Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") Signed-off-by: Yang Shi yang@os.amperecomputing.com Reported-by: kernel test robot oliver.sang@intel.com Tested-by: Oliver Sang oliver.sang@intel.com Reviewed-by: Yin Fengwei fengwei.yin@intel.com Cc: Rik van Riel riel@surriel.com Cc: Matthew Wilcox willy@infradead.org Cc: Christopher Lameter cl@linux.com Cc: Huang, Ying ying.huang@intel.com Cc: stable@vger.kerenl.org Signed-off-by: Andrew Morton akpm@linux-foundation.org (cherry picked from commit c4608d1bf7c6536d1a3d233eb21e50678681564e) Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com --- include/linux/mman.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/linux/mman.h b/include/linux/mman.h index 40d94411d492..dc7048824be8 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -156,6 +156,7 @@ calc_vm_flag_bits(unsigned long flags) return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | + _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | arch_calc_vm_flag_bits(flags); }