[PATCH OLK-6.6] mempolicy: optimize queue_folios_pte_range by PTE batching

From: Dev Jain <dev.jain@arm.com> mainline inclusion from mainline-v6.16-rc1 commit 4a34c584d8cd13d2b721d21cf629f77c60bfb4a4 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/ICF4UI CVE: NA Reference:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ------------------------------------------- After the check for queue_folio_required(), the code only cares about the folio in the for loop, i.e the PTEs are redundant. Therefore, optimize this loop by skipping over a PTE batch mapping the same folio. With a test program migrating pages of the calling process, which includes a mapped VMA of size 4GB with pte-mapped large folios of order-9, and migrating once back and forth node-0 and node-1, the average execution time reduces from 7.5 to 4 seconds, giving an approx 47% speedup. Link: https://lkml.kernel.org/r/20250416053048.96479-1-dev.jain@arm.com Signed-off-by: Dev Jain <dev.jain@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: mm/mempolicy.c [qp->nr_failed is introduced in 1cb5d11a370f, which is not merged.] Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> --- mm/mempolicy.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 88f0bb008efd..1587efaf777e 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -508,6 +508,7 @@ static int queue_folios_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr, static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; struct vm_area_struct *vma = walk->vma; struct folio *folio; struct queue_pages *qp = walk->private; @@ -515,6 +516,7 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr, pte_t *pte, *mapped_pte; pte_t ptent; spinlock_t *ptl; + int max_nr, nr; ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) @@ -525,13 +527,19 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr, walk->action = ACTION_AGAIN; return 0; } - for (; addr != end; pte++, addr += PAGE_SIZE) { + for (; addr != end; pte += nr, addr += nr * PAGE_SIZE) { + max_nr = (end - addr) >> PAGE_SHIFT; + nr = 1; ptent = ptep_get(pte); if (!pte_present(ptent)) continue; folio = vm_normal_folio(vma, addr, ptent); if (!folio || folio_is_zone_device(folio)) continue; + if (folio_test_large(folio) && max_nr != 1) + nr = folio_pte_batch(folio, addr, pte, ptent, + max_nr, fpb_flags, + NULL, NULL, NULL); /* * vm_normal_folio() filters out zero pages, but there might * still be reserved folios to skip, perhaps in a VDSO. -- 2.25.1

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/16709 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/SLO... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/16709 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/SLO...
participants (2)
-
patchwork bot
-
Tong Tiangen