[PATCH openEuler-1.0-LTS 0/3] backport patch for thp deferred list for 4.19 - Kernel - mailweb.openeuler.org

newer
[PATCH openEuler-1.0-LTS 0/2]...

[PATCH openEuler-1.0-LTS 0/3] backport patch for thp deferred list for 4.19

older
[PATCH OLK-5.10] scsi: qla2xxx:...

Wupeng Ma

9 May 2024 9 May '24

2:59 p.m.

From: Ma Wupeng <mawupeng1@huawei.com> backport patch for thp deferred list Jianxing Wang (1): mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush Kirill A. Shutemov (1): mm, thp: do not queue fully unmapped pages for deferred split Yin Fengwei (1): THP: avoid lock when check whether THP is in deferred list mm/huge_memory.c | 17 ++++++++++++----- mm/mmu_gather.c | 16 ++++++++++++++-- mm/rmap.c | 14 ++++++++++---- 3 files changed, 36 insertions(+), 11 deletions(-) -- 2.25.1

Reply

Sign in to reply online Use email software

Show replies by date

Wupeng Ma

9 May 9 May

2:59 p.m.

New subject: [PATCH openEuler-1.0-LTS 1/3] mm, thp: do not queue fully unmapped pages for deferred split

From: "Kirill A. Shutemov" <kirill@shutemov.name> mainline inclusion from mainline-v5.5-rc1 commit f1fe80d4ae3396cf3665bd6dc77f4004c1c2e9f8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9NU9F Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- Adding fully unmapped pages into deferred split queue is not productive: these pages are about to be freed or they are pinned and cannot be split anyway. Link: http://lkml.kernel.org/r/20190913091849.11151-1-kirill.shutemov@linux.intel.... Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> --- mm/rmap.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index c336dacfac52..bf26f9c8edac 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1281,12 +1281,20 @@ static void page_remove_anon_compound_rmap(struct page *page) if (TestClearPageDoubleMap(page)) { /* * Subpages can be mapped with PTEs too. Check how many of - * themi are still mapped. + * them are still mapped. */ for (i = 0, nr = 0; i < HPAGE_PMD_NR; i++) { if (atomic_add_negative(-1, &page[i]._mapcount)) nr++; } + + /* + * Queue the page for deferred split if at least one small + * page of the compound page is unmapped, but at least one + * small page is still mapped. + */ + if (nr && nr < HPAGE_PMD_NR) + deferred_split_huge_page(page); } else { nr = HPAGE_PMD_NR; } @@ -1294,10 +1302,8 @@ static void page_remove_anon_compound_rmap(struct page *page) if (unlikely(PageMlocked(page))) clear_page_mlock(page); - if (nr) { + if (nr) __mod_node_page_state(page_pgdat(page), NR_ANON_MAPPED, -nr); - deferred_split_huge_page(page); - } } /** -- 2.25.1

Reply

Sign in to reply online Use email software

Wupeng Ma

2:59 p.m.

New subject: [PATCH openEuler-1.0-LTS 2/3] mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush

From: Jianxing Wang <wangjianxing@loongson.cn> mainline inclusion from mainline-v5.19-rc1 commit b191c9bc334a936775843867485c207e23b30e1b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9NU9F Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- free a large list of pages maybe cause rcu_sched starved on non-preemptible kernels. howerver free_unref_page_list maybe can't cond_resched as it maybe called in interrupt or atomic context, especially can't detect atomic context in CONFIG_PREEMPTION=n. The issue is detected in guest with kvm cpu 200% overcommit, however I didn't see the warning in the host with the same application. I'm sure that the patch is needed for guest kernel, but no sure for host. To reproduce, set up two virtual machines in one host machine, per vm has the same number cpu and half memory of host. the run ltpstress.sh in per vm, then will see rcu stall warning.kernel is preempt disabled, append kernel command 'preempt=none' if enable dynamic preempt . It could detected in loongson machine(32 core, 128G mem) and ProLiant DL380 Gen9(x86 E5-2680, 28 core, 64G mem) tlb flush batch count depends on PAGE_SIZE, it's too large if PAGE_SIZE > 4K, here limit free batch count with 512. And add schedule point in tlb_batch_pages_flush. rcu: rcu_sched kthread starved for 5359 jiffies! g454793 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=19 [...] Call Trace: free_unref_page_list+0x19c/0x270 release_pages+0x3cc/0x498 tlb_flush_mmu_free+0x44/0x70 zap_pte_range+0x450/0x738 unmap_page_range+0x108/0x240 unmap_vmas+0x74/0xf0 unmap_region+0xb0/0x120 do_munmap+0x264/0x438 vm_munmap+0x58/0xa0 sys_munmap+0x10/0x20 syscall_common+0x24/0x38 Link: https://lkml.kernel.org/r/20220317072857.2635262-1-wangjianxing@loongson.cn Signed-off-by: Jianxing Wang <wangjianxing@loongson.cn> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> --- mm/mmu_gather.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index a44cf211ffee..2b3f6967176f 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -71,8 +71,20 @@ void tlb_flush_mmu_free(struct mmu_gather *tlb) tlb_table_flush(tlb); #endif for (batch = &tlb->local; batch && batch->nr; batch = batch->next) { - free_pages_and_swap_cache(batch->pages, batch->nr); - batch->nr = 0; + struct page **pages = batch->pages; + + do { + /* + * limit free batch count when PAGE_SIZE > 4K + */ + unsigned int nr = min(512U, batch->nr); + + free_pages_and_swap_cache(pages, nr); + pages += nr; + batch->nr -= nr; + + cond_resched(); + } while (batch->nr); } tlb->active = &tlb->local; } -- 2.25.1

Reply

Sign in to reply online Use email software

Wupeng Ma

2:59 p.m.

New subject: [PATCH openEuler-1.0-LTS 3/3] THP: avoid lock when check whether THP is in deferred list

From: Yin Fengwei <fengwei.yin@intel.com> mainline inclusion from mainline-v6.5-rc1 commit deedad80f660af8199ea3b3f70939f2d226b9154 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9NU9F Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... -------------------------------- free_transhuge_page() acquires split queue lock then check whether the THP was added to deferred list or not. It brings high deferred queue lock contention. It's safe to check whether the THP is in deferred list or not without holding the deferred queue lock in free_transhuge_page() because when code hit free_transhuge_page(), there is no one tries to add the folio to _deferred_list. Running page_fault1 of will-it-scale + order 2 folio for anonymous mapping with 96 processes on an Ice Lake 48C/96T test box, we could see the 61% split_queue_lock contention: - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page - 63.01% free_transhuge_page + 62.91% _raw_spin_lock_irqsave With this patch applied, the split_queue_lock contention is less than 1%. Link: https://lkml.kernel.org/r/20230429082759.1600796-2-fengwei.yin@intel.com Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Conflicts: mm/huge_memory.c [Ma Wupeng: mainline use folio & refactor on ds_queue lead to conflicts] Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> --- mm/huge_memory.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8b7086cfd1ed..b2e39b947126 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2907,13 +2907,20 @@ void free_transhuge_page(struct page *page) struct deferred_split ds_queue; unsigned long flags; + /* + * At this point, there is no one trying to add the folio to + * deferred_list. If folio is not in deferred_list, it's safe + * to check without acquiring the split_queue_lock. + */ get_deferred_split_queue(page, &ds_queue); - spin_lock_irqsave(ds_queue.split_queue_lock, flags); - if (!list_empty(page_deferred_list(page))) { - (*ds_queue.split_queue_len)--; - list_del(page_deferred_list(page)); + if (data_race(!list_empty(page_deferred_list(page)))) { + spin_lock_irqsave(ds_queue.split_queue_lock, flags); + if (!list_empty(page_deferred_list(page))) { + (*ds_queue.split_queue_len)--; + list_del(page_deferred_list(page)); + } + spin_unlock_irqrestore(ds_queue.split_queue_lock, flags); } - spin_unlock_irqrestore(ds_queue.split_queue_lock, flags); free_compound_page(page); } -- 2.25.1

Reply

Sign in to reply online Use email software

patchwork bot

3:06 p.m.

反馈：您发送到kernel@openeuler.org的补丁/补丁集，已成功转换为PR！ PR链接地址： https://gitee.com/openeuler/kernel/pulls/7052 邮件列表地址：https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/7... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/7052 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/7...

Reply

Sign in to reply online Use email software

537

Age (days ago)

537

Last active (days ago)

4 comments

2 participants

tags

participants (2)

patchwork bot
Wupeng Ma