[PATCH OLK-6.6 0/3] bugfix for memory reliable
Fix three bugs for memory reliable. Wupeng Ma (3): mm: mem_reliable: fix task reliable counter during fork mm: mem_reliable: use percise count during limit check mm: mem_reliable: prevent excessive direct reclaim loops include/linux/mem_reliable.h | 10 +++++++--- mm/memory.c | 4 +++- mm/page_alloc.c | 3 +++ 3 files changed, 13 insertions(+), 4 deletions(-) -- 2.43.0
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/5441 -------------------------------- Commit 8fc2546f8508 ("proc: mem_reliable: Count reliable memory usage of reliable tasks") introduce counting reliable memory allocated by the reliable user tasks which need to update reliable counter after rss. However during commit 1741ac635805 ("mm/memory: optimize fork() with PTE-mapped THP") which introduce PTE batching when consecutive present PTEs map consecutive pages of the same large folio and share identical bits aside from PFNs. This patch introuce rss modification without updating memory reliale's counter which lead to memory relabies's counter inbalanced during fork with thp disabled. Fix this by updating memory reliable's counter after rss updates. Fixes: 1741ac635805 ("mm/memory: optimize fork() with PTE-mapped THP") Signed-off-by: Wupeng Ma <mawupeng1@huawei.com> --- mm/memory.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index a6d146d684e8..cefcc97c7f51 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -935,6 +935,7 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma folio_add_new_anon_rmap(new_folio, dst_vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(new_folio, dst_vma); rss[MM_ANONPAGES]++; + add_reliable_folio_counter(new_folio, dst_vma->vm_mm, 1); /* All done, just insert the new page copy in the child */ pte = mk_pte(&new_folio->page, dst_vma->vm_page_prot); @@ -1019,6 +1020,7 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma folio_dup_file_rmap_ptes(folio, page, nr); rss[mm_counter_file(folio)] += nr; } + add_reliable_folio_counter(folio, dst_vma->vm_mm, nr); if (any_writable) pte = pte_mkwrite(pte, src_vma); __copy_present_ptes(dst_vma, src_vma, dst_pte, src_pte, pte, @@ -1046,8 +1048,8 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma } else { folio_dup_file_rmap_pte(folio, page); rss[mm_counter_file(folio)]++; - add_reliable_folio_counter(folio, dst_vma->vm_mm, 1); } + add_reliable_folio_counter(folio, dst_vma->vm_mm, 1); copy_pte: __copy_present_ptes(dst_vma, src_vma, dst_pte, src_pte, pte, addr, 1); -- 2.43.0
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/5441 -------------------------------- If the fallback for memory reliable is enabled, direct reclaim will be used if the task's reliable memory limit is reached and pages need to be released. However, percpu_counter_read_positive() provides a fast but imprecise counter reading. During limit enforcement, this inaccuracy may cause the observed usage to appear significantly larger than the actual value. As a result, even repeated constrained reclaim attempts may fail to bring memory usage below the limit, eventually leading to OOM. To avoid this issue, use an accurate counter check when determining whether the reliable memory limit has been exceeded. Fixes: 200321e8a69e ("mm: mem_reliable: Add limiting the usage of reliable memory") Signed-off-by: Wupeng Ma <mawupeng1@huawei.com> --- include/linux/mem_reliable.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/include/linux/mem_reliable.h b/include/linux/mem_reliable.h index 1e928ff69d99..9047918e1331 100644 --- a/include/linux/mem_reliable.h +++ b/include/linux/mem_reliable.h @@ -169,11 +169,15 @@ static inline void shmem_reliable_folio_add(struct folio *folio, int nr_page) percpu_counter_add(&shmem_reliable_pages, nr_page); } - static inline bool reliable_mem_limit_check(unsigned long nr_page) { - return (task_reliable_used_pages() + nr_page) <= - (task_reliable_limit >> PAGE_SHIFT); + s64 nr_task_pages; + + /* limit check need precise counter, use sum rather than read */ + nr_task_pages = percpu_counter_sum_positive(&pagecache_reliable_pages); + nr_task_pages += percpu_counter_sum_positive(&anon_reliable_pages); + + return (nr_task_pages + nr_page) <= (task_reliable_limit >> PAGE_SHIFT); } static inline bool mem_reliable_should_reclaim(void) -- 2.43.0
hulk inclusion category: bugfix bugzilla: https://atomgit.com/openeuler/kernel/issues/5441 ------------------------------------------ In our testing, under memory pressure, it becomes difficult totrigger an OOM(out-of-memory) kill when regular memory allocation competes concurrently with driver GFP_ATOMIC allocations. This occurs because GFP_ATOMIC allocations can succeed from as low as 3/8 of the min watermark. In such a situation, even if direct reclaim can free a small number of pages, regular memory requests may still fail because the current watermark remains too high for them. In the following scenario: restart: __alloc_pages_direct_reclaim // reclaims a small number of pages should_reclaim_retry // watermark too high for user // resets no_progress_loops to zero and // goes to restart goto restart; In our production environment, we encountered a case where memory allocation performed __alloc_pages_direct_reclaim 257 times without either succeeding or triggering OOM, causing service freezes. To avoid meaningless continuous direct reclaim loops, skip watermark check for non mirrored zone if memory reliable enabled. Fixes: e0fb8bd67d0a ("mm: mem_reliable: Introduce memory reliable") Signed-off-by: Wupeng Ma <mawupeng1@huawei.com> --- mm/page_alloc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c701f1f4675e..b00c583a28dc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4027,6 +4027,9 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, unsigned long min_wmark = min_wmark_pages(zone); bool wmark; + if (skip_non_mirrored_zone(gfp_mask, z)) + continue; + available = reclaimable = zone_reclaimable_pages(zone); available += zone_page_state_snapshot(zone, NR_FREE_PAGES); -- 2.43.0
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://atomgit.com/openeuler/kernel/merge_requests/19909 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/YIB... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://atomgit.com/openeuler/kernel/merge_requests/19909 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/YIB...
participants (2)
-
patchwork bot -
Wupeng Ma