merge mainline patches into OLK-6.6
Jeongjun Park (1): mm: migrate: annotate data-race in migrate_folio_unmap()
Kefeng Wang (1): tmpfs: don't enable large folios if not supported
Pei Li (1): mm: ignore data-race in __swap_writepage
Sergey Senozhatsky (1): mm: Kconfig: fixup zsmalloc configuration
Wei Xu (1): mm/mglru: reset page lru tier bits when activating
Zeng Jingxiang (1): mm/vmscan: wake up flushers conditionally to avoid cgroup OOM
include/linux/mm_inline.h | 15 ++++++++++++++- include/linux/mmzone.h | 2 ++ mm/Kconfig | 2 +- mm/migrate.c | 2 +- mm/page_io.c | 7 ++++++- mm/shmem.c | 5 ++++- mm/vmscan.c | 33 ++++++++++++++++++++++++++------- 7 files changed, 54 insertions(+), 12 deletions(-)
From: Pei Li peili.dev@gmail.com
mainline inclusion from mainline-v6.11-rc1 commit 7b7aca6d7c0f9b2d9400bfc57cb2b23cfbd5134d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBET92
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Syzbot reported a possible data race:
BUG: KCSAN: data-race in __swap_writepage / scan_swap_map_slots
read-write to 0xffff888102fca610 of 8 bytes by task 7106 on cpu 1. read to 0xffff888102fca610 of 8 bytes by task 7080 on cpu 0.
While we are in __swap_writepage to read sis->flags, scan_swap_map_slots is trying to update it with SWP_SCANNING.
value changed: 0x0000000000008083 -> 0x0000000000004083.
While this can be updated non-atomicially, this won't affect SWP_SYNCHRONOUS_IO, so we consider this data-race safe.
This is possibly introduced by commit 3222d8c2a7f8 ("block: remove ->rw_page"), where this if branch is introduced.
Link: https://lkml.kernel.org/r/20240711-bug13-v1-1-cea2b8ae8d76@gmail.com Fixes: 3222d8c2a7f8 ("block: remove ->rw_page") Signed-off-by: Pei Li peili.dev@gmail.com Reported-by: syzbot+da25887cc13da6bf3b8c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=da25887cc13da6bf3b8c Cc: Dan Williams dan.j.williams@intel.com Cc: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Conflicts: mm/page_io.c [Because HULK-6.6 don't merge mainline patch b99b4e0d9d7f29b428bacd7a61188b2abf340c1e ("mm: pass a folio to __swap_writepage()")] Signed-off-by: Kaixiong Yu yukaixiong@huawei.com --- mm/page_io.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/mm/page_io.c b/mm/page_io.c index f2b3c4eed688..1887cd2093e3 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -381,7 +381,12 @@ void __swap_writepage(struct page *page, struct writeback_control *wbc) */ if (data_race(sis->flags & SWP_FS_OPS)) swap_writepage_fs(page, wbc); - else if (sis->flags & SWP_SYNCHRONOUS_IO) + /* + * ->flags can be updated non-atomicially (scan_swap_map_slots), + * but that will never affect SWP_SYNCHRONOUS_IO, so the data_race + * is safe. + */ + else if (data_race(sis->flags & SWP_SYNCHRONOUS_IO)) swap_writepage_bdev_sync(page, wbc, sis); else swap_writepage_bdev_async(page, wbc, sis);
From: Sergey Senozhatsky senozhatsky@chromium.org
mainline inclusion from mainline-v6.12-rc1 commit 5ad7a998ba921e37d7e373a2d70d7da401b27f94 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBET92
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
zsmalloc is not exclusive to zswap. Commit b3fbd58fcbb1 ("mm: Kconfig: simplify zswap configuration") made CONFIG_ZSMALLOC only visible when CONFIG_ZSWAP is selected, which makes it impossible to menuconfig zsmalloc-specific features (stats, chain-size, etc.) on systems that use ZRAM but don't have ZSWAP enabled.
Make zsmalloc depend on both ZRAM and ZSWAP.
Link: https://lkml.kernel.org/r/20240903040143.1580705-1-senozhatsky@chromium.org Fixes: b3fbd58fcbb1 ("mm: Kconfig: simplify zswap configuration") Signed-off-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: Johannes Weiner hannes@cmpxchg.org Cc: Minchan Kim minchan@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Kaixiong Yu yukaixiong@huawei.com --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/Kconfig b/mm/Kconfig index 782c43f08e8f..1bf38f9f72db 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -190,7 +190,7 @@ config Z3FOLD
config ZSMALLOC tristate - prompt "N:1 compression allocator (zsmalloc)" if ZSWAP + prompt "N:1 compression allocator (zsmalloc)" if (ZSWAP || ZRAM) depends on MMU help zsmalloc is a slab-based memory allocator designed to store
From: Jeongjun Park aha310510@gmail.com
mainline inclusion from mainline-v6.12-rc1 commit 8001070cfbec5cd4ea00b8b48ea51df91122f265 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBET92
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
I found a report from syzbot [1]
This report shows that the value can be changed, but in reality, the value of __folio_set_movable() cannot be changed because it holds the folio refcount.
Therefore, it is appropriate to add an annotate to make KCSAN ignore that data-race.
[1]
================================================================== BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch
write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: page_cache_delete mm/filemap.c:153 [inline] __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 truncate_inode_folio+0x42/0x50 mm/truncate.c:178 shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 shmem_truncate_range mm/shmem.c:1144 [inline] shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 evict+0x2f0/0x580 fs/inode.c:731 iput_final fs/inode.c:1883 [inline] iput+0x42a/0x5b0 fs/inode.c:1909 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 dput+0x5c/0xd0 fs/dcache.c:857 __fput+0x3fb/0x6d0 fs/file_table.c:439 ____fput+0x1c/0x30 fs/file_table.c:459 task_work_run+0x13a/0x1a0 kernel/task_work.c:228 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: __folio_test_movable include/linux/page-flags.h:699 [inline] migrate_folio_unmap mm/migrate.c:1199 [inline] migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 migrate_pages_sync mm/migrate.c:1963 [inline] migrate_pages+0xff1/0x1820 mm/migrate.c:2072 do_mbind mm/mempolicy.c:1390 [inline] kernel_mbind mm/mempolicy.c:1533 [inline] __do_sys_mbind mm/mempolicy.c:1607 [inline] __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0xffff888127601078 -> 0x0000000000000000
Link: https://lkml.kernel.org/r/20240924130053.107490-1-aha310510@gmail.com Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") Signed-off-by: Jeongjun Park aha310510@gmail.com Reported-by: syzbot syzkaller@googlegroups.com Acked-by: David Hildenbrand david@redhat.com Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: Matthew Wilcox willy@infradead.org Cc: Zi Yan ziy@nvidia.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Kaixiong Yu yukaixiong@huawei.com --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 05538e2edd1b..557c496c3f92 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1137,7 +1137,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, int rc = -EAGAIN; int old_page_state = 0; struct anon_vma *anon_vma = NULL; - bool is_lru = !__folio_test_movable(src); + bool is_lru = data_race(!__folio_test_movable(src)); bool locked = false; bool dst_locked = false;
From: Kefeng Wang wangkefeng.wang@huawei.com
mainline inclusion from mainline-v6.13-rc1 commit 5a90c155defa684f3a21f68c3f8e40c056e6114c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBET92
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
tmpfs can support large folios, but there are some configurable options (mount options and runtime deny/force) to enable/disable large folio allocation, so there is a performance issue when performing writes without large folios. The issue is similar to commit 4e527d5841e2 ("iomap: fault in smaller chunks for non-large folio mappings").
Since 'deny' is for emergencies and 'force' is for testing, performance issues should not be a problem in real production environments, so don't call mapping_set_large_folios() in __shmem_get_inode() when large folio is disabled with mount huge=never option (default policy).
Link: https://lkml.kernel.org/r/20241017141742.1169404-1-wangkefeng.wang@huawei.co... Fixes: 9aac777aaf94 ("filemap: Convert generic_perform_write() to support large folios") Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Christian Brauner brauner@kernel.org Cc: David Hildenbrand david@redhat.com Cc: Hugh Dickins hughd@google.com Cc: Jan Kara jack@suse.cz Cc: Matthew Wilcox willy@infradead.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Kaixiong Yu yukaixiong@huawei.com --- mm/shmem.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/shmem.c b/mm/shmem.c index f9e48c353a9d..63e261f7fe87 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2805,7 +2805,10 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap, cache_no_acl(inode); if (sbinfo->noswap) mapping_set_unevictable(inode->i_mapping); - mapping_set_large_folios(inode->i_mapping); + + /* Don't consider 'deny' for emergencies and 'force' for testing */ + if (sbinfo->huge) + mapping_set_large_folios(inode->i_mapping);
switch (mode & S_IFMT) { default:
From: Wei Xu weixugc@google.com
mainline inclusion from mainline-v6.13-rc1 commit f1001f3d3b6868998cab73d10fda1a5c99ddf963 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBET92
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
When a folio is activated, lru_gen_add_folio() moves the folio to the youngest generation. But unlike folio_update_gen()/folio_inc_gen(), lru_gen_add_folio() doesn't reset the folio lru tier bits (LRU_REFS_MASK | LRU_REFS_FLAGS). This inconsistency can affect how pages are aged via folio_mark_accessed() (e.g. fd accesses), though no user visible impact related to this has been detected yet.
Note that lru_gen_add_folio() cannot clear PG_workingset if the activation is due to workingset refault, otherwise PSI accounting will be skipped. So fix lru_gen_add_folio() to clear the lru tier bits other than PG_workingset when activating a folio, and also clear all the lru tier bits when a folio is activated via folio_activate() in lru_gen_look_around().
Link: https://lkml.kernel.org/r/20241017181528.3358821-1-weixugc@google.com Fixes: 018ee47f1489 ("mm: multi-gen LRU: exploit locality in rmap") Signed-off-by: Wei Xu weixugc@google.com Cc: Axel Rasmussen axelrasmussen@google.com Cc: Brian Geffon bgeffon@google.com Cc: Jan Alexander Steffens heftig@archlinux.org Cc: Suleiman Souhlal suleiman@google.com Cc: Yu Zhao yuzhao@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Kaixiong Yu yukaixiong@huawei.com --- include/linux/mm_inline.h | 15 ++++++++++++++- include/linux/mmzone.h | 2 ++ mm/vmscan.c | 8 ++++---- 3 files changed, 20 insertions(+), 5 deletions(-)
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 386e1eaac1fa..24049f0124a5 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -155,6 +155,11 @@ static inline int folio_lru_refs(struct folio *folio) return ((flags & LRU_REFS_MASK) >> LRU_REFS_PGOFF) + workingset; }
+static inline void folio_clear_lru_refs(struct folio *folio) +{ + set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS, 0); +} + static inline int folio_lru_gen(struct folio *folio) { unsigned long flags = READ_ONCE(folio->flags); @@ -224,6 +229,7 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, { unsigned long seq; unsigned long flags; + unsigned long mask; int gen = folio_lru_gen(folio); int type = folio_is_file_lru(folio); int zone = folio_zonenum(folio); @@ -259,7 +265,14 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, gen = lru_gen_from_seq(seq); flags = (gen + 1UL) << LRU_GEN_PGOFF; /* see the comment on MIN_NR_GENS about PG_active */ - set_mask_bits(&folio->flags, LRU_GEN_MASK | BIT(PG_active), flags); + mask = LRU_GEN_MASK; + /* + * Don't clear PG_workingset here because it can affect PSI accounting + * if the activation is due to workingset refault. + */ + if (folio_test_active(folio)) + mask |= LRU_REFS_MASK | BIT(PG_referenced) | BIT(PG_active); + set_mask_bits(&folio->flags, mask, flags);
lru_gen_update_size(lruvec, folio, -1, gen); /* for folio_rotate_reclaimable() */ diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 3cee238de7c8..412167d6fb41 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -404,6 +404,8 @@ enum { NR_LRU_GEN_CAPS };
+#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_workingset)) + #define MIN_LRU_BATCH BITS_PER_LONG #define MAX_LRU_BATCH (MIN_LRU_BATCH * 64)
diff --git a/mm/vmscan.c b/mm/vmscan.c index 10acf5b339a7..63e6f72031e9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3284,8 +3284,6 @@ static bool should_clear_pmd_young(void) * shorthand helpers ******************************************************************************/
-#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_workingset)) - #define DEFINE_MAX_SEQ(lruvec) \ unsigned long max_seq = READ_ONCE((lruvec)->lrugen.max_seq)
@@ -4787,8 +4785,10 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) old_gen = folio_lru_gen(folio); if (old_gen < 0) folio_set_referenced(folio); - else if (old_gen != new_gen) + else if (old_gen != new_gen) { + folio_clear_lru_refs(folio); folio_activate(folio); + } }
arch_leave_lazy_mmu_mode(); @@ -5040,7 +5040,7 @@ static bool isolate_folio(struct lruvec *lruvec, struct folio *folio, struct sca
/* see the comment on MAX_NR_TIERS */ if (!folio_test_referenced(folio)) - set_mask_bits(&folio->flags, LRU_REFS_MASK | LRU_REFS_FLAGS, 0); + folio_clear_lru_refs(folio);
/* for shrink_folio_list() */ folio_clear_reclaim(folio);
From: Zeng Jingxiang linuszeng@tencent.com
mainline inclusion from mainline-v6.13-rc1 commit 1bc542c6a0d1444559ab75823a89a94d244bf933 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBET92
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Commit 14aa8b2d5c2e ("mm/mglru: don't sync disk for each aging cycle") removed the opportunity to wake up flushers during the MGLRU page reclamation process can lead to an increased likelihood of triggering OOM when encountering many dirty pages during reclamation on MGLRU.
This leads to premature OOM if there are too many dirty pages in cgroup: Killed
dd invoked oom-killer: gfp_mask=0x101cca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE), order=0, oom_score_adj=0
Call Trace: <TASK> dump_stack_lvl+0x5f/0x80 dump_stack+0x14/0x20 dump_header+0x46/0x1b0 oom_kill_process+0x104/0x220 out_of_memory+0x112/0x5a0 mem_cgroup_out_of_memory+0x13b/0x150 try_charge_memcg+0x44f/0x5c0 charge_memcg+0x34/0x50 __mem_cgroup_charge+0x31/0x90 filemap_add_folio+0x4b/0xf0 __filemap_get_folio+0x1a4/0x5b0 ? srso_return_thunk+0x5/0x5f ? __block_commit_write+0x82/0xb0 ext4_da_write_begin+0xe5/0x270 generic_perform_write+0x134/0x2b0 ext4_buffered_write_iter+0x57/0xd0 ext4_file_write_iter+0x76/0x7d0 ? selinux_file_permission+0x119/0x150 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f vfs_write+0x30c/0x440 ksys_write+0x65/0xe0 __x64_sys_write+0x1e/0x30 x64_sys_call+0x11c2/0x1d50 do_syscall_64+0x47/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
memory: usage 308224kB, limit 308224kB, failcnt 2589 swap: usage 0kB, limit 9007199254740988kB, failcnt 0
... file_dirty 303247360 file_writeback 0 ...
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=test, mems_allowed=0,oom_memcg=/test,task_memcg=/test,task=dd,pid=4404,uid=0 Memory cgroup out of memory: Killed process 4404 (dd) total-vm:10512kB, anon-rss:1152kB, file-rss:1824kB, shmem-rss:0kB, UID:0 pgtables:76kB oom_score_adj:0
The flusher wake up was removed to decrease SSD wearing, but if we are seeing all dirty folios at the tail of an LRU, not waking up the flusher could lead to thrashing easily. So wake it up when a memcg is about to OOM due to dirty caches.
I did run the build kernel test[1] on V6, with -j16 1G memcg on my local branch:
Without the patch(10 times): user 1449.394 system 368.78 372.58 363.03 362.31 360.84 372.70 368.72 364.94 373.51 366.58 (avg 367.399) real 164.883
With the V6 patch(10 times): user 1447.525 system 360.87 360.63 372.39 364.09 368.49 365.15 359.93 362.04 359.72 354.60 (avg 362.79) real 164.514
Test results show that this patch has about 1% performance improvement, which should be caused by noise.
Link: https://lkml.kernel.org/r/20241026115714.1437435-1-jingxiangzeng.cas@gmail.c... Link: https://lore.kernel.org/all/CACePvbV4L-gRN9UKKuUnksfVJjOTq_5Sti2-e=pb_w51kuc... [1] Fixes: 14aa8b2d5c2e ("mm/mglru: don't sync disk for each aging cycle") Suggested-by: Wei Xu weixugc@google.com Signed-off-by: Zeng Jingxiang linuszeng@tencent.com Signed-off-by: Kairui Song kasong@tencent.com Reviewed-by: Wei Xu weixugc@google.com Tested-by: Chris Li chrisl@kernel.org Cc: T.J. Mercier tjmercier@google.com Cc: Yu Zhao yuzhao@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Kaixiong Yu yukaixiong@huawei.com --- mm/vmscan.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c index 63e6f72031e9..8e605361b714 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4953,6 +4953,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c int tier_idx) { bool success; + bool dirty, writeback; int gen = folio_lru_gen(folio); int type = folio_is_file_lru(folio); int zone = folio_zonenum(folio); @@ -5007,9 +5008,17 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c return true; }
+ dirty = folio_test_dirty(folio); + writeback = folio_test_writeback(folio); + if (type == LRU_GEN_FILE && dirty) { + sc->nr.file_taken += delta; + if (!writeback) + sc->nr.unqueued_dirty += delta; + } + /* waiting for writeback */ - if (folio_test_locked(folio) || folio_test_writeback(folio) || - (type == LRU_GEN_FILE && folio_test_dirty(folio))) { + if (folio_test_locked(folio) || writeback || + (type == LRU_GEN_FILE && dirty)) { gen = folio_inc_gen(lruvec, folio, true); list_move(&folio->lru, &lrugen->folios[gen][type][zone]); return true; @@ -5125,7 +5134,8 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, MAX_LRU_BATCH, scanned, skipped, isolated, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); - + if (type == LRU_GEN_FILE) + sc->nr.file_taken += isolated; /* * There might not be eligible folios due to reclaim_idx. Check the * remaining to prevent livelock if it's not making progress. @@ -5254,6 +5264,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap return scanned; retry: reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false); + sc->nr.unqueued_dirty += stat.nr_unqueued_dirty; sc->nr_reclaimed += reclaimed; trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, scanned, reclaimed, &stat, sc->priority, @@ -5467,6 +5478,13 @@ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) cond_resched(); }
+ /* + * If too many file cache in the coldest generation can't be evicted + * due to being dirty, wake up the flusher. + */ + if (sc->nr.unqueued_dirty && sc->nr.unqueued_dirty == sc->nr.file_taken) + wakeup_flusher_threads(WB_REASON_VMSCAN); + /* whether this lruvec should be rotated */ return nr_to_scan < 0; } @@ -6588,6 +6606,7 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc) bool reclaimable = false;
if (lru_gen_enabled() && root_reclaim(sc)) { + memset(&sc->nr, 0, sizeof(sc->nr)); lru_gen_shrink_node(pgdat, sc); return; }
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/14611 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/D...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/14611 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/D...