From: Johannes Weiner hannes@cmpxchg.org
mainline inclusion from mainline-v6.1-rc5 commit 82e60d00b753bb5cfecce22b8e952436b14d02a3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8BCRJ CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
When psi annotations were added to to btrfs compression reads, the psi state tracking over add_ra_bio_pages and btrfs_submit_compressed_read was faulty. A pressure state, once entered, is never left. This results in incorrectly elevated pressure, which triggers OOM kills.
pflags record the *previous* memstall state when we enter a new one. The code tried to initialize pflags to 1, and then optimize the leave call when we either didn't enter a memstall, or were already inside a nested stall. However, there can be multiple PageWorkingset pages in the bio, at which point it's that path itself that enters repeatedly and overwrites pflags. This causes us to miss the exit.
Enter the stall only once if needed, then unwind correctly.
erofs has the same problem, fix that up too. And move the memstall exit past submit_bio() to restore submit accounting originally added by b8e24a9300b0 ("block: annotate refault stalls from IO submission").
Link: https://lkml.kernel.org/r/Y2UHRqthNUwuIQGS@cmpxchg.org Fixes: 4088a47e78f9 ("btrfs: add manual PSI accounting for compressed reads") Fixes: 99486c511f68 ("erofs: add manual PSI accounting for the compressed address space") Fixes: 118f3663fbc6 ("block: remove PSI accounting from the bio layer") Link: https://lore.kernel.org/r/d20a0a85-e415-cf78-27f9-77dd7a94bc8d@leemhuis.info... Signed-off-by: Johannes Weiner hannes@cmpxchg.org Reported-by: Thorsten Leemhuis linux@leemhuis.info Tested-by: Thorsten Leemhuis linux@leemhuis.info Cc: Chao Yu chao@kernel.org Cc: Chris Mason clm@fb.com Cc: Christoph Hellwig hch@lst.de Cc: David Sterba dsterba@suse.com Cc: Gao Xiang xiang@kernel.org Cc: Jens Axboe axboe@kernel.dk Cc: Josef Bacik josef@toxicpanda.com Cc: Suren Baghdasaryan surenb@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Conflicts: fs/btrfs/compression.c Signed-off-by: Liu Shixin liushixin2@huawei.com --- fs/btrfs/compression.c | 14 ++++++++------ fs/erofs/zdata.c | 18 +++++++++++------- 2 files changed, 19 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 2c04dd767995e..a3ecff725688a 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -502,7 +502,7 @@ static u64 bio_end_offset(struct bio *bio) static noinline int add_ra_bio_pages(struct inode *inode, u64 compressed_end, struct compressed_bio *cb, - unsigned long *pflags) + int *memstall, unsigned long *pflags) { unsigned long end_index; unsigned long pg_index; @@ -551,8 +551,10 @@ static noinline int add_ra_bio_pages(struct inode *inode, goto next; }
- if (PageWorkingset(page)) + if (!*memstall && PageWorkingset(page)) { psi_memstall_enter(pflags); + *memstall = 1; + }
end = last_offset + PAGE_SIZE - 1; /* @@ -635,8 +637,8 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, u64 cur_disk_byte = (u64)bio->bi_iter.bi_sector << 9; u64 em_len; u64 em_start; - /* Initialize to 1 to make skip psi_memstall_leave unless needed */ - unsigned long pflags = 1; + unsigned long pflags; + int memstall = 0; struct extent_map *em; blk_status_t ret = BLK_STS_RESOURCE; int faili = 0; @@ -695,7 +697,7 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, faili = nr_pages - 1; cb->nr_pages = nr_pages;
- add_ra_bio_pages(inode, em_start + em_len, cb, &pflags); + add_ra_bio_pages(inode, em_start + em_len, cb, &memstall, &pflags);
/* include any pages we added in add_ra-bio_pages */ cb->len = bio->bi_iter.bi_size; @@ -774,7 +776,7 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, bio_endio(comp_bio); }
- if (!pflags) + if (memstall) psi_memstall_leave(&pflags);
return 0; diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 004274bceae68..92c41cdf256e3 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1166,8 +1166,8 @@ static void z_erofs_submit_queue(struct super_block *sb, pgoff_t last_index; unsigned int nr_bios = 0; struct bio *bio = NULL; - /* initialize to 1 to make skip psi_memstall_leave unless needed */ - unsigned long pflags = 1; + unsigned long pflags; + int memstall = 0;
bi_private = jobqueueset_init(sb, q, fgq, force_fg); qtail[JQ_BYPASS] = &q[JQ_BYPASS]->head; @@ -1206,14 +1206,18 @@ static void z_erofs_submit_queue(struct super_block *sb,
if (bio && cur != last_index + 1) { submit_bio_retry: - if (!pflags) - psi_memstall_leave(&pflags); submit_bio(bio); + if (memstall) { + psi_memstall_leave(&pflags); + memstall = 0; + } bio = NULL; }
- if (unlikely(PageWorkingset(page))) + if (unlikely(PageWorkingset(page)) && !memstall) { psi_memstall_enter(&pflags); + memstall = 1; + }
if (!bio) { bio = bio_alloc(GFP_NOIO, BIO_MAX_PAGES); @@ -1243,9 +1247,9 @@ static void z_erofs_submit_queue(struct super_block *sb, } while (owned_head != Z_EROFS_PCLUSTER_TAIL);
if (bio) { - if (!pflags) - psi_memstall_leave(&pflags); submit_bio(bio); + if (memstall) + psi_memstall_leave(&pflags); }
/*