Josef Bacik (1): btrfs: do not WARN_ON() if we have PageError set
Qu Wenruo (2): btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors btrfs: support page uptodate assertions in subpage mode
fs/btrfs/ctree.c | 3 +- fs/btrfs/ctree.h | 38 ++++++++++++- fs/btrfs/extent_io.c | 116 +++++++++++++++++++++++++++------------- fs/btrfs/struct-funcs.c | 18 ++++--- 4 files changed, 127 insertions(+), 48 deletions(-)
From: Qu Wenruo wqu@suse.com
mainline inclusion from mainline-v5.11-rc1 commit 884b07d0f4f7e09d8312008fed04e01d9d2270dc category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IALPSO CVE: CVE-2022-48902
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------
To support sectorsize < PAGE_SIZE case, we need to take extra care of extent buffer accessors.
Since sectorsize is smaller than PAGE_SIZE, one page can contain multiple tree blocks, we must use eb->start to determine the real offset to read/write for extent buffer accessors.
This patch introduces two helpers to do this:
- get_eb_page_index() This is to calculate the index to access extent_buffer::pages. It's just a simple wrapper around "start >> PAGE_SHIFT".
For sectorsize == PAGE_SIZE case, nothing is changed. For sectorsize < PAGE_SIZE case, we always get index as 0, and the existing page shift also works.
- get_eb_offset_in_page() This is to calculate the offset to access extent_buffer::pages. This needs to take extent_buffer::start into consideration.
For sectorsize == PAGE_SIZE case, extent_buffer::start is always aligned to PAGE_SIZE, thus adding extent_buffer::start to offset_in_page() won't change the result. For sectorsize < PAGE_SIZE case, adding extent_buffer::start gives us the correct offset to access.
This patch will touch the following parts to cover all extent buffer accessors:
- BTRFS_SETGET_HEADER_FUNCS() - read_extent_buffer() - read_extent_buffer_to_user() - memcmp_extent_buffer() - write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer_full() - copy_extent_buffer() - memcpy_extent_buffer() - memmove_extent_buffer() - btrfs_get_token_##bits() - btrfs_get_##bits() - btrfs_set_token_##bits() - btrfs_set_##bits() - generic_bin_search()
Signed-off-by: Goldwyn Rodrigues rgoldwyn@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com
Conflicts: fs/btrfs/extent_io.c [commit 74ee79142c0a ("btrfs: reset destination buffer when read_extent_buffer() gets invalid range") from v6.6-rc4 is already backported, and context from read_extent_buffer() is changed] Signed-off-by: Yu Kuai yukuai3@huawei.com --- fs/btrfs/ctree.c | 3 +- fs/btrfs/ctree.h | 38 ++++++++++++++++++++++-- fs/btrfs/extent_io.c | 64 ++++++++++++++++++++++++----------------- fs/btrfs/struct-funcs.c | 18 ++++++------ 4 files changed, 85 insertions(+), 38 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 814f2f07e74c..8b584190633c 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1748,9 +1748,10 @@ static noinline int generic_bin_search(struct extent_buffer *eb, oip = offset_in_page(offset);
if (oip + key_size <= PAGE_SIZE) { - const unsigned long idx = offset >> PAGE_SHIFT; + const unsigned long idx = get_eb_page_index(offset); char *kaddr = page_address(eb->pages[idx]);
+ oip = get_eb_offset_in_page(eb, offset); tmp = (struct btrfs_disk_key *)(kaddr + oip); } else { read_extent_buffer(eb, &unaligned, offset, key_size); diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 482842ac397b..6af951d503e3 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1496,13 +1496,14 @@ static inline void btrfs_set_token_##name(struct btrfs_map_token *token,\ #define BTRFS_SETGET_HEADER_FUNCS(name, type, member, bits) \ static inline u##bits btrfs_##name(const struct extent_buffer *eb) \ { \ - const type *p = page_address(eb->pages[0]); \ + const type *p = page_address(eb->pages[0]) + \ + offset_in_page(eb->start); \ return get_unaligned_le##bits(&p->member); \ } \ static inline void btrfs_set_##name(const struct extent_buffer *eb, \ u##bits val) \ { \ - type *p = page_address(eb->pages[0]); \ + type *p = page_address(eb->pages[0]) + offset_in_page(eb->start); \ put_unaligned_le##bits(val, &p->member); \ }
@@ -3295,6 +3296,39 @@ static inline void assertfail(const char *expr, const char* file, int line) { } #define ASSERT(expr) (void)(expr) #endif
+/* + * Get the correct offset inside the page of extent buffer. + * + * @eb: target extent buffer + * @start: offset inside the extent buffer + * + * Will handle both sectorsize == PAGE_SIZE and sectorsize < PAGE_SIZE cases. + */ +static inline size_t get_eb_offset_in_page(const struct extent_buffer *eb, + unsigned long offset) +{ + /* + * For sectorsize == PAGE_SIZE case, eb->start will always be aligned + * to PAGE_SIZE, thus adding it won't cause any difference. + * + * For sectorsize < PAGE_SIZE, we must only read the data that belongs + * to the eb, thus we have to take the eb->start into consideration. + */ + return offset_in_page(offset + eb->start); +} + +static inline unsigned long get_eb_page_index(unsigned long offset) +{ + /* + * For sectorsize == PAGE_SIZE case, plain >> PAGE_SHIFT is enough. + * + * For sectorsize < PAGE_SIZE case, we only support 64K PAGE_SIZE, + * and have ensured that all tree blocks are contained in one page, + * thus we always get index == 0. + */ + return offset >> PAGE_SHIFT; +} + /* * Use that for functions that are conditionally exported for sanity tests but * otherwise static diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 104e2f1fe4f7..4dc2ef740220 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5632,7 +5632,7 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv, struct page *page; char *kaddr; char *dst = (char *)dstv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start);
if (check_eb_range(eb, start, len)) { /* @@ -5643,7 +5643,7 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv, return; }
- offset = offset_in_page(start); + offset = get_eb_offset_in_page(eb, start);
while (len > 0) { page = eb->pages[i]; @@ -5668,13 +5668,13 @@ int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb, struct page *page; char *kaddr; char __user *dst = (char __user *)dstv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); int ret = 0;
WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len);
- offset = offset_in_page(start); + offset = get_eb_offset_in_page(eb, start);
while (len > 0) { page = eb->pages[i]; @@ -5703,13 +5703,13 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, struct page *page; char *kaddr; char *ptr = (char *)ptrv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); int ret = 0;
if (check_eb_range(eb, start, len)) return -EINVAL;
- offset = offset_in_page(start); + offset = get_eb_offset_in_page(eb, start);
while (len > 0) { page = eb->pages[i]; @@ -5735,7 +5735,7 @@ void write_extent_buffer_chunk_tree_uuid(const struct extent_buffer *eb, char *kaddr;
WARN_ON(!PageUptodate(eb->pages[0])); - kaddr = page_address(eb->pages[0]); + kaddr = page_address(eb->pages[0]) + get_eb_offset_in_page(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, chunk_tree_uuid), srcv, BTRFS_FSID_SIZE); } @@ -5745,7 +5745,7 @@ void write_extent_buffer_fsid(const struct extent_buffer *eb, const void *srcv) char *kaddr;
WARN_ON(!PageUptodate(eb->pages[0])); - kaddr = page_address(eb->pages[0]); + kaddr = page_address(eb->pages[0]) + get_eb_offset_in_page(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, fsid), srcv, BTRFS_FSID_SIZE); } @@ -5758,12 +5758,12 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv, struct page *page; char *kaddr; char *src = (char *)srcv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start);
if (check_eb_range(eb, start, len)) return;
- offset = offset_in_page(start); + offset = get_eb_offset_in_page(eb, start);
while (len > 0) { page = eb->pages[i]; @@ -5787,12 +5787,12 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, size_t offset; struct page *page; char *kaddr; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start);
if (check_eb_range(eb, start, len)) return;
- offset = offset_in_page(start); + offset = get_eb_offset_in_page(eb, start);
while (len > 0) { page = eb->pages[i]; @@ -5816,10 +5816,20 @@ void copy_extent_buffer_full(const struct extent_buffer *dst,
ASSERT(dst->len == src->len);
- num_pages = num_extent_pages(dst); - for (i = 0; i < num_pages; i++) - copy_page(page_address(dst->pages[i]), - page_address(src->pages[i])); + if (dst->fs_info->sectorsize == PAGE_SIZE) { + num_pages = num_extent_pages(dst); + for (i = 0; i < num_pages; i++) + copy_page(page_address(dst->pages[i]), + page_address(src->pages[i])); + } else { + size_t src_offset = get_eb_offset_in_page(src, 0); + size_t dst_offset = get_eb_offset_in_page(dst, 0); + + ASSERT(src->fs_info->sectorsize < PAGE_SIZE); + memcpy(page_address(dst->pages[0]) + dst_offset, + page_address(src->pages[0]) + src_offset, + src->len); + } }
void copy_extent_buffer(const struct extent_buffer *dst, @@ -5832,7 +5842,7 @@ void copy_extent_buffer(const struct extent_buffer *dst, size_t offset; struct page *page; char *kaddr; - unsigned long i = dst_offset >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(dst_offset);
if (check_eb_range(dst, dst_offset, len) || check_eb_range(src, src_offset, len)) @@ -5840,7 +5850,7 @@ void copy_extent_buffer(const struct extent_buffer *dst,
WARN_ON(src->len != dst_len);
- offset = offset_in_page(dst_offset); + offset = get_eb_offset_in_page(dst, dst_offset);
while (len > 0) { page = dst->pages[i]; @@ -5884,7 +5894,7 @@ static inline void eb_bitmap_offset(const struct extent_buffer *eb, * the bitmap item in the extent buffer + the offset of the byte in the * bitmap item. */ - offset = start + byte_offset; + offset = start + offset_in_page(eb->start) + byte_offset;
*page_index = offset >> PAGE_SHIFT; *page_offset = offset_in_page(offset); @@ -6038,11 +6048,11 @@ void memcpy_extent_buffer(const struct extent_buffer *dst, return;
while (len > 0) { - dst_off_in_page = offset_in_page(dst_offset); - src_off_in_page = offset_in_page(src_offset); + dst_off_in_page = get_eb_offset_in_page(dst, dst_offset); + src_off_in_page = get_eb_offset_in_page(dst, src_offset);
- dst_i = dst_offset >> PAGE_SHIFT; - src_i = src_offset >> PAGE_SHIFT; + dst_i = get_eb_page_index(dst_offset); + src_i = get_eb_page_index(src_offset);
cur = min(len, (unsigned long)(PAGE_SIZE - src_off_in_page)); @@ -6078,11 +6088,11 @@ void memmove_extent_buffer(const struct extent_buffer *dst, return; } while (len > 0) { - dst_i = dst_end >> PAGE_SHIFT; - src_i = src_end >> PAGE_SHIFT; + dst_i = get_eb_page_index(dst_end); + src_i = get_eb_page_index(src_end);
- dst_off_in_page = offset_in_page(dst_end); - src_off_in_page = offset_in_page(src_end); + dst_off_in_page = get_eb_offset_in_page(dst, dst_end); + src_off_in_page = get_eb_offset_in_page(dst, src_end);
cur = min_t(unsigned long, len, src_off_in_page + 1); cur = min(cur, dst_off_in_page + 1); diff --git a/fs/btrfs/struct-funcs.c b/fs/btrfs/struct-funcs.c index c46be27be700..8260f8bb3ff0 100644 --- a/fs/btrfs/struct-funcs.c +++ b/fs/btrfs/struct-funcs.c @@ -57,8 +57,9 @@ u##bits btrfs_get_token_##bits(struct btrfs_map_token *token, \ const void *ptr, unsigned long off) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ - const unsigned long oip = offset_in_page(member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ + const unsigned long oip = get_eb_offset_in_page(token->eb, \ + member_offset); \ const int size = sizeof(u##bits); \ u8 lebytes[sizeof(u##bits)]; \ const int part = PAGE_SIZE - oip; \ @@ -85,8 +86,8 @@ u##bits btrfs_get_##bits(const struct extent_buffer *eb, \ const void *ptr, unsigned long off) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long oip = offset_in_page(member_offset); \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ + const unsigned long oip = get_eb_offset_in_page(eb, member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ char *kaddr = page_address(eb->pages[idx]); \ const int size = sizeof(u##bits); \ const int part = PAGE_SIZE - oip; \ @@ -106,8 +107,9 @@ void btrfs_set_token_##bits(struct btrfs_map_token *token, \ u##bits val) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ - const unsigned long oip = offset_in_page(member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ + const unsigned long oip = get_eb_offset_in_page(token->eb, \ + member_offset); \ const int size = sizeof(u##bits); \ u8 lebytes[sizeof(u##bits)]; \ const int part = PAGE_SIZE - oip; \ @@ -136,8 +138,8 @@ void btrfs_set_##bits(const struct extent_buffer *eb, void *ptr, \ unsigned long off, u##bits val) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long oip = offset_in_page(member_offset); \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ + const unsigned long oip = get_eb_offset_in_page(eb, member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ char *kaddr = page_address(eb->pages[idx]); \ const int size = sizeof(u##bits); \ const int part = PAGE_SIZE - oip; \
From: Qu Wenruo wqu@suse.com
mainline inclusion from mainline-v5.13-rc1 commit b8f957715eae0490ceca13da43d43e9f1eba39ac category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IALPSO CVE: CVE-2022-48902
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------
There are quite some assert checks on page uptodate in extent buffer write accessors. They ensure the destination page is already uptodate.
This is fine for regular sector size case, but not for subpage case, as for subpage we only mark the page uptodate if the page contains no hole and all its extent buffers are uptodate.
So instead of checking PageUptodate(), for subpage case we check the uptodate bitmap of btrfs_subpage structure.
To make the check more elegant, introduce a helper, assert_eb_page_uptodate() to do the check for both subpage and regular sector size cases.
The following functions are involved:
- write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer() - extent_buffer_test_bit() - extent_buffer_bitmap_set() - extent_buffer_bitmap_clear()
Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Yu Kuai yukuai3@huawei.com --- fs/btrfs/extent_io.c | 42 ++++++++++++++++++++++++++++++++---------- 1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 4dc2ef740220..f41a4d02612c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5729,12 +5729,34 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, return ret; }
+/* + * Check that the extent buffer is uptodate. + * + * For regular sector size == PAGE_SIZE case, check if @page is uptodate. + * For subpage case, check if the range covered by the eb has EXTENT_UPTODATE. + */ +static void assert_eb_page_uptodate(const struct extent_buffer *eb, + struct page *page) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + + if (fs_info->sectorsize < PAGE_SIZE) { + bool uptodate; + + uptodate = btrfs_subpage_test_uptodate(fs_info, page, + eb->start, eb->len); + WARN_ON(!uptodate); + } else { + WARN_ON(!PageUptodate(page)); + } +} + void write_extent_buffer_chunk_tree_uuid(const struct extent_buffer *eb, const void *srcv) { char *kaddr;
- WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_page_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_offset_in_page(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, chunk_tree_uuid), srcv, BTRFS_FSID_SIZE); @@ -5744,7 +5766,7 @@ void write_extent_buffer_fsid(const struct extent_buffer *eb, const void *srcv) { char *kaddr;
- WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_page_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_offset_in_page(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, fsid), srcv, BTRFS_FSID_SIZE); @@ -5767,7 +5789,7 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv,
while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page);
cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -5796,7 +5818,7 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start,
while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page);
cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -5854,7 +5876,7 @@ void copy_extent_buffer(const struct extent_buffer *dst,
while (len > 0) { page = dst->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(dst, page);
cur = min(len, (unsigned long)(PAGE_SIZE - offset));
@@ -5916,7 +5938,7 @@ int extent_buffer_test_bit(const struct extent_buffer *eb, unsigned long start,
eb_bitmap_offset(eb, start, nr, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); return 1U & (kaddr[offset] >> (nr & (BITS_PER_BYTE - 1))); } @@ -5941,7 +5963,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star
eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page);
while (len >= bits_to_set) { @@ -5952,7 +5974,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); } } @@ -5984,7 +6006,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb,
eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page);
while (len >= bits_to_clear) { @@ -5995,7 +6017,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb, if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); } }
From: Josef Bacik josef@toxicpanda.com
mainline inclusion from mainline-v5.17-rc7 commit a50e1fcbc9b85fd4e95b89a75c0884cb032a3e06 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IALPSO CVE: CVE-2022-48902
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-------------------------------------------
Whenever we do any extent buffer operations we call assert_eb_page_uptodate() to complain loudly if we're operating on an non-uptodate page. Our overnight tests caught this warning earlier this week
WARNING: CPU: 1 PID: 553508 at fs/btrfs/extent_io.c:6849 assert_eb_page_uptodate+0x3f/0x50 CPU: 1 PID: 553508 Comm: kworker/u4:13 Tainted: G W 5.17.0-rc3+ #564 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Workqueue: btrfs-cache btrfs_work_helper RIP: 0010:assert_eb_page_uptodate+0x3f/0x50 RSP: 0018:ffffa961440a7c68 EFLAGS: 00010246 RAX: 0017ffffc0002112 RBX: ffffe6e74453f9c0 RCX: 0000000000001000 RDX: ffffe6e74467c887 RSI: ffffe6e74453f9c0 RDI: ffff8d4c5efc2fc0 RBP: 0000000000000d56 R08: ffff8d4d4a224000 R09: 0000000000000000 R10: 00015817fa9d1ef0 R11: 000000000000000c R12: 00000000000007b1 R13: ffff8d4c5efc2fc0 R14: 0000000001500000 R15: 0000000001cb1000 FS: 0000000000000000(0000) GS:ffff8d4dbbd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff31d3448d8 CR3: 0000000118be8004 CR4: 0000000000370ee0 Call Trace:
extent_buffer_test_bit+0x3f/0x70 free_space_test_bit+0xa6/0xc0 load_free_space_tree+0x1f6/0x470 caching_thread+0x454/0x630 ? rcu_read_lock_sched_held+0x12/0x60 ? rcu_read_lock_sched_held+0x12/0x60 ? rcu_read_lock_sched_held+0x12/0x60 ? lock_release+0x1f0/0x2d0 btrfs_work_helper+0xf2/0x3e0 ? lock_release+0x1f0/0x2d0 ? finish_task_switch.isra.0+0xf9/0x3a0 process_one_work+0x26d/0x580 ? process_one_work+0x580/0x580 worker_thread+0x55/0x3b0 ? process_one_work+0x580/0x580 kthread+0xf0/0x120 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30
This was partially fixed by c2e39305299f01 ("btrfs: clear extent buffer uptodate when we fail to write it"), however all that fix did was keep us from finding extent buffers after a failed writeout. It didn't keep us from continuing to use a buffer that we already had found.
In this case we're searching the commit root to cache the block group, so we can start committing the transaction and switch the commit root and then start writing. After the switch we can look up an extent buffer that hasn't been written yet and start processing that block group. Then we fail to write that block out and clear Uptodate on the page, and then we start spewing these errors.
Normally we're protected by the tree lock to a certain degree here. If we read a block we have that block read locked, and we block the writer from locking the block before we submit it for the write. However this isn't necessarily fool proof because the read could happen before we do the submit_bio and after we locked and unlocked the extent buffer.
Also in this particular case we have path->skip_locking set, so that won't save us here. We'll simply get a block that was valid when we read it, but became invalid while we were using it.
What we really want is to catch the case where we've "read" a block but it's not marked Uptodate. On read we ClearPageError(), so if we're !Uptodate and !Error we know we didn't do the right thing for reading the page.
Fix this by checking !Uptodate && !Error, this way we will not complain if our buffer gets invalidated while we're using it, and we'll maintain the spirit of the check which is to make sure we have a fully in-cache block while we're messing with it.
CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Yu Kuai yukuai3@huawei.com --- fs/btrfs/extent_io.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f41a4d02612c..888659f6eac8 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5740,14 +5740,24 @@ static void assert_eb_page_uptodate(const struct extent_buffer *eb, { struct btrfs_fs_info *fs_info = eb->fs_info;
+ /* + * If we are using the commit root we could potentially clear a page + * Uptodate while we're using the extent buffer that we've previously + * looked up. We don't want to complain in this case, as the page was + * valid before, we just didn't write it out. Instead we want to catch + * the case where we didn't actually read the block properly, which + * would have !PageUptodate && !PageError, as we clear PageError before + * reading. + */ if (fs_info->sectorsize < PAGE_SIZE) { - bool uptodate; + bool uptodate, error;
uptodate = btrfs_subpage_test_uptodate(fs_info, page, eb->start, eb->len); - WARN_ON(!uptodate); + error = btrfs_subpage_test_error(fs_info, page, eb->start, eb->len); + WARN_ON(!uptodate && !error); } else { - WARN_ON(!PageUptodate(page)); + WARN_ON(!PageUptodate(page) && !PageError(page)); } }
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/11401 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/5...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/11401 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/5...