This patch set fix two corruption problems:
Long Li (1): xfs: shutdown to ensure submits buffers on LSN boundaries
yangerkun (1): xfs: shutdown xfs once inode double free
fs/xfs/libxfs/xfs_ialloc.c | 6 +++++- fs/xfs/xfs_log_recover.c | 6 +++++- 2 files changed, 10 insertions(+), 2 deletions(-)
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8RN63 CVE: NA
--------------------------------
While performing the io fault injection test, I caught the following data corruption report:
XFS (dm-0): Internal error ltbno + ltlen > bno at line 1957 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_ag_extent+0x79c/0x1130 CPU: 3 PID: 33 Comm: kworker/3:0 Not tainted 6.5.0-rc7-next-20230825-00001-g7f8666926889 #214 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 Workqueue: xfs-inodegc/dm-0 xfs_inodegc_worker Call Trace: <TASK> dump_stack_lvl+0x50/0x70 xfs_corruption_error+0x134/0x150 xfs_free_ag_extent+0x7d3/0x1130 __xfs_free_extent+0x201/0x3c0 xfs_trans_free_extent+0x29b/0xa10 xfs_extent_free_finish_item+0x2a/0xb0 xfs_defer_finish_noroll+0x8d1/0x1b40 xfs_defer_finish+0x21/0x200 xfs_itruncate_extents_flags+0x1cb/0x650 xfs_free_eofblocks+0x18f/0x250 xfs_inactive+0x485/0x570 xfs_inodegc_worker+0x207/0x530 process_scheduled_works+0x24a/0xe10 worker_thread+0x5ac/0xc60 kthread+0x2cd/0x3c0 ret_from_fork+0x4a/0x80 ret_from_fork_asm+0x11/0x20 </TASK> XFS (dm-0): Corruption detected. Unmount and run xfs_repair
After analyzing the disk image, it was found that the corruption was triggered by the fact that extent was recorded in both the inode and AGF btrees. After a long time of reproduction and analysis, we found that the root cause of the problem was that the AGF btree block was not recovered.
Consider the following scenario, Transaction A and Transaction B are in the same record, so Transaction A and Transaction B share the same LSN1. If the buf item in Transaction A has been recovered, then the buf item in Transaction B cannot be recovered, because log recovery skips items with a metadata LSN >= the current LSN of the recovery item. If there is still an inode item in transaction B that records the Extent X, the Extent X will be recorded in both the inode and the AGF btree block after transaction B is recovered.
|------------Record (LSN1)------------------|---Record (LSN2)---| |----------Trans A------------|-------------Trans B-------------| | Buf Item(Extent X) | Buf Item / Inode item(Extent X) | | Extent X is freed | Extent X is allocated |
After commit 12818d24db8a ("xfs: rework log recovery to submit buffers on LSN boundaries") was introduced, during log recovery we submits buffers on lsn boundaries. The above problem can be avoided under normal paths, but is not guaranteed under abnormal paths. Consider the following process, if an error is encountered after recover buf item in transaction A and before recover buf item in transaction B, buffers that have been added to buffer_list will still be submitted, this violates the submits rule on lsn boundaries. So buf item in Transaction B cannot be recovered on the next mount due to current lsn equal to metadata lsn.
xlog_do_recovery_pass xlog_recover_process xlog_recover_process_data ... xlog_recover_buf_commit_pass2 xlog_recover_do_reg_buffer //recover buf item in Trans A xfs_buf_delwri_queue(bp, buffer_list) ... ====> Encountered error and returned ... xlog_recover_buf_commit_pass2 xlog_recover_do_reg_buffer //recover buf item in Trans B xfs_buf_delwri_queue(bp, buffer_list) if (!list_empty(&buffer_list)) xfs_buf_delwri_submit(&buffer_list); //submit regardless of error
In order to make sure that submits buffers on lsn boundaries in the abnormal paths, we need to check error status before submit buffers that have been added from the last record processed.
Signed-off-by: Long Li leo.lilong@huawei.com --- fs/xfs/xfs_log_recover.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 7606ce475088..c45c54f73238 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -3220,8 +3220,12 @@ xlog_do_recovery_pass( * Submit buffers that have been added from the last record processed, * regardless of error status. */ - if (!list_empty(&buffer_list)) + if (!list_empty(&buffer_list)) { + if (error) + xfs_force_shutdown(log->l_mp, SHUTDOWN_META_IO_ERROR); + error2 = xfs_buf_delwri_submit(&buffer_list); + }
if (error && first_bad) *first_bad = rhead_blk;
From: yangerkun yangerkun@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I8RNQD CVE: NA
--------------------------------
Once we free exact one inode twice in xfs_difree_inobt, this lead to that ir_freecount does not match ir_free(ir_freecount will add twice, but ir_free will change only once), and the latter xfs_inobt_get_rec will bark for the mismatch of the ir_freecount and ir_free. Once we call xfs_inobt_get_rec when we process AGI unlinked lists, this will fail xfs mount.
We has not found the root cause why we free exact one inode twice, but we really should reject this for the purpose to not spread mistakes.
Signed-off-by: yangerkun yangerkun@huawei.com Signed-off-by: Long Li leo.lilong@huawei.com --- fs/xfs/libxfs/xfs_ialloc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index b83e54c70906..a51e2de3eb5a 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -1970,7 +1970,11 @@ xfs_difree_inobt( */ off = agino - rec.ir_startino; ASSERT(off >= 0 && off < XFS_INODES_PER_CHUNK); - ASSERT(!(rec.ir_free & XFS_INOBT_MASK(off))); + + if (XFS_IS_CORRUPT(mp, rec.ir_free & XFS_INOBT_MASK(off))) { + error = -EFSCORRUPTED; + goto error0; + } /* * Mark the inode free & increment the count. */
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/3608 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/3...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/3608 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/3...