[PATCH OLK-6.6 0/9] ext4: correct behaviors under errors=remount-ro mode

Baokun Li (8): sb_writers: fix kabi broken Revert "ext4: don't set SB_RDONLY after filesystem errors" Revert "ext4: replace read-only check for shutdown check in mmp code" Revert "ext4: drop read-only check from ext4_force_commit()" Revert "ext4: drop read-only check in ext4_write_inode()" Revert "ext4: drop read-only check in ext4_init_inode_table()" Revert "ext4: warn on read-only filesystem in ext4_journal_check_start()" Revert "ext4: drop EXT4_MF_FS_ABORTED flag" Jan Kara (1): fs: fix a hungtask problem when freeze/unfreeze fs fs/ext4/ext4.h | 1 + fs/ext4/ext4_jbd2.c | 3 +-- fs/ext4/fsync.c | 7 ++++--- fs/ext4/ialloc.c | 6 ++++++ fs/ext4/inode.c | 11 +++++++---- fs/ext4/mballoc.c | 4 ++-- fs/ext4/mmp.c | 2 +- fs/ext4/super.c | 21 +++++++++++++-------- fs/super.c | 9 ++++++++- include/linux/fs.h | 5 ++++- 10 files changed, 47 insertions(+), 22 deletions(-) -- 2.46.1

From: Jan Kara <jack@suse.cz> maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 Reference: https://lore.kernel.org/r/20210104160457.GG4018@quack2.suse.cz/ -------------------------------- We found the following deadlock when running xfstests generic/390 with ext4 filesystem, and simutaneously offlining/onlining the disk we tested. It will cause a deadlock whose call trace is like this: fsstress D 0 11672 11625 0x00000080 Call Trace: ? __schedule+0x2fc/0x930 ? filename_parentat+0x10b/0x1a0 schedule+0x28/0x70 rwsem_down_read_failed+0x102/0x1c0 ? __percpu_down_read+0x93/0xb0 __percpu_down_read+0x93/0xb0 __sb_start_write+0x5f/0x70 mnt_want_write+0x20/0x50 do_renameat2+0x1f3/0x550 __x64_sys_rename+0x1c/0x20 do_syscall_64+0x5b/0x1b0 entry_SYSCALL_64_after_hwframe+0x65/0xca The root cause is that when ext4 hits IO error due to disk being offline, it will switch itself into read-only state. When it is frozen at that moment, following thaw_super() call will not unlock percpu freeze semaphores (as the fs is read-only) causing the deadlock. Fix the problem by tracking whether the superblock was read-only at the time we were freezing it. Reported-and-tested-by: Shijie Luo <luoshijie1@huawei.com> Closes: https://lore.kernel.org/r/20201226095641.17290-1-luoshijie1@huawei.com/ Fixes: 18e9e5104fcd ("Introduce freeze_super and thaw_super for the fsfreeze ioctl") Signed-off-by: geruijun <geruijun@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/super.c | 9 ++++++++- include/linux/fs.h | 2 ++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/super.c b/fs/super.c index b142e71eb8df..42c9754e3add 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1996,11 +1996,13 @@ int freeze_super(struct super_block *sb, enum freeze_holder who) /* Nothing to do really... */ sb->s_writers.freeze_holders |= who; sb->s_writers.frozen = SB_FREEZE_COMPLETE; + sb->s_writers.frozen_ro = 1; wake_up_var(&sb->s_writers.frozen); super_unlock_excl(sb); return 0; } + sb->s_writers.frozen_ro = 0; sb->s_writers.frozen = SB_FREEZE_WRITE; /* Release s_umount to preserve sb_start_write -> s_umount ordering */ super_unlock_excl(sb); @@ -2082,7 +2084,12 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) return -EINVAL; } - if (sb_rdonly(sb)) { + /* + * Was the fs frozen in read-only state? Note that we cannot just check + * sb_rdonly(sb) as the filesystem might have switched to read-only + * state due to internal errors or so. + */ + if (sb->s_writers.frozen_ro) { sb->s_writers.freeze_holders &= ~who; sb->s_writers.frozen = SB_UNFROZEN; wake_up_var(&sb->s_writers.frozen); diff --git a/include/linux/fs.h b/include/linux/fs.h index 0efc8bea182d..8c0cdb73a838 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1225,6 +1225,8 @@ enum { struct sb_writers { unsigned short frozen; /* Is sb frozen? */ + unsigned short frozen_ro; /* Was sb read-only + * when frozen? */ unsigned short freeze_holders; /* Who froze fs? */ struct percpu_rw_semaphore rw_sem[SB_FREEZE_LEVELS]; }; -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- Fix the broken KABI by splitting the unsigned short frozen field into two unsigned char fields using KABI_REPLACE2(). Fixes: 8cb6859cb789 ("[Backport] fs: fix a hungtask problem when freeze/unfreeze fs") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- include/linux/fs.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 8c0cdb73a838..78a064db2de3 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1224,8 +1224,9 @@ enum { #define SB_FREEZE_LEVELS (SB_FREEZE_COMPLETE - 1) struct sb_writers { - unsigned short frozen; /* Is sb frozen? */ - unsigned short frozen_ro; /* Was sb read-only + KABI_REPLACE2(unsigned short frozen, + unsigned char frozen, /* Is sb frozen? */ + unsigned char frozen_ro) /* Was sb read-only * when frozen? */ unsigned short freeze_holders; /* Who froze fs? */ struct percpu_rw_semaphore rw_sem[SB_FREEZE_LEVELS]; -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- This reverts commit 88a5f8ce0c0bce7d85d112a8ca819557d428e50f. Because this patch causes the file system to be displayed as rw by mount when an error is triggered in 'errors=remount-ro' mode. The issue this patch was intended to fix has been resolved by 'fs: fix a hungtask problem when freeze/unfreeze fs', therefore it can be safely reverted. Fixes: 88a5f8ce0c0b ("ext4: don't set SB_RDONLY after filesystem errors") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/super.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 84bee249657d..92e3476d4fe0 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -821,12 +821,11 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error, ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only"); /* - * EXT4_FLAGS_SHUTDOWN was set which stops all filesystem - * modifications. We don't set SB_RDONLY because that requires - * sb->s_umount semaphore and setting it without proper remount - * procedure is confusing code such as freeze_super() leading to - * deadlocks and other problems. + * Make sure updated value of ->s_mount_flags will be visible before + * ->s_flags update */ + smp_wmb(); + sb->s_flags |= SB_RDONLY; #ifdef CONFIG_EXT4_ERROR_REPORT out: ext4_netlink_send_info(sb, force_ro ? 2 : 1); -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- This reverts commit 1e1566b9c85fbd6150657ea17f50fd42b9166d31. We are about to revert 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag"), therefore the changes to the sb_rdonly check are being reverted here. Fixes: 1e1566b9c85f ("ext4: replace read-only check for shutdown check in mmp code") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/mmp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c index d64c04ed061a..55d5a3f1c131 100644 --- a/fs/ext4/mmp.c +++ b/fs/ext4/mmp.c @@ -162,7 +162,7 @@ static int kmmpd(void *data) memcpy(mmp->mmp_nodename, init_utsname()->nodename, sizeof(mmp->mmp_nodename)); - while (!kthread_should_stop() && !ext4_forced_shutdown(sb)) { + while (!kthread_should_stop() && !sb_rdonly(sb)) { if (!ext4_has_feature_mmp(sb)) { ext4_warning(sb, "kmmpd being stopped since MMP feature" " has been disabled."); -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- This reverts commit 889860e452d7436ca72018b8a03cbd89c38d6384. We are about to revert 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag"), therefore the changes to the sb_rdonly check are being reverted here. Fixes: 889860e452d7 ("ext4: drop read-only check from ext4_force_commit()") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/super.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 92e3476d4fe0..0ce332f118d2 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -6436,7 +6436,13 @@ static int ext4_clear_journal_err(struct super_block *sb, */ int ext4_force_commit(struct super_block *sb) { - return ext4_journal_force_commit(EXT4_SB(sb)->s_journal); + journal_t *journal; + + if (sb_rdonly(sb)) + return 0; + + journal = EXT4_SB(sb)->s_journal; + return ext4_journal_force_commit(journal); } static int ext4_sync_fs(struct super_block *sb, int wait) -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- This reverts commit f1128084b40e520bea8bb32b3ff4d03745ab7e64. We are about to revert 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag"), therefore the changes to the sb_rdonly check are being reverted here. Fixes: f1128084b40e ("ext4: drop read-only check in ext4_write_inode()") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/inode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f626f1c509c2..d28defcb2403 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5809,7 +5809,8 @@ int ext4_write_inode(struct inode *inode, struct writeback_control *wbc) { int err; - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC)) + if (WARN_ON_ONCE(current->flags & PF_MEMALLOC) || + sb_rdonly(inode->i_sb)) return 0; if (unlikely(ext4_forced_shutdown(inode->i_sb))) -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- This reverts commit ffb6844e28ef6b9d76bee378774d7afbc3db6da9. We are about to revert 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag"), therefore the changes to the sb_rdonly check are being reverted here. Fixes: ffb6844e28ef ("ext4: drop read-only check in ext4_init_inode_table()") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/ialloc.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index d6440dffe02b..182252546f87 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -1533,6 +1533,12 @@ int ext4_init_inode_table(struct super_block *sb, ext4_group_t group, int num, ret = 0, used_blks = 0; unsigned long used_inos = 0; + /* This should not happen, but just to be sure check this */ + if (sb_rdonly(sb)) { + ret = 1; + goto out; + } + gdp = ext4_get_group_desc(sb, group, &group_desc_bh); if (!gdp || !grp) goto out; -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- This reverts commit e7fc2b31e04c46c9e2098bba710c9951c6b968af. We are about to revert 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag"), therefore the changes to the sb_rdonly check are being reverted here. Fixes: e7fc2b31e04c ("ext4: warn on read-only filesystem in ext4_journal_check_start()") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/ext4_jbd2.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c index cac0eb5068d3..938bc8e157de 100644 --- a/fs/ext4/ext4_jbd2.c +++ b/fs/ext4/ext4_jbd2.c @@ -71,9 +71,8 @@ static int ext4_journal_check_start(struct super_block *sb) if (unlikely(ext4_forced_shutdown(sb))) return -EIO; - if (WARN_ON_ONCE(sb_rdonly(sb))) + if (sb_rdonly(sb)) return -EROFS; - WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE); journal = EXT4_SB(sb)->s_journal; /* -- 2.46.1

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBEQJ3 -------------------------------- This reverts commit 95257987a6387f02970eda707e55a06cce734e18. When an error is triggered in 'errors=remount-ro' mode, the file system becomes read-only, meaning it should be readable but not writable. The reverted patch set EXT4_FLAGS_SHUTDOWN, which caused read operations to also return -EIO. Therefore, this patch was reverted to avoid this mechanism change. Fixes: 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag") Signed-off-by: Baokun Li <libaokun1@huawei.com> --- fs/ext4/ext4.h | 1 + fs/ext4/fsync.c | 7 ++++--- fs/ext4/inode.c | 8 +++++--- fs/ext4/mballoc.c | 4 ++-- fs/ext4/super.c | 4 ++-- 5 files changed, 14 insertions(+), 10 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 03bfb7f0a4a0..70d51641058a 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1821,6 +1821,7 @@ static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino) */ enum { EXT4_MF_MNTDIR_SAMPLED, + EXT4_MF_FS_ABORTED, /* Fatal error detected */ EXT4_MF_FC_INELIGIBLE /* Fast commit ineligible */ }; diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c index b40d3b29f7e5..bffc1d0994f5 100644 --- a/fs/ext4/fsync.c +++ b/fs/ext4/fsync.c @@ -131,6 +131,7 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync) int ret = 0, err; bool needs_barrier = false; struct inode *inode = file->f_mapping->host; + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); if (unlikely(ext4_forced_shutdown(inode->i_sb))) return -EIO; @@ -140,14 +141,14 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync) trace_ext4_sync_file_enter(file, datasync); if (sb_rdonly(inode->i_sb)) { - /* Make sure that we read updated s_ext4_flags value */ + /* Make sure that we read updated s_mount_flags value */ smp_rmb(); - if (ext4_forced_shutdown(inode->i_sb)) + if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FS_ABORTED)) ret = -EROFS; goto out; } - if (!EXT4_SB(inode->i_sb)->s_journal) { + if (!sbi->s_journal) { ret = ext4_fsync_nojournal(file, start, end, datasync, &needs_barrier); if (needs_barrier) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d28defcb2403..f48d52df35d6 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2293,7 +2293,8 @@ static int mpage_map_and_submit_extent(handle_t *handle, if (err < 0) { struct super_block *sb = inode->i_sb; - if (ext4_forced_shutdown(sb)) + if (ext4_forced_shutdown(sb) || + ext4_test_mount_flag(sb, EXT4_MF_FS_ABORTED)) goto invalidate_dirty_pages; /* * Let the uper layers retry transient errors. @@ -2616,13 +2617,14 @@ static int ext4_do_writepages(struct mpage_da_data *mpd) * If the filesystem has aborted, it is read-only, so return * right away instead of dumping stack traces later on that * will obscure the real source of the problem. We test - * fs shutdown state instead of sb->s_flag's SB_RDONLY because + * EXT4_MF_FS_ABORTED instead of sb->s_flag's SB_RDONLY because * the latter could be true if the filesystem is mounted * read-only, and in that case, ext4_writepages should * *never* be called, so if that ever happens, we would want * the stack trace. */ - if (unlikely(ext4_forced_shutdown(mapping->host->i_sb))) { + if (unlikely(ext4_forced_shutdown(mapping->host->i_sb) || + ext4_test_mount_flag(inode->i_sb, EXT4_MF_FS_ABORTED))) { ret = -EROFS; goto out_writepages; } diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 4eb689237cf2..1bf9df63b73a 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -5740,7 +5740,7 @@ static inline void ext4_mb_show_pa(struct super_block *sb) { ext4_group_t i, ngroups; - if (ext4_forced_shutdown(sb)) + if (ext4_test_mount_flag(sb, EXT4_MF_FS_ABORTED)) return; ngroups = ext4_get_groups_count(sb); @@ -5774,7 +5774,7 @@ static void ext4_mb_show_ac(struct ext4_allocation_context *ac) { struct super_block *sb = ac->ac_sb; - if (ext4_forced_shutdown(sb)) + if (ext4_test_mount_flag(sb, EXT4_MF_FS_ABORTED)) return; mb_debug(sb, "Can't allocate:" diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 0ce332f118d2..ac8e05722c53 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -779,7 +779,7 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error, WARN_ON_ONCE(1); if (!continue_fs && !sb_rdonly(sb)) { - set_bit(EXT4_FLAGS_SHUTDOWN, &EXT4_SB(sb)->s_ext4_flags); + ext4_set_mount_flag(sb, EXT4_MF_FS_ABORTED); if (journal) jbd2_journal_abort(journal, -EIO); } @@ -6684,7 +6684,7 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) flush_work(&sbi->s_sb_upd_work); if ((bool)(fc->sb_flags & SB_RDONLY) != sb_rdonly(sb)) { - if (ext4_forced_shutdown(sb)) { + if (ext4_test_mount_flag(sb, EXT4_MF_FS_ABORTED)) { err = -EROFS; goto restore_opts; } -- 2.46.1

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/14475 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/N... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/14475 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/N...
participants (2)
-
Baokun Li
-
patchwork bot