From: Hyunwoo Kim imv4bel@gmail.com
mainline inclusion from mainline-v6.0-rc5 commit 9cb636b5f6a8cc6d1b50809ec8f8d33ae0c84c95 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5QI0W CVE: CVE-2022-40307
---------------------------
A race condition may occur if the user calls close() on another thread during a write() operation on the device node of the efi capsule.
This is a race condition that occurs between the efi_capsule_write() and efi_capsule_flush() functions of efi_capsule_fops, which ultimately results in UAF.
So, the page freeing process is modified to be done in efi_capsule_release() instead of efi_capsule_flush().
Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Hyunwoo Kim imv4bel@gmail.com Link: https://lore.kernel.org/all/20220907102920.GA88602@ubuntu/ Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Xia Longlong xialonglong1@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Reviewed-by: Xiu Jianfeng xiujianfeng@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/firmware/efi/capsule-loader.c | 31 ++++++--------------------- 1 file changed, 7 insertions(+), 24 deletions(-)
diff --git a/drivers/firmware/efi/capsule-loader.c b/drivers/firmware/efi/capsule-loader.c index 96688986da56..94aae1e67c99 100644 --- a/drivers/firmware/efi/capsule-loader.c +++ b/drivers/firmware/efi/capsule-loader.c @@ -243,29 +243,6 @@ static ssize_t efi_capsule_write(struct file *file, const char __user *buff, return ret; }
-/** - * efi_capsule_flush - called by file close or file flush - * @file: file pointer - * @id: not used - * - * If a capsule is being partially uploaded then calling this function - * will be treated as upload termination and will free those completed - * buffer pages and -ECANCELED will be returned. - **/ -static int efi_capsule_flush(struct file *file, fl_owner_t id) -{ - int ret = 0; - struct capsule_info *cap_info = file->private_data; - - if (cap_info->index > 0) { - pr_err("capsule upload not complete\n"); - efi_free_all_buff_pages(cap_info); - ret = -ECANCELED; - } - - return ret; -} - /** * efi_capsule_release - called by file close * @inode: not used @@ -278,6 +255,13 @@ static int efi_capsule_release(struct inode *inode, struct file *file) { struct capsule_info *cap_info = file->private_data;
+ if (cap_info->index > 0 && + (cap_info->header.headersize == 0 || + cap_info->count < cap_info->total_size)) { + pr_err("capsule upload not complete\n"); + efi_free_all_buff_pages(cap_info); + } + kfree(cap_info->pages); kfree(cap_info->phys); kfree(file->private_data); @@ -325,7 +309,6 @@ static const struct file_operations efi_capsule_fops = { .owner = THIS_MODULE, .open = efi_capsule_open, .write = efi_capsule_write, - .flush = efi_capsule_flush, .release = efi_capsule_release, .llseek = no_llseek, };
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X CVE: NA
--------------------------------
Following process: Init: v2_read_file_info: <3> dqi_free_blk 0 dqi_free_entry 5 dqi_blks 6
Step 1. chown bin f_a -> dquot_acquire -> v2_write_dquot: qtree_write_dquot do_insert_tree find_free_dqentry get_free_dqblk write_blk(info->dqi_blocks) // info->dqi_blocks = 6, failure. The content in physical block (corresponding to blk 6) is random.
Step 2. chown root f_a -> dquot_transfer -> dqput_all -> dqput -> ext4_release_dquot -> v2_release_dquot -> qtree_delete_dquot: dquot_release remove_tree free_dqentry put_free_dqblk(6) info->dqi_free_blk = blk // info->dqi_free_blk = 6
Step 3. drop cache (buffer head for block 6 is released)
Step 4. chown bin f_b -> dquot_acquire -> commit_dqblk -> v2_write_dquot: qtree_write_dquot do_insert_tree find_free_dqentry get_free_dqblk dh = (struct qt_disk_dqdbheader *)buf blk = info->dqi_free_blk // 6 ret = read_blk(info, blk, buf) // The content of buf is random info->dqi_free_blk = le32_to_cpu(dh->dqdh_next_free) // random blk
Step 5. chown bin f_c -> notify_change -> ext4_setattr -> dquot_transfer: dquot = dqget -> acquire_dquot -> ext4_acquire_dquot -> dquot_acquire -> commit_dqblk -> v2_write_dquot -> dq_insert_tree: do_insert_tree find_free_dqentry get_free_dqblk blk = info->dqi_free_blk // If blk < 0 and blk is not an error code, it will be returned as dquot
transfer_to[USRQUOTA] = dquot // A random negative value __dquot_transfer(transfer_to) dquot_add_inodes(transfer_to[cnt]) spin_lock(&dquot->dq_dqb_lock) // page fault
, which will lead to kernel page fault: Quota error (device sda): qtree_write_dquot: Error -8000 occurred while creating quota BUG: unable to handle page fault for address: ffffffffffffe120 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP CPU: 0 PID: 5974 Comm: chown Not tainted 6.0.0-rc1-00004 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:_raw_spin_lock+0x3a/0x90 Call Trace: dquot_add_inodes+0x28/0x270 __dquot_transfer+0x377/0x840 dquot_transfer+0xde/0x540 ext4_setattr+0x405/0x14d0 notify_change+0x68e/0x9f0 chown_common+0x300/0x430 __x64_sys_fchownat+0x29/0x40
In order to avoid accessing invalid quota memory address, this patch adds block number checking of next/prev free block read from quota file.
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216372 Fixes: 1da177e4c3f4152 ("Linux-2.6.12-rc2") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- fs/quota/quota_tree.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+)
diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c index 833cd3e3758b..3b5d2429b29c 100644 --- a/fs/quota/quota_tree.c +++ b/fs/quota/quota_tree.c @@ -79,6 +79,35 @@ static ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf) return ret; }
+static inline int do_check_range(struct super_block *sb, uint val, uint max_val) +{ + if (val >= max_val) { + quota_error(sb, "Getting block too big (%u >= %u)", + val, max_val); + return -EUCLEAN; + } + + return 0; +} + +static int check_free_block(struct qtree_mem_dqinfo *info, + struct qt_disk_dqdbheader *dh) +{ + int err = 0; + uint nextblk, prevblk; + + nextblk = le32_to_cpu(dh->dqdh_next_free); + err = do_check_range(info->dqi_sb, nextblk, info->dqi_blocks); + if (err) + return err; + prevblk = le32_to_cpu(dh->dqdh_prev_free); + err = do_check_range(info->dqi_sb, prevblk, info->dqi_blocks); + if (err) + return err; + + return err; +} + /* Remove empty block from list and return it */ static int get_free_dqblk(struct qtree_mem_dqinfo *info) { @@ -93,6 +122,9 @@ static int get_free_dqblk(struct qtree_mem_dqinfo *info) ret = read_blk(info, blk, buf); if (ret < 0) goto out_buf; + ret = check_free_block(info, dh); + if (ret) + goto out_buf; info->dqi_free_blk = le32_to_cpu(dh->dqdh_next_free); } else { @@ -240,6 +272,9 @@ static uint find_free_dqentry(struct qtree_mem_dqinfo *info, *err = read_blk(info, blk, buf); if (*err < 0) goto out_buf; + *err = check_free_block(info, dh); + if (*err) + goto out_buf; } else { blk = get_free_dqblk(info); if ((int)blk < 0) { @@ -432,6 +467,9 @@ static int free_dqentry(struct qtree_mem_dqinfo *info, struct dquot *dquot, goto out_buf; } dh = (struct qt_disk_dqdbheader *)buf; + ret = check_free_block(info, dh); + if (ret) + goto out_buf; le16_add_cpu(&dh->dqdh_entries, -1); if (!le16_to_cpu(dh->dqdh_entries)) { /* Block got free? */ ret = remove_free_dqentry(info, buf, blk);
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X CVE: NA
--------------------------------
Cleanup all block checking places, replace them with helper function do_check_range().
Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- fs/quota/quota_tree.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-)
diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c index 3b5d2429b29c..5389cabc6f20 100644 --- a/fs/quota/quota_tree.c +++ b/fs/quota/quota_tree.c @@ -79,11 +79,12 @@ static ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf) return ret; }
-static inline int do_check_range(struct super_block *sb, uint val, uint max_val) +static inline int do_check_range(struct super_block *sb, uint val, + uint min_val, uint max_val) { - if (val >= max_val) { - quota_error(sb, "Getting block too big (%u >= %u)", - val, max_val); + if (val < min_val || val >= max_val) { + quota_error(sb, "Getting block %u out of range %u-%u", + val, min_val, max_val); return -EUCLEAN; }
@@ -97,11 +98,11 @@ static int check_free_block(struct qtree_mem_dqinfo *info, uint nextblk, prevblk;
nextblk = le32_to_cpu(dh->dqdh_next_free); - err = do_check_range(info->dqi_sb, nextblk, info->dqi_blocks); + err = do_check_range(info->dqi_sb, nextblk, 0, info->dqi_blocks); if (err) return err; prevblk = le32_to_cpu(dh->dqdh_prev_free); - err = do_check_range(info->dqi_sb, prevblk, info->dqi_blocks); + err = do_check_range(info->dqi_sb, prevblk, 0, info->dqi_blocks); if (err) return err;
@@ -526,12 +527,10 @@ static int remove_tree(struct qtree_mem_dqinfo *info, struct dquot *dquot, goto out_buf; } newblk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]); - if (newblk < QT_TREEOFF || newblk >= info->dqi_blocks) { - quota_error(dquot->dq_sb, "Getting block too big (%u >= %u)", - newblk, info->dqi_blocks); - ret = -EUCLEAN; + ret = do_check_range(dquot->dq_sb, newblk, QT_TREEOFF, + info->dqi_blocks); + if (ret) goto out_buf; - }
if (depth == info->dqi_qtree_depth - 1) { ret = free_dqentry(info, dquot, newblk); @@ -632,12 +631,9 @@ static loff_t find_tree_dqentry(struct qtree_mem_dqinfo *info, blk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]); if (!blk) /* No reference? */ goto out_buf; - if (blk < QT_TREEOFF || blk >= info->dqi_blocks) { - quota_error(dquot->dq_sb, "Getting block too big (%u >= %u)", - blk, info->dqi_blocks); - ret = -EUCLEAN; + ret = do_check_range(dquot->dq_sb, blk, QT_TREEOFF, info->dqi_blocks); + if (ret) goto out_buf; - }
if (depth < info->dqi_qtree_depth - 1) ret = find_tree_dqentry(info, dquot, blk, depth+1);
From: Zhihao Cheng chengzhihao1@huawei.com
hulk inclusion category: bugfix bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X CVE: NA
--------------------------------
It would be better to do more sanity checking (eg. dqdh_entries, block no.) for the content read from quota file, which can prevent corrupting the quota file.
Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- fs/quota/quota_tree.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/fs/quota/quota_tree.c b/fs/quota/quota_tree.c index 5389cabc6f20..3c600ed3781a 100644 --- a/fs/quota/quota_tree.c +++ b/fs/quota/quota_tree.c @@ -79,12 +79,12 @@ static ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, char *buf) return ret; }
-static inline int do_check_range(struct super_block *sb, uint val, - uint min_val, uint max_val) +static inline int do_check_range(struct super_block *sb, const char *val_name, + uint val, uint min_val, uint max_val) { if (val < min_val || val >= max_val) { - quota_error(sb, "Getting block %u out of range %u-%u", - val, min_val, max_val); + quota_error(sb, "Getting %s %u out of range %u-%u", + val_name, val, min_val, max_val); return -EUCLEAN; }
@@ -98,11 +98,13 @@ static int check_free_block(struct qtree_mem_dqinfo *info, uint nextblk, prevblk;
nextblk = le32_to_cpu(dh->dqdh_next_free); - err = do_check_range(info->dqi_sb, nextblk, 0, info->dqi_blocks); + err = do_check_range(info->dqi_sb, "dqdh_next_free", nextblk, 0, + info->dqi_blocks); if (err) return err; prevblk = le32_to_cpu(dh->dqdh_prev_free); - err = do_check_range(info->dqi_sb, prevblk, 0, info->dqi_blocks); + err = do_check_range(info->dqi_sb, "dqdh_prev_free", prevblk, 0, + info->dqi_blocks); if (err) return err;
@@ -276,6 +278,11 @@ static uint find_free_dqentry(struct qtree_mem_dqinfo *info, *err = check_free_block(info, dh); if (*err) goto out_buf; + *err = do_check_range(info->dqi_sb, "dqdh_entries", + le16_to_cpu(dh->dqdh_entries), 0, + qtree_dqstr_in_blk(info)); + if (*err) + goto out_buf; } else { blk = get_free_dqblk(info); if ((int)blk < 0) { @@ -357,6 +364,10 @@ static int do_insert_tree(struct qtree_mem_dqinfo *info, struct dquot *dquot, } ref = (__le32 *)buf; newblk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]); + ret = do_check_range(dquot->dq_sb, "block", newblk, 0, + info->dqi_blocks); + if (ret) + goto out_buf; if (!newblk) newson = 1; if (depth == info->dqi_qtree_depth - 1) { @@ -469,6 +480,11 @@ static int free_dqentry(struct qtree_mem_dqinfo *info, struct dquot *dquot, } dh = (struct qt_disk_dqdbheader *)buf; ret = check_free_block(info, dh); + if (ret) + goto out_buf; + ret = do_check_range(info->dqi_sb, "dqdh_entries", + le16_to_cpu(dh->dqdh_entries), 1, + qtree_dqstr_in_blk(info) + 1); if (ret) goto out_buf; le16_add_cpu(&dh->dqdh_entries, -1); @@ -527,7 +543,7 @@ static int remove_tree(struct qtree_mem_dqinfo *info, struct dquot *dquot, goto out_buf; } newblk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]); - ret = do_check_range(dquot->dq_sb, newblk, QT_TREEOFF, + ret = do_check_range(dquot->dq_sb, "block", newblk, QT_TREEOFF, info->dqi_blocks); if (ret) goto out_buf; @@ -631,7 +647,8 @@ static loff_t find_tree_dqentry(struct qtree_mem_dqinfo *info, blk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]); if (!blk) /* No reference? */ goto out_buf; - ret = do_check_range(dquot->dq_sb, blk, QT_TREEOFF, info->dqi_blocks); + ret = do_check_range(dquot->dq_sb, "block", blk, QT_TREEOFF, + info->dqi_blocks); if (ret) goto out_buf;
@@ -747,7 +764,13 @@ static int find_next_id(struct qtree_mem_dqinfo *info, qid_t *id, goto out_buf; } for (i = __get_index(info, *id, depth); i < epb; i++) { - if (ref[i] == cpu_to_le32(0)) { + uint blk_no = le32_to_cpu(ref[i]); + + ret = do_check_range(info->dqi_sb, "block", blk_no, 0, + info->dqi_blocks); + if (ret) + goto out_buf; + if (blk_no == 0) { *id += level_inc; continue; } @@ -755,7 +778,7 @@ static int find_next_id(struct qtree_mem_dqinfo *info, qid_t *id, ret = 0; goto out_buf; } - ret = find_next_id(info, id, le32_to_cpu(ref[i]), depth + 1); + ret = find_next_id(info, id, blk_no, depth + 1); if (ret != -ENOENT) break; }
From: Baokun Li libaokun1@huawei.com
hulk inclusion category: bugfix bugzilla: 187600, https://gitee.com/openeuler/kernel/issues/I5SV2U CVE: NA
--------------------------------
If the starting position of our insert range happens to be in the hole between the two ext4_extent_idx, because the lblk of the ext4_extent in the previous ext4_extent_idx is always less than the start, which leads to the "extent" variable access across the boundary, the following UAF is triggered:
================================================================== BUG: KASAN: use-after-free in ext4_ext_shift_extents+0x257/0x790 Read of size 4 at addr ffff88819807a008 by task fallocate/8010 CPU: 3 PID: 8010 Comm: fallocate Tainted: G E 5.10.0+ #492 Call Trace: dump_stack+0x7d/0xa3 print_address_description.constprop.0+0x1e/0x220 kasan_report.cold+0x67/0x7f ext4_ext_shift_extents+0x257/0x790 ext4_insert_range+0x5b6/0x700 ext4_fallocate+0x39e/0x3d0 vfs_fallocate+0x26f/0x470 ksys_fallocate+0x3a/0x70 __x64_sys_fallocate+0x4f/0x60 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ==================================================================
For right shifts, we can divide them into the following situations:
1. When the first ee_block of ext4_extent_idx is greater than or equal to start, make right shifts directly from the first ee_block. 1) If it is greater than start, we need to continue searching in the previous ext4_extent_idx. 2) If it is equal to start, we can exit the loop (iterator=NULL).
2. When the first ee_block of ext4_extent_idx is less than start, then traverse from the last extent to find the first extent whose ee_block is less than start. 1) If extent is still the last extent after traversal, it means that the last ee_block of ext4_extent_idx is less than start, that is, start is located in the hole between idx and (idx+1), so we can exit the loop directly (break) without right shifts. 2) Otherwise, make right shifts at the corresponding position of the found extent, and then exit the loop (iterator=NULL).
Fixes: 331573febb6a ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate") Cc: stable@vger.kernel.org Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- fs/ext4/extents.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 8e5ed3e315cd..e3e0cdb20627 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -5472,6 +5472,7 @@ ext4_ext_shift_extents(struct inode *inode, handle_t *handle, * and it is decreased till we reach start. */ again: + ret = 0; if (SHIFT == SHIFT_LEFT) iterator = &start; else @@ -5515,14 +5516,21 @@ ext4_ext_shift_extents(struct inode *inode, handle_t *handle, ext4_ext_get_actual_len(extent); } else { extent = EXT_FIRST_EXTENT(path[depth].p_hdr); - if (le32_to_cpu(extent->ee_block) > 0) + if (le32_to_cpu(extent->ee_block) > start) *iterator = le32_to_cpu(extent->ee_block) - 1; - else - /* Beginning is reached, end of the loop */ + else if (le32_to_cpu(extent->ee_block) == start) iterator = NULL; - /* Update path extent in case we need to stop */ - while (le32_to_cpu(extent->ee_block) < start) + else { + extent = EXT_LAST_EXTENT(path[depth].p_hdr); + while (le32_to_cpu(extent->ee_block) >= start) + extent--; + + if (extent == EXT_LAST_EXTENT(path[depth].p_hdr)) + break; + extent++; + iterator = NULL; + } path[depth].p_ext = extent; } ret = ext4_ext_shift_path_extents(path, shift, inode,
From: Li Lingfeng lilingfeng3@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5SR8X CVE: NA
--------------------------------
====================================================== WARNING: possible circular locking dependency detected 4.18.0+ #4 Tainted: G ---------r- - ------------------------------------------------------ dmsetup/923 is trying to acquire lock: 000000008d8170dd (kn->count#184){++++}, at: kernfs_remove+0x24/0x40 fs/kernfs/dir.c:1354
but task is already holding lock: 000000003377330b (slab_mutex){+.+.}, at: kmem_cache_destroy+0xec/0x320 mm/slab_common.c:928
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (slab_mutex){+.+.}: __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0x105/0x11a0 kernel/locking/mutex.c:1072 slab_attr_store+0x6d/0xe0 mm/slub.c:5526 sysfs_kf_write+0x10f/0x170 fs/sysfs/file.c:139 kernfs_fop_write+0x290/0x440 fs/kernfs/file.c:316 __vfs_write+0x81/0x100 fs/read_write.c:485 vfs_write+0x184/0x4c0 fs/read_write.c:549 ksys_write+0xc6/0x1a0 fs/read_write.c:598 do_syscall_64+0xca/0x5a0 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x6a/0xdf
-> #0 (kn->count#184){++++}: lock_acquire+0x10f/0x340 kernel/locking/lockdep.c:3868 kernfs_drain fs/kernfs/dir.c:467 [inline] __kernfs_remove fs/kernfs/dir.c:1320 [inline] __kernfs_remove+0x6d0/0x890 fs/kernfs/dir.c:1279 kernfs_remove+0x24/0x40 fs/kernfs/dir.c:1354 sysfs_remove_dir+0xb6/0xf0 fs/sysfs/dir.c:99 kobject_del.part.1+0x35/0xe0 lib/kobject.c:573 kobject_del+0x1b/0x30 lib/kobject.c:569 shutdown_cache+0x17f/0x310 mm/slab_common.c:592 kmem_cache_destroy+0x263/0x320 mm/slab_common.c:943 bio_put_slab block/bio.c:152 [inline] bioset_exit+0x20d/0x330 block/bio.c:1916 cleanup_mapped_device+0x64/0x360 drivers/md/dm.c:1903 free_dev+0xbc/0x240 drivers/md/dm.c:2058 __dm_destroy+0x317/0x490 drivers/md/dm.c:2426 dm_hash_remove_all+0x8f/0x250 drivers/md/dm-ioctl.c:314 remove_all+0x4d/0x90 drivers/md/dm-ioctl.c:471 ctl_ioctl+0x426/0x910 drivers/md/dm-ioctl.c:1870 dm_ctl_ioctl+0x23/0x30 drivers/md/dm-ioctl.c:1892 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1a5/0x1100 fs/ioctl.c:696 ksys_ioctl+0x7c/0xa0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:718 do_syscall_64+0xca/0x5a0 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x6a/0xdf
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(slab_mutex); lock(kn->count#184); lock(slab_mutex); lock(kn->count#184);
A potential deadlock may occur when we remove and write a slab-attr-file in /sys/kernfs/slab/xxx/ at the same time. The lock sequence in remove process is: slab_mutex --> kn->count The lock sequence in write process is: kn->count --> slab_mutex This can be fixed by replacing mutex_lock with mutex_trylock in slab_attr_store.
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- mm/slub.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c index 4bc29bcd0d5d..f9b39a3718d0 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5560,7 +5560,10 @@ static ssize_t slab_attr_store(struct kobject *kobj, if (slab_state >= FULL && err >= 0 && is_root_cache(s)) { struct kmem_cache *c;
- mutex_lock(&slab_mutex); + if (!mutex_trylock(&slab_mutex)) { + pr_warn("slab file is busy\n"); + return -EBUSY; + } if (s->max_attr_size < len) s->max_attr_size = len;