From: Li Lingfeng lilingfeng3@huawei.com
mainline inclusion from mainline-v6.4-rc8 commit cb65b282c9640c27d3129e2e04b711ce1b352838 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FIUX CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
----------------------------------------
Must check pmd->fail_io before using pmd->data_sm since pmd->data_sm may be destroyed by other processes.
P1(kworker) P2(message) do_worker process_prepared process_prepared_discard_passdown_pt2 dm_pool_dec_data_range pool_message commit dm_pool_commit_metadata ↓ // commit failed metadata_operation_failed abort_transaction dm_pool_abort_metadata __open_or_format_metadata ↓ dm_sm_disk_open ↓ // open failed // pmd->data_sm is NULL dm_sm_dec_blocks ↓ // try to access pmd->data_sm --> UAF
As shown above, if dm_pool_commit_metadata() and dm_pool_abort_metadata() fail in pool_message process, kworker may trigger UAF.
Fixes: be500ed721a6 ("dm space maps: improve performance with inc/dec on ranges of blocks") Cc: stable@vger.kernel.org Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org
Conflicts: drivers/md/dm-thin-metadata.c Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/dm-thin-metadata.c | 34 ++++++++++++++++++++++------------ 1 file changed, 22 insertions(+), 12 deletions(-)
diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index e37f468cdbf0..bd9466dc9400 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -1711,13 +1711,15 @@ int dm_thin_remove_range(struct dm_thin_device *td,
int dm_pool_block_is_shared(struct dm_pool_metadata *pmd, dm_block_t b, bool *result) { - int r; + int r = -EINVAL; uint32_t ref_count;
down_read(&pmd->root_lock); - r = dm_sm_get_count(pmd->data_sm, b, &ref_count); - if (!r) - *result = (ref_count > 1); + if (!pmd->fail_io) { + r = dm_sm_get_count(pmd->data_sm, b, &ref_count); + if (!r) + *result = (ref_count > 1); + } up_read(&pmd->root_lock);
return r; @@ -1728,10 +1730,14 @@ int dm_pool_inc_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_ int r = 0;
down_write(&pmd->root_lock); - for (; b != e; b++) { - r = dm_sm_inc_block(pmd->data_sm, b); - if (r) - break; + if (!pmd->fail_io) { + for (; b != e; b++) { + r = dm_sm_inc_block(pmd->data_sm, b); + if (r) + break; + } + } else { + r = -EINVAL; } up_write(&pmd->root_lock);
@@ -1743,10 +1749,14 @@ int dm_pool_dec_data_range(struct dm_pool_metadata *pmd, dm_block_t b, dm_block_ int r = 0;
down_write(&pmd->root_lock); - for (; b != e; b++) { - r = dm_sm_dec_block(pmd->data_sm, b); - if (r) - break; + if (!pmd->fail_io) { + for (; b != e; b++) { + r = dm_sm_dec_block(pmd->data_sm, b); + if (r) + break; + } + } else { + r = -EINVAL; } up_write(&pmd->root_lock);
From: Li Lingfeng lilingfeng3@huawei.com
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI78
--------------------------------
This reverts commit 90d1a836f017faf27c24265773171997485075ce.
It's not proper to just abort IO when the map is not ready. So revert this and requeue IO to keep consistent with the community.
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/dm-rq.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 3bd805f7ce85..46bba3de378c 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -752,15 +752,8 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
if (unlikely(!ti)) { int srcu_idx; - struct dm_table *map; + struct dm_table *map = dm_get_live_table(md, &srcu_idx);
- map = dm_get_live_table(md, &srcu_idx); - if (!map) { - DMERR_LIMIT("%s: mapping table unavailable, erroring io", - dm_device_name(md)); - dm_put_live_table(md, srcu_idx); - return BLK_STS_IOERR; - } ti = dm_table_find_target(map, 0); dm_put_live_table(md, srcu_idx); }
From: Mike Snitzer snitzer@redhat.com
mainline inclusion from mainline-v5.18-rc1 commit fa247089de9936a46e290d4724cb5f0b845600f5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI78 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
----------------------------------------
Update both bio-based and request-based DM to requeue IO if the mapping table not available.
This race of IO being submitted before the DM device ready is so narrow, yet possible for initial table load given that the DM device's request_queue is created prior, that it best to requeue IO to handle this unlikely case.
Reported-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/dm-rq.c | 7 ++++++- drivers/md/dm.c | 5 +++-- 2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 46bba3de378c..288064e94e52 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -752,8 +752,13 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
if (unlikely(!ti)) { int srcu_idx; - struct dm_table *map = dm_get_live_table(md, &srcu_idx); + struct dm_table *map;
+ map = dm_get_live_table(md, &srcu_idx); + if (unlikely(!map)) { + dm_put_live_table(md, srcu_idx); + return BLK_STS_RESOURCE; + } ti = dm_table_find_target(map, 0); dm_put_live_table(md, srcu_idx); } diff --git a/drivers/md/dm.c b/drivers/md/dm.c index ea1baea3a11d..326b3ea2a21f 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1781,8 +1781,9 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
map = dm_get_live_table(md, &srcu_idx);
- /* if we're suspended, we have to queue this io for later */ - if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) { + /* If suspended, or map not yet available, queue this IO for later */ + if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) || + unlikely(!map)) { dm_put_live_table(md, srcu_idx);
if (!(bio->bi_opf & REQ_RAHEAD))
From: Li Lingfeng lilingfeng3@huawei.com
mainline inclusion from mainline-v6.4-rc1 commit 38d11da522aacaa05898c734a1cec86f1e611129 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI5Z CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
----------------------------------------
Commit fa247089de99 ("dm: requeue IO if mapping table not yet available") added a detection of whether the mapping table is available in the IO submission process. If the mapping table is unavailable, it returns BLK_STS_RESOURCE and requeues the IO. This can lead to the following deadlock problem:
dm create mount ioctl(DM_DEV_CREATE_CMD) ioctl(DM_TABLE_LOAD_CMD) do_mount vfs_get_tree ext4_get_tree get_tree_bdev sget_fc alloc_super // got &s->s_umount down_write_nested(&s->s_umount, ...); ext4_fill_super ext4_load_super ext4_read_bh submit_bio // submit and wait io end ioctl(DM_DEV_SUSPEND_CMD) dev_suspend do_resume dm_suspend __dm_suspend lock_fs freeze_bdev get_active_super grab_super // wait for &s->s_umount down_write(&s->s_umount); dm_swap_table __bind // set md->map(can't get here)
IO will be continuously requeued while holding the lock since mapping table is NULL. At the same time, mapping table won't be set since the lock is not available. Like request-based DM, bio-based DM also has the same problem.
It's not proper to just abort IO if the mapping table not available. So clear DM_SKIP_LOCKFS_FLAG when the mapping table is NULL, this allows the DM table to be loaded and the IO submitted upon resume.
Fixes: fa247089de99 ("dm: requeue IO if mapping table not yet available") Cc: stable@vger.kernel.org Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org
Conflicts: drivers/md/dm-ioctl.c Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/dm-ioctl.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c index c8c27d23bb45..3af982ed8424 100644 --- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1034,9 +1034,13 @@ static int do_resume(struct dm_ioctl *param)
/* Do we need to load a new map ? */ if (new_map) { + int srcu_idx; + /* Suspend if it isn't already suspended */ - if (param->flags & DM_SKIP_LOCKFS_FLAG) + old_map = dm_get_live_table(md, &srcu_idx); + if (param->flags & DM_SKIP_LOCKFS_FLAG || !old_map) suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG; + dm_put_live_table(md, srcu_idx); if (param->flags & DM_NOFLUSH_FLAG) suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG; if (!dm_suspended_md(md))
From: Li Lingfeng lilingfeng3@huawei.com
mainline inclusion from mainline-v6.4-rc8 commit 2760904d895279f87196f0fa9ec570c79fe6a2e4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI5Z CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
----------------------------------------
As described in commit 38d11da522aa ("dm: don't lock fs when the map is NULL in process of resume"), a deadlock may be triggered between do_resume() and do_mount().
This commit preserves the fix from commit 38d11da522aa but moves it to where it also serves to fix a similar deadlock between do_suspend() and do_mount(). It does so, if the active map is NULL, by clearing DM_SUSPEND_LOCKFS_FLAG in dm_suspend() which is called by both do_suspend() and do_resume().
Fixes: 38d11da522aa ("dm: don't lock fs when the map is NULL in process of resume") Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org
Conflicts: drivers/md/dm-ioctl.c Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/dm-ioctl.c | 6 +----- drivers/md/dm.c | 4 ++++ 2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c index 3af982ed8424..c8c27d23bb45 100644 --- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1034,13 +1034,9 @@ static int do_resume(struct dm_ioctl *param)
/* Do we need to load a new map ? */ if (new_map) { - int srcu_idx; - /* Suspend if it isn't already suspended */ - old_map = dm_get_live_table(md, &srcu_idx); - if (param->flags & DM_SKIP_LOCKFS_FLAG || !old_map) + if (param->flags & DM_SKIP_LOCKFS_FLAG) suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG; - dm_put_live_table(md, srcu_idx); if (param->flags & DM_NOFLUSH_FLAG) suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG; if (!dm_suspended_md(md)) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 326b3ea2a21f..0aa6fd33abf1 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2747,6 +2747,10 @@ int dm_suspend(struct mapped_device *md, unsigned suspend_flags) }
map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); + if (!map) { + /* avoid deadlock with fs/namespace.c:do_mount() */ + suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG; + }
r = __dm_suspend(md, map, suspend_flags, TASK_INTERRUPTIBLE, DMF_SUSPENDED); if (r)
From: David Sloan david.sloan@eideticom.com
mainline inclusion from mainline-v6.0-rc3 commit 5e8daf906f890560df430d30617c692a794acb73 category: bugfix bugzilla: 188015, https://gitee.com/openeuler/kernel/issues/I6OERX CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id...
--------------------------------
A race condition still exists when removing and re-creating md devices in test cases. However, it is only seen on some setups.
The race condition was tracked down to a reference still being held to the kobject by the rdev in the md_rdev_misc_wq which will be released in rdev_delayed_delete().
md_alloc() waits for previous deletions by waiting on the md_misc_wq, but the md_rdev_misc_wq may still be holding a reference to a recently removed device.
To fix this, also flush the md_rdev_misc_wq in md_alloc().
Signed-off-by: David Sloan david.sloan@eideticom.com [logang@deltatee.com: rewrote commit message] Signed-off-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Song Liu song@kernel.org
Conflict: drivers/md/md.c
Signed-off-by: Li Nan linan122@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yongqiang Liu liuyongqiang13@huawei.com --- drivers/md/md.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 5a5e1f1fdb52..d16fdfa1aada 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -5462,6 +5462,7 @@ static int md_alloc(dev_t dev, char *name) * completely removed (mddev_delayed_delete). */ flush_workqueue(md_misc_wq); + flush_workqueue(md_rdev_misc_wq);
mutex_lock(&disks_mutex); error = -EEXIST;