It's not proper to just abort IO when the map is not ready. So revert this and requeue IO to keep consistent with the community. And fix the deadlock introduced by the patch.
v1->v2: add patch 38d11da522aa "dm: don't lock fs when the map is NULL in process of resume"
Li Lingfeng (3): Revert "dm: make sure dm_table is binded before queue request" dm: don't lock fs when the map is NULL in process of resume dm: don't lock fs when the map is NULL during suspend or resume
Mike Snitzer (1): dm: requeue IO if mapping table not yet available
drivers/md/dm-rq.c | 6 ++---- drivers/md/dm.c | 15 +++++++-------- 2 files changed, 9 insertions(+), 12 deletions(-)
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI78
--------------------------------
This reverts commit 90d1a836f017faf27c24265773171997485075ce.
It's not proper to just abort IO when the map is not ready. So revert this and requeue IO to keep consistent with the community.
Signed-off-by: Li Lingfeng lilingfeng3@huawei.com --- drivers/md/dm-rq.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 8980f129b31f..0fe032147c09 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -500,15 +500,8 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
if (unlikely(!ti)) { int srcu_idx; - struct dm_table *map; - - map = dm_get_live_table(md, &srcu_idx); - if (!map) { - DMERR_LIMIT("%s: mapping table unavailable, erroring io", - dm_device_name(md)); - dm_put_live_table(md, srcu_idx); - return BLK_STS_IOERR; - } + struct dm_table *map = dm_get_live_table(md, &srcu_idx); + ti = dm_table_find_target(map, 0); dm_put_live_table(md, srcu_idx); }
From: Mike Snitzer snitzer@redhat.com
mainline inclusion from mainline-v5.18-rc1 commit fa247089de9936a46e290d4724cb5f0b845600f5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI78 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
----------------------------------------
Update both bio-based and request-based DM to requeue IO if the mapping table not available.
This race of IO being submitted before the DM device ready is so narrow, yet possible for initial table load given that the DM device's request_queue is created prior, that it best to requeue IO to handle this unlikely case.
Reported-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Li Lingfeng lilingfeng3@huawei.com --- drivers/md/dm-rq.c | 7 ++++++- drivers/md/dm.c | 11 +++-------- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 0fe032147c09..31b6cc71ee96 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -500,8 +500,13 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
if (unlikely(!ti)) { int srcu_idx; - struct dm_table *map = dm_get_live_table(md, &srcu_idx); + struct dm_table *map;
+ map = dm_get_live_table(md, &srcu_idx); + if (unlikely(!map)) { + dm_put_live_table(md, srcu_idx); + return BLK_STS_RESOURCE; + } ti = dm_table_find_target(map, 0); dm_put_live_table(md, srcu_idx); } diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 1cb2a84f2403..3649ee4d9000 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1684,15 +1684,10 @@ static blk_qc_t dm_submit_bio(struct bio *bio) struct dm_table *map;
map = dm_get_live_table(md, &srcu_idx); - if (unlikely(!map)) { - DMERR_LIMIT("%s: mapping table unavailable, erroring io", - dm_device_name(md)); - bio_io_error(bio); - goto out; - }
- /* If suspended, queue this IO for later */ - if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) { + /* If suspended, or map not yet available, queue this IO for later */ + if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) || + unlikely(!map)) { if (bio->bi_opf & REQ_NOWAIT) bio_wouldblock_error(bio); else if (bio->bi_opf & REQ_RAHEAD)
mainline inclusion from mainline-v6.4-rc1 commit 38d11da522aacaa05898c734a1cec86f1e611129 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI78 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
----------------------------------------
Commit fa247089de99 ("dm: requeue IO if mapping table not yet available") added a detection of whether the mapping table is available in the IO submission process. If the mapping table is unavailable, it returns BLK_STS_RESOURCE and requeues the IO. This can lead to the following deadlock problem:
dm create mount ioctl(DM_DEV_CREATE_CMD) ioctl(DM_TABLE_LOAD_CMD) do_mount vfs_get_tree ext4_get_tree get_tree_bdev sget_fc alloc_super // got &s->s_umount down_write_nested(&s->s_umount, ...); ext4_fill_super ext4_load_super ext4_read_bh submit_bio // submit and wait io end ioctl(DM_DEV_SUSPEND_CMD) dev_suspend do_resume dm_suspend __dm_suspend lock_fs freeze_bdev get_active_super grab_super // wait for &s->s_umount down_write(&s->s_umount); dm_swap_table __bind // set md->map(can't get here)
IO will be continuously requeued while holding the lock since mapping table is NULL. At the same time, mapping table won't be set since the lock is not available. Like request-based DM, bio-based DM also has the same problem.
It's not proper to just abort IO if the mapping table not available. So clear DM_SKIP_LOCKFS_FLAG when the mapping table is NULL, this allows the DM table to be loaded and the IO submitted upon resume.
Fixes: fa247089de99 ("dm: requeue IO if mapping table not yet available") Cc: stable@vger.kernel.org Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org
Conflicts: drivers/md/dm-ioctl.c Signed-off-by: Li Lingfeng lilingfeng3@huawei.com --- drivers/md/dm-ioctl.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c index 7db47cd26634..82ea53463216 100644 --- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1070,9 +1070,13 @@ static int do_resume(struct dm_ioctl *param)
/* Do we need to load a new map ? */ if (new_map) { + int srcu_idx; + /* Suspend if it isn't already suspended */ - if (param->flags & DM_SKIP_LOCKFS_FLAG) + old_map = dm_get_live_table(md, &srcu_idx); + if (param->flags & DM_SKIP_LOCKFS_FLAG || !old_map) suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG; + dm_put_live_table(md, srcu_idx); if (param->flags & DM_NOFLUSH_FLAG) suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG; if (!dm_suspended_md(md))
mainline inclusion from mainline-v6.4-rc8 commit 2760904d895279f87196f0fa9ec570c79fe6a2e4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7FI78 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
----------------------------------------
As described in commit 38d11da522aa ("dm: don't lock fs when the map is NULL in process of resume"), a deadlock may be triggered between do_resume() and do_mount().
This commit preserves the fix from commit 38d11da522aa but moves it to where it also serves to fix a similar deadlock between do_suspend() and do_mount(). It does so, if the active map is NULL, by clearing DM_SUSPEND_LOCKFS_FLAG in dm_suspend() which is called by both do_suspend() and do_resume().
Fixes: 38d11da522aa ("dm: don't lock fs when the map is NULL in process of resume") Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org
Conflicts: drivers/md/dm-ioctl.c Signed-off-by: Li Lingfeng lilingfeng3@huawei.com --- drivers/md/dm-ioctl.c | 6 +----- drivers/md/dm.c | 4 ++++ 2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c index 82ea53463216..7db47cd26634 100644 --- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1070,13 +1070,9 @@ static int do_resume(struct dm_ioctl *param)
/* Do we need to load a new map ? */ if (new_map) { - int srcu_idx; - /* Suspend if it isn't already suspended */ - old_map = dm_get_live_table(md, &srcu_idx); - if (param->flags & DM_SKIP_LOCKFS_FLAG || !old_map) + if (param->flags & DM_SKIP_LOCKFS_FLAG) suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG; - dm_put_live_table(md, srcu_idx); if (param->flags & DM_NOFLUSH_FLAG) suspend_flags |= DM_SUSPEND_NOFLUSH_FLAG; if (!dm_suspended_md(md)) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 3649ee4d9000..3a49fbed974a 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2602,6 +2602,10 @@ int dm_suspend(struct mapped_device *md, unsigned suspend_flags) }
map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); + if (!map) { + /* avoid deadlock with fs/namespace.c:do_mount() */ + suspend_flags &= ~DM_SUSPEND_LOCKFS_FLAG; + }
r = __dm_suspend(md, map, suspend_flags, TASK_INTERRUPTIBLE, DMF_SUSPENDED); if (r)
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/1345 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/2...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/1345 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/2...