[PATCH OLK-5.10 0/4] block: fix UAF in blk_mq_has_sqsched()

The issue occurs in blk_mq_has_sqsched() function which accesses q->elevator pointer and its members without any protection. During device state changes, related memory structures might be freed, causing a Use-After-Free vulnerability. This patch introduces a QUEUE_FLAG_SQ_SCHED flag that is set during scheduler initialization, eliminating the need for runtime checks of scheduler features. This removes the need to access q->elevator at unsafe moments, resolving the UAF issue. Bart Van Assche (1): block: Decode all flag names in the debugfs output Christoph Hellwig (1): block: remove QUEUE_FLAG_DEAD Ming Lei (2): blk-mq: protect q->elevator by ->sysfs_lock in blk_mq_elv_switch_none blk-mq: avoid to touch q->elevator without any protection block/bfq-iosched.c | 3 +++ block/blk-core.c | 2 -- block/blk-mq-debugfs.c | 9 ++++----- block/blk-mq-sched.c | 1 + block/blk-mq.c | 22 +++++----------------- block/kyber-iosched.c | 3 ++- block/mq-deadline.c | 3 +++ drivers/block/mtip32xx/mtip32xx.c | 2 +- include/linux/blkdev.h | 4 ++-- include/linux/elevator.h | 2 -- 10 files changed, 21 insertions(+), 30 deletions(-) -- 2.39.2

From: Christoph Hellwig <hch@lst.de> mainline inclusion from mainline-v6.0-rc1 commit 1f90307e5f0d7bc9a336ead528f616a5df8e5944 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBY0UQ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ------------------ Disallow setting the blk-mq state on any queue that is already dying as setting the state even then is a bad idea, and remove the now unused QUEUE_FLAG_DEAD flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-core.c [Context conflicts.] block/blk-mq-debugfs.c [Context conflicts.] include/linux/blkdev.h [Context conflicts.] drivers/block/mtip32xx/mtip32xx.c [Due to not merging commit e8b58ef09e84 ("mtip32xx: fix device removal").] Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> --- block/blk-core.c | 2 -- block/blk-mq-debugfs.c | 8 +++----- drivers/block/mtip32xx/mtip32xx.c | 2 +- include/linux/blkdev.h | 2 -- 4 files changed, 4 insertions(+), 10 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index bf3bfc3ed339..15702561b470 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -635,8 +635,6 @@ void blk_cleanup_queue(struct request_queue *q) */ blk_freeze_queue(q); - blk_queue_flag_set(QUEUE_FLAG_DEAD, q); - /* for synchronous bio-based driver finish in-flight integrity i/o */ blk_flush_integrity(); diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index de587a442a90..eacb6d5d2085 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -114,7 +114,6 @@ static const char *const blk_queue_flag_name[] = { QUEUE_FLAG_NAME(ADD_RANDOM), QUEUE_FLAG_NAME(SECERASE), QUEUE_FLAG_NAME(SAME_FORCE), - QUEUE_FLAG_NAME(DEAD), QUEUE_FLAG_NAME(INIT_DONE), QUEUE_FLAG_NAME(STABLE_WRITES), QUEUE_FLAG_NAME(POLL), @@ -152,11 +151,10 @@ static ssize_t queue_state_write(void *data, const char __user *buf, char opbuf[16] = { }, *op; /* - * The "state" attribute is removed after blk_cleanup_queue() has called - * blk_mq_free_queue(). Return if QUEUE_FLAG_DEAD has been set to avoid - * triggering a use-after-free. + * The "state" attribute is removed when the queue is removed. Don't + * allow setting the state on a dying queue to avoid a use-after-free. */ - if (blk_queue_dead(q)) + if (blk_queue_dying(q)) return -ENOENT; if (count >= sizeof(opbuf)) { diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 153e2cdecb4d..6d2b211def2f 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -149,7 +149,7 @@ static bool mtip_check_surprise_removal(struct pci_dev *pdev) if (vendor_id == 0xFFFF) { dd->sr = true; if (dd->queue) - blk_queue_flag_set(QUEUE_FLAG_DEAD, dd->queue); + blk_queue_flag_set(QUEUE_FLAG_DYING, dd->queue); else dev_warn(&dd->pdev->dev, "%s: dd->queue is NULL\n", __func__); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f27a0916a75e..b6da530894b4 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -642,7 +642,6 @@ struct request_queue { #define QUEUE_FLAG_ADD_RANDOM 10 /* Contributes to random pool */ #define QUEUE_FLAG_SECERASE 11 /* supports secure erase */ #define QUEUE_FLAG_SAME_FORCE 12 /* force complete on same CPU */ -#define QUEUE_FLAG_DEAD 13 /* queue tear-down finished */ #define QUEUE_FLAG_INIT_DONE 14 /* queue is initialized */ #define QUEUE_FLAG_STABLE_WRITES 15 /* don't modify blks until WB is done */ #define QUEUE_FLAG_POLL 16 /* IO polling enabled if set */ @@ -674,7 +673,6 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); #define blk_queue_stopped(q) test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags) #define blk_queue_dying(q) test_bit(QUEUE_FLAG_DYING, &(q)->queue_flags) -#define blk_queue_dead(q) test_bit(QUEUE_FLAG_DEAD, &(q)->queue_flags) #define blk_queue_init_done(q) test_bit(QUEUE_FLAG_INIT_DONE, &(q)->queue_flags) #define blk_queue_nomerges(q) test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags) #define blk_queue_noxmerges(q) \ -- 2.39.2

From: Ming Lei <ming.lei@redhat.com> mainline inclusion from mainline-v5.19-rc3 commit 5fd7a84a09e640016fe106dd3e992f5210e23dc7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBY0UQ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ------------------ elevator can be tore down by sysfs switch interface or disk release, so hold ->sysfs_lock before referring to q->elevator, then potential use-after-free can be avoided. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220616014401.817001-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-mq.c [Context conflicts.] Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> --- block/blk-mq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 5a9c02d0199c..a45115237e43 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3924,12 +3924,14 @@ static bool blk_mq_elv_switch_none(struct list_head *head, if (!qe) return false; + /* q->elevator needs protection from ->sysfs_lock */ + mutex_lock(&q->sysfs_lock); + INIT_LIST_HEAD(&qe->node); qe->q = q; qe->type = q->elevator->type; list_add(&qe->node, head); - mutex_lock(&q->sysfs_lock); /* * After elevator_switch, the previous elevator_queue will be * released by elevator_release. The reference of the io scheduler -- 2.39.2

From: Ming Lei <ming.lei@redhat.com> mainline inclusion from mainline-v5.19-rc3 commit 4d337cebcb1c27d9b48c48b9a98e939d4552d584 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBY0UQ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ------------------ q->elevator is referred in blk_mq_has_sqsched() without any protection, no .q_usage_counter is held, no queue srcu and rcu read lock is held, so potential use-after-free may be triggered. Fix the issue by adding one queue flag for checking if the elevator uses single queue style dispatch. Meantime the elevator feature flag of ELEVATOR_F_MQ_AWARE isn't needed any more. Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220616014401.817001-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-mq.c [Due to not merging commit 4f481208749a ("blk-mq: prepare for implementing hctx table via xarray").] block/mq-deadline.c [Due to not merging commit 3e9a99eba058 ("block/mq-deadline: Rename dd_init_queue() and dd_exit_queue()").] include/linux/blkdev.h [Context conflicts.] include/linux/elevator.h [Due to not merging commit 2e9bc3465ac5 ("block: move elevator.h to block/").] Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> --- block/bfq-iosched.c | 3 +++ block/blk-mq-sched.c | 1 + block/blk-mq.c | 18 ++---------------- block/kyber-iosched.c | 3 ++- block/mq-deadline.c | 3 +++ include/linux/blkdev.h | 2 ++ include/linux/elevator.h | 2 -- 7 files changed, 13 insertions(+), 19 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 9811bbd6558e..1f840f8ed9af 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6573,6 +6573,9 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) bfq_init_root_group(bfqd->root_group, bfqd); bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group); + /* We dispatch from request queue wide instead of hw queue */ + blk_queue_flag_set(QUEUE_FLAG_SQ_SCHED, q); + wbt_disable_default(q); return 0; diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 0a37d15caae4..d5c5bd38f7bc 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -590,6 +590,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) int ret; if (!e) { + blk_queue_flag_clear(QUEUE_FLAG_SQ_SCHED, q); q->elevator = NULL; q->nr_requests = q->tag_set->queue_depth; return 0; diff --git a/block/blk-mq.c b/block/blk-mq.c index a45115237e43..f94adf15bf53 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1790,20 +1790,6 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) } EXPORT_SYMBOL(blk_mq_run_hw_queue); -/* - * Is the request queue handled by an IO scheduler that does not respect - * hardware queues when dispatching? - */ -static bool blk_mq_has_sqsched(struct request_queue *q) -{ - struct elevator_queue *e = q->elevator; - - if (e && e->type->ops.dispatch_request && - !(e->type->elevator_features & ELEVATOR_F_MQ_AWARE)) - return true; - return false; -} - /* * Return prefered queue to dispatch from (if any) for non-mq aware IO * scheduler. @@ -1837,7 +1823,7 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async) int i; sq_hctx = NULL; - if (blk_mq_has_sqsched(q)) + if (blk_queue_sq_sched(q)) sq_hctx = blk_mq_get_sq_hctx(q); queue_for_each_hw_ctx(q, hctx, i) { if (blk_mq_hctx_stopped(hctx)) @@ -1865,7 +1851,7 @@ void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs) int i; sq_hctx = NULL; - if (blk_mq_has_sqsched(q)) + if (blk_queue_sq_sched(q)) sq_hctx = blk_mq_get_sq_hctx(q); queue_for_each_hw_ctx(q, hctx, i) { if (blk_mq_hctx_stopped(hctx)) diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index fbf8f7e00241..f4c97fd0d50e 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -417,6 +417,8 @@ static int kyber_init_sched(struct request_queue *q, struct elevator_type *e) blk_stat_enable_accounting(q); + blk_queue_flag_clear(QUEUE_FLAG_SQ_SCHED, q); + eq->elevator_data = kqd; q->elevator = eq; @@ -1028,7 +1030,6 @@ static struct elevator_type kyber_sched = { #endif .elevator_attrs = kyber_sched_attrs, .elevator_name = "kyber", - .elevator_features = ELEVATOR_F_MQ_AWARE, .elevator_owner = THIS_MODULE, }; diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 42b6e9dbe7c7..bc78e996bacb 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -432,6 +432,9 @@ static int dd_init_queue(struct request_queue *q, struct elevator_type *e) spin_lock_init(&dd->zone_lock); INIT_LIST_HEAD(&dd->dispatch); + /* We dispatch from request queue wide instead of hw queue */ + blk_queue_flag_set(QUEUE_FLAG_SQ_SCHED, q); + q->elevator = eq; return 0; } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index b6da530894b4..49578094b500 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -642,6 +642,7 @@ struct request_queue { #define QUEUE_FLAG_ADD_RANDOM 10 /* Contributes to random pool */ #define QUEUE_FLAG_SECERASE 11 /* supports secure erase */ #define QUEUE_FLAG_SAME_FORCE 12 /* force complete on same CPU */ +#define QUEUE_FLAG_SQ_SCHED 13 /* single queue style io dispatch */ #define QUEUE_FLAG_INIT_DONE 14 /* queue is initialized */ #define QUEUE_FLAG_STABLE_WRITES 15 /* don't modify blks until WB is done */ #define QUEUE_FLAG_POLL 16 /* IO polling enabled if set */ @@ -707,6 +708,7 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); #define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags) #define blk_queue_registered(q) test_bit(QUEUE_FLAG_REGISTERED, &(q)->queue_flags) #define blk_queue_nowait(q) test_bit(QUEUE_FLAG_NOWAIT, &(q)->queue_flags) +#define blk_queue_sq_sched(q) test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags) extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q); diff --git a/include/linux/elevator.h b/include/linux/elevator.h index 1363b5858486..820563e85c41 100644 --- a/include/linux/elevator.h +++ b/include/linux/elevator.h @@ -188,8 +188,6 @@ extern struct request *elv_rb_find(struct rb_root *, sector_t); /* Supports zoned block devices sequential write constraint */ #define ELEVATOR_F_ZBD_SEQ_WRITE (1U << 0) -/* Supports scheduling on multiple hardware queues */ -#define ELEVATOR_F_MQ_AWARE (1U << 1) #endif /* CONFIG_BLOCK */ #endif -- 2.39.2

From: Bart Van Assche <bvanassche@acm.org> mainline inclusion from mainline-v6.5-rc1 commit d5fb8726f1dea70543a93ab1d7332857f157b7f3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IBY0UQ Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... ------------------ See also: * Commit 4d337cebcb1c ("blk-mq: avoid to touch q->elevator without any protection"). * Commit 414dd48e882c ("blk-mq: add tagset quiesce interface"). Cc: Christoph Hellwig <hch@lst.de> Cc: Damien Le Moal <dlemoal@kernel.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20230518222708.1190867-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-mq-debugfs.c [Due to not merging commit 414dd48e882c ("blk-mq: add tagset quiesce interface") and commit 3222d8c2a7f8 ("block: remove ->rw_page").] Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> --- block/blk-mq-debugfs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index eacb6d5d2085..8aab3ef2d31f 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -130,6 +130,7 @@ static const char *const blk_queue_flag_name[] = { QUEUE_FLAG_NAME(RQ_ALLOC_TIME), QUEUE_FLAG_NAME(HCTX_ACTIVE), QUEUE_FLAG_NAME(NOWAIT), + QUEUE_FLAG_NAME(SQ_SCHED), QUEUE_FLAG_NAME(DISPATCH_ASYNC), }; #undef QUEUE_FLAG_NAME -- 2.39.2

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/15941 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/6B7... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/15941 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/6B7...
participants (2)
-
patchwork bot
-
Zheng Qixing