[PATCH openEuler-1.0-LTS 1/3] block: fix race between adding/removing rq qos and normal IO

From: Ming Lei <ming.lei@redhat.com> mainline inclusion from mainline-v5.14 commit 2cafe29a8d03f02a3d16193bdaae2f3e82a423f9 category: bugfix bugzilla: 182135 CVE: NA ------------------------------------------------- Yi reported several kernel panics on: [16687.001777] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 ... [16687.163549] pc : __rq_qos_track+0x38/0x60 or [ 997.690455] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... [ 997.850347] pc : __rq_qos_done+0x2c/0x50 Turns out it is caused by race between adding rq qos(wbt) and normal IO because rq_qos_add can be run when IO is being submitted, fix this issue by freezing queue before adding/deleting rq qos to queue. rq_qos_exit() needn't to freeze queue because it is called after queue has been frozen. iolatency calls rq_qos_add() during allocating queue, so freezing won't add delay because queue usage refcount works at atomic mode at that time. iocost calls rq_qos_add() when writing cgroup attribute file, that is fine to freeze queue at that time since we usually freeze queue when storing to queue sysfs attribute, meantime iocost only exists on the root cgroup. wbt_init calls it in blk_register_queue() and queue sysfs attribute store(queue_wb_lat_store() when write it 1st time in case of !BLK_WBT_MQ), the following patch will speedup the queue freezing in wbt_init. Reported-by: Yi Zhang <yi.zhang@redhat.com> Cc: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Yi Zhang <yi.zhang@redhat.com> Link: https://lore.kernel.org/r/20210609015822.103433-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflict: block/blk-rq-qos.h Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Reviewed-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> --- block/blk-rq-qos.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index 98caba3e962ee..c99b2facd9ffd 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -6,6 +6,7 @@ #include <linux/blk_types.h> #include <linux/atomic.h> #include <linux/wait.h> +#include <linux/blk-mq.h> enum rq_qos_id { RQ_QOS_WBT, @@ -74,20 +75,43 @@ static inline void rq_wait_init(struct rq_wait *rq_wait) static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos) { + /* + * No IO can be in-flight when adding rqos, so freeze queue, which + * is fine since we only support rq_qos for blk-mq queue. + * + * Reuse ->queue_lock for protecting against other concurrent + * rq_qos adding/deleting + */ + blk_mq_freeze_queue(q); + + spin_lock_irq(q->queue_lock); rqos->next = q->rq_qos; q->rq_qos = rqos; + spin_unlock_irq(q->queue_lock); + + blk_mq_unfreeze_queue(q); } static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos) { struct rq_qos **cur; + /* + * See comment in rq_qos_add() about freezing queue & using + * ->queue_lock. + */ + blk_mq_freeze_queue(q); + + spin_lock_irq(q->queue_lock); for (cur = &q->rq_qos; *cur; cur = &(*cur)->next) { if (*cur == rqos) { *cur = rqos->next; break; } } + spin_unlock_irq(q->queue_lock); + + blk_mq_unfreeze_queue(q); } bool rq_wait_inc_below(struct rq_wait *rq_wait, unsigned int limit); -- 2.25.1

From: Ming Lei <ming.lei@redhat.com> mainline inclusion from mainline-v5.14 commit a72c374f97a4c7b2f9dde5144c867fec4bdcd798 category: bugfix bugzilla: 182135 CVE: NA ------------------------------------------------- Mark queue init done when everything is done well in blk_register_queue(), so that wbt_enable_default() can be run quickly without any RCU period involved since adding rq qos requires to freeze queue. Also no any side effect by delaying to mark queue init done. Reported-by: Yi Zhang <yi.zhang@redhat.com> Cc: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Yi Zhang <yi.zhang@redhat.com> Link: https://lore.kernel.org/r/20210609015822.103433-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflict: block/blk-sysfs.c Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Reviewed-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> --- block/blk-sysfs.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 9687bff0244fb..4167f6a5c3f86 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -902,21 +902,6 @@ int blk_register_queue(struct gendisk *disk) kobject_name(&dev->kobj)); queue_flag_set_unlocked(QUEUE_FLAG_REGISTERED, q); - /* - * SCSI probing may synchronously create and destroy a lot of - * request_queues for non-existent devices. Shutting down a fully - * functional queue takes measureable wallclock time as RCU grace - * periods are involved. To avoid excessive latency in these - * cases, a request_queue starts out in a degraded mode which is - * faster to shut down and is made fully functional here as - * request_queues for non-existent devices never get registered. - */ - if (!blk_queue_init_done(q)) { - queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q); - percpu_ref_switch_to_percpu(&q->q_usage_counter); - blk_queue_bypass_end(q); - } - ret = blk_trace_init_sysfs(dev); if (ret) return ret; @@ -956,6 +941,22 @@ int blk_register_queue(struct gendisk *disk) ret = 0; unlock: mutex_unlock(&q->sysfs_lock); + + /* + * SCSI probing may synchronously create and destroy a lot of + * request_queues for non-existent devices. Shutting down a fully + * functional queue takes measureable wallclock time as RCU grace + * periods are involved. To avoid excessive latency in these + * cases, a request_queue starts out in a degraded mode which is + * faster to shut down and is made fully functional here as + * request_queues for non-existent devices never get registered. + */ + if (!blk_queue_init_done(q)) { + queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q); + percpu_ref_switch_to_percpu(&q->q_usage_counter); + blk_queue_bypass_end(q); + } + return ret; } EXPORT_SYMBOL_GPL(blk_register_queue); -- 2.25.1

From: Laibin Qiu <qiulaibin@huawei.com> hulk inclusion category: bugfix bugzilla: 182135 CVE: NA ------------------------------------------------- Add header file will generates kabi change problem. Clean up by __GENKSYMS__. Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Reviewed-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> --- block/blk-rq-qos.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index c99b2facd9ffd..d5daed923c725 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -6,7 +6,9 @@ #include <linux/blk_types.h> #include <linux/atomic.h> #include <linux/wait.h> +#ifndef __GENKSYMS__ #include <linux/blk-mq.h> +#endif enum rq_qos_id { RQ_QOS_WBT, -- 2.25.1
participants (1)
-
Yang Yingliang