From: Ming Lei ming.lei@redhat.com
mainline inclusion from mainline-v5.14 commit 2cafe29a8d03f02a3d16193bdaae2f3e82a423f9 category: bugfix bugzilla: 182135 CVE: NA
-------------------------------------------------
Yi reported several kernel panics on:
[16687.001777] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 ... [16687.163549] pc : __rq_qos_track+0x38/0x60
or
[ 997.690455] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... [ 997.850347] pc : __rq_qos_done+0x2c/0x50
Turns out it is caused by race between adding rq qos(wbt) and normal IO because rq_qos_add can be run when IO is being submitted, fix this issue by freezing queue before adding/deleting rq qos to queue.
rq_qos_exit() needn't to freeze queue because it is called after queue has been frozen.
iolatency calls rq_qos_add() during allocating queue, so freezing won't add delay because queue usage refcount works at atomic mode at that time.
iocost calls rq_qos_add() when writing cgroup attribute file, that is fine to freeze queue at that time since we usually freeze queue when storing to queue sysfs attribute, meantime iocost only exists on the root cgroup.
wbt_init calls it in blk_register_queue() and queue sysfs attribute store(queue_wb_lat_store() when write it 1st time in case of !BLK_WBT_MQ), the following patch will speedup the queue freezing in wbt_init.
Reported-by: Yi Zhang yi.zhang@redhat.com Cc: Bart Van Assche bvanassche@acm.org Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Bart Van Assche bvanassche@acm.org Tested-by: Yi Zhang yi.zhang@redhat.com Link: https://lore.kernel.org/r/20210609015822.103433-2-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk
Conflict: block/blk-rq-qos.h Signed-off-by: Laibin Qiu qiulaibin@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- block/blk-rq-qos.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index 98caba3e962ee..c99b2facd9ffd 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -6,6 +6,7 @@ #include <linux/blk_types.h> #include <linux/atomic.h> #include <linux/wait.h> +#include <linux/blk-mq.h>
enum rq_qos_id { RQ_QOS_WBT, @@ -74,20 +75,43 @@ static inline void rq_wait_init(struct rq_wait *rq_wait)
static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos) { + /* + * No IO can be in-flight when adding rqos, so freeze queue, which + * is fine since we only support rq_qos for blk-mq queue. + * + * Reuse ->queue_lock for protecting against other concurrent + * rq_qos adding/deleting + */ + blk_mq_freeze_queue(q); + + spin_lock_irq(q->queue_lock); rqos->next = q->rq_qos; q->rq_qos = rqos; + spin_unlock_irq(q->queue_lock); + + blk_mq_unfreeze_queue(q); }
static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos) { struct rq_qos **cur;
+ /* + * See comment in rq_qos_add() about freezing queue & using + * ->queue_lock. + */ + blk_mq_freeze_queue(q); + + spin_lock_irq(q->queue_lock); for (cur = &q->rq_qos; *cur; cur = &(*cur)->next) { if (*cur == rqos) { *cur = rqos->next; break; } } + spin_unlock_irq(q->queue_lock); + + blk_mq_unfreeze_queue(q); }
bool rq_wait_inc_below(struct rq_wait *rq_wait, unsigned int limit);
From: Ming Lei ming.lei@redhat.com
mainline inclusion from mainline-v5.14 commit a72c374f97a4c7b2f9dde5144c867fec4bdcd798 category: bugfix bugzilla: 182135 CVE: NA
-------------------------------------------------
Mark queue init done when everything is done well in blk_register_queue(), so that wbt_enable_default() can be run quickly without any RCU period involved since adding rq qos requires to freeze queue.
Also no any side effect by delaying to mark queue init done.
Reported-by: Yi Zhang yi.zhang@redhat.com Cc: Bart Van Assche bvanassche@acm.org Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Bart Van Assche bvanassche@acm.org Tested-by: Yi Zhang yi.zhang@redhat.com Link: https://lore.kernel.org/r/20210609015822.103433-3-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk
Conflict: block/blk-sysfs.c Signed-off-by: Laibin Qiu qiulaibin@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- block/blk-sysfs.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 9687bff0244fb..4167f6a5c3f86 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -902,21 +902,6 @@ int blk_register_queue(struct gendisk *disk) kobject_name(&dev->kobj)); queue_flag_set_unlocked(QUEUE_FLAG_REGISTERED, q);
- /* - * SCSI probing may synchronously create and destroy a lot of - * request_queues for non-existent devices. Shutting down a fully - * functional queue takes measureable wallclock time as RCU grace - * periods are involved. To avoid excessive latency in these - * cases, a request_queue starts out in a degraded mode which is - * faster to shut down and is made fully functional here as - * request_queues for non-existent devices never get registered. - */ - if (!blk_queue_init_done(q)) { - queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q); - percpu_ref_switch_to_percpu(&q->q_usage_counter); - blk_queue_bypass_end(q); - } - ret = blk_trace_init_sysfs(dev); if (ret) return ret; @@ -956,6 +941,22 @@ int blk_register_queue(struct gendisk *disk) ret = 0; unlock: mutex_unlock(&q->sysfs_lock); + + /* + * SCSI probing may synchronously create and destroy a lot of + * request_queues for non-existent devices. Shutting down a fully + * functional queue takes measureable wallclock time as RCU grace + * periods are involved. To avoid excessive latency in these + * cases, a request_queue starts out in a degraded mode which is + * faster to shut down and is made fully functional here as + * request_queues for non-existent devices never get registered. + */ + if (!blk_queue_init_done(q)) { + queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q); + percpu_ref_switch_to_percpu(&q->q_usage_counter); + blk_queue_bypass_end(q); + } + return ret; } EXPORT_SYMBOL_GPL(blk_register_queue);
From: Laibin Qiu qiulaibin@huawei.com
hulk inclusion category: bugfix bugzilla: 182135 CVE: NA
------------------------------------------------- Add header file will generates kabi change problem. Clean up by __GENKSYMS__. Signed-off-by: Laibin Qiu qiulaibin@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- block/blk-rq-qos.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index c99b2facd9ffd..d5daed923c725 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -6,7 +6,9 @@ #include <linux/blk_types.h> #include <linux/atomic.h> #include <linux/wait.h> +#ifndef __GENKSYMS__ #include <linux/blk-mq.h> +#endif
enum rq_qos_id { RQ_QOS_WBT,