From: Ming Lei ming.lei@redhat.com
mainline inclusion from mainline-v5.4-rc1 commit f9934a80f91dba8c7029ba7601459e41ea7770aa category: bugfix bugzilla: NA CVE: NA Link: https://gitee.com/openeuler/kernel/issues/I1WGZE
-------------------------------------------------
blk-mq may schedule to call queue's complete function on remote CPU via IPI, but doesn't provide any way to synchronize the request's complete fn. The current queue freeze interface can't provide the synchonization because aborted requests stay at blk-mq queues during EH.
In some driver's EH(such as NVMe), hardware queue's resource may be freed &re-allocated. If the completed request's complete fn is run finally after the hardware queue's resource is released, kernel crash will be triggered.
Prepare for fixing this kind of issue by introducing blk_mq_tagset_wait_completed_request().
Cc: Max Gurtovoy maxg@mellanox.com Cc: Sagi Grimberg sagi@grimberg.me Cc: Keith Busch keith.busch@intel.com Cc: Christoph Hellwig hch@lst.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Reviewed-by: Chao Leng lengchao@huawei.com Reviewed-by: Jike Cheng chengjike.cheng@huawei.com Conflicts: block/blk-mq-tag.c include/linux/blk-mq.h [lrz: remain return type to void as 4.19] Signed-off-by: Ruozhu Li liruozhu@huawei.com Signed-off-by: Lijie lijie34@huawei.com Reviewed-by: Tao Hou houtao1@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com --- block/blk-mq-tag.c | 31 +++++++++++++++++++++++++++++++ include/linux/blk-mq.h | 1 + 2 files changed, 32 insertions(+)
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 323bbca53a17..ce7f7188625e 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -9,6 +9,7 @@ #include <linux/module.h>
#include <linux/blk-mq.h> +#include <linux/delay.h> #include "blk.h" #include "blk-mq.h" #include "blk-mq-tag.h" @@ -328,6 +329,36 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, } EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
+static void blk_mq_tagset_count_completed_rqs(struct request *rq, + void *data, bool reserved) +{ + unsigned *count = data; + + if (blk_mq_request_completed(rq)) + (*count)++; +} + +/** + * blk_mq_tagset_wait_completed_request - wait until all completed req's + * complete funtion is run + * @tagset: Tag set to drain completed request + * + * Note: This function has to be run after all IO queues are shutdown + */ +void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset) +{ + while (true) { + unsigned count = 0; + + blk_mq_tagset_busy_iter(tagset, + blk_mq_tagset_count_completed_rqs, &count); + if (!count) + break; + msleep(5); + } +} +EXPORT_SYMBOL(blk_mq_tagset_wait_completed_request); + static void __blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn, void *priv, bool inflight) { diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index b414cad68024..d26edab21d5c 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -316,6 +316,7 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async); void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs); void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset, busy_tag_iter_fn *fn, void *priv); +void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset); void blk_mq_freeze_queue(struct request_queue *q); void blk_mq_unfreeze_queue(struct request_queue *q); void blk_freeze_queue_start(struct request_queue *q);