From: Moshe Shemesh moshe@nvidia.com
stable inclusion from stable-v6.6.11 commit 04ebb29dc9aabd4c71f5f796e9fa7fee7c477456 bugzilla: https://gitee.com/openeuler/kernel/issues/I99TJK
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit a53e215f90079f617360439b1b6284820731e34c ]
The cited patch tries to ensure no pending works on the mkey cache workqueue by disabling adding new works and call flush_workqueue(). But this workqueue also has delayed works which might still be pending the delay time to be queued.
Add cancel_delayed_work() for the delayed works which waits to be queued and then the flush_workqueue() will flush all works which are already queued and running.
Fixes: 374012b00457 ("RDMA/mlx5: Fix mkey cache possible deadlock on cleanup") Link: https://lore.kernel.org/r/b8722f14e7ed81452f791764a26d2ed4cfa11478.169825617... Signed-off-by: Moshe Shemesh moshe@nvidia.com Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: ZhangPeng zhangpeng362@huawei.com --- drivers/infiniband/hw/mlx5/mr.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 8a3762d9ff58..e0629898c3c0 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1026,11 +1026,13 @@ void mlx5_mkey_cache_cleanup(struct mlx5_ib_dev *dev) return;
mutex_lock(&dev->cache.rb_lock); + cancel_delayed_work(&dev->cache.remove_ent_dwork); for (node = rb_first(root); node; node = rb_next(node)) { ent = rb_entry(node, struct mlx5_cache_ent, node); xa_lock_irq(&ent->mkeys); ent->disabled = true; xa_unlock_irq(&ent->mkeys); + cancel_delayed_work(&ent->dwork); } mutex_unlock(&dev->cache.rb_lock);