From: Douglas Miller doug.miller@cornelisnetworks.com
stable inclusion from stable-v5.10.111 commit 5f54364ff6cfcd14cddf5441c4a490bb28dd69f7 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5GL1Z
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit 2bbac98d0930e8161b1957dc0ec99de39ade1b3c upstream.
Under certain conditions, such as MPI_Abort, the hfi1 cleanup code may represent the last reference held on the task mm. hfi1_mmu_rb_unregister() then drops the last reference and the mm is freed before the final use in hfi1_release_user_pages(). A new task may allocate the mm structure while it is still being used, resulting in problems. One manifestation is corruption of the mmap_sem counter leading to a hang in down_write(). Another is corruption of an mm struct that is in use by another task.
Fixes: 3d2a9d642512 ("IB/hfi1: Ensure correct mm is used at all times") Link: https://lore.kernel.org/r/20220408133523.122165.72975.stgit@awfm-01.cornelis... Cc: stable@vger.kernel.org Signed-off-by: Douglas Miller doug.miller@cornelisnetworks.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@cornelisnetworks.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Reviewed-by: Wei Li liwei391@huawei.com --- drivers/infiniband/hw/hfi1/mmu_rb.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c index d213f65d4cdd..ed8a96ae61ce 100644 --- a/drivers/infiniband/hw/hfi1/mmu_rb.c +++ b/drivers/infiniband/hw/hfi1/mmu_rb.c @@ -121,6 +121,9 @@ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler) unsigned long flags; struct list_head del_list;
+ /* Prevent freeing of mm until we are completely finished. */ + mmgrab(handler->mn.mm); + /* Unregister first so we don't get any more notifications. */ mmu_notifier_unregister(&handler->mn, handler->mn.mm);
@@ -143,6 +146,9 @@ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler)
do_remove(handler, &del_list);
+ /* Now the mm may be freed. */ + mmdrop(handler->mn.mm); + kfree(handler); }