From: Juan Zhou zhoujuan51@h-partners.com
Some bugfixes and cleanups.
Juan Zhou (1): RDMA/hns: Fix incorrect variable usage in scc_attr_is_visible()
Junxian Huang (1): RDMA/hns: Fix a potential Sleep-in-Atomic-Context
wenglianfa (3): RDMA/hns: Add mutex_destroy() to destroy the mutex RDMA/hns : Fix scc_param delay_work to execute after sysfs shutdown RDMA/hns : Fix null pointer when alloc_scc_param() fails
drivers/infiniband/hw/hns/hns_roce_bond.c | 2 ++ drivers/infiniband/hw/hns/hns_roce_device.h | 4 +-- drivers/infiniband/hw/hns/hns_roce_main.c | 13 ++++++-- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 36 ++++++++++----------- 4 files changed, 32 insertions(+), 23 deletions(-)
From: Junxian Huang huangjunxian6@hisilicon.com
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2
----------------------------------------------------------------------
In hns_roce_get_bond_netdev(), a bond_mutex is locked. This may lead to a potential Sleep-in-Atomic-Context along with the iboe.lock in hns_roce_query_port().
Since hns_roce_get_bond_netdev() doesn't involve iboe, move the call out of the critical section of iboe.lock.
Fixes: 2004b3f9092a ("RDMA/hns: Support RoCE bonding") Signed-off-by: Junxian Huang huangjunxian6@hisilicon.com Signed-off-by: Juan Zhou zhoujuan51@h-partners.com --- drivers/infiniband/hw/hns/hns_roce_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 905fd20a4..eef826f5b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -316,9 +316,10 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u32 port_num, if (ret) ibdev_warn(ib_dev, "failed to get speed, ret = %d.\n", ret);
+ net_dev = hr_dev->hw->get_bond_netdev(hr_dev); + spin_lock_irqsave(&hr_dev->iboe.lock, flags);
- net_dev = hr_dev->hw->get_bond_netdev(hr_dev); if (!net_dev) net_dev = get_hr_netdev(hr_dev, port); if (!net_dev) {
From: wenglianfa wenglianfa@huawei.com
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2
----------------------------------------------------------------------
Add mutex_destroy() to destroy the mutex.
Signed-off-by: wenglianfa wenglianfa@huawei.com Signed-off-by: Juan Zhou zhoujuan51@h-partners.com --- drivers/infiniband/hw/hns/hns_roce_bond.c | 2 ++ drivers/infiniband/hw/hns/hns_roce_main.c | 1 + 2 files changed, 3 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_bond.c b/drivers/infiniband/hw/hns/hns_roce_bond.c index 146eeb7f4..4b2b5538c 100644 --- a/drivers/infiniband/hw/hns/hns_roce_bond.c +++ b/drivers/infiniband/hw/hns/hns_roce_bond.c @@ -629,6 +629,7 @@ int hns_roce_cleanup_bond(struct hns_roce_bond_group *bond_grp)
completion_no_waiter = completion_done(&bond_grp->bond_work_done); complete(&bond_grp->bond_work_done); + mutex_destroy(&bond_grp->bond_mutex); if (completion_no_waiter) kfree(bond_grp);
@@ -780,6 +781,7 @@ static struct hns_roce_bond_group *hns_roce_alloc_bond_grp(struct hns_roce_dev * if (ret) { ibdev_err(&main_hr_dev->ib_dev, "failed to alloc bond ID, ret = %d.\n", ret); + mutex_destroy(&bond_grp->bond_mutex); kfree(bond_grp); return NULL; } diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index eef826f5b..992e6dfaa 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -1315,6 +1315,7 @@ err_uar_table_free: if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_CQ_RECORD_DB || hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_QP_RECORD_DB) mutex_destroy(&hr_dev->pgdir_mutex); + mutex_destroy(&hr_dev->uctx_list_mutex);
return ret; }
From: wenglianfa wenglianfa@huawei.com
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2
----------------------------------------------------------------------
After sysfs is disabled, scc delay_work may continue to be executed, causing the UAF problem. To fix it, cancel_delayde_work_sync() is introduced to ensure that scc delay_work is canceled or executed.
Fixes: 41da9cd8456d ("RDMA/hns: Support congestion control algorithm parameter configuration") Signed-off-by: wenglianfa wenglianfa@huawei.com --- drivers/infiniband/hw/hns/hns_roce_device.h | 4 +-- drivers/infiniband/hw/hns/hns_roce_main.c | 9 ++++-- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 31 ++++++++++----------- 3 files changed, 23 insertions(+), 21 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 538e3ed03..80870c278 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -1452,8 +1452,6 @@ struct hns_user_mmap_entry * hns_roce_user_mmap_entry_insert(struct ib_ucontext *ucontext, u64 address, size_t length, enum hns_roce_mmap_type mmap_type); -void hns_roce_register_sysfs(struct hns_roce_dev *hr_dev); -void hns_roce_unregister_sysfs(struct hns_roce_dev *hr_dev); void hns_roce_add_unfree_umem(struct hns_roce_user_db_page *user_page, struct hns_roce_dev *hr_dev); void hns_roce_free_unfree_umem(struct hns_roce_dev *hr_dev); @@ -1461,4 +1459,6 @@ void hns_roce_add_unfree_mtr(struct hns_roce_mtr_node *pos, struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr); void hns_roce_free_unfree_mtr(struct hns_roce_dev *hr_dev); +int hns_roce_alloc_scc_param(struct hns_roce_dev *hr_dev); +void hns_roce_dealloc_scc_param(struct hns_roce_dev *hr_dev); #endif /* _HNS_ROCE_DEVICE_H */ diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 992e6dfaa..ab43b6688 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -1444,16 +1444,21 @@ int hns_roce_init(struct hns_roce_dev *hr_dev) } }
+ ret = hns_roce_alloc_scc_param(hr_dev); + if (ret) + dev_err(hr_dev->dev, "alloc scc param failed, ret = %d!\n", + ret); + ret = hns_roce_register_device(hr_dev); if (ret) goto error_failed_register_device;
- hns_roce_register_sysfs(hr_dev); hns_roce_register_debugfs(hr_dev);
return 0;
error_failed_register_device: + hns_roce_dealloc_scc_param(hr_dev); if (hr_dev->hw->hw_exit) hr_dev->hw->hw_exit(hr_dev);
@@ -1483,8 +1488,8 @@ error_failed_alloc_dfx_cnt:
void hns_roce_exit(struct hns_roce_dev *hr_dev, bool bond_cleanup) { - hns_roce_unregister_sysfs(hr_dev); hns_roce_unregister_device(hr_dev, bond_cleanup); + hns_roce_dealloc_scc_param(hr_dev); hns_roce_unregister_debugfs(hr_dev);
if (hr_dev->hw->hw_exit) diff --git a/drivers/infiniband/hw/hns/hns_roce_sysfs.c b/drivers/infiniband/hw/hns/hns_roce_sysfs.c index a9708d28a..110d558a5 100644 --- a/drivers/infiniband/hw/hns/hns_roce_sysfs.c +++ b/drivers/infiniband/hw/hns/hns_roce_sysfs.c @@ -33,7 +33,7 @@ static void get_default_scc_param(struct hns_roce_dev *hr_dev) } }
-static int alloc_scc_param(struct hns_roce_dev *hr_dev) +int hns_roce_alloc_scc_param(struct hns_roce_dev *hr_dev) { struct hns_roce_scc_param *scc_param; int i; @@ -56,6 +56,19 @@ static int alloc_scc_param(struct hns_roce_dev *hr_dev)
return 0; } +void hns_roce_dealloc_scc_param(struct hns_roce_dev *hr_dev) +{ + int i; + + if (!hr_dev->scc_param) + return; + + for (i = 0; i < HNS_ROCE_SCC_ALGO_TOTAL; i++) + cancel_delayed_work_sync(&hr_dev->scc_param[i].scc_cfg_dwork); + + kvfree(hr_dev->scc_param); + hr_dev->scc_param = NULL; +}
struct hns_port_cc_attr { struct ib_port_attribute port_attr; @@ -328,19 +341,3 @@ const struct attribute_group *hns_attr_port_groups[] = { &dip_cc_param_group, NULL, }; - -void hns_roce_register_sysfs(struct hns_roce_dev *hr_dev) -{ - int ret; - - ret = alloc_scc_param(hr_dev); - if (ret) - dev_err(hr_dev->dev, "alloc scc param failed, ret = %d!\n", - ret); -} - -void hns_roce_unregister_sysfs(struct hns_roce_dev *hr_dev) -{ - if (hr_dev->scc_param) - kvfree(hr_dev->scc_param); -}
From: wenglianfa wenglianfa@huawei.com
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2
----------------------------------------------------------------------
The failure of alloc_scc_param() does not cause sysfs to be unavailable. In this case, hr_dev->scc_param=NULL, if the user calls scc_attr_show()/ scc_attr_store(), hr_dev->scc_param is accessed and a null pointer error is reported. To fix it, make scc_param invisible when alloc_scc_param() fails.
Fixes: 41da9cd8456d ("RDMA/hns: Support congestion control algorithm parameter configuration") Signed-off-by: wenglianfa wenglianfa@huawei.com Signed-off-by: Juan Zhou zhoujuan51@h-partners.com --- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_sysfs.c b/drivers/infiniband/hw/hns/hns_roce_sysfs.c index 110d558a5..8429076d8 100644 --- a/drivers/infiniband/hw/hns/hns_roce_sysfs.c +++ b/drivers/infiniband/hw/hns/hns_roce_sysfs.c @@ -171,6 +171,9 @@ static umode_t scc_attr_is_visible(struct kobject *kobj, struct ib_device *ibdev = ib_port_sysfs_get_ibdev_kobj(kobj, &port_num); struct hns_roce_dev *hr_dev = to_hr_dev(ibdev);
+ if (!hr_dev->scc_param) + return 0; + if (hr_dev->is_vf || !(hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_QP_FLOW_CTRL)) return 0;
From: Juan Zhou zhoujuan51@h-partners.com
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2
----------------------------------------------------------------------
The supported algorithm capabilities should be checked, instead of the default algorithm type.
Fixes: 41da9cd8456d ("RDMA/hns: Support congestion control algorithm parameter configuration") Signed-off-by: Juan Zhou zhoujuan51@h-partners.com --- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_sysfs.c b/drivers/infiniband/hw/hns/hns_roce_sysfs.c index 8429076d8..d36f05ac5 100644 --- a/drivers/infiniband/hw/hns/hns_roce_sysfs.c +++ b/drivers/infiniband/hw/hns/hns_roce_sysfs.c @@ -178,7 +178,7 @@ static umode_t scc_attr_is_visible(struct kobject *kobj, !(hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_QP_FLOW_CTRL)) return 0;
- if (!(hr_dev->caps.default_cong_type & (1 << scc_attr->algo_type))) + if (!(hr_dev->caps.cong_cap & (1 << scc_attr->algo_type))) return 0;
return 0644;