[OLK-6.6 0/5] Some bugfixes and cleanups

From: Juan Zhou <zhoujuan51@h-partners.com> Some bugfixes and cleanups. Juan Zhou (1): RDMA/hns: Fix incorrect variable usage in scc_attr_is_visible() Junxian Huang (1): RDMA/hns: Fix a potential Sleep-in-Atomic-Context wenglianfa (3): RDMA/hns: Add mutex_destroy() to destroy the mutex RDMA/hns : Fix scc_param delay_work to execute after sysfs shutdown RDMA/hns : Fix null pointer when alloc_scc_param() fails drivers/infiniband/hw/hns/hns_roce_bond.c | 2 ++ drivers/infiniband/hw/hns/hns_roce_device.h | 4 +-- drivers/infiniband/hw/hns/hns_roce_main.c | 13 ++++++-- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 36 ++++++++++----------- 4 files changed, 32 insertions(+), 23 deletions(-) -- 2.30.0

From: Junxian Huang <huangjunxian6@hisilicon.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2 ---------------------------------------------------------------------- In hns_roce_get_bond_netdev(), a bond_mutex is locked. This may lead to a potential Sleep-in-Atomic-Context along with the iboe.lock in hns_roce_query_port(). Since hns_roce_get_bond_netdev() doesn't involve iboe, move the call out of the critical section of iboe.lock. Fixes: 2004b3f9092a ("RDMA/hns: Support RoCE bonding") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com> --- drivers/infiniband/hw/hns/hns_roce_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 905fd20a4..eef826f5b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -316,9 +316,10 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u32 port_num, if (ret) ibdev_warn(ib_dev, "failed to get speed, ret = %d.\n", ret); + net_dev = hr_dev->hw->get_bond_netdev(hr_dev); + spin_lock_irqsave(&hr_dev->iboe.lock, flags); - net_dev = hr_dev->hw->get_bond_netdev(hr_dev); if (!net_dev) net_dev = get_hr_netdev(hr_dev, port); if (!net_dev) { -- 2.30.0

From: wenglianfa <wenglianfa@huawei.com> driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2 ---------------------------------------------------------------------- Add mutex_destroy() to destroy the mutex. Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com> --- drivers/infiniband/hw/hns/hns_roce_bond.c | 2 ++ drivers/infiniband/hw/hns/hns_roce_main.c | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/infiniband/hw/hns/hns_roce_bond.c b/drivers/infiniband/hw/hns/hns_roce_bond.c index 146eeb7f4..4b2b5538c 100644 --- a/drivers/infiniband/hw/hns/hns_roce_bond.c +++ b/drivers/infiniband/hw/hns/hns_roce_bond.c @@ -629,6 +629,7 @@ int hns_roce_cleanup_bond(struct hns_roce_bond_group *bond_grp) completion_no_waiter = completion_done(&bond_grp->bond_work_done); complete(&bond_grp->bond_work_done); + mutex_destroy(&bond_grp->bond_mutex); if (completion_no_waiter) kfree(bond_grp); @@ -780,6 +781,7 @@ static struct hns_roce_bond_group *hns_roce_alloc_bond_grp(struct hns_roce_dev * if (ret) { ibdev_err(&main_hr_dev->ib_dev, "failed to alloc bond ID, ret = %d.\n", ret); + mutex_destroy(&bond_grp->bond_mutex); kfree(bond_grp); return NULL; } diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index eef826f5b..992e6dfaa 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -1315,6 +1315,7 @@ err_uar_table_free: if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_CQ_RECORD_DB || hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_QP_RECORD_DB) mutex_destroy(&hr_dev->pgdir_mutex); + mutex_destroy(&hr_dev->uctx_list_mutex); return ret; } -- 2.30.0

From: wenglianfa <wenglianfa@huawei.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2 ---------------------------------------------------------------------- After sysfs is disabled, scc delay_work may continue to be executed, causing the UAF problem. To fix it, cancel_delayde_work_sync() is introduced to ensure that scc delay_work is canceled or executed. Fixes: 41da9cd8456d ("RDMA/hns: Support congestion control algorithm parameter configuration") Signed-off-by: wenglianfa <wenglianfa@huawei.com> --- drivers/infiniband/hw/hns/hns_roce_device.h | 4 +-- drivers/infiniband/hw/hns/hns_roce_main.c | 9 ++++-- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 31 ++++++++++----------- 3 files changed, 23 insertions(+), 21 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 538e3ed03..80870c278 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -1452,8 +1452,6 @@ struct hns_user_mmap_entry * hns_roce_user_mmap_entry_insert(struct ib_ucontext *ucontext, u64 address, size_t length, enum hns_roce_mmap_type mmap_type); -void hns_roce_register_sysfs(struct hns_roce_dev *hr_dev); -void hns_roce_unregister_sysfs(struct hns_roce_dev *hr_dev); void hns_roce_add_unfree_umem(struct hns_roce_user_db_page *user_page, struct hns_roce_dev *hr_dev); void hns_roce_free_unfree_umem(struct hns_roce_dev *hr_dev); @@ -1461,4 +1459,6 @@ void hns_roce_add_unfree_mtr(struct hns_roce_mtr_node *pos, struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr); void hns_roce_free_unfree_mtr(struct hns_roce_dev *hr_dev); +int hns_roce_alloc_scc_param(struct hns_roce_dev *hr_dev); +void hns_roce_dealloc_scc_param(struct hns_roce_dev *hr_dev); #endif /* _HNS_ROCE_DEVICE_H */ diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 992e6dfaa..ab43b6688 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -1444,16 +1444,21 @@ int hns_roce_init(struct hns_roce_dev *hr_dev) } } + ret = hns_roce_alloc_scc_param(hr_dev); + if (ret) + dev_err(hr_dev->dev, "alloc scc param failed, ret = %d!\n", + ret); + ret = hns_roce_register_device(hr_dev); if (ret) goto error_failed_register_device; - hns_roce_register_sysfs(hr_dev); hns_roce_register_debugfs(hr_dev); return 0; error_failed_register_device: + hns_roce_dealloc_scc_param(hr_dev); if (hr_dev->hw->hw_exit) hr_dev->hw->hw_exit(hr_dev); @@ -1483,8 +1488,8 @@ error_failed_alloc_dfx_cnt: void hns_roce_exit(struct hns_roce_dev *hr_dev, bool bond_cleanup) { - hns_roce_unregister_sysfs(hr_dev); hns_roce_unregister_device(hr_dev, bond_cleanup); + hns_roce_dealloc_scc_param(hr_dev); hns_roce_unregister_debugfs(hr_dev); if (hr_dev->hw->hw_exit) diff --git a/drivers/infiniband/hw/hns/hns_roce_sysfs.c b/drivers/infiniband/hw/hns/hns_roce_sysfs.c index a9708d28a..110d558a5 100644 --- a/drivers/infiniband/hw/hns/hns_roce_sysfs.c +++ b/drivers/infiniband/hw/hns/hns_roce_sysfs.c @@ -33,7 +33,7 @@ static void get_default_scc_param(struct hns_roce_dev *hr_dev) } } -static int alloc_scc_param(struct hns_roce_dev *hr_dev) +int hns_roce_alloc_scc_param(struct hns_roce_dev *hr_dev) { struct hns_roce_scc_param *scc_param; int i; @@ -56,6 +56,19 @@ static int alloc_scc_param(struct hns_roce_dev *hr_dev) return 0; } +void hns_roce_dealloc_scc_param(struct hns_roce_dev *hr_dev) +{ + int i; + + if (!hr_dev->scc_param) + return; + + for (i = 0; i < HNS_ROCE_SCC_ALGO_TOTAL; i++) + cancel_delayed_work_sync(&hr_dev->scc_param[i].scc_cfg_dwork); + + kvfree(hr_dev->scc_param); + hr_dev->scc_param = NULL; +} struct hns_port_cc_attr { struct ib_port_attribute port_attr; @@ -328,19 +341,3 @@ const struct attribute_group *hns_attr_port_groups[] = { &dip_cc_param_group, NULL, }; - -void hns_roce_register_sysfs(struct hns_roce_dev *hr_dev) -{ - int ret; - - ret = alloc_scc_param(hr_dev); - if (ret) - dev_err(hr_dev->dev, "alloc scc param failed, ret = %d!\n", - ret); -} - -void hns_roce_unregister_sysfs(struct hns_roce_dev *hr_dev) -{ - if (hr_dev->scc_param) - kvfree(hr_dev->scc_param); -} -- 2.30.0

From: wenglianfa <wenglianfa@huawei.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2 ---------------------------------------------------------------------- The failure of alloc_scc_param() does not cause sysfs to be unavailable. In this case, hr_dev->scc_param=NULL, if the user calls scc_attr_show()/ scc_attr_store(), hr_dev->scc_param is accessed and a null pointer error is reported. To fix it, make scc_param invisible when alloc_scc_param() fails. Fixes: 41da9cd8456d ("RDMA/hns: Support congestion control algorithm parameter configuration") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com> --- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/infiniband/hw/hns/hns_roce_sysfs.c b/drivers/infiniband/hw/hns/hns_roce_sysfs.c index 110d558a5..8429076d8 100644 --- a/drivers/infiniband/hw/hns/hns_roce_sysfs.c +++ b/drivers/infiniband/hw/hns/hns_roce_sysfs.c @@ -171,6 +171,9 @@ static umode_t scc_attr_is_visible(struct kobject *kobj, struct ib_device *ibdev = ib_port_sysfs_get_ibdev_kobj(kobj, &port_num); struct hns_roce_dev *hr_dev = to_hr_dev(ibdev); + if (!hr_dev->scc_param) + return 0; + if (hr_dev->is_vf || !(hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_QP_FLOW_CTRL)) return 0; -- 2.30.0

From: Juan Zhou <zhoujuan51@h-partners.com> driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9GZX2 ---------------------------------------------------------------------- The supported algorithm capabilities should be checked, instead of the default algorithm type. Fixes: 41da9cd8456d ("RDMA/hns: Support congestion control algorithm parameter configuration") Signed-off-by: Juan Zhou <zhoujuan51@h-partners.com> --- drivers/infiniband/hw/hns/hns_roce_sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_sysfs.c b/drivers/infiniband/hw/hns/hns_roce_sysfs.c index 8429076d8..d36f05ac5 100644 --- a/drivers/infiniband/hw/hns/hns_roce_sysfs.c +++ b/drivers/infiniband/hw/hns/hns_roce_sysfs.c @@ -178,7 +178,7 @@ static umode_t scc_attr_is_visible(struct kobject *kobj, !(hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_QP_FLOW_CTRL)) return 0; - if (!(hr_dev->caps.default_cong_type & (1 << scc_attr->algo_type))) + if (!(hr_dev->caps.cong_cap & (1 << scc_attr->algo_type))) return 0; return 0644; -- 2.30.0
participants (1)
-
Chengchang Tang