[PATCH OLK-5.10] RDMA/hns: Optimize HW performance by limiting ACK request frequency

driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/ICFUBY ---------------------------------------------------------------------- ACK_REQ_FREQ indicates the number of packets (after MTU fragmentation) HW sends before setting an ACK request. When MTU is greater than or equal to 1024, the current ACK_REQ_FREQ value causes HW to request an ACK for every MTU fragment. The processing of a large number of ACKs severely impacts HW performance when sending large size payloads. Limit the ACK request frequency by increasing ACK_REQ_FREQ to optimize HW performance. There are several constraints for ACK_REQ_FREQ: 1. mtu * (2 ^ ACK_REQ_FREQ) should not be too large, otherwise it may cause some unexpected retries when sending large payload. 4K is a recommended value. 2. ACK_REQ_FREQ should be larger than or equal to LP_PKTN_INI. But we don't need to add a check since the calculation here already guarantees this. 3. ACK_REQ_FREQ must be equal to LP_PKTN_INI when using LDCP or HC3 congestion control algorithm. Fixes: 64307761e707 ("RDMA/hns: Modify the value of long message loopback slice") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 3813b318f1c3..4d284e020178 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -5136,6 +5136,7 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp, dma_addr_t trrl_ba; dma_addr_t irrl_ba; enum ib_mtu ib_mtu; + u8 ack_req_freq; const u8 *smac; u8 lp_pktn_ini; u64 *mtts; @@ -5240,8 +5241,24 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp, hr_reg_write(context, QPC_LP_PKTN_INI, lp_pktn_ini); hr_reg_clear(qpc_mask, QPC_LP_PKTN_INI); - /* ACK_REQ_FREQ should be larger than or equal to LP_PKTN_INI */ - hr_reg_write(context, QPC_ACK_REQ_FREQ, lp_pktn_ini); +#define MAX_ACK_REQ_MSG_LEN 4096 + /* + * There are several constraints for ACK_REQ_FREQ: + * 1. mtu * (2 ^ ACK_REQ_FREQ) should not be too large, otherwise + * it may cause some unexpected retries when sending large + * payload. 4K is a recommended value. + * 2. ACK_REQ_FREQ should be larger than or equal to LP_PKTN_INI. + * But we don't need to add a check since the calculation here + * already guarantees this. + * 3. ACK_REQ_FREQ must be equal to LP_PKTN_INI when using LDCP + * or HC3 congestion control algorithm. + */ + if (hr_qp->congest_type == HNS_ROCE_CREATE_QP_FLAGS_LDCP || + hr_qp->congest_type == HNS_ROCE_CREATE_QP_FLAGS_HC3) + ack_req_freq = lp_pktn_ini; + else + ack_req_freq = ilog2(MAX_ACK_REQ_MSG_LEN / mtu); + hr_reg_write(context, QPC_ACK_REQ_FREQ, ack_req_freq); hr_reg_clear(qpc_mask, QPC_ACK_REQ_FREQ); hr_reg_clear(qpc_mask, QPC_RX_REQ_PSN_ERR); -- 2.33.0

反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/16739 邮件列表地址:https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/DPI... FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/16739 Mailing list address: https://mailweb.openeuler.org/archives/list/kernel@openeuler.org/message/DPI...
participants (2)
-
Junxian Huang
-
patchwork bot