High-performance-network
Threads by month
- ----- 2026 -----
- February
- January
- ----- 2025 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- 102 discussions
[PATCH rdma-core 1/2] libhns: Fix wrong WQE data in new post send API when QP wraps around
by Junxian Huang 30 Dec '25
by Junxian Huang 30 Dec '25
30 Dec '25
The modification in the fixes commit should also be applied to
new post send API.
Fixes: 15adbcf23df2 ("libhns: Fix wrong WQE data when QP wraps around")
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/hns_roce_u_hw_v2.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
index fd266b10c..932d778e3 100644
--- a/providers/hns/hns_roce_u_hw_v2.c
+++ b/providers/hns/hns_roce_u_hw_v2.c
@@ -2111,6 +2111,7 @@ init_rc_wqe(struct hns_roce_qp *qp, uint64_t wr_id, unsigned int opcode)
wqe_idx = qp->sq.head & (qp->sq.wqe_cnt - 1);
wqe = get_send_wqe(qp, wqe_idx);
+ wqe->byte_4 = 0;
hr_reg_write(wqe, RCWQE_OPCODE, opcode);
hr_reg_write_bool(wqe, RCWQE_CQE, send_flags & IBV_SEND_SIGNALED);
hr_reg_write_bool(wqe, RCWQE_FENCE, send_flags & IBV_SEND_FENCE);
@@ -2453,6 +2454,7 @@ init_ud_wqe(struct hns_roce_qp *qp, uint64_t wr_id, unsigned int opcode)
wqe_idx = qp->sq.head & (qp->sq.wqe_cnt - 1);
wqe = get_send_wqe(qp, wqe_idx);
+ wqe->rsv_opcode = 0;
hr_reg_write(wqe, UDWQE_OPCODE, opcode);
hr_reg_write_bool(wqe, UDWQE_CQE, send_flags & IBV_SEND_SIGNALED);
hr_reg_write_bool(wqe, UDWQE_SE, send_flags & IBV_SEND_SOLICITED);
--
2.33.0
1
1
27 Nov '25
Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
---
...testpmd-handle-IEEE1588-init-failure.patch | 59 ++
...3fwd-add-option-to-set-Rx-burst-size.patch | 248 +++++++
...v-fix-queue-crash-with-generic-pipel.patch | 108 +++
...dd-Tx-burst-size-configuration-optio.patch | 338 +++++++++
...t-hns3-remove-duplicate-struct-field.patch | 260 +++++++
0119-net-hns3-refactor-DCB-module.patch | 296 ++++++++
...-net-hns3-parse-max-TC-number-for-VF.patch | 73 ++
...-support-multi-TCs-capability-for-VF.patch | 172 +++++
...ns3-fix-queue-TC-configuration-on-VF.patch | 109 +++
...pport-multi-TCs-configuration-for-VF.patch | 681 ++++++++++++++++++
...pp-testpmd-avoid-crash-in-DCB-config.patch | 46 ++
...testpmd-show-all-DCB-priority-TC-map.patch | 38 +
...d-relax-number-of-TCs-in-DCB-command.patch | 54 ++
...euse-RSS-config-when-configuring-DCB.patch | 93 +++
...stpmd-add-prio-tc-map-in-DCB-command.patch | 296 ++++++++
...add-queue-restriction-in-DCB-command.patch | 264 +++++++
...p-testpmd-add-command-to-disable-DCB.patch | 158 ++++
0131-examples-l3fwd-force-link-speed.patch | 87 +++
...xamples-l3fwd-power-force-link-speed.patch | 80 ++
0133-config-arm-add-HiSilicon-HIP12.patch | 94 +++
0134-app-testpmd-fix-DCB-Tx-port.patch | 51 ++
0135-app-testpmd-fix-DCB-Rx-queues.patch | 35 +
...support-specify-TCs-when-DCB-forward.patch | 254 +++++++
...d-support-multi-cores-process-one-TC.patch | 292 ++++++++
dpdk.spec | 54 +-
25 files changed, 4239 insertions(+), 1 deletion(-)
create mode 100644 0114-app-testpmd-handle-IEEE1588-init-failure.patch
create mode 100644 0115-examples-l3fwd-add-option-to-set-Rx-burst-size.patch
create mode 100644 0116-examples-eventdev-fix-queue-crash-with-generic-pipel.patch
create mode 100644 0117-examples-l3fwd-add-Tx-burst-size-configuration-optio.patch
create mode 100644 0118-net-hns3-remove-duplicate-struct-field.patch
create mode 100644 0119-net-hns3-refactor-DCB-module.patch
create mode 100644 0120-net-hns3-parse-max-TC-number-for-VF.patch
create mode 100644 0121-net-hns3-support-multi-TCs-capability-for-VF.patch
create mode 100644 0122-net-hns3-fix-queue-TC-configuration-on-VF.patch
create mode 100644 0123-net-hns3-support-multi-TCs-configuration-for-VF.patch
create mode 100644 0124-app-testpmd-avoid-crash-in-DCB-config.patch
create mode 100644 0125-app-testpmd-show-all-DCB-priority-TC-map.patch
create mode 100644 0126-app-testpmd-relax-number-of-TCs-in-DCB-command.patch
create mode 100644 0127-app-testpmd-reuse-RSS-config-when-configuring-DCB.patch
create mode 100644 0128-app-testpmd-add-prio-tc-map-in-DCB-command.patch
create mode 100644 0129-app-testpmd-add-queue-restriction-in-DCB-command.patch
create mode 100644 0130-app-testpmd-add-command-to-disable-DCB.patch
create mode 100644 0131-examples-l3fwd-force-link-speed.patch
create mode 100644 0132-examples-l3fwd-power-force-link-speed.patch
create mode 100644 0133-config-arm-add-HiSilicon-HIP12.patch
create mode 100644 0134-app-testpmd-fix-DCB-Tx-port.patch
create mode 100644 0135-app-testpmd-fix-DCB-Rx-queues.patch
create mode 100644 0136-app-testpmd-support-specify-TCs-when-DCB-forward.patch
create mode 100644 0137-app-testpmd-support-multi-cores-process-one-TC.patch
diff --git a/0114-app-testpmd-handle-IEEE1588-init-failure.patch b/0114-app-testpmd-handle-IEEE1588-init-failure.patch
new file mode 100644
index 0000000..479ae2a
--- /dev/null
+++ b/0114-app-testpmd-handle-IEEE1588-init-failure.patch
@@ -0,0 +1,59 @@
+From 79d85f1f563526c39150082f6eb6d3515e0fcc93 Mon Sep 17 00:00:00 2001
+From: Dengdui Huang <huangdengdui(a)huawei.com>
+Date: Sat, 30 Mar 2024 15:44:09 +0800
+Subject: [PATCH 01/24] app/testpmd: handle IEEE1588 init failure
+
+[ upstream commit 80071a1c8ed669298434c56efe4ca0839f2a970e ]
+
+When the port's timestamping function failed to initialize
+(for example, the device does not support PTP), the packets
+received by the hardware do not contain the timestamp.
+In this case, IEEE1588 packet forwarding should not start.
+This patch fix it.
+
+Plus, adding a failure message when failed to disable PTP.
+
+Fixes: a78040c990cb ("app/testpmd: update forward engine beginning")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Acked-by: Aman Singh <aman.deep.singh(a)intel.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/ieee1588fwd.c | 15 ++++++++++++---
+ 1 file changed, 12 insertions(+), 3 deletions(-)
+
+diff --git a/app/test-pmd/ieee1588fwd.c b/app/test-pmd/ieee1588fwd.c
+index 386d9f1..52ae551 100644
+--- a/app/test-pmd/ieee1588fwd.c
++++ b/app/test-pmd/ieee1588fwd.c
+@@ -197,14 +197,23 @@ ieee1588_packet_fwd(struct fwd_stream *fs)
+ static int
+ port_ieee1588_fwd_begin(portid_t pi)
+ {
+- rte_eth_timesync_enable(pi);
+- return 0;
++ int ret;
++
++ ret = rte_eth_timesync_enable(pi);
++ if (ret)
++ printf("Port %u enable PTP failed, ret = %d\n", pi, ret);
++
++ return ret;
+ }
+
+ static void
+ port_ieee1588_fwd_end(portid_t pi)
+ {
+- rte_eth_timesync_disable(pi);
++ int ret;
++
++ ret = rte_eth_timesync_disable(pi);
++ if (ret)
++ printf("Port %u disable PTP failed, ret = %d\n", pi, ret);
+ }
+
+ static void
+--
+2.33.0
+
diff --git a/0115-examples-l3fwd-add-option-to-set-Rx-burst-size.patch b/0115-examples-l3fwd-add-option-to-set-Rx-burst-size.patch
new file mode 100644
index 0000000..dfbca3b
--- /dev/null
+++ b/0115-examples-l3fwd-add-option-to-set-Rx-burst-size.patch
@@ -0,0 +1,248 @@
+From e686a6cad028ebfaadefbdf942cf27f7612fbef5 Mon Sep 17 00:00:00 2001
+From: Jie Hai <haijie1(a)huawei.com>
+Date: Fri, 18 Oct 2024 09:08:51 +0800
+Subject: [PATCH 02/24] examples/l3fwd: add option to set Rx burst size
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+[ upstream commit d5c4897ecfb2540dc4990d9b367ddbe5013d0e66 ]
+
+Now the Rx burst size is fixed to MAX_PKT_BURST (32). This
+parameter needs to be modified in some performance optimization
+scenarios. So an option '--burst' is added to set the burst size
+explicitly. The default value is DEFAULT_PKT_BURST (32) and maximum
+value is MAX_PKT_BURST (512).
+
+Signed-off-by: Jie Hai <haijie1(a)huawei.com>
+Acked-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Acked-by: Huisong Li <lihuisong(a)huawei.com>
+Acked-by: Morten Brørup <mb(a)smartsharesystems.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ examples/l3fwd/l3fwd.h | 7 +++--
+ examples/l3fwd/l3fwd_acl.c | 2 +-
+ examples/l3fwd/l3fwd_em.c | 2 +-
+ examples/l3fwd/l3fwd_fib.c | 2 +-
+ examples/l3fwd/l3fwd_lpm.c | 2 +-
+ examples/l3fwd/main.c | 60 ++++++++++++++++++++++++++++++++++++--
+ 6 files changed, 67 insertions(+), 8 deletions(-)
+
+diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
+index e7ae0e5..bb73edd 100644
+--- a/examples/l3fwd/l3fwd.h
++++ b/examples/l3fwd/l3fwd.h
+@@ -23,10 +23,11 @@
+ #define RX_DESC_DEFAULT 1024
+ #define TX_DESC_DEFAULT 1024
+
+-#define MAX_PKT_BURST 32
++#define DEFAULT_PKT_BURST 32
++#define MAX_PKT_BURST 512
+ #define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */
+
+-#define MEMPOOL_CACHE_SIZE 256
++#define MEMPOOL_CACHE_SIZE RTE_MEMPOOL_CACHE_MAX_SIZE
+ #define MAX_RX_QUEUE_PER_LCORE 16
+
+ #define VECTOR_SIZE_DEFAULT MAX_PKT_BURST
+@@ -115,6 +116,8 @@ extern struct acl_algorithms acl_alg[];
+
+ extern uint32_t max_pkt_len;
+
++extern uint32_t nb_pkt_per_burst;
++
+ /* Send burst of packets on an output interface */
+ static inline int
+ send_burst(struct lcore_conf *qconf, uint16_t n, uint16_t port)
+diff --git a/examples/l3fwd/l3fwd_acl.c b/examples/l3fwd/l3fwd_acl.c
+index 401692b..89be104 100644
+--- a/examples/l3fwd/l3fwd_acl.c
++++ b/examples/l3fwd/l3fwd_acl.c
+@@ -1055,7 +1055,7 @@ acl_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid,
+- pkts_burst, MAX_PKT_BURST);
++ pkts_burst, nb_pkt_per_burst);
+
+ if (nb_rx > 0) {
+ struct acl_search_t acl_search;
+diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
+index 00796f1..9a37019 100644
+--- a/examples/l3fwd/l3fwd_em.c
++++ b/examples/l3fwd/l3fwd_em.c
+@@ -652,7 +652,7 @@ em_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+- MAX_PKT_BURST);
++ nb_pkt_per_burst);
+ if (nb_rx == 0)
+ continue;
+
+diff --git a/examples/l3fwd/l3fwd_fib.c b/examples/l3fwd/l3fwd_fib.c
+index 6a21984..5a55b35 100644
+--- a/examples/l3fwd/l3fwd_fib.c
++++ b/examples/l3fwd/l3fwd_fib.c
+@@ -239,7 +239,7 @@ fib_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+- MAX_PKT_BURST);
++ nb_pkt_per_burst);
+ if (nb_rx == 0)
+ continue;
+
+diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
+index a484a33..c3df879 100644
+--- a/examples/l3fwd/l3fwd_lpm.c
++++ b/examples/l3fwd/l3fwd_lpm.c
+@@ -206,7 +206,7 @@ lpm_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+- MAX_PKT_BURST);
++ nb_pkt_per_burst);
+ if (nb_rx == 0)
+ continue;
+
+diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
+index 3bf28ae..258a235 100644
+--- a/examples/l3fwd/main.c
++++ b/examples/l3fwd/main.c
+@@ -14,6 +14,7 @@
+ #include <getopt.h>
+ #include <signal.h>
+ #include <stdbool.h>
++#include <assert.h>
+
+ #include <rte_common.h>
+ #include <rte_vect.h>
+@@ -53,8 +54,10 @@
+
+ #define MAX_LCORE_PARAMS 1024
+
++static_assert(MEMPOOL_CACHE_SIZE >= MAX_PKT_BURST, "MAX_PKT_BURST should be at most MEMPOOL_CACHE_SIZE");
+ uint16_t nb_rxd = RX_DESC_DEFAULT;
+ uint16_t nb_txd = TX_DESC_DEFAULT;
++uint32_t nb_pkt_per_burst = DEFAULT_PKT_BURST;
+
+ /**< Ports set in promiscuous mode off by default. */
+ static int promiscuous_on;
+@@ -395,6 +398,7 @@ print_usage(const char *prgname)
+ " --config (port,queue,lcore)[,(port,queue,lcore)]"
+ " [--rx-queue-size NPKTS]"
+ " [--tx-queue-size NPKTS]"
++ " [--burst NPKTS]"
+ " [--eth-dest=X,MM:MM:MM:MM:MM:MM]"
+ " [--max-pkt-len PKTLEN]"
+ " [--no-numa]"
+@@ -420,6 +424,8 @@ print_usage(const char *prgname)
+ " Default: %d\n"
+ " --tx-queue-size NPKTS: Tx queue size in decimal\n"
+ " Default: %d\n"
++ " --burst NPKTS: Burst size in decimal\n"
++ " Default: %d\n"
+ " --eth-dest=X,MM:MM:MM:MM:MM:MM: Ethernet destination for port X\n"
+ " --max-pkt-len PKTLEN: maximum packet length in decimal (64-9600)\n"
+ " --no-numa: Disable numa awareness\n"
+@@ -449,7 +455,7 @@ print_usage(const char *prgname)
+ " another is route entry at while line leads with character '%c'.\n"
+ " --rule_ipv6=FILE: Specify the ipv6 rules entries file.\n"
+ " --alg: ACL classify method to use, one of: %s.\n\n",
+- prgname, RX_DESC_DEFAULT, TX_DESC_DEFAULT,
++ prgname, RX_DESC_DEFAULT, TX_DESC_DEFAULT, DEFAULT_PKT_BURST,
+ ACL_LEAD_CHAR, ROUTE_LEAD_CHAR, alg);
+ }
+
+@@ -662,6 +668,49 @@ parse_lookup(const char *optarg)
+ return 0;
+ }
+
++static void
++parse_pkt_burst(const char *optarg)
++{
++ struct rte_eth_dev_info dev_info;
++ unsigned long pkt_burst;
++ uint16_t burst_size;
++ char *end = NULL;
++ int ret;
++
++ /* parse decimal string */
++ pkt_burst = strtoul(optarg, &end, 10);
++ if ((optarg[0] == '\0') || (end == NULL) || (*end != '\0'))
++ return;
++
++ if (pkt_burst > MAX_PKT_BURST) {
++ RTE_LOG(INFO, L3FWD, "User provided burst must be <= %d. Using default value %d\n",
++ MAX_PKT_BURST, nb_pkt_per_burst);
++ return;
++ } else if (pkt_burst > 0) {
++ nb_pkt_per_burst = (uint32_t)pkt_burst;
++ return;
++ }
++
++ /* If user gives a value of zero, query the PMD for its recommended Rx burst size. */
++ ret = rte_eth_dev_info_get(0, &dev_info);
++ if (ret != 0)
++ return;
++ burst_size = dev_info.default_rxportconf.burst_size;
++ if (burst_size == 0) {
++ RTE_LOG(INFO, L3FWD, "PMD does not recommend a burst size. Using default value %d. "
++ "User provided value must be in [1, %d]\n",
++ nb_pkt_per_burst, MAX_PKT_BURST);
++ return;
++ } else if (burst_size > MAX_PKT_BURST) {
++ RTE_LOG(INFO, L3FWD, "PMD recommended burst size %d exceeds maximum value %d. "
++ "Using default value %d\n",
++ burst_size, MAX_PKT_BURST, nb_pkt_per_burst);
++ return;
++ }
++ nb_pkt_per_burst = burst_size;
++ RTE_LOG(INFO, L3FWD, "Using PMD-provided burst value %d\n", burst_size);
++}
++
+ #define MAX_JUMBO_PKT_LEN 9600
+
+ static const char short_options[] =
+@@ -693,6 +742,7 @@ static const char short_options[] =
+ #define CMD_LINE_OPT_RULE_IPV4 "rule_ipv4"
+ #define CMD_LINE_OPT_RULE_IPV6 "rule_ipv6"
+ #define CMD_LINE_OPT_ALG "alg"
++#define CMD_LINE_OPT_PKT_BURST "burst"
+
+ enum {
+ /* long options mapped to a short option */
+@@ -721,7 +771,8 @@ enum {
+ CMD_LINE_OPT_LOOKUP_NUM,
+ CMD_LINE_OPT_ENABLE_VECTOR_NUM,
+ CMD_LINE_OPT_VECTOR_SIZE_NUM,
+- CMD_LINE_OPT_VECTOR_TMO_NS_NUM
++ CMD_LINE_OPT_VECTOR_TMO_NS_NUM,
++ CMD_LINE_OPT_PKT_BURST_NUM,
+ };
+
+ static const struct option lgopts[] = {
+@@ -748,6 +799,7 @@ static const struct option lgopts[] = {
+ {CMD_LINE_OPT_RULE_IPV4, 1, 0, CMD_LINE_OPT_RULE_IPV4_NUM},
+ {CMD_LINE_OPT_RULE_IPV6, 1, 0, CMD_LINE_OPT_RULE_IPV6_NUM},
+ {CMD_LINE_OPT_ALG, 1, 0, CMD_LINE_OPT_ALG_NUM},
++ {CMD_LINE_OPT_PKT_BURST, 1, 0, CMD_LINE_OPT_PKT_BURST_NUM},
+ {NULL, 0, 0, 0}
+ };
+
+@@ -836,6 +888,10 @@ parse_args(int argc, char **argv)
+ parse_queue_size(optarg, &nb_txd, 0);
+ break;
+
++ case CMD_LINE_OPT_PKT_BURST_NUM:
++ parse_pkt_burst(optarg);
++ break;
++
+ case CMD_LINE_OPT_ETH_DEST_NUM:
+ parse_eth_dest(optarg);
+ break;
+--
+2.33.0
+
diff --git a/0116-examples-eventdev-fix-queue-crash-with-generic-pipel.patch b/0116-examples-eventdev-fix-queue-crash-with-generic-pipel.patch
new file mode 100644
index 0000000..f530b79
--- /dev/null
+++ b/0116-examples-eventdev-fix-queue-crash-with-generic-pipel.patch
@@ -0,0 +1,108 @@
+From 0d4fffdc3eae64a9d3a59bdcb6e327e7c85ef637 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Wed, 18 Sep 2024 06:41:42 +0000
+Subject: [PATCH 03/24] examples/eventdev: fix queue crash with generic
+ pipeline
+
+[ upstream commit f6f2307931c90d924405ea44b0b4be9d3d01bd17 ]
+
+There was a segmentation fault when executing eventdev_pipeline with
+command [1] with ConnectX-5 NIC card:
+
+0x000000000079208c in rte_eth_tx_buffer (tx_pkt=0x16f8ed300, buffer=0x100,
+ queue_id=11, port_id=0) at
+ ../lib/ethdev/rte_ethdev.h:6636
+txa_service_tx (txa=0x17b19d080, ev=0xffffffffe500, n=4) at
+ ../lib/eventdev/rte_event_eth_tx_adapter.c:631
+0x0000000000792234 in txa_service_func (args=0x17b19d080) at
+ ../lib/eventdev/rte_event_eth_tx_adapter.c:666
+0x00000000008b0784 in service_runner_do_callback (s=0x17fffe100,
+ cs=0x17ffb5f80, service_idx=2) at
+ ../lib/eal/common/rte_service.c:405
+0x00000000008b0ad8 in service_run (i=2, cs=0x17ffb5f80,
+ service_mask=18446744073709551615, s=0x17fffe100,
+ serialize_mt_unsafe=0) at
+ ../lib/eal/common/rte_service.c:441
+0x00000000008b0c68 in rte_service_run_iter_on_app_lcore (id=2,
+ serialize_mt_unsafe=0) at
+ ../lib/eal/common/rte_service.c:477
+0x000000000057bcc4 in schedule_devices (lcore_id=0) at
+ ../examples/eventdev_pipeline/pipeline_common.h:138
+0x000000000057ca94 in worker_generic_burst (arg=0x17b131e80) at
+ ../examples/eventdev_pipeline/
+ pipeline_worker_generic.c:83
+0x00000000005794a8 in main (argc=11, argv=0xfffffffff470) at
+ ../examples/eventdev_pipeline/main.c:449
+
+The root cause is that the queue_id (11) is invalid, the queue_id comes
+from mbuf.hash.txadapter.txq which may pre-write by NIC driver when
+receiving packets (e.g. pre-write mbuf.hash.fdir.hi field).
+
+Because this example only enabled one ethdev queue, so fixes it by reset
+txq to zero in the first worker stage.
+
+[1] dpdk-eventdev_pipeline -l 0-48 --vdev event_sw0 -- -r1 -t1 -e1 -w ff0
+ -s5 -n0 -c32 -W1000 -D
+When launch eventdev_pipeline with command [1], event_sw
+
+Fixes: 81fb40f95c82 ("examples/eventdev: add generic worker pipeline")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Chenxingyu Wang <wangchenxingyu(a)huawei.com>
+Acked-by: Pavan Nikhilesh <pbhagavatula(a)marvell.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ .mailmap | 1 +
+ examples/eventdev_pipeline/pipeline_worker_generic.c | 12 ++++++++----
+ 2 files changed, 9 insertions(+), 4 deletions(-)
+
+diff --git a/.mailmap b/.mailmap
+index ab0742a..7725e1c 100644
+--- a/.mailmap
++++ b/.mailmap
+@@ -224,6 +224,7 @@ Cheng Liu <liucheng11(a)huawei.com>
+ Cheng Peng <cheng.peng5(a)zte.com.cn>
+ Chengwen Feng <fengchengwen(a)huawei.com>
+ Chenmin Sun <chenmin.sun(a)intel.com>
++Chenxingyu Wang <wangchenxingyu(a)huawei.com>
+ Chenxu Di <chenxux.di(a)intel.com>
+ Chenyu Huang <chenyux.huang(a)intel.com>
+ Cheryl Houser <chouser(a)vmware.com>
+diff --git a/examples/eventdev_pipeline/pipeline_worker_generic.c b/examples/eventdev_pipeline/pipeline_worker_generic.c
+index 783f68c..831d7fd 100644
+--- a/examples/eventdev_pipeline/pipeline_worker_generic.c
++++ b/examples/eventdev_pipeline/pipeline_worker_generic.c
+@@ -38,10 +38,12 @@ worker_generic(void *arg)
+ }
+ received++;
+
+- /* The first worker stage does classification */
+- if (ev.queue_id == cdata.qid[0])
++ /* The first worker stage does classification and sets txq. */
++ if (ev.queue_id == cdata.qid[0]) {
+ ev.flow_id = ev.mbuf->hash.rss
+ % cdata.num_fids;
++ rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
++ }
+
+ ev.queue_id = cdata.next_qid[ev.queue_id];
+ ev.op = RTE_EVENT_OP_FORWARD;
+@@ -96,10 +98,12 @@ worker_generic_burst(void *arg)
+
+ for (i = 0; i < nb_rx; i++) {
+
+- /* The first worker stage does classification */
+- if (events[i].queue_id == cdata.qid[0])
++ /* The first worker stage does classification and sets txq. */
++ if (events[i].queue_id == cdata.qid[0]) {
+ events[i].flow_id = events[i].mbuf->hash.rss
+ % cdata.num_fids;
++ rte_event_eth_tx_adapter_txq_set(events[i].mbuf, 0);
++ }
+
+ events[i].queue_id = cdata.next_qid[events[i].queue_id];
+ events[i].op = RTE_EVENT_OP_FORWARD;
+--
+2.33.0
+
diff --git a/0117-examples-l3fwd-add-Tx-burst-size-configuration-optio.patch b/0117-examples-l3fwd-add-Tx-burst-size-configuration-optio.patch
new file mode 100644
index 0000000..35b50bf
--- /dev/null
+++ b/0117-examples-l3fwd-add-Tx-burst-size-configuration-optio.patch
@@ -0,0 +1,338 @@
+From a384e977287431b4e845924405cef27eb93dc442 Mon Sep 17 00:00:00 2001
+From: Sivaprasad Tummala <sivaprasad.tummala(a)amd.com>
+Date: Thu, 6 Nov 2025 14:16:31 +0000
+Subject: [PATCH 04/24] examples/l3fwd: add Tx burst size configuration option
+
+[ upstream commit 79375d1015b308234e8b6955671a296394249f9b ]
+
+Previously, the Tx burst size in l3fwd was fixed at 256, which could
+lead to suboptimal performance in certain scenarios.
+
+This patch introduces separate --rx-burst and --tx-burst options to
+explicitly configure Rx and Tx burst sizes. By default, the Tx burst
+size now matches the Rx burst size for better efficiency and pipeline
+balance.
+
+Fixes: d5c4897ecfb2 ("examples/l3fwd: add option to set Rx burst size")
+
+Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala(a)amd.com>
+Tested-by: Venkat Kumar Ande <venkatkumar.ande(a)amd.com>
+Tested-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Tested-by: Pavan Nikhilesh <pbhagavatula(a)marvell.com>
+Acked-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ doc/guides/sample_app_ug/l3_forward.rst | 6 ++
+ examples/l3fwd/l3fwd.h | 10 +---
+ examples/l3fwd/l3fwd_acl.c | 2 +-
+ examples/l3fwd/l3fwd_common.h | 5 +-
+ examples/l3fwd/l3fwd_em.c | 2 +-
+ examples/l3fwd/l3fwd_fib.c | 2 +-
+ examples/l3fwd/l3fwd_lpm.c | 2 +-
+ examples/l3fwd/main.c | 80 +++++++++++++++----------
+ 8 files changed, 67 insertions(+), 42 deletions(-)
+
+diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_app_ug/l3_forward.rst
+index 1cc2c1d..22014cc 100644
+--- a/doc/guides/sample_app_ug/l3_forward.rst
++++ b/doc/guides/sample_app_ug/l3_forward.rst
+@@ -78,6 +78,8 @@ The application has a number of command line options::
+ [-P]
+ [--lookup LOOKUP_METHOD]
+ --config(port,queue,lcore)[,(port,queue,lcore)]
++ [--rx-burst NPKTS]
++ [--tx-burst NPKTS]
+ [--eth-dest=X,MM:MM:MM:MM:MM:MM]
+ [--max-pkt-len PKTLEN]
+ [--no-numa]
+@@ -114,6 +116,10 @@ Where,
+
+ * ``--config (port,queue,lcore)[,(port,queue,lcore)]:`` Determines which queues from which ports are mapped to which cores.
+
++* ``--rx-burst NPKTS:`` Optional, Rx burst size in decimal (default 32).
++
++* ``--tx-burst NPKTS:`` Optional, Tx burst size in decimal (default 32).
++
+ * ``--eth-dest=X,MM:MM:MM:MM:MM:MM:`` Optional, ethernet destination for port X.
+
+ * ``--max-pkt-len:`` Optional, maximum packet length in decimal (64-9600).
+diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
+index bb73edd..7c36ab2 100644
+--- a/examples/l3fwd/l3fwd.h
++++ b/examples/l3fwd/l3fwd.h
+@@ -32,10 +32,6 @@
+
+ #define VECTOR_SIZE_DEFAULT MAX_PKT_BURST
+ #define VECTOR_TMO_NS_DEFAULT 1E6 /* 1ms */
+-/*
+- * Try to avoid TX buffering if we have at least MAX_TX_BURST packets to send.
+- */
+-#define MAX_TX_BURST (MAX_PKT_BURST / 2)
+
+ #define NB_SOCKETS 8
+
+@@ -116,7 +112,7 @@ extern struct acl_algorithms acl_alg[];
+
+ extern uint32_t max_pkt_len;
+
+-extern uint32_t nb_pkt_per_burst;
++extern uint32_t rx_burst_size;
+
+ /* Send burst of packets on an output interface */
+ static inline int
+@@ -151,8 +147,8 @@ send_single_packet(struct lcore_conf *qconf,
+ len++;
+
+ /* enough pkts to be sent */
+- if (unlikely(len == MAX_PKT_BURST)) {
+- send_burst(qconf, MAX_PKT_BURST, port);
++ if (unlikely(len == rx_burst_size)) {
++ send_burst(qconf, rx_burst_size, port);
+ len = 0;
+ }
+
+diff --git a/examples/l3fwd/l3fwd_acl.c b/examples/l3fwd/l3fwd_acl.c
+index 89be104..58218da 100644
+--- a/examples/l3fwd/l3fwd_acl.c
++++ b/examples/l3fwd/l3fwd_acl.c
+@@ -1055,7 +1055,7 @@ acl_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid,
+- pkts_burst, nb_pkt_per_burst);
++ pkts_burst, rx_burst_size);
+
+ if (nb_rx > 0) {
+ struct acl_search_t acl_search;
+diff --git a/examples/l3fwd/l3fwd_common.h b/examples/l3fwd/l3fwd_common.h
+index 224b1c0..9b9bdf6 100644
+--- a/examples/l3fwd/l3fwd_common.h
++++ b/examples/l3fwd/l3fwd_common.h
+@@ -18,6 +18,9 @@
+ /* Minimum value of IPV4 total length (20B) in network byte order. */
+ #define IPV4_MIN_LEN_BE (sizeof(struct rte_ipv4_hdr) << 8)
+
++extern uint32_t rx_burst_size;
++extern uint32_t tx_burst_size;
++
+ /*
+ * From http://www.rfc-editor.org/rfc/rfc1812.txt section 5.2.2:
+ * - The IP version number must be 4.
+@@ -64,7 +67,7 @@ send_packetsx4(struct lcore_conf *qconf, uint16_t port, struct rte_mbuf *m[],
+ * If TX buffer for that queue is empty, and we have enough packets,
+ * then send them straightway.
+ */
+- if (num >= MAX_TX_BURST && len == 0) {
++ if (num >= tx_burst_size && len == 0) {
+ n = rte_eth_tx_burst(port, qconf->tx_queue_id[port], m, num);
+ if (unlikely(n < num)) {
+ do {
+diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
+index 9a37019..75e89e6 100644
+--- a/examples/l3fwd/l3fwd_em.c
++++ b/examples/l3fwd/l3fwd_em.c
+@@ -652,7 +652,7 @@ em_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+- nb_pkt_per_burst);
++ rx_burst_size);
+ if (nb_rx == 0)
+ continue;
+
+diff --git a/examples/l3fwd/l3fwd_fib.c b/examples/l3fwd/l3fwd_fib.c
+index 5a55b35..3f905f9 100644
+--- a/examples/l3fwd/l3fwd_fib.c
++++ b/examples/l3fwd/l3fwd_fib.c
+@@ -239,7 +239,7 @@ fib_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+- nb_pkt_per_burst);
++ rx_burst_size);
+ if (nb_rx == 0)
+ continue;
+
+diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
+index c3df879..40c365e 100644
+--- a/examples/l3fwd/l3fwd_lpm.c
++++ b/examples/l3fwd/l3fwd_lpm.c
+@@ -206,7 +206,7 @@ lpm_main_loop(__rte_unused void *dummy)
+ portid = qconf->rx_queue_list[i].port_id;
+ queueid = qconf->rx_queue_list[i].queue_id;
+ nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
+- nb_pkt_per_burst);
++ rx_burst_size);
+ if (nb_rx == 0)
+ continue;
+
+diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
+index 258a235..be5b5d8 100644
+--- a/examples/l3fwd/main.c
++++ b/examples/l3fwd/main.c
+@@ -57,7 +57,8 @@
+ static_assert(MEMPOOL_CACHE_SIZE >= MAX_PKT_BURST, "MAX_PKT_BURST should be at most MEMPOOL_CACHE_SIZE");
+ uint16_t nb_rxd = RX_DESC_DEFAULT;
+ uint16_t nb_txd = TX_DESC_DEFAULT;
+-uint32_t nb_pkt_per_burst = DEFAULT_PKT_BURST;
++uint32_t rx_burst_size = DEFAULT_PKT_BURST;
++uint32_t tx_burst_size = DEFAULT_PKT_BURST;
+
+ /**< Ports set in promiscuous mode off by default. */
+ static int promiscuous_on;
+@@ -398,7 +399,8 @@ print_usage(const char *prgname)
+ " --config (port,queue,lcore)[,(port,queue,lcore)]"
+ " [--rx-queue-size NPKTS]"
+ " [--tx-queue-size NPKTS]"
+- " [--burst NPKTS]"
++ " [--rx-burst NPKTS]"
++ " [--tx-burst NPKTS]"
+ " [--eth-dest=X,MM:MM:MM:MM:MM:MM]"
+ " [--max-pkt-len PKTLEN]"
+ " [--no-numa]"
+@@ -424,7 +426,9 @@ print_usage(const char *prgname)
+ " Default: %d\n"
+ " --tx-queue-size NPKTS: Tx queue size in decimal\n"
+ " Default: %d\n"
+- " --burst NPKTS: Burst size in decimal\n"
++ " --rx-burst NPKTS: RX Burst size in decimal\n"
++ " Default: %d\n"
++ " --tx-burst NPKTS: TX Burst size in decimal\n"
+ " Default: %d\n"
+ " --eth-dest=X,MM:MM:MM:MM:MM:MM: Ethernet destination for port X\n"
+ " --max-pkt-len PKTLEN: maximum packet length in decimal (64-9600)\n"
+@@ -455,8 +459,8 @@ print_usage(const char *prgname)
+ " another is route entry at while line leads with character '%c'.\n"
+ " --rule_ipv6=FILE: Specify the ipv6 rules entries file.\n"
+ " --alg: ACL classify method to use, one of: %s.\n\n",
+- prgname, RX_DESC_DEFAULT, TX_DESC_DEFAULT, DEFAULT_PKT_BURST,
+- ACL_LEAD_CHAR, ROUTE_LEAD_CHAR, alg);
++ prgname, RX_DESC_DEFAULT, TX_DESC_DEFAULT, DEFAULT_PKT_BURST, DEFAULT_PKT_BURST,
++ MEMPOOL_CACHE_SIZE, ACL_LEAD_CHAR, ROUTE_LEAD_CHAR, alg);
+ }
+
+ static int
+@@ -669,7 +673,7 @@ parse_lookup(const char *optarg)
+ }
+
+ static void
+-parse_pkt_burst(const char *optarg)
++parse_pkt_burst(const char *optarg, bool is_rx_burst, uint32_t *burst_sz)
+ {
+ struct rte_eth_dev_info dev_info;
+ unsigned long pkt_burst;
+@@ -684,31 +688,38 @@ parse_pkt_burst(const char *optarg)
+
+ if (pkt_burst > MAX_PKT_BURST) {
+ RTE_LOG(INFO, L3FWD, "User provided burst must be <= %d. Using default value %d\n",
+- MAX_PKT_BURST, nb_pkt_per_burst);
++ MAX_PKT_BURST, *burst_sz);
+ return;
+ } else if (pkt_burst > 0) {
+- nb_pkt_per_burst = (uint32_t)pkt_burst;
++ *burst_sz = (uint32_t)pkt_burst;
+ return;
+ }
+
+- /* If user gives a value of zero, query the PMD for its recommended Rx burst size. */
+- ret = rte_eth_dev_info_get(0, &dev_info);
+- if (ret != 0)
+- return;
+- burst_size = dev_info.default_rxportconf.burst_size;
+- if (burst_size == 0) {
+- RTE_LOG(INFO, L3FWD, "PMD does not recommend a burst size. Using default value %d. "
+- "User provided value must be in [1, %d]\n",
+- nb_pkt_per_burst, MAX_PKT_BURST);
+- return;
+- } else if (burst_size > MAX_PKT_BURST) {
+- RTE_LOG(INFO, L3FWD, "PMD recommended burst size %d exceeds maximum value %d. "
+- "Using default value %d\n",
+- burst_size, MAX_PKT_BURST, nb_pkt_per_burst);
+- return;
++ if (is_rx_burst) {
++ /* If user gives a value of zero, query the PMD for its recommended
++ * Rx burst size.
++ */
++ ret = rte_eth_dev_info_get(0, &dev_info);
++ if (ret != 0)
++ return;
++ burst_size = dev_info.default_rxportconf.burst_size;
++ if (burst_size == 0) {
++ RTE_LOG(INFO, L3FWD, "PMD does not recommend a burst size. Using default value %d. "
++ "User provided value must be in [1, %d]\n",
++ rx_burst_size, MAX_PKT_BURST);
++ return;
++ } else if (burst_size > MAX_PKT_BURST) {
++ RTE_LOG(INFO, L3FWD, "PMD recommended burst size %d exceeds maximum value %d. "
++ "Using default value %d\n",
++ burst_size, MAX_PKT_BURST, rx_burst_size);
++ return;
++ }
++ *burst_sz = burst_size;
++ RTE_LOG(INFO, L3FWD, "Using PMD-provided RX burst value %d\n", burst_size);
++ } else {
++ RTE_LOG(INFO, L3FWD, "User provided TX burst is 0. Using default value %d\n",
++ *burst_sz);
+ }
+- nb_pkt_per_burst = burst_size;
+- RTE_LOG(INFO, L3FWD, "Using PMD-provided burst value %d\n", burst_size);
+ }
+
+ #define MAX_JUMBO_PKT_LEN 9600
+@@ -742,7 +753,8 @@ static const char short_options[] =
+ #define CMD_LINE_OPT_RULE_IPV4 "rule_ipv4"
+ #define CMD_LINE_OPT_RULE_IPV6 "rule_ipv6"
+ #define CMD_LINE_OPT_ALG "alg"
+-#define CMD_LINE_OPT_PKT_BURST "burst"
++#define CMD_LINE_OPT_PKT_RX_BURST "rx-burst"
++#define CMD_LINE_OPT_PKT_TX_BURST "tx-burst"
+
+ enum {
+ /* long options mapped to a short option */
+@@ -772,7 +784,8 @@ enum {
+ CMD_LINE_OPT_ENABLE_VECTOR_NUM,
+ CMD_LINE_OPT_VECTOR_SIZE_NUM,
+ CMD_LINE_OPT_VECTOR_TMO_NS_NUM,
+- CMD_LINE_OPT_PKT_BURST_NUM,
++ CMD_LINE_OPT_PKT_RX_BURST_NUM,
++ CMD_LINE_OPT_PKT_TX_BURST_NUM,
+ };
+
+ static const struct option lgopts[] = {
+@@ -799,7 +812,8 @@ static const struct option lgopts[] = {
+ {CMD_LINE_OPT_RULE_IPV4, 1, 0, CMD_LINE_OPT_RULE_IPV4_NUM},
+ {CMD_LINE_OPT_RULE_IPV6, 1, 0, CMD_LINE_OPT_RULE_IPV6_NUM},
+ {CMD_LINE_OPT_ALG, 1, 0, CMD_LINE_OPT_ALG_NUM},
+- {CMD_LINE_OPT_PKT_BURST, 1, 0, CMD_LINE_OPT_PKT_BURST_NUM},
++ {CMD_LINE_OPT_PKT_RX_BURST, 1, 0, CMD_LINE_OPT_PKT_RX_BURST_NUM},
++ {CMD_LINE_OPT_PKT_TX_BURST, 1, 0, CMD_LINE_OPT_PKT_TX_BURST_NUM},
+ {NULL, 0, 0, 0}
+ };
+
+@@ -888,8 +902,12 @@ parse_args(int argc, char **argv)
+ parse_queue_size(optarg, &nb_txd, 0);
+ break;
+
+- case CMD_LINE_OPT_PKT_BURST_NUM:
+- parse_pkt_burst(optarg);
++ case CMD_LINE_OPT_PKT_RX_BURST_NUM:
++ parse_pkt_burst(optarg, true, &rx_burst_size);
++ break;
++
++ case CMD_LINE_OPT_PKT_TX_BURST_NUM:
++ parse_pkt_burst(optarg, false, &tx_burst_size);
+ break;
+
+ case CMD_LINE_OPT_ETH_DEST_NUM:
+@@ -1613,6 +1631,8 @@ main(int argc, char **argv)
+ if (ret < 0)
+ rte_exit(EXIT_FAILURE, "Invalid L3FWD parameters\n");
+
++ RTE_LOG(INFO, L3FWD, "Using Rx burst %u Tx burst %u\n", rx_burst_size, tx_burst_size);
++
+ /* Setup function pointers for lookup method. */
+ setup_l3fwd_lookup_tables();
+
+--
+2.33.0
+
diff --git a/0118-net-hns3-remove-duplicate-struct-field.patch b/0118-net-hns3-remove-duplicate-struct-field.patch
new file mode 100644
index 0000000..48f16b3
--- /dev/null
+++ b/0118-net-hns3-remove-duplicate-struct-field.patch
@@ -0,0 +1,260 @@
+From 90219bcbe62357d12707e244239b1e00912c2e9a Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 1 Jul 2025 17:10:00 +0800
+Subject: [PATCH 05/24] net/hns3: remove duplicate struct field
+
+[ upstream commit 3f09e50b5c23ff3a06e89f944e9e1e4cb37faacb ]
+
+The struct hns3_hw and hns3_hw.dcb_info both has num_tc field, their
+meanings are the same, to ensure code readability, remove the num_tc
+field of struct hns3_hw.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ drivers/net/hns3/hns3_dcb.c | 44 ++++++++++++-------------------
+ drivers/net/hns3/hns3_dump.c | 2 +-
+ drivers/net/hns3/hns3_ethdev.c | 4 +--
+ drivers/net/hns3/hns3_ethdev.h | 3 +--
+ drivers/net/hns3/hns3_ethdev_vf.c | 2 +-
+ drivers/net/hns3/hns3_tm.c | 6 ++---
+ 6 files changed, 25 insertions(+), 36 deletions(-)
+
+diff --git a/drivers/net/hns3/hns3_dcb.c b/drivers/net/hns3/hns3_dcb.c
+index 915e4eb..ee7502d 100644
+--- a/drivers/net/hns3/hns3_dcb.c
++++ b/drivers/net/hns3/hns3_dcb.c
+@@ -623,7 +623,7 @@ hns3_set_rss_size(struct hns3_hw *hw, uint16_t nb_rx_q)
+ uint16_t used_rx_queues;
+ uint16_t i;
+
+- rx_qnum_per_tc = nb_rx_q / hw->num_tc;
++ rx_qnum_per_tc = nb_rx_q / hw->dcb_info.num_tc;
+ if (rx_qnum_per_tc > hw->rss_size_max) {
+ hns3_err(hw, "rx queue number of per tc (%u) is greater than "
+ "value (%u) hardware supported.",
+@@ -631,11 +631,11 @@ hns3_set_rss_size(struct hns3_hw *hw, uint16_t nb_rx_q)
+ return -EINVAL;
+ }
+
+- used_rx_queues = hw->num_tc * rx_qnum_per_tc;
++ used_rx_queues = hw->dcb_info.num_tc * rx_qnum_per_tc;
+ if (used_rx_queues != nb_rx_q) {
+ hns3_err(hw, "rx queue number (%u) configured must be an "
+ "integral multiple of valid tc number (%u).",
+- nb_rx_q, hw->num_tc);
++ nb_rx_q, hw->dcb_info.num_tc);
+ return -EINVAL;
+ }
+ hw->alloc_rss_size = rx_qnum_per_tc;
+@@ -665,12 +665,12 @@ hns3_tc_queue_mapping_cfg(struct hns3_hw *hw, uint16_t nb_tx_q)
+ uint16_t tx_qnum_per_tc;
+ uint8_t i;
+
+- tx_qnum_per_tc = nb_tx_q / hw->num_tc;
+- used_tx_queues = hw->num_tc * tx_qnum_per_tc;
++ tx_qnum_per_tc = nb_tx_q / hw->dcb_info.num_tc;
++ used_tx_queues = hw->dcb_info.num_tc * tx_qnum_per_tc;
+ if (used_tx_queues != nb_tx_q) {
+ hns3_err(hw, "tx queue number (%u) configured must be an "
+ "integral multiple of valid tc number (%u).",
+- nb_tx_q, hw->num_tc);
++ nb_tx_q, hw->dcb_info.num_tc);
+ return -EINVAL;
+ }
+
+@@ -678,7 +678,7 @@ hns3_tc_queue_mapping_cfg(struct hns3_hw *hw, uint16_t nb_tx_q)
+ hw->tx_qnum_per_tc = tx_qnum_per_tc;
+ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
+ tc_queue = &hw->tc_queue[i];
+- if (hw->hw_tc_map & BIT(i) && i < hw->num_tc) {
++ if (hw->hw_tc_map & BIT(i) && i < hw->dcb_info.num_tc) {
+ tc_queue->enable = true;
+ tc_queue->tqp_offset = i * hw->tx_qnum_per_tc;
+ tc_queue->tqp_count = hw->tx_qnum_per_tc;
+@@ -720,15 +720,15 @@ hns3_queue_to_tc_mapping(struct hns3_hw *hw, uint16_t nb_rx_q, uint16_t nb_tx_q)
+ {
+ int ret;
+
+- if (nb_rx_q < hw->num_tc) {
++ if (nb_rx_q < hw->dcb_info.num_tc) {
+ hns3_err(hw, "number of Rx queues(%u) is less than number of TC(%u).",
+- nb_rx_q, hw->num_tc);
++ nb_rx_q, hw->dcb_info.num_tc);
+ return -EINVAL;
+ }
+
+- if (nb_tx_q < hw->num_tc) {
++ if (nb_tx_q < hw->dcb_info.num_tc) {
+ hns3_err(hw, "number of Tx queues(%u) is less than number of TC(%u).",
+- nb_tx_q, hw->num_tc);
++ nb_tx_q, hw->dcb_info.num_tc);
+ return -EINVAL;
+ }
+
+@@ -739,15 +739,6 @@ hns3_queue_to_tc_mapping(struct hns3_hw *hw, uint16_t nb_rx_q, uint16_t nb_tx_q)
+ return hns3_tc_queue_mapping_cfg(hw, nb_tx_q);
+ }
+
+-static int
+-hns3_dcb_update_tc_queue_mapping(struct hns3_hw *hw, uint16_t nb_rx_q,
+- uint16_t nb_tx_q)
+-{
+- hw->num_tc = hw->dcb_info.num_tc;
+-
+- return hns3_queue_to_tc_mapping(hw, nb_rx_q, nb_tx_q);
+-}
+-
+ int
+ hns3_dcb_info_init(struct hns3_hw *hw)
+ {
+@@ -1028,7 +1019,7 @@ hns3_q_to_qs_map(struct hns3_hw *hw)
+ uint32_t i, j;
+ int ret;
+
+- for (i = 0; i < hw->num_tc; i++) {
++ for (i = 0; i < hw->dcb_info.num_tc; i++) {
+ tc_queue = &hw->tc_queue[i];
+ for (j = 0; j < tc_queue->tqp_count; j++) {
+ q_id = tc_queue->tqp_offset + j;
+@@ -1053,7 +1044,7 @@ hns3_pri_q_qs_cfg(struct hns3_hw *hw)
+ return -EINVAL;
+
+ /* Cfg qs -> pri mapping */
+- for (i = 0; i < hw->num_tc; i++) {
++ for (i = 0; i < hw->dcb_info.num_tc; i++) {
+ ret = hns3_qs_to_pri_map_cfg(hw, i, i);
+ if (ret) {
+ hns3_err(hw, "qs_to_pri mapping fail: %d", ret);
+@@ -1448,8 +1439,8 @@ hns3_dcb_info_cfg(struct hns3_adapter *hns)
+ for (i = 0; i < HNS3_MAX_USER_PRIO; i++)
+ hw->dcb_info.prio_tc[i] = dcb_rx_conf->dcb_tc[i];
+
+- ret = hns3_dcb_update_tc_queue_mapping(hw, hw->data->nb_rx_queues,
+- hw->data->nb_tx_queues);
++ ret = hns3_queue_to_tc_mapping(hw, hw->data->nb_rx_queues,
++ hw->data->nb_tx_queues);
+ if (ret)
+ hns3_err(hw, "update tc queue mapping failed, ret = %d.", ret);
+
+@@ -1635,8 +1626,7 @@ hns3_dcb_init(struct hns3_hw *hw)
+ */
+ default_tqp_num = RTE_MIN(hw->rss_size_max,
+ hw->tqps_num / hw->dcb_info.num_tc);
+- ret = hns3_dcb_update_tc_queue_mapping(hw, default_tqp_num,
+- default_tqp_num);
++ ret = hns3_queue_to_tc_mapping(hw, default_tqp_num, default_tqp_num);
+ if (ret) {
+ hns3_err(hw,
+ "update tc queue mapping failed, ret = %d.",
+@@ -1673,7 +1663,7 @@ hns3_update_queue_map_configure(struct hns3_adapter *hns)
+ if ((uint32_t)mq_mode & RTE_ETH_MQ_RX_DCB_FLAG)
+ return 0;
+
+- ret = hns3_dcb_update_tc_queue_mapping(hw, nb_rx_q, nb_tx_q);
++ ret = hns3_queue_to_tc_mapping(hw, nb_rx_q, nb_tx_q);
+ if (ret) {
+ hns3_err(hw, "failed to update tc queue mapping, ret = %d.",
+ ret);
+diff --git a/drivers/net/hns3/hns3_dump.c b/drivers/net/hns3/hns3_dump.c
+index 8411835..947526e 100644
+--- a/drivers/net/hns3/hns3_dump.c
++++ b/drivers/net/hns3/hns3_dump.c
+@@ -914,7 +914,7 @@ hns3_is_link_fc_mode(struct hns3_adapter *hns)
+ if (hw->current_fc_status == HNS3_FC_STATUS_PFC)
+ return false;
+
+- if (hw->num_tc > 1 && !pf->support_multi_tc_pause)
++ if (hw->dcb_info.num_tc > 1 && !pf->support_multi_tc_pause)
+ return false;
+
+ return true;
+diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
+index f74fad4..035ebb1 100644
+--- a/drivers/net/hns3/hns3_ethdev.c
++++ b/drivers/net/hns3/hns3_ethdev.c
+@@ -5440,7 +5440,7 @@ hns3_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
+ return -EOPNOTSUPP;
+ }
+
+- if (hw->num_tc > 1 && !pf->support_multi_tc_pause) {
++ if (hw->dcb_info.num_tc > 1 && !pf->support_multi_tc_pause) {
+ hns3_err(hw, "in multi-TC scenarios, MAC pause is not supported.");
+ return -EOPNOTSUPP;
+ }
+@@ -5517,7 +5517,7 @@ hns3_get_dcb_info(struct rte_eth_dev *dev, struct rte_eth_dcb_info *dcb_info)
+ for (i = 0; i < dcb_info->nb_tcs; i++)
+ dcb_info->tc_bws[i] = hw->dcb_info.pg_info[0].tc_dwrr[i];
+
+- for (i = 0; i < hw->num_tc; i++) {
++ for (i = 0; i < hw->dcb_info.num_tc; i++) {
+ dcb_info->tc_queue.tc_rxq[0][i].base = hw->alloc_rss_size * i;
+ dcb_info->tc_queue.tc_txq[0][i].base =
+ hw->tc_queue[i].tqp_offset;
+diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
+index d856285..532d44b 100644
+--- a/drivers/net/hns3/hns3_ethdev.h
++++ b/drivers/net/hns3/hns3_ethdev.h
+@@ -133,7 +133,7 @@ struct hns3_tc_info {
+ };
+
+ struct hns3_dcb_info {
+- uint8_t num_tc;
++ uint8_t num_tc; /* Total number of enabled TCs */
+ uint8_t num_pg; /* It must be 1 if vNET-Base schd */
+ uint8_t pg_dwrr[HNS3_PG_NUM];
+ uint8_t prio_tc[HNS3_MAX_USER_PRIO];
+@@ -537,7 +537,6 @@ struct hns3_hw {
+ uint16_t rss_ind_tbl_size;
+ uint16_t rss_key_size;
+
+- uint8_t num_tc; /* Total number of enabled TCs */
+ uint8_t hw_tc_map;
+ enum hns3_fc_mode requested_fc_mode; /* FC mode requested by user */
+ struct hns3_dcb_info dcb_info;
+diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
+index 465280d..3e0bb9d 100644
+--- a/drivers/net/hns3/hns3_ethdev_vf.c
++++ b/drivers/net/hns3/hns3_ethdev_vf.c
+@@ -854,7 +854,7 @@ hns3vf_get_basic_info(struct hns3_hw *hw)
+
+ basic_info = (struct hns3_basic_info *)resp_msg;
+ hw->hw_tc_map = basic_info->hw_tc_map;
+- hw->num_tc = hns3vf_get_num_tc(hw);
++ hw->dcb_info.num_tc = hns3vf_get_num_tc(hw);
+ hw->pf_vf_if_version = basic_info->pf_vf_if_version;
+ hns3vf_update_caps(hw, basic_info->caps);
+
+diff --git a/drivers/net/hns3/hns3_tm.c b/drivers/net/hns3/hns3_tm.c
+index d969164..387b37c 100644
+--- a/drivers/net/hns3/hns3_tm.c
++++ b/drivers/net/hns3/hns3_tm.c
+@@ -519,13 +519,13 @@ hns3_tm_tc_node_add(struct rte_eth_dev *dev, uint32_t node_id,
+
+ if (node_id >= pf->tm_conf.nb_nodes_max - 1 ||
+ node_id < pf->tm_conf.nb_leaf_nodes_max ||
+- hns3_tm_calc_node_tc_no(&pf->tm_conf, node_id) >= hw->num_tc) {
++ hns3_tm_calc_node_tc_no(&pf->tm_conf, node_id) >= hw->dcb_info.num_tc) {
+ error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+ error->message = "invalid tc node ID";
+ return -EINVAL;
+ }
+
+- if (pf->tm_conf.nb_tc_node >= hw->num_tc) {
++ if (pf->tm_conf.nb_tc_node >= hw->dcb_info.num_tc) {
+ error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+ error->message = "too many TCs";
+ return -EINVAL;
+@@ -974,7 +974,7 @@ hns3_tm_configure_check(struct hns3_hw *hw, struct rte_tm_error *error)
+ }
+
+ if (hns3_tm_calc_node_tc_no(tm_conf, tm_node->id) >=
+- hw->num_tc) {
++ hw->dcb_info.num_tc) {
+ error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+ error->message = "node's TC not exist";
+ return false;
+--
+2.33.0
+
diff --git a/0119-net-hns3-refactor-DCB-module.patch b/0119-net-hns3-refactor-DCB-module.patch
new file mode 100644
index 0000000..06fd5aa
--- /dev/null
+++ b/0119-net-hns3-refactor-DCB-module.patch
@@ -0,0 +1,296 @@
+From 9e24d82acdc382dfd6113a6e8a798f04f5a6f3b6 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 1 Jul 2025 17:10:01 +0800
+Subject: [PATCH 06/24] net/hns3: refactor DCB module
+
+[ upstream commit c90c52d7a9028cca0686b799a7614c988d8b9b42 ]
+
+The DCB-related fields span in multiple structures, this patch moves
+them into struct hns3_dcb_info.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ drivers/net/hns3/hns3_dcb.c | 13 +++++------
+ drivers/net/hns3/hns3_ethdev.c | 38 +++++++++++++++----------------
+ drivers/net/hns3/hns3_ethdev.h | 8 +++----
+ drivers/net/hns3/hns3_ethdev_vf.c | 4 ++--
+ drivers/net/hns3/hns3_rss.c | 8 +++----
+ 5 files changed, 34 insertions(+), 37 deletions(-)
+
+diff --git a/drivers/net/hns3/hns3_dcb.c b/drivers/net/hns3/hns3_dcb.c
+index ee7502d..76f597e 100644
+--- a/drivers/net/hns3/hns3_dcb.c
++++ b/drivers/net/hns3/hns3_dcb.c
+@@ -678,7 +678,7 @@ hns3_tc_queue_mapping_cfg(struct hns3_hw *hw, uint16_t nb_tx_q)
+ hw->tx_qnum_per_tc = tx_qnum_per_tc;
+ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
+ tc_queue = &hw->tc_queue[i];
+- if (hw->hw_tc_map & BIT(i) && i < hw->dcb_info.num_tc) {
++ if (hw->dcb_info.hw_tc_map & BIT(i) && i < hw->dcb_info.num_tc) {
+ tc_queue->enable = true;
+ tc_queue->tqp_offset = i * hw->tx_qnum_per_tc;
+ tc_queue->tqp_count = hw->tx_qnum_per_tc;
+@@ -762,7 +762,7 @@ hns3_dcb_info_init(struct hns3_hw *hw)
+ if (i != 0)
+ continue;
+
+- hw->dcb_info.pg_info[i].tc_bit_map = hw->hw_tc_map;
++ hw->dcb_info.pg_info[i].tc_bit_map = hw->dcb_info.hw_tc_map;
+ for (k = 0; k < hw->dcb_info.num_tc; k++)
+ hw->dcb_info.pg_info[i].tc_dwrr[k] = BW_MAX_PERCENT;
+ }
+@@ -1395,15 +1395,14 @@ static int
+ hns3_dcb_info_cfg(struct hns3_adapter *hns)
+ {
+ struct rte_eth_dcb_rx_conf *dcb_rx_conf;
+- struct hns3_pf *pf = &hns->pf;
+ struct hns3_hw *hw = &hns->hw;
+ uint8_t tc_bw, bw_rest;
+ uint8_t i, j;
+ int ret;
+
+ dcb_rx_conf = &hw->data->dev_conf.rx_adv_conf.dcb_rx_conf;
+- pf->local_max_tc = (uint8_t)dcb_rx_conf->nb_tcs;
+- pf->pfc_max = (uint8_t)dcb_rx_conf->nb_tcs;
++ hw->dcb_info.local_max_tc = (uint8_t)dcb_rx_conf->nb_tcs;
++ hw->dcb_info.pfc_max = (uint8_t)dcb_rx_conf->nb_tcs;
+
+ /* Config pg0 */
+ memset(hw->dcb_info.pg_info, 0,
+@@ -1412,7 +1411,7 @@ hns3_dcb_info_cfg(struct hns3_adapter *hns)
+ hw->dcb_info.pg_info[0].pg_id = 0;
+ hw->dcb_info.pg_info[0].pg_sch_mode = HNS3_SCH_MODE_DWRR;
+ hw->dcb_info.pg_info[0].bw_limit = hw->max_tm_rate;
+- hw->dcb_info.pg_info[0].tc_bit_map = hw->hw_tc_map;
++ hw->dcb_info.pg_info[0].tc_bit_map = hw->dcb_info.hw_tc_map;
+
+ /* Each tc has same bw for valid tc by default */
+ tc_bw = BW_MAX_PERCENT / hw->dcb_info.num_tc;
+@@ -1482,7 +1481,7 @@ hns3_dcb_info_update(struct hns3_adapter *hns, uint8_t num_tc)
+ bit_map = 1;
+ hw->dcb_info.num_tc = 1;
+ }
+- hw->hw_tc_map = bit_map;
++ hw->dcb_info.hw_tc_map = bit_map;
+
+ return hns3_dcb_info_cfg(hns);
+ }
+diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
+index 035ebb1..8c4f38c 100644
+--- a/drivers/net/hns3/hns3_ethdev.c
++++ b/drivers/net/hns3/hns3_ethdev.c
+@@ -1876,7 +1876,6 @@ hns3_check_mq_mode(struct rte_eth_dev *dev)
+ enum rte_eth_rx_mq_mode rx_mq_mode = dev->data->dev_conf.rxmode.mq_mode;
+ enum rte_eth_tx_mq_mode tx_mq_mode = dev->data->dev_conf.txmode.mq_mode;
+ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+- struct hns3_pf *pf = HNS3_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+ struct rte_eth_dcb_rx_conf *dcb_rx_conf;
+ struct rte_eth_dcb_tx_conf *dcb_tx_conf;
+ uint8_t num_tc;
+@@ -1894,9 +1893,9 @@ hns3_check_mq_mode(struct rte_eth_dev *dev)
+ dcb_rx_conf = &dev->data->dev_conf.rx_adv_conf.dcb_rx_conf;
+ dcb_tx_conf = &dev->data->dev_conf.tx_adv_conf.dcb_tx_conf;
+ if ((uint32_t)rx_mq_mode & RTE_ETH_MQ_RX_DCB_FLAG) {
+- if (dcb_rx_conf->nb_tcs > pf->tc_max) {
++ if (dcb_rx_conf->nb_tcs > hw->dcb_info.tc_max) {
+ hns3_err(hw, "nb_tcs(%u) > max_tc(%u) driver supported.",
+- dcb_rx_conf->nb_tcs, pf->tc_max);
++ dcb_rx_conf->nb_tcs, hw->dcb_info.tc_max);
+ return -EINVAL;
+ }
+
+@@ -2837,25 +2836,25 @@ hns3_get_board_configuration(struct hns3_hw *hw)
+ return ret;
+ }
+
+- pf->tc_max = cfg.tc_num;
+- if (pf->tc_max > HNS3_MAX_TC_NUM || pf->tc_max < 1) {
++ hw->dcb_info.tc_max = cfg.tc_num;
++ if (hw->dcb_info.tc_max > HNS3_MAX_TC_NUM || hw->dcb_info.tc_max < 1) {
+ PMD_INIT_LOG(WARNING,
+ "Get TC num(%u) from flash, set TC num to 1",
+- pf->tc_max);
+- pf->tc_max = 1;
++ hw->dcb_info.tc_max);
++ hw->dcb_info.tc_max = 1;
+ }
+
+ /* Dev does not support DCB */
+ if (!hns3_dev_get_support(hw, DCB)) {
+- pf->tc_max = 1;
+- pf->pfc_max = 0;
++ hw->dcb_info.tc_max = 1;
++ hw->dcb_info.pfc_max = 0;
+ } else
+- pf->pfc_max = pf->tc_max;
++ hw->dcb_info.pfc_max = hw->dcb_info.tc_max;
+
+ hw->dcb_info.num_tc = 1;
+ hw->alloc_rss_size = RTE_MIN(hw->rss_size_max,
+ hw->tqps_num / hw->dcb_info.num_tc);
+- hns3_set_bit(hw->hw_tc_map, 0, 1);
++ hns3_set_bit(hw->dcb_info.hw_tc_map, 0, 1);
+ pf->tx_sch_mode = HNS3_FLAG_TC_BASE_SCH_MODE;
+
+ pf->wanted_umv_size = cfg.umv_space;
+@@ -3025,7 +3024,7 @@ hns3_tx_buffer_calc(struct hns3_hw *hw, struct hns3_pkt_buf_alloc *buf_alloc)
+ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
+ priv = &buf_alloc->priv_buf[i];
+
+- if (hw->hw_tc_map & BIT(i)) {
++ if (hw->dcb_info.hw_tc_map & BIT(i)) {
+ if (total_size < pf->tx_buf_size)
+ return -ENOMEM;
+
+@@ -3076,7 +3075,7 @@ hns3_get_tc_num(struct hns3_hw *hw)
+ uint8_t i;
+
+ for (i = 0; i < HNS3_MAX_TC_NUM; i++)
+- if (hw->hw_tc_map & BIT(i))
++ if (hw->dcb_info.hw_tc_map & BIT(i))
+ cnt++;
+ return cnt;
+ }
+@@ -3136,7 +3135,7 @@ hns3_get_no_pfc_priv_num(struct hns3_hw *hw,
+
+ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
+ priv = &buf_alloc->priv_buf[i];
+- if (hw->hw_tc_map & BIT(i) &&
++ if (hw->dcb_info.hw_tc_map & BIT(i) &&
+ !(hw->dcb_info.hw_pfc_map & BIT(i)) && priv->enable)
+ cnt++;
+ }
+@@ -3235,7 +3234,7 @@ hns3_rx_buf_calc_all(struct hns3_hw *hw, bool max,
+ priv->wl.high = 0;
+ priv->buf_size = 0;
+
+- if (!(hw->hw_tc_map & BIT(i)))
++ if (!(hw->dcb_info.hw_tc_map & BIT(i)))
+ continue;
+
+ priv->enable = 1;
+@@ -3274,7 +3273,7 @@ hns3_drop_nopfc_buf_till_fit(struct hns3_hw *hw,
+ for (i = HNS3_MAX_TC_NUM - 1; i >= 0; i--) {
+ priv = &buf_alloc->priv_buf[i];
+ mask = BIT((uint8_t)i);
+- if (hw->hw_tc_map & mask &&
++ if (hw->dcb_info.hw_tc_map & mask &&
+ !(hw->dcb_info.hw_pfc_map & mask)) {
+ /* Clear the no pfc TC private buffer */
+ priv->wl.low = 0;
+@@ -3311,7 +3310,7 @@ hns3_drop_pfc_buf_till_fit(struct hns3_hw *hw,
+ for (i = HNS3_MAX_TC_NUM - 1; i >= 0; i--) {
+ priv = &buf_alloc->priv_buf[i];
+ mask = BIT((uint8_t)i);
+- if (hw->hw_tc_map & mask && hw->dcb_info.hw_pfc_map & mask) {
++ if (hw->dcb_info.hw_tc_map & mask && hw->dcb_info.hw_pfc_map & mask) {
+ /* Reduce the number of pfc TC with private buffer */
+ priv->wl.low = 0;
+ priv->enable = 0;
+@@ -3369,7 +3368,7 @@ hns3_only_alloc_priv_buff(struct hns3_hw *hw,
+ priv->wl.high = 0;
+ priv->buf_size = 0;
+
+- if (!(hw->hw_tc_map & BIT(i)))
++ if (!(hw->dcb_info.hw_tc_map & BIT(i)))
+ continue;
+
+ priv->enable = 1;
+@@ -5502,13 +5501,12 @@ static int
+ hns3_get_dcb_info(struct rte_eth_dev *dev, struct rte_eth_dcb_info *dcb_info)
+ {
+ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+- struct hns3_pf *pf = HNS3_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+ enum rte_eth_rx_mq_mode mq_mode = dev->data->dev_conf.rxmode.mq_mode;
+ int i;
+
+ rte_spinlock_lock(&hw->lock);
+ if ((uint32_t)mq_mode & RTE_ETH_MQ_RX_DCB_FLAG)
+- dcb_info->nb_tcs = pf->local_max_tc;
++ dcb_info->nb_tcs = hw->dcb_info.local_max_tc;
+ else
+ dcb_info->nb_tcs = 1;
+
+diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
+index 532d44b..ef39d81 100644
+--- a/drivers/net/hns3/hns3_ethdev.h
++++ b/drivers/net/hns3/hns3_ethdev.h
+@@ -133,7 +133,11 @@ struct hns3_tc_info {
+ };
+
+ struct hns3_dcb_info {
++ uint8_t tc_max; /* max number of tc driver supported */
+ uint8_t num_tc; /* Total number of enabled TCs */
++ uint8_t hw_tc_map;
++ uint8_t local_max_tc; /* max number of local tc */
++ uint8_t pfc_max;
+ uint8_t num_pg; /* It must be 1 if vNET-Base schd */
+ uint8_t pg_dwrr[HNS3_PG_NUM];
+ uint8_t prio_tc[HNS3_MAX_USER_PRIO];
+@@ -537,7 +541,6 @@ struct hns3_hw {
+ uint16_t rss_ind_tbl_size;
+ uint16_t rss_key_size;
+
+- uint8_t hw_tc_map;
+ enum hns3_fc_mode requested_fc_mode; /* FC mode requested by user */
+ struct hns3_dcb_info dcb_info;
+ enum hns3_fc_status current_fc_status; /* current flow control status */
+@@ -833,9 +836,6 @@ struct hns3_pf {
+ uint16_t mps; /* Max packet size */
+
+ uint8_t tx_sch_mode;
+- uint8_t tc_max; /* max number of tc driver supported */
+- uint8_t local_max_tc; /* max number of local tc */
+- uint8_t pfc_max;
+ uint16_t pause_time;
+ bool support_fc_autoneg; /* support FC autonegotiate */
+ bool support_multi_tc_pause;
+diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
+index 3e0bb9d..b713327 100644
+--- a/drivers/net/hns3/hns3_ethdev_vf.c
++++ b/drivers/net/hns3/hns3_ethdev_vf.c
+@@ -830,7 +830,7 @@ hns3vf_get_num_tc(struct hns3_hw *hw)
+ uint32_t i;
+
+ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
+- if (hw->hw_tc_map & BIT(i))
++ if (hw->dcb_info.hw_tc_map & BIT(i))
+ num_tc++;
+ }
+ return num_tc;
+@@ -853,7 +853,7 @@ hns3vf_get_basic_info(struct hns3_hw *hw)
+ }
+
+ basic_info = (struct hns3_basic_info *)resp_msg;
+- hw->hw_tc_map = basic_info->hw_tc_map;
++ hw->dcb_info.hw_tc_map = basic_info->hw_tc_map;
+ hw->dcb_info.num_tc = hns3vf_get_num_tc(hw);
+ hw->pf_vf_if_version = basic_info->pf_vf_if_version;
+ hns3vf_update_caps(hw, basic_info->caps);
+diff --git a/drivers/net/hns3/hns3_rss.c b/drivers/net/hns3/hns3_rss.c
+index 3eae4ca..508b3e2 100644
+--- a/drivers/net/hns3/hns3_rss.c
++++ b/drivers/net/hns3/hns3_rss.c
+@@ -940,13 +940,13 @@ hns3_set_rss_tc_mode_entry(struct hns3_hw *hw, uint8_t *tc_valid,
+ * has to enable the unused TC by using TC0 queue
+ * mapping configuration.
+ */
+- tc_valid[i] = (hw->hw_tc_map & BIT(i)) ?
+- !!(hw->hw_tc_map & BIT(i)) : 1;
++ tc_valid[i] = (hw->dcb_info.hw_tc_map & BIT(i)) ?
++ !!(hw->dcb_info.hw_tc_map & BIT(i)) : 1;
+ tc_size[i] = roundup_size;
+- tc_offset[i] = (hw->hw_tc_map & BIT(i)) ?
++ tc_offset[i] = (hw->dcb_info.hw_tc_map & BIT(i)) ?
+ rss_size * i : 0;
+ } else {
+- tc_valid[i] = !!(hw->hw_tc_map & BIT(i));
++ tc_valid[i] = !!(hw->dcb_info.hw_tc_map & BIT(i));
+ tc_size[i] = tc_valid[i] ? roundup_size : 0;
+ tc_offset[i] = tc_valid[i] ? rss_size * i : 0;
+ }
+--
+2.33.0
+
diff --git a/0120-net-hns3-parse-max-TC-number-for-VF.patch b/0120-net-hns3-parse-max-TC-number-for-VF.patch
new file mode 100644
index 0000000..9d590f4
--- /dev/null
+++ b/0120-net-hns3-parse-max-TC-number-for-VF.patch
@@ -0,0 +1,73 @@
+From ae67f91f9ca6deb981d595000e06936b966e710d Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 1 Jul 2025 17:10:02 +0800
+Subject: [PATCH 07/24] net/hns3: parse max TC number for VF
+
+[ upstream commit 4bbf4f689cd029dac9fdf0e5e6dc63dc15be4629 ]
+
+The mailbox message HNS3_MBX_GET_BASIC_INFO can obtain the maximum
+number of TCs of the device. The VF does not support multiple TCs,
+therefore, this field is not saved.
+
+Now the VF needs to support multiple TCs, therefore, this field needs
+to be saved.
+
+This commit also support dump the TC info.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ drivers/net/hns3/hns3_dump.c | 2 ++
+ drivers/net/hns3/hns3_ethdev_vf.c | 1 +
+ drivers/net/hns3/hns3_mbx.h | 2 +-
+ 3 files changed, 4 insertions(+), 1 deletion(-)
+
+diff --git a/drivers/net/hns3/hns3_dump.c b/drivers/net/hns3/hns3_dump.c
+index 947526e..16e7db7 100644
+--- a/drivers/net/hns3/hns3_dump.c
++++ b/drivers/net/hns3/hns3_dump.c
+@@ -209,6 +209,7 @@ hns3_get_device_basic_info(FILE *file, struct rte_eth_dev *dev)
+ " - Device Base Info:\n"
+ "\t -- name: %s\n"
+ "\t -- adapter_state=%s\n"
++ "\t -- tc_max=%u tc_num=%u\n"
+ "\t -- nb_rx_queues=%u nb_tx_queues=%u\n"
+ "\t -- total_tqps_num=%u tqps_num=%u intr_tqps_num=%u\n"
+ "\t -- rss_size_max=%u alloc_rss_size=%u tx_qnum_per_tc=%u\n"
+@@ -221,6 +222,7 @@ hns3_get_device_basic_info(FILE *file, struct rte_eth_dev *dev)
+ "\t -- intr_conf: lsc=%u rxq=%u\n",
+ dev->data->name,
+ hns3_get_adapter_state_name(hw->adapter_state),
++ hw->dcb_info.tc_max, hw->dcb_info.num_tc,
+ dev->data->nb_rx_queues, dev->data->nb_tx_queues,
+ hw->total_tqps_num, hw->tqps_num, hw->intr_tqps_num,
+ hw->rss_size_max, hw->alloc_rss_size, hw->tx_qnum_per_tc,
+diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
+index b713327..f06e06f 100644
+--- a/drivers/net/hns3/hns3_ethdev_vf.c
++++ b/drivers/net/hns3/hns3_ethdev_vf.c
+@@ -853,6 +853,7 @@ hns3vf_get_basic_info(struct hns3_hw *hw)
+ }
+
+ basic_info = (struct hns3_basic_info *)resp_msg;
++ hw->dcb_info.tc_max = basic_info->tc_max;
+ hw->dcb_info.hw_tc_map = basic_info->hw_tc_map;
+ hw->dcb_info.num_tc = hns3vf_get_num_tc(hw);
+ hw->pf_vf_if_version = basic_info->pf_vf_if_version;
+diff --git a/drivers/net/hns3/hns3_mbx.h b/drivers/net/hns3/hns3_mbx.h
+index 2b6cb8f..705e776 100644
+--- a/drivers/net/hns3/hns3_mbx.h
++++ b/drivers/net/hns3/hns3_mbx.h
+@@ -53,7 +53,7 @@ enum HNS3_MBX_OPCODE {
+
+ struct hns3_basic_info {
+ uint8_t hw_tc_map;
+- uint8_t rsv;
++ uint8_t tc_max;
+ uint16_t pf_vf_if_version;
+ /* capabilities of VF dependent on PF */
+ uint32_t caps;
+--
+2.33.0
+
diff --git a/0121-net-hns3-support-multi-TCs-capability-for-VF.patch b/0121-net-hns3-support-multi-TCs-capability-for-VF.patch
new file mode 100644
index 0000000..1c3b400
--- /dev/null
+++ b/0121-net-hns3-support-multi-TCs-capability-for-VF.patch
@@ -0,0 +1,172 @@
+From 6c1d20c2842ccecc2174c0d668772b5ec4e2128c Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 1 Jul 2025 17:10:03 +0800
+Subject: [PATCH 08/24] net/hns3: support multi-TCs capability for VF
+
+[ upstream commit 95dc6d361143508077e3f3635c170d69126f8faa ]
+
+The VF multi-TCs feature depends on firmware and PF driver, the
+capability was set when:
+1) Firmware report VF multi-TCs flag.
+2) PF driver report VF multi-TCs flag.
+3) PF driver support query multi-TCs info mailbox message.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ drivers/net/hns3/hns3_cmd.c | 5 ++++-
+ drivers/net/hns3/hns3_cmd.h | 2 ++
+ drivers/net/hns3/hns3_dump.c | 3 ++-
+ drivers/net/hns3/hns3_ethdev.h | 1 +
+ drivers/net/hns3/hns3_ethdev_vf.c | 33 +++++++++++++++++++++++++++++++
+ drivers/net/hns3/hns3_mbx.h | 7 +++++++
+ 6 files changed, 49 insertions(+), 2 deletions(-)
+
+diff --git a/drivers/net/hns3/hns3_cmd.c b/drivers/net/hns3/hns3_cmd.c
+index 62da6dd..6955afb 100644
+--- a/drivers/net/hns3/hns3_cmd.c
++++ b/drivers/net/hns3/hns3_cmd.c
+@@ -482,7 +482,8 @@ static void
+ hns3_parse_capability(struct hns3_hw *hw,
+ struct hns3_query_version_cmd *cmd)
+ {
+- uint32_t caps = rte_le_to_cpu_32(cmd->caps[0]);
++ uint64_t caps = ((uint64_t)rte_le_to_cpu_32(cmd->caps[1]) << 32) |
++ rte_le_to_cpu_32(cmd->caps[0]);
+
+ if (hns3_get_bit(caps, HNS3_CAPS_FD_QUEUE_REGION_B))
+ hns3_set_bit(hw->capability, HNS3_DEV_SUPPORT_FD_QUEUE_REGION_B,
+@@ -524,6 +525,8 @@ hns3_parse_capability(struct hns3_hw *hw,
+ hns3_set_bit(hw->capability, HNS3_DEV_SUPPORT_FC_AUTO_B, 1);
+ if (hns3_get_bit(caps, HNS3_CAPS_GRO_B))
+ hns3_set_bit(hw->capability, HNS3_DEV_SUPPORT_GRO_B, 1);
++ if (hns3_get_bit(caps, HNS3_CAPS_VF_MULTI_TCS_B))
++ hns3_set_bit(hw->capability, HNS3_DEV_SUPPORT_VF_MULTI_TCS_B, 1);
+ }
+
+ static uint32_t
+diff --git a/drivers/net/hns3/hns3_cmd.h b/drivers/net/hns3/hns3_cmd.h
+index 4d707c1..86169f5 100644
+--- a/drivers/net/hns3/hns3_cmd.h
++++ b/drivers/net/hns3/hns3_cmd.h
+@@ -325,6 +325,7 @@ enum HNS3_CAPS_BITS {
+ HNS3_CAPS_TM_B = 19,
+ HNS3_CAPS_GRO_B = 20,
+ HNS3_CAPS_FC_AUTO_B = 30,
++ HNS3_CAPS_VF_MULTI_TCS_B = 34,
+ };
+
+ /* Capabilities of VF dependent on the PF */
+@@ -334,6 +335,7 @@ enum HNS3VF_CAPS_BITS {
+ * in kernel side PF.
+ */
+ HNS3VF_CAPS_VLAN_FLT_MOD_B = 0,
++ HNS3VF_CAPS_MULTI_TCS_B = 1,
+ };
+
+ enum HNS3_API_CAP_BITS {
+diff --git a/drivers/net/hns3/hns3_dump.c b/drivers/net/hns3/hns3_dump.c
+index 16e7db7..c8da7e1 100644
+--- a/drivers/net/hns3/hns3_dump.c
++++ b/drivers/net/hns3/hns3_dump.c
+@@ -105,7 +105,8 @@ hns3_get_dev_feature_capability(FILE *file, struct hns3_hw *hw)
+ {HNS3_DEV_SUPPORT_TM_B, "TM"},
+ {HNS3_DEV_SUPPORT_VF_VLAN_FLT_MOD_B, "VF VLAN FILTER MOD"},
+ {HNS3_DEV_SUPPORT_FC_AUTO_B, "FC AUTO"},
+- {HNS3_DEV_SUPPORT_GRO_B, "GRO"}
++ {HNS3_DEV_SUPPORT_GRO_B, "GRO"},
++ {HNS3_DEV_SUPPORT_VF_MULTI_TCS_B, "VF MULTI TCS"},
+ };
+ uint32_t i;
+
+diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
+index ef39d81..c8acd28 100644
+--- a/drivers/net/hns3/hns3_ethdev.h
++++ b/drivers/net/hns3/hns3_ethdev.h
+@@ -918,6 +918,7 @@ enum hns3_dev_cap {
+ HNS3_DEV_SUPPORT_VF_VLAN_FLT_MOD_B,
+ HNS3_DEV_SUPPORT_FC_AUTO_B,
+ HNS3_DEV_SUPPORT_GRO_B,
++ HNS3_DEV_SUPPORT_VF_MULTI_TCS_B,
+ };
+
+ #define hns3_dev_get_support(hw, _name) \
+diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
+index f06e06f..52859a8 100644
+--- a/drivers/net/hns3/hns3_ethdev_vf.c
++++ b/drivers/net/hns3/hns3_ethdev_vf.c
+@@ -815,12 +815,45 @@ hns3vf_get_queue_info(struct hns3_hw *hw)
+ return hns3vf_check_tqp_info(hw);
+ }
+
++static void
++hns3vf_update_multi_tcs_cap(struct hns3_hw *hw, uint32_t pf_multi_tcs_bit)
++{
++ uint8_t resp_msg[HNS3_MBX_MAX_RESP_DATA_SIZE];
++ struct hns3_vf_to_pf_msg req;
++ int ret;
++
++ if (!hns3_dev_get_support(hw, VF_MULTI_TCS))
++ return;
++
++ if (pf_multi_tcs_bit == 0) {
++ /*
++ * Early PF driver versions may don't report
++ * HNS3VF_CAPS_MULTI_TCS_B when VF query basic info, so clear
++ * the corresponding capability bit.
++ */
++ hns3_set_bit(hw->capability, HNS3_DEV_SUPPORT_VF_MULTI_TCS_B, 0);
++ return;
++ }
++
++ /*
++ * Early PF driver versions may also report HNS3VF_CAPS_MULTI_TCS_B
++ * when VF query basic info, but they don't support query TC info
++ * mailbox message, so clear the corresponding capability bit.
++ */
++ hns3vf_mbx_setup(&req, HNS3_MBX_GET_TC, HNS3_MBX_GET_PRIO_MAP);
++ ret = hns3vf_mbx_send(hw, &req, true, resp_msg, sizeof(resp_msg));
++ if (ret)
++ hns3_set_bit(hw->capability, HNS3_DEV_SUPPORT_VF_MULTI_TCS_B, 0);
++}
++
+ static void
+ hns3vf_update_caps(struct hns3_hw *hw, uint32_t caps)
+ {
+ if (hns3_get_bit(caps, HNS3VF_CAPS_VLAN_FLT_MOD_B))
+ hns3_set_bit(hw->capability,
+ HNS3_DEV_SUPPORT_VF_VLAN_FLT_MOD_B, 1);
++
++ hns3vf_update_multi_tcs_cap(hw, hns3_get_bit(caps, HNS3VF_CAPS_MULTI_TCS_B));
+ }
+
+ static int
+diff --git a/drivers/net/hns3/hns3_mbx.h b/drivers/net/hns3/hns3_mbx.h
+index 705e776..97fbc4c 100644
+--- a/drivers/net/hns3/hns3_mbx.h
++++ b/drivers/net/hns3/hns3_mbx.h
+@@ -48,6 +48,9 @@ enum HNS3_MBX_OPCODE {
+
+ HNS3_MBX_HANDLE_VF_TBL = 38, /* (VF -> PF) store/clear hw cfg tbl */
+ HNS3_MBX_GET_RING_VECTOR_MAP, /* (VF -> PF) get ring-to-vector map */
++
++ HNS3_MBX_GET_TC = 47, /* (VF -> PF) get tc info of PF configured */
++
+ HNS3_MBX_PUSH_LINK_STATUS = 201, /* (IMP -> PF) get port link status */
+ };
+
+@@ -59,6 +62,10 @@ struct hns3_basic_info {
+ uint32_t caps;
+ };
+
++enum hns3_mbx_get_tc_subcode {
++ HNS3_MBX_GET_PRIO_MAP = 0, /* query priority to tc map */
++};
++
+ /* below are per-VF mac-vlan subcodes */
+ enum hns3_mbx_mac_vlan_subcode {
+ HNS3_MBX_MAC_VLAN_UC_MODIFY = 0, /* modify UC mac addr */
+--
+2.33.0
+
diff --git a/0122-net-hns3-fix-queue-TC-configuration-on-VF.patch b/0122-net-hns3-fix-queue-TC-configuration-on-VF.patch
new file mode 100644
index 0000000..c25930f
--- /dev/null
+++ b/0122-net-hns3-fix-queue-TC-configuration-on-VF.patch
@@ -0,0 +1,109 @@
+From 290aef514a1d17ad7e99b73f98539995caf0c1b3 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 1 Jul 2025 17:09:59 +0800
+Subject: [PATCH 09/24] net/hns3: fix queue TC configuration on VF
+
+[ upstream commit a542f48bc0ec83c296ae01ad691479c17caf99b5 ]
+
+The VF cannot configure the mapping of queue to TC by directly writing
+the register. Instead, the mapping must be modified by using firmware
+command.
+
+Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ drivers/net/hns3/hns3_cmd.h | 8 ++++++++
+ drivers/net/hns3/hns3_rxtx.c | 26 +++++++++++++++++++++-----
+ 2 files changed, 29 insertions(+), 5 deletions(-)
+
+diff --git a/drivers/net/hns3/hns3_cmd.h b/drivers/net/hns3/hns3_cmd.h
+index 86169f5..2a2ec15 100644
+--- a/drivers/net/hns3/hns3_cmd.h
++++ b/drivers/net/hns3/hns3_cmd.h
+@@ -178,6 +178,7 @@ enum hns3_opcode_type {
+
+ /* TQP commands */
+ HNS3_OPC_QUERY_TX_STATUS = 0x0B03,
++ HNS3_OPC_TQP_TX_QUEUE_TC = 0x0B04,
+ HNS3_OPC_QUERY_RX_STATUS = 0x0B13,
+ HNS3_OPC_CFG_COM_TQP_QUEUE = 0x0B20,
+ HNS3_OPC_RESET_TQP_QUEUE = 0x0B22,
+@@ -972,6 +973,13 @@ struct hns3_reset_tqp_queue_cmd {
+ uint8_t rsv[19];
+ };
+
++struct hns3vf_tx_ring_tc_cmd {
++ uint16_t tqp_id;
++ uint16_t rsv1;
++ uint8_t tc_id;
++ uint8_t rsv2[19];
++};
++
+ #define HNS3_CFG_RESET_MAC_B 3
+ #define HNS3_CFG_RESET_FUNC_B 7
+ #define HNS3_CFG_RESET_RCB_B 1
+diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
+index 393512b..6beb91c 100644
+--- a/drivers/net/hns3/hns3_rxtx.c
++++ b/drivers/net/hns3/hns3_rxtx.c
+@@ -1176,12 +1176,14 @@ hns3_init_txq(struct hns3_tx_queue *txq)
+ hns3_init_tx_queue_hw(txq);
+ }
+
+-static void
++static int
+ hns3_init_tx_ring_tc(struct hns3_adapter *hns)
+ {
++ struct hns3_cmd_desc desc;
++ struct hns3vf_tx_ring_tc_cmd *req = (struct hns3vf_tx_ring_tc_cmd *)desc.data;
+ struct hns3_hw *hw = &hns->hw;
+ struct hns3_tx_queue *txq;
+- int i, num;
++ int i, num, ret;
+
+ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
+ struct hns3_tc_queue_info *tc_queue = &hw->tc_queue[i];
+@@ -1196,9 +1198,24 @@ hns3_init_tx_ring_tc(struct hns3_adapter *hns)
+ if (txq == NULL)
+ continue;
+
+- hns3_write_dev(txq, HNS3_RING_TX_TC_REG, tc_queue->tc);
++ if (!hns->is_vf) {
++ hns3_write_dev(txq, HNS3_RING_TX_TC_REG, tc_queue->tc);
++ continue;
++ }
++
++ hns3_cmd_setup_basic_desc(&desc, HNS3_OPC_TQP_TX_QUEUE_TC, false);
++ req->tqp_id = rte_cpu_to_le_16(num);
++ req->tc_id = tc_queue->tc;
++ ret = hns3_cmd_send(hw, &desc, 1);
++ if (ret != 0) {
++ hns3_err(hw, "config Tx queue (%u)'s TC failed! ret = %d.",
++ num, ret);
++ return ret;
++ }
+ }
+ }
++
++ return 0;
+ }
+
+ static int
+@@ -1274,9 +1291,8 @@ hns3_init_tx_queues(struct hns3_adapter *hns)
+ txq = (struct hns3_tx_queue *)hw->fkq_data.tx_queues[i];
+ hns3_init_txq(txq);
+ }
+- hns3_init_tx_ring_tc(hns);
+
+- return 0;
++ return hns3_init_tx_ring_tc(hns);
+ }
+
+ /*
+--
+2.33.0
+
diff --git a/0123-net-hns3-support-multi-TCs-configuration-for-VF.patch b/0123-net-hns3-support-multi-TCs-configuration-for-VF.patch
new file mode 100644
index 0000000..67a83f7
--- /dev/null
+++ b/0123-net-hns3-support-multi-TCs-configuration-for-VF.patch
@@ -0,0 +1,681 @@
+From 04c5d5addc5f94134ea729d9d3e07a1b0185fdf7 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 1 Jul 2025 17:10:04 +0800
+Subject: [PATCH 10/24] net/hns3: support multi-TCs configuration for VF
+
+[ upstream commit fd89a25eb8112e0a6ff821a8f19e92b9d95082bc ]
+
+If VF has the multi-TCs capability, then application could configure the
+multi-TCs feature through the DCB interface. Because VF does not have
+its own ETS and PFC components, the constraints are as follows:
+
+1. The DCB configuration (struct rte_eth_dcb_rx_conf and
+ rte_eth_dcb_tx_conf) must be the same as that of the PF.
+2. VF does not support RTE_ETH_DCB_PFC_SUPPORT configuration.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ drivers/net/hns3/hns3_dcb.c | 106 ++++++++++++
+ drivers/net/hns3/hns3_dcb.h | 4 +
+ drivers/net/hns3/hns3_dump.c | 6 +-
+ drivers/net/hns3/hns3_ethdev.c | 98 +-----------
+ drivers/net/hns3/hns3_ethdev_vf.c | 257 ++++++++++++++++++++++++++++--
+ drivers/net/hns3/hns3_mbx.h | 39 +++++
+ 6 files changed, 402 insertions(+), 108 deletions(-)
+
+diff --git a/drivers/net/hns3/hns3_dcb.c b/drivers/net/hns3/hns3_dcb.c
+index 76f597e..c1a8542 100644
+--- a/drivers/net/hns3/hns3_dcb.c
++++ b/drivers/net/hns3/hns3_dcb.c
+@@ -1800,3 +1800,109 @@ hns3_fc_enable(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
+
+ return ret;
+ }
++
++int
++hns3_get_dcb_info(struct rte_eth_dev *dev, struct rte_eth_dcb_info *dcb_info)
++{
++ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
++ enum rte_eth_rx_mq_mode mq_mode = dev->data->dev_conf.rxmode.mq_mode;
++ struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
++ int i;
++
++ if (hns->is_vf && !hns3_dev_get_support(hw, VF_MULTI_TCS))
++ return -ENOTSUP;
++
++ rte_spinlock_lock(&hw->lock);
++ if ((uint32_t)mq_mode & RTE_ETH_MQ_RX_DCB_FLAG)
++ dcb_info->nb_tcs = hw->dcb_info.local_max_tc;
++ else
++ dcb_info->nb_tcs = 1;
++
++ for (i = 0; i < HNS3_MAX_USER_PRIO; i++)
++ dcb_info->prio_tc[i] = hw->dcb_info.prio_tc[i];
++ for (i = 0; i < dcb_info->nb_tcs; i++)
++ dcb_info->tc_bws[i] = hw->dcb_info.pg_info[0].tc_dwrr[i];
++
++ for (i = 0; i < hw->dcb_info.num_tc; i++) {
++ dcb_info->tc_queue.tc_rxq[0][i].base = hw->alloc_rss_size * i;
++ dcb_info->tc_queue.tc_txq[0][i].base =
++ hw->tc_queue[i].tqp_offset;
++ dcb_info->tc_queue.tc_rxq[0][i].nb_queue = hw->alloc_rss_size;
++ dcb_info->tc_queue.tc_txq[0][i].nb_queue =
++ hw->tc_queue[i].tqp_count;
++ }
++ rte_spinlock_unlock(&hw->lock);
++
++ return 0;
++}
++
++int
++hns3_check_dev_mq_mode(struct rte_eth_dev *dev)
++{
++ enum rte_eth_rx_mq_mode rx_mq_mode = dev->data->dev_conf.rxmode.mq_mode;
++ enum rte_eth_tx_mq_mode tx_mq_mode = dev->data->dev_conf.txmode.mq_mode;
++ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
++ struct hns3_adapter *hns = HNS3_DEV_HW_TO_ADAPTER(hw);
++ struct rte_eth_dcb_rx_conf *dcb_rx_conf;
++ struct rte_eth_dcb_tx_conf *dcb_tx_conf;
++ uint8_t num_tc;
++ int max_tc = 0;
++ int i;
++
++ if (((uint32_t)rx_mq_mode & RTE_ETH_MQ_RX_VMDQ_FLAG) ||
++ (tx_mq_mode == RTE_ETH_MQ_TX_VMDQ_DCB ||
++ tx_mq_mode == RTE_ETH_MQ_TX_VMDQ_ONLY)) {
++ hns3_err(hw, "VMDQ is not supported, rx_mq_mode = %d, tx_mq_mode = %d.",
++ rx_mq_mode, tx_mq_mode);
++ return -EOPNOTSUPP;
++ }
++
++ dcb_rx_conf = &dev->data->dev_conf.rx_adv_conf.dcb_rx_conf;
++ dcb_tx_conf = &dev->data->dev_conf.tx_adv_conf.dcb_tx_conf;
++ if ((uint32_t)rx_mq_mode & RTE_ETH_MQ_RX_DCB_FLAG) {
++ if (dcb_rx_conf->nb_tcs > hw->dcb_info.tc_max) {
++ hns3_err(hw, "nb_tcs(%u) > max_tc(%u) driver supported.",
++ dcb_rx_conf->nb_tcs, hw->dcb_info.tc_max);
++ return -EINVAL;
++ }
++
++ /*
++ * The PF driver supports only four or eight TCs. But the
++ * number of TCs supported by the VF driver is flexible,
++ * therefore, only the number of TCs in the PF is verified.
++ */
++ if (!hns->is_vf && !(dcb_rx_conf->nb_tcs == HNS3_4_TCS ||
++ dcb_rx_conf->nb_tcs == HNS3_8_TCS)) {
++ hns3_err(hw, "on RTE_ETH_MQ_RX_DCB_RSS mode, "
++ "nb_tcs(%d) != %d or %d in rx direction.",
++ dcb_rx_conf->nb_tcs, HNS3_4_TCS, HNS3_8_TCS);
++ return -EINVAL;
++ }
++
++ if (dcb_rx_conf->nb_tcs != dcb_tx_conf->nb_tcs) {
++ hns3_err(hw, "num_tcs(%d) of tx is not equal to rx(%d)",
++ dcb_tx_conf->nb_tcs, dcb_rx_conf->nb_tcs);
++ return -EINVAL;
++ }
++
++ for (i = 0; i < HNS3_MAX_USER_PRIO; i++) {
++ if (dcb_rx_conf->dcb_tc[i] != dcb_tx_conf->dcb_tc[i]) {
++ hns3_err(hw, "dcb_tc[%d] = %u in rx direction, "
++ "is not equal to one in tx direction.",
++ i, dcb_rx_conf->dcb_tc[i]);
++ return -EINVAL;
++ }
++ if (dcb_rx_conf->dcb_tc[i] > max_tc)
++ max_tc = dcb_rx_conf->dcb_tc[i];
++ }
++
++ num_tc = max_tc + 1;
++ if (num_tc > dcb_rx_conf->nb_tcs) {
++ hns3_err(hw, "max num_tc(%u) mapped > nb_tcs(%u)",
++ num_tc, dcb_rx_conf->nb_tcs);
++ return -EINVAL;
++ }
++ }
++
++ return 0;
++}
+diff --git a/drivers/net/hns3/hns3_dcb.h b/drivers/net/hns3/hns3_dcb.h
+index d5bb5ed..552e9c3 100644
+--- a/drivers/net/hns3/hns3_dcb.h
++++ b/drivers/net/hns3/hns3_dcb.h
+@@ -215,4 +215,8 @@ int hns3_update_queue_map_configure(struct hns3_adapter *hns);
+ int hns3_port_shaper_update(struct hns3_hw *hw, uint32_t speed);
+ uint8_t hns3_txq_mapped_tc_get(struct hns3_hw *hw, uint16_t txq_no);
+
++int hns3_get_dcb_info(struct rte_eth_dev *dev, struct rte_eth_dcb_info *dcb_info);
++
++int hns3_check_dev_mq_mode(struct rte_eth_dev *dev);
++
+ #endif /* HNS3_DCB_H */
+diff --git a/drivers/net/hns3/hns3_dump.c b/drivers/net/hns3/hns3_dump.c
+index c8da7e1..5bd1a45 100644
+--- a/drivers/net/hns3/hns3_dump.c
++++ b/drivers/net/hns3/hns3_dump.c
+@@ -210,7 +210,7 @@ hns3_get_device_basic_info(FILE *file, struct rte_eth_dev *dev)
+ " - Device Base Info:\n"
+ "\t -- name: %s\n"
+ "\t -- adapter_state=%s\n"
+- "\t -- tc_max=%u tc_num=%u\n"
++ "\t -- tc_max=%u tc_num=%u dwrr[%u %u %u %u]\n"
+ "\t -- nb_rx_queues=%u nb_tx_queues=%u\n"
+ "\t -- total_tqps_num=%u tqps_num=%u intr_tqps_num=%u\n"
+ "\t -- rss_size_max=%u alloc_rss_size=%u tx_qnum_per_tc=%u\n"
+@@ -224,6 +224,10 @@ hns3_get_device_basic_info(FILE *file, struct rte_eth_dev *dev)
+ dev->data->name,
+ hns3_get_adapter_state_name(hw->adapter_state),
+ hw->dcb_info.tc_max, hw->dcb_info.num_tc,
++ hw->dcb_info.pg_info[0].tc_dwrr[0],
++ hw->dcb_info.pg_info[0].tc_dwrr[1],
++ hw->dcb_info.pg_info[0].tc_dwrr[2],
++ hw->dcb_info.pg_info[0].tc_dwrr[3],
+ dev->data->nb_rx_queues, dev->data->nb_tx_queues,
+ hw->total_tqps_num, hw->tqps_num, hw->intr_tqps_num,
+ hw->rss_size_max, hw->alloc_rss_size, hw->tx_qnum_per_tc,
+diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
+index 8c4f38c..d8cb5ce 100644
+--- a/drivers/net/hns3/hns3_ethdev.c
++++ b/drivers/net/hns3/hns3_ethdev.c
+@@ -1870,71 +1870,6 @@ hns3_remove_mc_mac_addr(struct hns3_hw *hw, struct rte_ether_addr *mac_addr)
+ return ret;
+ }
+
+-static int
+-hns3_check_mq_mode(struct rte_eth_dev *dev)
+-{
+- enum rte_eth_rx_mq_mode rx_mq_mode = dev->data->dev_conf.rxmode.mq_mode;
+- enum rte_eth_tx_mq_mode tx_mq_mode = dev->data->dev_conf.txmode.mq_mode;
+- struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+- struct rte_eth_dcb_rx_conf *dcb_rx_conf;
+- struct rte_eth_dcb_tx_conf *dcb_tx_conf;
+- uint8_t num_tc;
+- int max_tc = 0;
+- int i;
+-
+- if (((uint32_t)rx_mq_mode & RTE_ETH_MQ_RX_VMDQ_FLAG) ||
+- (tx_mq_mode == RTE_ETH_MQ_TX_VMDQ_DCB ||
+- tx_mq_mode == RTE_ETH_MQ_TX_VMDQ_ONLY)) {
+- hns3_err(hw, "VMDQ is not supported, rx_mq_mode = %d, tx_mq_mode = %d.",
+- rx_mq_mode, tx_mq_mode);
+- return -EOPNOTSUPP;
+- }
+-
+- dcb_rx_conf = &dev->data->dev_conf.rx_adv_conf.dcb_rx_conf;
+- dcb_tx_conf = &dev->data->dev_conf.tx_adv_conf.dcb_tx_conf;
+- if ((uint32_t)rx_mq_mode & RTE_ETH_MQ_RX_DCB_FLAG) {
+- if (dcb_rx_conf->nb_tcs > hw->dcb_info.tc_max) {
+- hns3_err(hw, "nb_tcs(%u) > max_tc(%u) driver supported.",
+- dcb_rx_conf->nb_tcs, hw->dcb_info.tc_max);
+- return -EINVAL;
+- }
+-
+- if (!(dcb_rx_conf->nb_tcs == HNS3_4_TCS ||
+- dcb_rx_conf->nb_tcs == HNS3_8_TCS)) {
+- hns3_err(hw, "on RTE_ETH_MQ_RX_DCB_RSS mode, "
+- "nb_tcs(%d) != %d or %d in rx direction.",
+- dcb_rx_conf->nb_tcs, HNS3_4_TCS, HNS3_8_TCS);
+- return -EINVAL;
+- }
+-
+- if (dcb_rx_conf->nb_tcs != dcb_tx_conf->nb_tcs) {
+- hns3_err(hw, "num_tcs(%d) of tx is not equal to rx(%d)",
+- dcb_tx_conf->nb_tcs, dcb_rx_conf->nb_tcs);
+- return -EINVAL;
+- }
+-
+- for (i = 0; i < HNS3_MAX_USER_PRIO; i++) {
+- if (dcb_rx_conf->dcb_tc[i] != dcb_tx_conf->dcb_tc[i]) {
+- hns3_err(hw, "dcb_tc[%d] = %u in rx direction, "
+- "is not equal to one in tx direction.",
+- i, dcb_rx_conf->dcb_tc[i]);
+- return -EINVAL;
+- }
+- if (dcb_rx_conf->dcb_tc[i] > max_tc)
+- max_tc = dcb_rx_conf->dcb_tc[i];
+- }
+-
+- num_tc = max_tc + 1;
+- if (num_tc > dcb_rx_conf->nb_tcs) {
+- hns3_err(hw, "max num_tc(%u) mapped > nb_tcs(%u)",
+- num_tc, dcb_rx_conf->nb_tcs);
+- return -EINVAL;
+- }
+- }
+-
+- return 0;
+-}
+-
+ static int
+ hns3_bind_ring_with_vector(struct hns3_hw *hw, uint16_t vector_id, bool en,
+ enum hns3_ring_type queue_type, uint16_t queue_id)
+@@ -2033,7 +1968,7 @@ hns3_check_dev_conf(struct rte_eth_dev *dev)
+ struct rte_eth_conf *conf = &dev->data->dev_conf;
+ int ret;
+
+- ret = hns3_check_mq_mode(dev);
++ ret = hns3_check_dev_mq_mode(dev);
+ if (ret)
+ return ret;
+
+@@ -5497,37 +5432,6 @@ hns3_priority_flow_ctrl_set(struct rte_eth_dev *dev,
+ return ret;
+ }
+
+-static int
+-hns3_get_dcb_info(struct rte_eth_dev *dev, struct rte_eth_dcb_info *dcb_info)
+-{
+- struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+- enum rte_eth_rx_mq_mode mq_mode = dev->data->dev_conf.rxmode.mq_mode;
+- int i;
+-
+- rte_spinlock_lock(&hw->lock);
+- if ((uint32_t)mq_mode & RTE_ETH_MQ_RX_DCB_FLAG)
+- dcb_info->nb_tcs = hw->dcb_info.local_max_tc;
+- else
+- dcb_info->nb_tcs = 1;
+-
+- for (i = 0; i < HNS3_MAX_USER_PRIO; i++)
+- dcb_info->prio_tc[i] = hw->dcb_info.prio_tc[i];
+- for (i = 0; i < dcb_info->nb_tcs; i++)
+- dcb_info->tc_bws[i] = hw->dcb_info.pg_info[0].tc_dwrr[i];
+-
+- for (i = 0; i < hw->dcb_info.num_tc; i++) {
+- dcb_info->tc_queue.tc_rxq[0][i].base = hw->alloc_rss_size * i;
+- dcb_info->tc_queue.tc_txq[0][i].base =
+- hw->tc_queue[i].tqp_offset;
+- dcb_info->tc_queue.tc_rxq[0][i].nb_queue = hw->alloc_rss_size;
+- dcb_info->tc_queue.tc_txq[0][i].nb_queue =
+- hw->tc_queue[i].tqp_count;
+- }
+- rte_spinlock_unlock(&hw->lock);
+-
+- return 0;
+-}
+-
+ static int
+ hns3_reinit_dev(struct hns3_adapter *hns)
+ {
+diff --git a/drivers/net/hns3/hns3_ethdev_vf.c b/drivers/net/hns3/hns3_ethdev_vf.c
+index 52859a8..a207bbf 100644
+--- a/drivers/net/hns3/hns3_ethdev_vf.c
++++ b/drivers/net/hns3/hns3_ethdev_vf.c
+@@ -379,6 +379,236 @@ hns3vf_bind_ring_with_vector(struct hns3_hw *hw, uint16_t vector_id,
+ return ret;
+ }
+
++static int
++hns3vf_set_multi_tc(struct hns3_hw *hw, const struct hns3_mbx_tc_config *config)
++{
++ struct hns3_mbx_tc_config *payload;
++ struct hns3_vf_to_pf_msg req;
++ int ret;
++
++ hns3vf_mbx_setup(&req, HNS3_MBX_SET_TC, 0);
++ payload = (struct hns3_mbx_tc_config *)req.data;
++ memcpy(payload, config, sizeof(*payload));
++ payload->prio_tc_map = rte_cpu_to_le_32(config->prio_tc_map);
++ ret = hns3vf_mbx_send(hw, &req, true, NULL, 0);
++ if (ret)
++ hns3_err(hw, "failed to set multi-tc, ret = %d.", ret);
++
++ return ret;
++}
++
++static int
++hns3vf_unset_multi_tc(struct hns3_hw *hw)
++{
++ struct hns3_mbx_tc_config *paylod;
++ struct hns3_vf_to_pf_msg req;
++ int ret;
++
++ hns3vf_mbx_setup(&req, HNS3_MBX_SET_TC, 0);
++ paylod = (struct hns3_mbx_tc_config *)req.data;
++ paylod->tc_dwrr[0] = HNS3_ETS_DWRR_MAX;
++ paylod->num_tc = 1;
++ ret = hns3vf_mbx_send(hw, &req, true, NULL, 0);
++ if (ret)
++ hns3_err(hw, "failed to unset multi-tc, ret = %d.", ret);
++
++ return ret;
++}
++
++static int
++hns3vf_check_multi_tc_config(struct rte_eth_dev *dev, const struct hns3_mbx_tc_config *info)
++{
++ struct rte_eth_dcb_rx_conf *rx_conf = &dev->data->dev_conf.rx_adv_conf.dcb_rx_conf;
++ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
++ uint32_t prio_tc_map = info->prio_tc_map;
++ uint8_t map;
++ int i;
++
++ if (rx_conf->nb_tcs != info->num_tc) {
++ hns3_err(hw, "num_tcs(%d) is not equal to PF config(%u)!",
++ rx_conf->nb_tcs, info->num_tc);
++ return -EINVAL;
++ }
++
++ for (i = 0; i < HNS3_MAX_USER_PRIO; i++) {
++ map = prio_tc_map & HNS3_MBX_PRIO_MASK;
++ prio_tc_map >>= HNS3_MBX_PRIO_SHIFT;
++ if (rx_conf->dcb_tc[i] != map) {
++ hns3_err(hw, "dcb_tc[%d] = %u is not equal to PF config(%u)!",
++ i, rx_conf->dcb_tc[i], map);
++ return -EINVAL;
++ }
++ }
++
++ return 0;
++}
++
++static int
++hns3vf_get_multi_tc_info(struct hns3_hw *hw, struct hns3_mbx_tc_config *info)
++{
++ uint8_t resp_msg[HNS3_MBX_MAX_RESP_DATA_SIZE];
++ struct hns3_mbx_tc_prio_map *map = (struct hns3_mbx_tc_prio_map *)resp_msg;
++ struct hns3_mbx_tc_ets_info *ets = (struct hns3_mbx_tc_ets_info *)resp_msg;
++ struct hns3_vf_to_pf_msg req;
++ int i, ret;
++
++ memset(info, 0, sizeof(*info));
++
++ hns3vf_mbx_setup(&req, HNS3_MBX_GET_TC, HNS3_MBX_GET_PRIO_MAP);
++ ret = hns3vf_mbx_send(hw, &req, true, resp_msg, sizeof(resp_msg));
++ if (ret) {
++ hns3_err(hw, "failed to get multi-tc prio map, ret = %d.", ret);
++ return ret;
++ }
++ info->prio_tc_map = rte_le_to_cpu_32(map->prio_tc_map);
++
++ hns3vf_mbx_setup(&req, HNS3_MBX_GET_TC, HNS3_MBX_GET_ETS_INFO);
++ ret = hns3vf_mbx_send(hw, &req, true, resp_msg, sizeof(resp_msg));
++ if (ret) {
++ hns3_err(hw, "failed to get multi-tc ETS info, ret = %d.", ret);
++ return ret;
++ }
++ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
++ if (ets->sch_mode[i] == HNS3_ETS_SCHED_MODE_INVALID)
++ continue;
++ info->tc_dwrr[i] = ets->sch_mode[i];
++ info->num_tc++;
++ if (ets->sch_mode[i] > 0)
++ info->tc_sch_mode |= 1u << i;
++ }
++
++ return 0;
++}
++
++static void
++hns3vf_update_dcb_info(struct hns3_hw *hw, const struct hns3_mbx_tc_config *info)
++{
++ uint32_t prio_tc_map;
++ uint8_t map;
++ int i;
++
++ hw->dcb_info.local_max_tc = hw->dcb_info.num_tc;
++ hw->dcb_info.hw_tc_map = (1u << hw->dcb_info.num_tc) - 1u;
++ memset(hw->dcb_info.pg_info[0].tc_dwrr, 0, sizeof(hw->dcb_info.pg_info[0].tc_dwrr));
++
++ if (hw->dcb_info.num_tc == 1) {
++ memset(hw->dcb_info.prio_tc, 0, sizeof(hw->dcb_info.prio_tc));
++ hw->dcb_info.pg_info[0].tc_dwrr[0] = HNS3_ETS_DWRR_MAX;
++ return;
++ }
++
++ if (info == NULL)
++ return;
++
++ prio_tc_map = info->prio_tc_map;
++ for (i = 0; i < HNS3_MAX_TC_NUM; i++) {
++ map = prio_tc_map & HNS3_MBX_PRIO_MASK;
++ prio_tc_map >>= HNS3_MBX_PRIO_SHIFT;
++ hw->dcb_info.prio_tc[i] = map;
++ }
++ for (i = 0; i < hw->dcb_info.num_tc; i++)
++ hw->dcb_info.pg_info[0].tc_dwrr[i] = info->tc_dwrr[i];
++}
++
++static int
++hns3vf_setup_dcb(struct rte_eth_dev *dev)
++{
++ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
++ struct hns3_mbx_tc_config info;
++ int ret;
++
++ if (!hns3_dev_get_support(hw, VF_MULTI_TCS)) {
++ hns3_err(hw, "this port does not support dcb configurations.");
++ return -ENOTSUP;
++ }
++
++ if (dev->data->dev_conf.dcb_capability_en & RTE_ETH_DCB_PFC_SUPPORT) {
++ hns3_err(hw, "VF don't support PFC!");
++ return -ENOTSUP;
++ }
++
++ ret = hns3vf_get_multi_tc_info(hw, &info);
++ if (ret)
++ return ret;
++
++ ret = hns3vf_check_multi_tc_config(dev, &info);
++ if (ret)
++ return ret;
++
++ /*
++ * If multiple-TCs have been configured, cancel the configuration
++ * first. Otherwise, the configuration will fail.
++ */
++ if (hw->dcb_info.num_tc > 1) {
++ ret = hns3vf_unset_multi_tc(hw);
++ if (ret)
++ return ret;
++ hw->dcb_info.num_tc = 1;
++ hns3vf_update_dcb_info(hw, NULL);
++ }
++
++ ret = hns3vf_set_multi_tc(hw, &info);
++ if (ret)
++ return ret;
++
++ hw->dcb_info.num_tc = info.num_tc;
++ hns3vf_update_dcb_info(hw, &info);
++
++ return hns3_queue_to_tc_mapping(hw, hw->data->nb_rx_queues, hw->data->nb_rx_queues);
++}
++
++static int
++hns3vf_unset_dcb(struct rte_eth_dev *dev)
++{
++ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
++ int ret;
++
++ if (hw->dcb_info.num_tc > 1) {
++ ret = hns3vf_unset_multi_tc(hw);
++ if (ret)
++ return ret;
++ }
++
++ hw->dcb_info.num_tc = 1;
++ hns3vf_update_dcb_info(hw, NULL);
++
++ return hns3_queue_to_tc_mapping(hw, hw->data->nb_rx_queues, hw->data->nb_rx_queues);
++}
++
++static int
++hns3vf_config_dcb(struct rte_eth_dev *dev)
++{
++ struct rte_eth_conf *conf = &dev->data->dev_conf;
++ uint32_t rx_mq_mode = conf->rxmode.mq_mode;
++ int ret;
++
++ if (rx_mq_mode & RTE_ETH_MQ_RX_DCB_FLAG)
++ ret = hns3vf_setup_dcb(dev);
++ else
++ ret = hns3vf_unset_dcb(dev);
++
++ return ret;
++}
++
++static int
++hns3vf_check_dev_conf(struct rte_eth_dev *dev)
++{
++ struct hns3_hw *hw = HNS3_DEV_PRIVATE_TO_HW(dev->data->dev_private);
++ struct rte_eth_conf *conf = &dev->data->dev_conf;
++ int ret;
++
++ ret = hns3_check_dev_mq_mode(dev);
++ if (ret)
++ return ret;
++
++ if (conf->link_speeds & RTE_ETH_LINK_SPEED_FIXED) {
++ hns3_err(hw, "setting link speed/duplex not supported");
++ ret = -EINVAL;
++ }
++
++ return ret;
++}
++
+ static int
+ hns3vf_dev_configure(struct rte_eth_dev *dev)
+ {
+@@ -412,11 +642,13 @@ hns3vf_dev_configure(struct rte_eth_dev *dev)
+ }
+
+ hw->adapter_state = HNS3_NIC_CONFIGURING;
+- if (conf->link_speeds & RTE_ETH_LINK_SPEED_FIXED) {
+- hns3_err(hw, "setting link speed/duplex not supported");
+- ret = -EINVAL;
++ ret = hns3vf_check_dev_conf(dev);
++ if (ret)
++ goto cfg_err;
++
++ ret = hns3vf_config_dcb(dev);
++ if (ret)
+ goto cfg_err;
+- }
+
+ /* When RSS is not configured, redirect the packet queue 0 */
+ if ((uint32_t)mq_mode & RTE_ETH_MQ_RX_RSS_FLAG) {
+@@ -1496,6 +1728,15 @@ hns3vf_init_vf(struct rte_eth_dev *eth_dev)
+ return ret;
+ }
+
++static void
++hns3vf_notify_uninit(struct hns3_hw *hw)
++{
++ struct hns3_vf_to_pf_msg req;
++
++ hns3vf_mbx_setup(&req, HNS3_MBX_VF_UNINIT, 0);
++ (void)hns3vf_mbx_send(hw, &req, false, NULL, 0);
++}
++
+ static void
+ hns3vf_uninit_vf(struct rte_eth_dev *eth_dev)
+ {
+@@ -1515,6 +1756,7 @@ hns3vf_uninit_vf(struct rte_eth_dev *eth_dev)
+ rte_intr_disable(pci_dev->intr_handle);
+ hns3_intr_unregister(pci_dev->intr_handle, hns3vf_interrupt_handler,
+ eth_dev);
++ (void)hns3vf_notify_uninit(hw);
+ hns3_cmd_uninit(hw);
+ hns3_cmd_destroy_queue(hw);
+ hw->io_base = NULL;
+@@ -1652,14 +1894,8 @@ static int
+ hns3vf_do_start(struct hns3_adapter *hns, bool reset_queue)
+ {
+ struct hns3_hw *hw = &hns->hw;
+- uint16_t nb_rx_q = hw->data->nb_rx_queues;
+- uint16_t nb_tx_q = hw->data->nb_tx_queues;
+ int ret;
+
+- ret = hns3_queue_to_tc_mapping(hw, nb_rx_q, nb_tx_q);
+- if (ret)
+- return ret;
+-
+ hns3_enable_rxd_adv_layout(hw);
+
+ ret = hns3_init_queues(hns, reset_queue);
+@@ -2240,6 +2476,7 @@ static const struct eth_dev_ops hns3vf_eth_dev_ops = {
+ .vlan_filter_set = hns3vf_vlan_filter_set,
+ .vlan_offload_set = hns3vf_vlan_offload_set,
+ .get_reg = hns3_get_regs,
++ .get_dcb_info = hns3_get_dcb_info,
+ .dev_supported_ptypes_get = hns3_dev_supported_ptypes_get,
+ .tx_done_cleanup = hns3_tx_done_cleanup,
+ .eth_dev_priv_dump = hns3_eth_dev_priv_dump,
+diff --git a/drivers/net/hns3/hns3_mbx.h b/drivers/net/hns3/hns3_mbx.h
+index 97fbc4c..1a8c2df 100644
+--- a/drivers/net/hns3/hns3_mbx.h
++++ b/drivers/net/hns3/hns3_mbx.h
+@@ -9,6 +9,8 @@
+
+ #include <rte_spinlock.h>
+
++#include "hns3_cmd.h"
++
+ enum HNS3_MBX_OPCODE {
+ HNS3_MBX_RESET = 0x01, /* (VF -> PF) assert reset */
+ HNS3_MBX_ASSERTING_RESET, /* (PF -> VF) PF is asserting reset */
+@@ -45,11 +47,13 @@ enum HNS3_MBX_OPCODE {
+ HNS3_MBX_PUSH_VLAN_INFO = 34, /* (PF -> VF) push port base vlan */
+
+ HNS3_MBX_PUSH_PROMISC_INFO = 36, /* (PF -> VF) push vf promisc info */
++ HNS3_MBX_VF_UNINIT, /* (VF -> PF) vf is unintializing */
+
+ HNS3_MBX_HANDLE_VF_TBL = 38, /* (VF -> PF) store/clear hw cfg tbl */
+ HNS3_MBX_GET_RING_VECTOR_MAP, /* (VF -> PF) get ring-to-vector map */
+
+ HNS3_MBX_GET_TC = 47, /* (VF -> PF) get tc info of PF configured */
++ HNS3_MBX_SET_TC, /* (VF -> PF) set tc */
+
+ HNS3_MBX_PUSH_LINK_STATUS = 201, /* (IMP -> PF) get port link status */
+ };
+@@ -64,8 +68,43 @@ struct hns3_basic_info {
+
+ enum hns3_mbx_get_tc_subcode {
+ HNS3_MBX_GET_PRIO_MAP = 0, /* query priority to tc map */
++ HNS3_MBX_GET_ETS_INFO, /* query ets info */
++};
++
++struct hns3_mbx_tc_prio_map {
++ /*
++ * Each four bits correspond to one priority's TC.
++ * Bit0-3 correspond to priority-0's TC, bit4-7 correspond to
++ * priority-1's TC, and so on.
++ */
++ uint32_t prio_tc_map;
+ };
+
++#define HNS3_ETS_SCHED_MODE_INVALID 255
++#define HNS3_ETS_DWRR_MAX 100
++struct hns3_mbx_tc_ets_info {
++ uint8_t sch_mode[HNS3_MAX_TC_NUM]; /* 1~100: DWRR, 0: SP; 255-invalid */
++};
++
++#define HNS3_MBX_PRIO_SHIFT 4
++#define HNS3_MBX_PRIO_MASK 0xFu
++struct __rte_packed_begin hns3_mbx_tc_config {
++ /*
++ * Each four bits correspond to one priority's TC.
++ * Bit0-3 correspond to priority-0's TC, bit4-7 correspond to
++ * priority-1's TC, and so on.
++ */
++ uint32_t prio_tc_map;
++ uint8_t tc_dwrr[HNS3_MAX_TC_NUM];
++ uint8_t num_tc;
++ /*
++ * Each bit correspond to one TC's scheduling mode, 0 means SP
++ * scheduling mode, 1 means DWRR scheduling mode.
++ * Bit0 corresponds to TC0, bit1 corresponds to TC1, and so on.
++ */
++ uint8_t tc_sch_mode;
++} __rte_packed_end;
++
+ /* below are per-VF mac-vlan subcodes */
+ enum hns3_mbx_mac_vlan_subcode {
+ HNS3_MBX_MAC_VLAN_UC_MODIFY = 0, /* modify UC mac addr */
+--
+2.33.0
+
diff --git a/0124-app-testpmd-avoid-crash-in-DCB-config.patch b/0124-app-testpmd-avoid-crash-in-DCB-config.patch
new file mode 100644
index 0000000..1ae38e6
--- /dev/null
+++ b/0124-app-testpmd-avoid-crash-in-DCB-config.patch
@@ -0,0 +1,46 @@
+From f0e5cd4941e2b6e95c86e27de5c93d3ba5c3c096 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 20 Feb 2025 15:06:51 +0800
+Subject: [PATCH 11/24] app/testpmd: avoid crash in DCB config
+
+[ upstream commit d646e219b34ffc4d531f3703fc317e7cff9a25ae ]
+
+The "port config dcb ..." command will segment fault when input with
+invalid port id, this patch fixes it.
+
+Fixes: 9b53e542e9e1 ("app/testpmd: add priority flow control")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/cmdline.c | 6 ++++++
+ 1 file changed, 6 insertions(+)
+
+diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
+index 8ef116c..7eba675 100644
+--- a/app/test-pmd/cmdline.c
++++ b/app/test-pmd/cmdline.c
+@@ -3201,6 +3201,9 @@ cmd_config_dcb_parsed(void *parsed_result,
+ uint8_t pfc_en;
+ int ret;
+
++ if (port_id_is_invalid(port_id, ENABLED_WARN))
++ return;
++
+ port = &ports[port_id];
+ /** Check if the port is not started **/
+ if (port->port_status != RTE_PORT_STOPPED) {
+@@ -6401,6 +6404,9 @@ cmd_priority_flow_ctrl_set_parsed(void *parsed_result,
+ int rx_fc_enable, tx_fc_enable;
+ int ret;
+
++ if (port_id_is_invalid(res->port_id, ENABLED_WARN))
++ return;
++
+ /*
+ * Rx on/off, flow control is enabled/disabled on RX side. This can indicate
+ * the RTE_ETH_FC_TX_PAUSE, Transmit pause frame at the Rx side.
+--
+2.33.0
+
diff --git a/0125-app-testpmd-show-all-DCB-priority-TC-map.patch b/0125-app-testpmd-show-all-DCB-priority-TC-map.patch
new file mode 100644
index 0000000..a32819f
--- /dev/null
+++ b/0125-app-testpmd-show-all-DCB-priority-TC-map.patch
@@ -0,0 +1,38 @@
+From 4bff87cadadf0912b34e4bcb3436ddd6f2f8a59b Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 20 Feb 2025 15:06:50 +0800
+Subject: [PATCH 12/24] app/testpmd: show all DCB priority TC map
+
+[ upstream commit 164d7ac277bba10b27dd96821536e6b4a71cfebf ]
+
+Currently, the "show port dcb_tc" command displays only the mapping
+in the number of TCs. This patch fixes it by show all priority's TC
+mapping.
+
+Fixes: cd80f411a7e7 ("app/testpmd: add command to display DCB info")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/config.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
+index 2c4dedd..0722cc2 100644
+--- a/app/test-pmd/config.c
++++ b/app/test-pmd/config.c
+@@ -6911,8 +6911,8 @@ port_dcb_info_display(portid_t port_id)
+ printf("\n TC : ");
+ for (i = 0; i < dcb_info.nb_tcs; i++)
+ printf("\t%4d", i);
+- printf("\n Priority : ");
+- for (i = 0; i < dcb_info.nb_tcs; i++)
++ printf("\n Prio2TC : ");
++ for (i = 0; i < RTE_ETH_DCB_NUM_USER_PRIORITIES; i++)
+ printf("\t%4d", dcb_info.prio_tc[i]);
+ printf("\n BW percent :");
+ for (i = 0; i < dcb_info.nb_tcs; i++)
+--
+2.33.0
+
diff --git a/0126-app-testpmd-relax-number-of-TCs-in-DCB-command.patch b/0126-app-testpmd-relax-number-of-TCs-in-DCB-command.patch
new file mode 100644
index 0000000..b8f012d
--- /dev/null
+++ b/0126-app-testpmd-relax-number-of-TCs-in-DCB-command.patch
@@ -0,0 +1,54 @@
+From 2c6b9d89f89d05ccb26f20c55bfd90b6b08b7132 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 24 Apr 2025 14:17:46 +0800
+Subject: [PATCH 13/24] app/testpmd: relax number of TCs in DCB command
+
+[ upstream commit 5f2695ee948ddaf36050f2d6b58a3437248c1663 ]
+
+Currently, the "port config 0 dcb ..." command only supports 4 or 8
+TCs. Other number of TCs may be used in actual applications.
+
+This commit removes this restriction.
+
+Fixes: 900550de04a7 ("app/testpmd: add dcb support")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/cmdline.c | 4 ++--
+ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 2 +-
+ 2 files changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
+index 7eba675..8cea88c 100644
+--- a/app/test-pmd/cmdline.c
++++ b/app/test-pmd/cmdline.c
+@@ -3211,9 +3211,9 @@ cmd_config_dcb_parsed(void *parsed_result,
+ return;
+ }
+
+- if ((res->num_tcs != RTE_ETH_4_TCS) && (res->num_tcs != RTE_ETH_8_TCS)) {
++ if (res->num_tcs <= 1 || res->num_tcs > RTE_ETH_8_TCS) {
+ fprintf(stderr,
+- "The invalid number of traffic class, only 4 or 8 allowed.\n");
++ "The invalid number of traffic class, only 2~8 allowed.\n");
+ return;
+ }
+
+diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+index 227188f..c07b62d 100644
+--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
++++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+@@ -2142,7 +2142,7 @@ Set the DCB mode for an individual port::
+
+ testpmd> port config (port_id) dcb vt (on|off) (traffic_class) pfc (on|off)
+
+-The traffic class should be 4 or 8.
++The traffic class could be 2~8.
+
+ port config - Burst
+ ~~~~~~~~~~~~~~~~~~~
+--
+2.33.0
+
diff --git a/0127-app-testpmd-reuse-RSS-config-when-configuring-DCB.patch b/0127-app-testpmd-reuse-RSS-config-when-configuring-DCB.patch
new file mode 100644
index 0000000..353a93b
--- /dev/null
+++ b/0127-app-testpmd-reuse-RSS-config-when-configuring-DCB.patch
@@ -0,0 +1,93 @@
+From 5cc8fdb356b74e3c8b7a8ec83ac33b6c2ff5fc45 Mon Sep 17 00:00:00 2001
+From: Min Zhou <zhoumin(a)loongson.cn>
+Date: Wed, 20 Nov 2024 17:37:46 +0800
+Subject: [PATCH 14/24] app/testpmd: reuse RSS config when configuring DCB
+
+In the testpmd command, we have to stop the port firstly before configuring
+the DCB. However, some PMDs may execute a hardware reset during the port
+stop, such as ixgbe. Some kind of reset operations of PMD could clear the
+configurations of RSS in the hardware register. This would cause the loss
+of RSS configurations that were set during the testpmd initialization. As
+a result, I find that I cannot enable RSS and DCB at the same time in the
+testpmd command when using Intel 82599 NIC.
+
+The patch uses rss conf from software instead of reading from the hardware
+register when configuring DCB.
+
+Signed-off-by: Min Zhou <zhoumin(a)loongson.cn>
+Acked-by: Stephen Hemminger <stephen(a)networkplumber.org>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/testpmd.c | 26 ++++++--------------------
+ 1 file changed, 6 insertions(+), 20 deletions(-)
+
+diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
+index 9e4e99e..e93214d 100644
+--- a/app/test-pmd/testpmd.c
++++ b/app/test-pmd/testpmd.c
+@@ -4298,15 +4298,11 @@ const uint16_t vlan_tags[] = {
+ 24, 25, 26, 27, 28, 29, 30, 31
+ };
+
+-static int
+-get_eth_dcb_conf(portid_t pid, struct rte_eth_conf *eth_conf,
+- enum dcb_mode_enable dcb_mode,
+- enum rte_eth_nb_tcs num_tcs,
+- uint8_t pfc_en)
++static void
++get_eth_dcb_conf(struct rte_eth_conf *eth_conf, enum dcb_mode_enable dcb_mode,
++ enum rte_eth_nb_tcs num_tcs, uint8_t pfc_en)
+ {
+ uint8_t i;
+- int32_t rc;
+- struct rte_eth_rss_conf rss_conf;
+
+ /*
+ * Builds up the correct configuration for dcb+vt based on the vlan tags array
+@@ -4348,12 +4344,6 @@ get_eth_dcb_conf(portid_t pid, struct rte_eth_conf *eth_conf,
+ struct rte_eth_dcb_tx_conf *tx_conf =
+ ð_conf->tx_adv_conf.dcb_tx_conf;
+
+- memset(&rss_conf, 0, sizeof(struct rte_eth_rss_conf));
+-
+- rc = rte_eth_dev_rss_hash_conf_get(pid, &rss_conf);
+- if (rc != 0)
+- return rc;
+-
+ rx_conf->nb_tcs = num_tcs;
+ tx_conf->nb_tcs = num_tcs;
+
+@@ -4365,7 +4355,6 @@ get_eth_dcb_conf(portid_t pid, struct rte_eth_conf *eth_conf,
+ eth_conf->rxmode.mq_mode =
+ (enum rte_eth_rx_mq_mode)
+ (rx_mq_mode & RTE_ETH_MQ_RX_DCB_RSS);
+- eth_conf->rx_adv_conf.rss_conf = rss_conf;
+ eth_conf->txmode.mq_mode = RTE_ETH_MQ_TX_DCB;
+ }
+
+@@ -4374,8 +4363,6 @@ get_eth_dcb_conf(portid_t pid, struct rte_eth_conf *eth_conf,
+ RTE_ETH_DCB_PG_SUPPORT | RTE_ETH_DCB_PFC_SUPPORT;
+ else
+ eth_conf->dcb_capability_en = RTE_ETH_DCB_PG_SUPPORT;
+-
+- return 0;
+ }
+
+ int
+@@ -4398,10 +4385,9 @@ init_port_dcb_config(portid_t pid,
+ /* retain the original device configuration. */
+ memcpy(&port_conf, &rte_port->dev_conf, sizeof(struct rte_eth_conf));
+
+- /*set configuration of DCB in vt mode and DCB in non-vt mode*/
+- retval = get_eth_dcb_conf(pid, &port_conf, dcb_mode, num_tcs, pfc_en);
+- if (retval < 0)
+- return retval;
++ /* set configuration of DCB in vt mode and DCB in non-vt mode */
++ get_eth_dcb_conf(&port_conf, dcb_mode, num_tcs, pfc_en);
++
+ port_conf.rxmode.offloads |= RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
+ /* remove RSS HASH offload for DCB in vt mode */
+ if (port_conf.rxmode.mq_mode == RTE_ETH_MQ_RX_VMDQ_DCB) {
+--
+2.33.0
+
diff --git a/0128-app-testpmd-add-prio-tc-map-in-DCB-command.patch b/0128-app-testpmd-add-prio-tc-map-in-DCB-command.patch
new file mode 100644
index 0000000..ad4ea47
--- /dev/null
+++ b/0128-app-testpmd-add-prio-tc-map-in-DCB-command.patch
@@ -0,0 +1,296 @@
+From ebb9eb84c710366d9a42a95e2e4168eb3b2b027a Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 24 Apr 2025 14:17:47 +0800
+Subject: [PATCH 15/24] app/testpmd: add prio-tc map in DCB command
+
+[ upstream commit 601576ae6699b31460f35816be54a63c34f54377 ]
+
+Currently, the "port config 0 dcb ..." command config the prio-tc map
+by remainder operation, which means the prio-tc = prio % nb_tcs.
+
+This commit introduces an optional parameter "prio-tc" which is the same
+as kernel dcb ets tool. The new command:
+
+ port config 0 dcb vt off 4 pfc off prio-tc 0:1 1:2 2:3 ...
+
+If this parameter is not specified, the prio-tc map is configured by
+default.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/cmdline.c | 119 ++++++++++++++++++--
+ app/test-pmd/testpmd.c | 21 ++--
+ app/test-pmd/testpmd.h | 4 +-
+ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +-
+ 4 files changed, 125 insertions(+), 22 deletions(-)
+
+diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
+index 8cea88c..3178040 100644
+--- a/app/test-pmd/cmdline.c
++++ b/app/test-pmd/cmdline.c
+@@ -3186,19 +3186,111 @@ struct cmd_config_dcb {
+ cmdline_fixed_string_t vt_en;
+ uint8_t num_tcs;
+ cmdline_fixed_string_t pfc;
+- cmdline_fixed_string_t pfc_en;
++ cmdline_multi_string_t token_str;
+ };
+
++static int
++parse_dcb_token_prio_tc(char *param_str[], int param_num,
++ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES],
++ uint8_t *prio_tc_en)
++{
++ unsigned long prio, tc;
++ int prio_tc_maps = 0;
++ char *param, *end;
++ int i;
++
++ for (i = 0; i < param_num; i++) {
++ param = param_str[i];
++ prio = strtoul(param, &end, 10);
++ if (prio >= RTE_ETH_DCB_NUM_USER_PRIORITIES) {
++ fprintf(stderr, "Bad Argument: invalid PRIO %lu\n", prio);
++ return -1;
++ }
++ if ((*end != ':') || (strlen(end + 1) == 0)) {
++ fprintf(stderr, "Bad Argument: invalid PRIO:TC format %s\n", param);
++ return -1;
++ }
++ tc = strtoul(end + 1, &end, 10);
++ if (tc >= RTE_ETH_8_TCS) {
++ fprintf(stderr, "Bad Argument: invalid TC %lu\n", tc);
++ return -1;
++ }
++ if (*end != '\0') {
++ fprintf(stderr, "Bad Argument: invalid PRIO:TC format %s\n", param);
++ return -1;
++ }
++ prio_tc[prio] = tc;
++ prio_tc_maps++;
++ } while (0);
++
++ if (prio_tc_maps == 0) {
++ fprintf(stderr, "Bad Argument: no PRIO:TC provided\n");
++ return -1;
++ }
++ *prio_tc_en = 1;
++
++ return 0;
++}
++
++static int
++parse_dcb_token_value(char *token_str,
++ uint8_t *pfc_en,
++ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES],
++ uint8_t *prio_tc_en)
++{
++#define MAX_TOKEN_NUM 128
++ char *split_str[MAX_TOKEN_NUM];
++ int split_num = 0;
++ char *token;
++
++ /* split multiple token to split str. */
++ do {
++ token = strtok_r(token_str, " \f\n\r\t\v", &token_str);
++ if (token == NULL)
++ break;
++ if (split_num >= MAX_TOKEN_NUM) {
++ fprintf(stderr, "Bad Argument: too much argument\n");
++ return -1;
++ }
++ split_str[split_num++] = token;
++ } while (1);
++
++ /* parse fixed parameter "pfc-en" first. */
++ token = split_str[0];
++ if (strcmp(token, "on") == 0)
++ *pfc_en = 1;
++ else if (strcmp(token, "off") == 0)
++ *pfc_en = 0;
++ else {
++ fprintf(stderr, "Bad Argument: pfc-en must be on or off\n");
++ return -EINVAL;
++ }
++
++ if (split_num == 1)
++ return 0;
++
++ /* start parse optional parameter. */
++ token = split_str[1];
++ if (strcmp(token, "prio-tc") != 0) {
++ fprintf(stderr, "Bad Argument: unknown token %s\n", token);
++ return -1;
++ }
++
++ return parse_dcb_token_prio_tc(&split_str[2], split_num - 2, prio_tc, prio_tc_en);
++}
++
+ static void
+ cmd_config_dcb_parsed(void *parsed_result,
+ __rte_unused struct cmdline *cl,
+ __rte_unused void *data)
+ {
++ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES] = {0};
+ struct cmd_config_dcb *res = parsed_result;
+ struct rte_eth_dcb_info dcb_info;
+ portid_t port_id = res->port_id;
++ uint8_t prio_tc_en = 0;
+ struct rte_port *port;
+- uint8_t pfc_en;
++ uint8_t pfc_en = 0;
+ int ret;
+
+ if (port_id_is_invalid(port_id, ENABLED_WARN))
+@@ -3230,20 +3322,19 @@ cmd_config_dcb_parsed(void *parsed_result,
+ return;
+ }
+
+- if (!strncmp(res->pfc_en, "on", 2))
+- pfc_en = 1;
+- else
+- pfc_en = 0;
++ ret = parse_dcb_token_value(res->token_str, &pfc_en, prio_tc, &prio_tc_en);
++ if (ret != 0)
++ return;
+
+ /* DCB in VT mode */
+ if (!strncmp(res->vt_en, "on", 2))
+ ret = init_port_dcb_config(port_id, DCB_VT_ENABLED,
+ (enum rte_eth_nb_tcs)res->num_tcs,
+- pfc_en);
++ pfc_en, prio_tc, prio_tc_en);
+ else
+ ret = init_port_dcb_config(port_id, DCB_ENABLED,
+ (enum rte_eth_nb_tcs)res->num_tcs,
+- pfc_en);
++ pfc_en, prio_tc, prio_tc_en);
+ if (ret != 0) {
+ fprintf(stderr, "Cannot initialize network ports.\n");
+ return;
+@@ -3270,13 +3361,17 @@ static cmdline_parse_token_num_t cmd_config_dcb_num_tcs =
+ TOKEN_NUM_INITIALIZER(struct cmd_config_dcb, num_tcs, RTE_UINT8);
+ static cmdline_parse_token_string_t cmd_config_dcb_pfc =
+ TOKEN_STRING_INITIALIZER(struct cmd_config_dcb, pfc, "pfc");
+-static cmdline_parse_token_string_t cmd_config_dcb_pfc_en =
+- TOKEN_STRING_INITIALIZER(struct cmd_config_dcb, pfc_en, "on#off");
++static cmdline_parse_token_string_t cmd_config_dcb_token_str =
++ TOKEN_STRING_INITIALIZER(struct cmd_config_dcb, token_str, TOKEN_STRING_MULTI);
+
+ static cmdline_parse_inst_t cmd_config_dcb = {
+ .f = cmd_config_dcb_parsed,
+ .data = NULL,
+- .help_str = "port config <port-id> dcb vt on|off <num_tcs> pfc on|off",
++ .help_str = "port config <port-id> dcb vt on|off <num_tcs> pfc on|off prio-tc PRIO-MAP\n"
++ "where PRIO-MAP: [ PRIO-MAP ] PRIO-MAPPING\n"
++ " PRIO-MAPPING := PRIO:TC\n"
++ " PRIO: { 0 .. 7 }\n"
++ " TC: { 0 .. 7 }",
+ .tokens = {
+ (void *)&cmd_config_dcb_port,
+ (void *)&cmd_config_dcb_config,
+@@ -3286,7 +3381,7 @@ static cmdline_parse_inst_t cmd_config_dcb = {
+ (void *)&cmd_config_dcb_vt_en,
+ (void *)&cmd_config_dcb_num_tcs,
+ (void *)&cmd_config_dcb_pfc,
+- (void *)&cmd_config_dcb_pfc_en,
++ (void *)&cmd_config_dcb_token_str,
+ NULL,
+ },
+ };
+diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
+index e93214d..fa3cd37 100644
+--- a/app/test-pmd/testpmd.c
++++ b/app/test-pmd/testpmd.c
+@@ -4300,9 +4300,10 @@ const uint16_t vlan_tags[] = {
+
+ static void
+ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, enum dcb_mode_enable dcb_mode,
+- enum rte_eth_nb_tcs num_tcs, uint8_t pfc_en)
++ enum rte_eth_nb_tcs num_tcs, uint8_t pfc_en,
++ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES], uint8_t prio_tc_en)
+ {
+- uint8_t i;
++ uint8_t dcb_tc_val, i;
+
+ /*
+ * Builds up the correct configuration for dcb+vt based on the vlan tags array
+@@ -4329,8 +4330,9 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, enum dcb_mode_enable dcb_mode,
+ 1 << (i % vmdq_rx_conf->nb_queue_pools);
+ }
+ for (i = 0; i < RTE_ETH_DCB_NUM_USER_PRIORITIES; i++) {
+- vmdq_rx_conf->dcb_tc[i] = i % num_tcs;
+- vmdq_tx_conf->dcb_tc[i] = i % num_tcs;
++ dcb_tc_val = prio_tc_en ? prio_tc[i] : i % num_tcs;
++ vmdq_rx_conf->dcb_tc[i] = dcb_tc_val;
++ vmdq_tx_conf->dcb_tc[i] = dcb_tc_val;
+ }
+
+ /* set DCB mode of RX and TX of multiple queues */
+@@ -4348,8 +4350,9 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, enum dcb_mode_enable dcb_mode,
+ tx_conf->nb_tcs = num_tcs;
+
+ for (i = 0; i < RTE_ETH_DCB_NUM_USER_PRIORITIES; i++) {
+- rx_conf->dcb_tc[i] = i % num_tcs;
+- tx_conf->dcb_tc[i] = i % num_tcs;
++ dcb_tc_val = prio_tc_en ? prio_tc[i] : i % num_tcs;
++ rx_conf->dcb_tc[i] = dcb_tc_val;
++ tx_conf->dcb_tc[i] = dcb_tc_val;
+ }
+
+ eth_conf->rxmode.mq_mode =
+@@ -4369,7 +4372,9 @@ int
+ init_port_dcb_config(portid_t pid,
+ enum dcb_mode_enable dcb_mode,
+ enum rte_eth_nb_tcs num_tcs,
+- uint8_t pfc_en)
++ uint8_t pfc_en,
++ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES],
++ uint8_t prio_tc_en)
+ {
+ struct rte_eth_conf port_conf;
+ struct rte_port *rte_port;
+@@ -4386,7 +4391,7 @@ init_port_dcb_config(portid_t pid,
+ memcpy(&port_conf, &rte_port->dev_conf, sizeof(struct rte_eth_conf));
+
+ /* set configuration of DCB in vt mode and DCB in non-vt mode */
+- get_eth_dcb_conf(&port_conf, dcb_mode, num_tcs, pfc_en);
++ get_eth_dcb_conf(&port_conf, dcb_mode, num_tcs, pfc_en, prio_tc, prio_tc_en);
+
+ port_conf.rxmode.offloads |= RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
+ /* remove RSS HASH offload for DCB in vt mode */
+diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
+index 9b10a9e..6b8ff28 100644
+--- a/app/test-pmd/testpmd.h
++++ b/app/test-pmd/testpmd.h
+@@ -1120,7 +1120,9 @@ uint8_t port_is_bonding_member(portid_t member_pid);
+
+ int init_port_dcb_config(portid_t pid, enum dcb_mode_enable dcb_mode,
+ enum rte_eth_nb_tcs num_tcs,
+- uint8_t pfc_en);
++ uint8_t pfc_en,
++ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES],
++ uint8_t prio_tc_en);
+ int start_port(portid_t pid);
+ void stop_port(portid_t pid);
+ void close_port(portid_t pid);
+diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+index c07b62d..c60fd15 100644
+--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
++++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+@@ -2140,9 +2140,10 @@ port config - DCB
+
+ Set the DCB mode for an individual port::
+
+- testpmd> port config (port_id) dcb vt (on|off) (traffic_class) pfc (on|off)
++ testpmd> port config (port_id) dcb vt (on|off) (traffic_class) pfc (on|off) prio-tc (prio-tc)
+
+ The traffic class could be 2~8.
++The prio-tc field here is optional, if not specified then the prio-tc use default configuration.
+
+ port config - Burst
+ ~~~~~~~~~~~~~~~~~~~
+--
+2.33.0
+
diff --git a/0129-app-testpmd-add-queue-restriction-in-DCB-command.patch b/0129-app-testpmd-add-queue-restriction-in-DCB-command.patch
new file mode 100644
index 0000000..1aab2e7
--- /dev/null
+++ b/0129-app-testpmd-add-queue-restriction-in-DCB-command.patch
@@ -0,0 +1,264 @@
+From fb99db310dca2b93a1f50fcaa8c46226e84ae411 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 24 Apr 2025 14:17:48 +0800
+Subject: [PATCH 16/24] app/testpmd: add queue restriction in DCB command
+
+[ upstream commit 2169699b15fc4cf317108f86d5039a7e8055d024 ]
+
+In some test scenarios, users want to test DCB by specifying the number
+of Rx/Tx queues. But the "port config 0 dcb ..." command will auto
+adjust Rx/Tx queue number.
+
+This patch introduces an optional parameter "keep-qnum" which make sure
+the "port config 0 dcb ..." command don't adjust Rx/Tx queue number.
+The new command:
+
+ port config 0 dcb vt off 4 pfc off keep-qnum
+
+If this parameter is not specified, the Rx/Tx queue number was adjusted
+by default.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/cmdline.c | 83 ++++++++++++++++++---
+ app/test-pmd/testpmd.c | 42 ++++++-----
+ app/test-pmd/testpmd.h | 3 +-
+ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +-
+ 4 files changed, 98 insertions(+), 33 deletions(-)
+
+diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
+index 3178040..f42b806 100644
+--- a/app/test-pmd/cmdline.c
++++ b/app/test-pmd/cmdline.c
+@@ -3232,14 +3232,47 @@ parse_dcb_token_prio_tc(char *param_str[], int param_num,
+ return 0;
+ }
+
++#define DCB_TOKEN_PRIO_TC "prio-tc"
++#define DCB_TOKEN_KEEP_QNUM "keep-qnum"
++
++static int
++parse_dcb_token_find(char *split_str[], int split_num, int *param_num)
++{
++ int i;
++
++ if (strcmp(split_str[0], DCB_TOKEN_KEEP_QNUM) == 0) {
++ *param_num = 0;
++ return 0;
++ }
++
++ if (strcmp(split_str[0], DCB_TOKEN_PRIO_TC) != 0) {
++ fprintf(stderr, "Bad Argument: unknown token %s\n", split_str[0]);
++ return -EINVAL;
++ }
++
++ for (i = 1; i < split_num; i++) {
++ if ((strcmp(split_str[i], DCB_TOKEN_PRIO_TC) != 0) &&
++ (strcmp(split_str[i], DCB_TOKEN_KEEP_QNUM) != 0))
++ continue;
++ /* find another optional parameter, then exit. */
++ break;
++ }
++
++ *param_num = i - 1;
++
++ return 0;
++}
++
+ static int
+ parse_dcb_token_value(char *token_str,
+ uint8_t *pfc_en,
+ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES],
+- uint8_t *prio_tc_en)
++ uint8_t *prio_tc_en,
++ uint8_t *keep_qnum)
+ {
+ #define MAX_TOKEN_NUM 128
+ char *split_str[MAX_TOKEN_NUM];
++ int param_num, start, ret;
+ int split_num = 0;
+ char *token;
+
+@@ -3270,13 +3303,40 @@ parse_dcb_token_value(char *token_str,
+ return 0;
+
+ /* start parse optional parameter. */
+- token = split_str[1];
+- if (strcmp(token, "prio-tc") != 0) {
+- fprintf(stderr, "Bad Argument: unknown token %s\n", token);
+- return -1;
+- }
++ start = 1;
++ do {
++ param_num = 0;
++ ret = parse_dcb_token_find(&split_str[start], split_num - start, ¶m_num);
++ if (ret != 0)
++ return ret;
+
+- return parse_dcb_token_prio_tc(&split_str[2], split_num - 2, prio_tc, prio_tc_en);
++ token = split_str[start];
++ if (strcmp(token, DCB_TOKEN_PRIO_TC) == 0) {
++ if (*prio_tc_en == 1) {
++ fprintf(stderr, "Bad Argument: detect multiple %s token\n",
++ DCB_TOKEN_PRIO_TC);
++ return -1;
++ }
++ ret = parse_dcb_token_prio_tc(&split_str[start + 1], param_num, prio_tc,
++ prio_tc_en);
++ if (ret != 0)
++ return ret;
++ } else {
++ /* this must be keep-qnum. */
++ if (*keep_qnum == 1) {
++ fprintf(stderr, "Bad Argument: detect multiple %s token\n",
++ DCB_TOKEN_KEEP_QNUM);
++ return -1;
++ }
++ *keep_qnum = 1;
++ }
++
++ start += param_num + 1;
++ if (start >= split_num)
++ break;
++ } while (1);
++
++ return 0;
+ }
+
+ static void
+@@ -3289,6 +3349,7 @@ cmd_config_dcb_parsed(void *parsed_result,
+ struct rte_eth_dcb_info dcb_info;
+ portid_t port_id = res->port_id;
+ uint8_t prio_tc_en = 0;
++ uint8_t keep_qnum = 0;
+ struct rte_port *port;
+ uint8_t pfc_en = 0;
+ int ret;
+@@ -3322,7 +3383,7 @@ cmd_config_dcb_parsed(void *parsed_result,
+ return;
+ }
+
+- ret = parse_dcb_token_value(res->token_str, &pfc_en, prio_tc, &prio_tc_en);
++ ret = parse_dcb_token_value(res->token_str, &pfc_en, prio_tc, &prio_tc_en, &keep_qnum);
+ if (ret != 0)
+ return;
+
+@@ -3330,11 +3391,11 @@ cmd_config_dcb_parsed(void *parsed_result,
+ if (!strncmp(res->vt_en, "on", 2))
+ ret = init_port_dcb_config(port_id, DCB_VT_ENABLED,
+ (enum rte_eth_nb_tcs)res->num_tcs,
+- pfc_en, prio_tc, prio_tc_en);
++ pfc_en, prio_tc, prio_tc_en, keep_qnum);
+ else
+ ret = init_port_dcb_config(port_id, DCB_ENABLED,
+ (enum rte_eth_nb_tcs)res->num_tcs,
+- pfc_en, prio_tc, prio_tc_en);
++ pfc_en, prio_tc, prio_tc_en, keep_qnum);
+ if (ret != 0) {
+ fprintf(stderr, "Cannot initialize network ports.\n");
+ return;
+@@ -3367,7 +3428,7 @@ static cmdline_parse_token_string_t cmd_config_dcb_token_str =
+ static cmdline_parse_inst_t cmd_config_dcb = {
+ .f = cmd_config_dcb_parsed,
+ .data = NULL,
+- .help_str = "port config <port-id> dcb vt on|off <num_tcs> pfc on|off prio-tc PRIO-MAP\n"
++ .help_str = "port config <port-id> dcb vt on|off <num_tcs> pfc on|off prio-tc PRIO-MAP keep-qnum\n"
+ "where PRIO-MAP: [ PRIO-MAP ] PRIO-MAPPING\n"
+ " PRIO-MAPPING := PRIO:TC\n"
+ " PRIO: { 0 .. 7 }\n"
+diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
+index fa3cd37..0f8d8a1 100644
+--- a/app/test-pmd/testpmd.c
++++ b/app/test-pmd/testpmd.c
+@@ -4374,7 +4374,8 @@ init_port_dcb_config(portid_t pid,
+ enum rte_eth_nb_tcs num_tcs,
+ uint8_t pfc_en,
+ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES],
+- uint8_t prio_tc_en)
++ uint8_t prio_tc_en,
++ uint8_t keep_qnum)
+ {
+ struct rte_eth_conf port_conf;
+ struct rte_port *rte_port;
+@@ -4422,26 +4423,27 @@ init_port_dcb_config(portid_t pid,
+ return -1;
+ }
+
+- /* Assume the ports in testpmd have the same dcb capability
+- * and has the same number of rxq and txq in dcb mode
+- */
+- if (dcb_mode == DCB_VT_ENABLED) {
+- if (rte_port->dev_info.max_vfs > 0) {
+- nb_rxq = rte_port->dev_info.nb_rx_queues;
+- nb_txq = rte_port->dev_info.nb_tx_queues;
+- } else {
+- nb_rxq = rte_port->dev_info.max_rx_queues;
+- nb_txq = rte_port->dev_info.max_tx_queues;
+- }
+- } else {
+- /*if vt is disabled, use all pf queues */
+- if (rte_port->dev_info.vmdq_pool_base == 0) {
+- nb_rxq = rte_port->dev_info.max_rx_queues;
+- nb_txq = rte_port->dev_info.max_tx_queues;
++ if (keep_qnum == 0) {
++ /* Assume the ports in testpmd have the same dcb capability
++ * and has the same number of rxq and txq in dcb mode
++ */
++ if (dcb_mode == DCB_VT_ENABLED) {
++ if (rte_port->dev_info.max_vfs > 0) {
++ nb_rxq = rte_port->dev_info.nb_rx_queues;
++ nb_txq = rte_port->dev_info.nb_tx_queues;
++ } else {
++ nb_rxq = rte_port->dev_info.max_rx_queues;
++ nb_txq = rte_port->dev_info.max_tx_queues;
++ }
+ } else {
+- nb_rxq = (queueid_t)num_tcs;
+- nb_txq = (queueid_t)num_tcs;
+-
++ /*if vt is disabled, use all pf queues */
++ if (rte_port->dev_info.vmdq_pool_base == 0) {
++ nb_rxq = rte_port->dev_info.max_rx_queues;
++ nb_txq = rte_port->dev_info.max_tx_queues;
++ } else {
++ nb_rxq = (queueid_t)num_tcs;
++ nb_txq = (queueid_t)num_tcs;
++ }
+ }
+ }
+ rx_free_thresh = 64;
+diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
+index 6b8ff28..4e12073 100644
+--- a/app/test-pmd/testpmd.h
++++ b/app/test-pmd/testpmd.h
+@@ -1122,7 +1122,8 @@ int init_port_dcb_config(portid_t pid, enum dcb_mode_enable dcb_mode,
+ enum rte_eth_nb_tcs num_tcs,
+ uint8_t pfc_en,
+ uint8_t prio_tc[RTE_ETH_DCB_NUM_USER_PRIORITIES],
+- uint8_t prio_tc_en);
++ uint8_t prio_tc_en,
++ uint8_t keep_qnum);
+ int start_port(portid_t pid);
+ void stop_port(portid_t pid);
+ void close_port(portid_t pid);
+diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+index c60fd15..f265e45 100644
+--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
++++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+@@ -2140,10 +2140,11 @@ port config - DCB
+
+ Set the DCB mode for an individual port::
+
+- testpmd> port config (port_id) dcb vt (on|off) (traffic_class) pfc (on|off) prio-tc (prio-tc)
++ testpmd> port config (port_id) dcb vt (on|off) (traffic_class) pfc (on|off) prio-tc (prio-tc) keep-qnum
+
+ The traffic class could be 2~8.
+ The prio-tc field here is optional, if not specified then the prio-tc use default configuration.
++The keep-qnum field here is also optional, if specified then don't adjust Rx/Tx queue number.
+
+ port config - Burst
+ ~~~~~~~~~~~~~~~~~~~
+--
+2.33.0
+
diff --git a/0130-app-testpmd-add-command-to-disable-DCB.patch b/0130-app-testpmd-add-command-to-disable-DCB.patch
new file mode 100644
index 0000000..b428a9d
--- /dev/null
+++ b/0130-app-testpmd-add-command-to-disable-DCB.patch
@@ -0,0 +1,158 @@
+From e8dc9121f8f870512219dada8b6859aa528a371b Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 24 Apr 2025 14:17:49 +0800
+Subject: [PATCH 17/24] app/testpmd: add command to disable DCB
+
+[ upstream commit 0ecbf93f50018e552ea3aa401129ef6075c1b36b ]
+
+After the "port config 0 dcb ..." command is invoked, no command is
+available to disable DCB.
+
+This commit introduces disable DCB when num_tcs is 1, so user could
+disable the DCB by command:
+ port config 0 dcb vt off 1 pfc off
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/cmdline.c | 4 +-
+ app/test-pmd/testpmd.c | 58 ++++++++++++++-------
+ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 2 +-
+ 3 files changed, 43 insertions(+), 21 deletions(-)
+
+diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
+index f42b806..332d7b3 100644
+--- a/app/test-pmd/cmdline.c
++++ b/app/test-pmd/cmdline.c
+@@ -3364,9 +3364,9 @@ cmd_config_dcb_parsed(void *parsed_result,
+ return;
+ }
+
+- if (res->num_tcs <= 1 || res->num_tcs > RTE_ETH_8_TCS) {
++ if (res->num_tcs < 1 || res->num_tcs > RTE_ETH_8_TCS) {
+ fprintf(stderr,
+- "The invalid number of traffic class, only 2~8 allowed.\n");
++ "The invalid number of traffic class, only 1~8 allowed.\n");
+ return;
+ }
+
+diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
+index 0f8d8a1..5557314 100644
+--- a/app/test-pmd/testpmd.c
++++ b/app/test-pmd/testpmd.c
+@@ -4368,6 +4368,22 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, enum dcb_mode_enable dcb_mode,
+ eth_conf->dcb_capability_en = RTE_ETH_DCB_PG_SUPPORT;
+ }
+
++static void
++clear_eth_dcb_conf(portid_t pid, struct rte_eth_conf *eth_conf)
++{
++ uint32_t i;
++
++ eth_conf->rxmode.mq_mode &= ~(RTE_ETH_MQ_RX_DCB | RTE_ETH_MQ_RX_VMDQ_DCB);
++ eth_conf->txmode.mq_mode = RTE_ETH_MQ_TX_NONE;
++ eth_conf->dcb_capability_en = 0;
++ if (dcb_config) {
++ /* Unset VLAN filter configuration if already config DCB. */
++ eth_conf->rxmode.offloads &= ~RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
++ for (i = 0; i < RTE_DIM(vlan_tags); i++)
++ rx_vft_set(pid, vlan_tags[i], 0);
++ }
++}
++
+ int
+ init_port_dcb_config(portid_t pid,
+ enum dcb_mode_enable dcb_mode,
+@@ -4391,16 +4407,19 @@ init_port_dcb_config(portid_t pid,
+ /* retain the original device configuration. */
+ memcpy(&port_conf, &rte_port->dev_conf, sizeof(struct rte_eth_conf));
+
+- /* set configuration of DCB in vt mode and DCB in non-vt mode */
+- get_eth_dcb_conf(&port_conf, dcb_mode, num_tcs, pfc_en, prio_tc, prio_tc_en);
+-
+- port_conf.rxmode.offloads |= RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
+- /* remove RSS HASH offload for DCB in vt mode */
+- if (port_conf.rxmode.mq_mode == RTE_ETH_MQ_RX_VMDQ_DCB) {
+- port_conf.rxmode.offloads &= ~RTE_ETH_RX_OFFLOAD_RSS_HASH;
+- for (i = 0; i < nb_rxq; i++)
+- rte_port->rxq[i].conf.offloads &=
+- ~RTE_ETH_RX_OFFLOAD_RSS_HASH;
++ if (num_tcs > 1) {
++ /* set configuration of DCB in vt mode and DCB in non-vt mode */
++ get_eth_dcb_conf(&port_conf, dcb_mode, num_tcs, pfc_en, prio_tc, prio_tc_en);
++ port_conf.rxmode.offloads |= RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
++ /* remove RSS HASH offload for DCB in vt mode */
++ if (port_conf.rxmode.mq_mode == RTE_ETH_MQ_RX_VMDQ_DCB) {
++ port_conf.rxmode.offloads &= ~RTE_ETH_RX_OFFLOAD_RSS_HASH;
++ for (i = 0; i < nb_rxq; i++)
++ rte_port->rxq[i].conf.offloads &=
++ ~RTE_ETH_RX_OFFLOAD_RSS_HASH;
++ }
++ } else {
++ clear_eth_dcb_conf(pid, &port_conf);
+ }
+
+ /* re-configure the device . */
+@@ -4415,7 +4434,8 @@ init_port_dcb_config(portid_t pid,
+ /* If dev_info.vmdq_pool_base is greater than 0,
+ * the queue id of vmdq pools is started after pf queues.
+ */
+- if (dcb_mode == DCB_VT_ENABLED &&
++ if (num_tcs > 1 &&
++ dcb_mode == DCB_VT_ENABLED &&
+ rte_port->dev_info.vmdq_pool_base > 0) {
+ fprintf(stderr,
+ "VMDQ_DCB multi-queue mode is nonsensical for port %d.\n",
+@@ -4423,7 +4443,7 @@ init_port_dcb_config(portid_t pid,
+ return -1;
+ }
+
+- if (keep_qnum == 0) {
++ if (num_tcs > 1 && keep_qnum == 0) {
+ /* Assume the ports in testpmd have the same dcb capability
+ * and has the same number of rxq and txq in dcb mode
+ */
+@@ -4451,19 +4471,21 @@ init_port_dcb_config(portid_t pid,
+ memcpy(&rte_port->dev_conf, &port_conf, sizeof(struct rte_eth_conf));
+
+ rxtx_port_config(pid);
+- /* VLAN filter */
+- rte_port->dev_conf.rxmode.offloads |= RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
+- for (i = 0; i < RTE_DIM(vlan_tags); i++)
+- rx_vft_set(pid, vlan_tags[i], 1);
++ if (num_tcs > 1) {
++ /* VLAN filter */
++ rte_port->dev_conf.rxmode.offloads |= RTE_ETH_RX_OFFLOAD_VLAN_FILTER;
++ for (i = 0; i < RTE_DIM(vlan_tags); i++)
++ rx_vft_set(pid, vlan_tags[i], 1);
++ }
+
+ retval = eth_macaddr_get_print_err(pid, &rte_port->eth_addr);
+ if (retval != 0)
+ return retval;
+
+- rte_port->dcb_flag = 1;
++ rte_port->dcb_flag = num_tcs > 1 ? 1 : 0;
+
+ /* Enter DCB configuration status */
+- dcb_config = 1;
++ dcb_config = num_tcs > 1 ? 1 : 0;
+
+ return 0;
+ }
+diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+index f265e45..e816c81 100644
+--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
++++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+@@ -2142,7 +2142,7 @@ Set the DCB mode for an individual port::
+
+ testpmd> port config (port_id) dcb vt (on|off) (traffic_class) pfc (on|off) prio-tc (prio-tc) keep-qnum
+
+-The traffic class could be 2~8.
++The traffic class could be 1~8, if the value is 1, DCB is disabled.
+ The prio-tc field here is optional, if not specified then the prio-tc use default configuration.
+ The keep-qnum field here is also optional, if specified then don't adjust Rx/Tx queue number.
+
+--
+2.33.0
+
diff --git a/0131-examples-l3fwd-force-link-speed.patch b/0131-examples-l3fwd-force-link-speed.patch
new file mode 100644
index 0000000..c4a660b
--- /dev/null
+++ b/0131-examples-l3fwd-force-link-speed.patch
@@ -0,0 +1,87 @@
+From f45d2fe457138ef75dc43aa8171d9473313b7ca7 Mon Sep 17 00:00:00 2001
+From: Dengdui Huang <huangdengdui(a)huawei.com>
+Date: Wed, 27 Aug 2025 09:31:05 +0800
+Subject: [PATCH 18/24] examples/l3fwd: force link speed
+
+[ upstream commit 2001c8eaf4efb94173410644cf29cbaa62a0ac83 ]
+
+Currently, l3fwd starts in auto-negotiation mode, but it may fail to
+link up when auto-negotiation is not supported. Therefore, it is
+necessary to support starting with a specified speed for port.
+
+Additionally, this patch does not support changing the duplex mode.
+So speeds like 10M, 100M are not configurable using this method.
+
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Reviewed-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ examples/l3fwd/main.c | 17 ++++++++++++++++-
+ 1 file changed, 16 insertions(+), 1 deletion(-)
+
+diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
+index be5b5d8..066e7c8 100644
+--- a/examples/l3fwd/main.c
++++ b/examples/l3fwd/main.c
+@@ -422,6 +422,7 @@ print_usage(const char *prgname)
+ " Accepted: em (Exact Match), lpm (Longest Prefix Match), fib (Forwarding Information Base),\n"
+ " acl (Access Control List)\n"
+ " --config (port,queue,lcore): Rx queue configuration\n"
++ " --eth-link-speed: force link speed\n"
+ " --rx-queue-size NPKTS: Rx queue size in decimal\n"
+ " Default: %d\n"
+ " --tx-queue-size NPKTS: Tx queue size in decimal\n"
+@@ -732,6 +733,7 @@ static const char short_options[] =
+ ;
+
+ #define CMD_LINE_OPT_CONFIG "config"
++#define CMD_LINK_OPT_ETH_LINK_SPEED "eth-link-speed"
+ #define CMD_LINE_OPT_RX_QUEUE_SIZE "rx-queue-size"
+ #define CMD_LINE_OPT_TX_QUEUE_SIZE "tx-queue-size"
+ #define CMD_LINE_OPT_ETH_DEST "eth-dest"
+@@ -763,6 +765,7 @@ enum {
+ * conflict with short options */
+ CMD_LINE_OPT_MIN_NUM = 256,
+ CMD_LINE_OPT_CONFIG_NUM,
++ CMD_LINK_OPT_ETH_LINK_SPEED_NUM,
+ CMD_LINE_OPT_RX_QUEUE_SIZE_NUM,
+ CMD_LINE_OPT_TX_QUEUE_SIZE_NUM,
+ CMD_LINE_OPT_ETH_DEST_NUM,
+@@ -790,6 +793,7 @@ enum {
+
+ static const struct option lgopts[] = {
+ {CMD_LINE_OPT_CONFIG, 1, 0, CMD_LINE_OPT_CONFIG_NUM},
++ {CMD_LINK_OPT_ETH_LINK_SPEED, 1, 0, CMD_LINK_OPT_ETH_LINK_SPEED_NUM},
+ {CMD_LINE_OPT_RX_QUEUE_SIZE, 1, 0, CMD_LINE_OPT_RX_QUEUE_SIZE_NUM},
+ {CMD_LINE_OPT_TX_QUEUE_SIZE, 1, 0, CMD_LINE_OPT_TX_QUEUE_SIZE_NUM},
+ {CMD_LINE_OPT_ETH_DEST, 1, 0, CMD_LINE_OPT_ETH_DEST_NUM},
+@@ -845,6 +849,7 @@ parse_args(int argc, char **argv)
+ uint8_t eth_rx_q = 0;
+ struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc();
+ #endif
++ int speed_num;
+
+ argvopt = argv;
+
+@@ -893,7 +898,17 @@ parse_args(int argc, char **argv)
+ }
+ lcore_params = 1;
+ break;
+-
++ case CMD_LINK_OPT_ETH_LINK_SPEED_NUM:
++ speed_num = atoi(optarg);
++ if ((speed_num == RTE_ETH_SPEED_NUM_10M) ||
++ (speed_num == RTE_ETH_SPEED_NUM_100M)) {
++ fprintf(stderr, "Unsupported fixed speed\n");
++ print_usage(prgname);
++ return -1;
++ }
++ if (speed_num >= 0 && rte_eth_speed_bitflag(speed_num, 0) > 0)
++ port_conf.link_speeds = rte_eth_speed_bitflag(speed_num, 0);
++ break;
+ case CMD_LINE_OPT_RX_QUEUE_SIZE_NUM:
+ parse_queue_size(optarg, &nb_rxd, 1);
+ break;
+--
+2.33.0
+
diff --git a/0132-examples-l3fwd-power-force-link-speed.patch b/0132-examples-l3fwd-power-force-link-speed.patch
new file mode 100644
index 0000000..57f1f09
--- /dev/null
+++ b/0132-examples-l3fwd-power-force-link-speed.patch
@@ -0,0 +1,80 @@
+From 2239ed372f161db4f729c983511b2f7ab4ca0a6c Mon Sep 17 00:00:00 2001
+From: Dengdui Huang <huangdengdui(a)huawei.com>
+Date: Wed, 27 Aug 2025 09:31:06 +0800
+Subject: [PATCH 19/24] examples/l3fwd-power: force link speed
+
+[ upstream commit 2001c8eaf4efb94173410644cf29cbaa62a0ac83 ]
+
+Currently, l3fwd-power starts in auto-negotiation mode, but it may fail
+to link up when auto-negotiation is not supported. Therefore, it is
+necessary to support starting with a specified speed for port.
+
+Additionally, this patch does not support changing the duplex mode.
+ So speeds like 10M, 100M are not configurable using this method.
+
+Signed-off-by: Dengdui Huang <huangdengdui(a)huawei.com>
+Reviewed-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ examples/l3fwd-power/main.c | 18 ++++++++++++++++++
+ 1 file changed, 18 insertions(+)
+
+diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
+index 9c0dcd3..cb5e90c 100644
+--- a/examples/l3fwd-power/main.c
++++ b/examples/l3fwd-power/main.c
+@@ -1503,6 +1503,7 @@ print_usage(const char *prgname)
+ " -U: set min/max frequency for uncore to maximum value\n"
+ " -i (frequency index): set min/max frequency for uncore to specified frequency index\n"
+ " --config (port,queue,lcore): rx queues configuration\n"
++ " --eth-link-speed: force link speed\n"
+ " --high-perf-cores CORELIST: list of high performance cores\n"
+ " --perf-config: similar as config, cores specified as indices"
+ " for bins containing high or regular performance cores\n"
+@@ -1741,12 +1742,14 @@ parse_pmd_mgmt_config(const char *name)
+ #define CMD_LINE_OPT_PAUSE_DURATION "pause-duration"
+ #define CMD_LINE_OPT_SCALE_FREQ_MIN "scale-freq-min"
+ #define CMD_LINE_OPT_SCALE_FREQ_MAX "scale-freq-max"
++#define CMD_LINK_OPT_ETH_LINK_SPEED "eth-link-speed"
+
+ /* Parse the argument given in the command line of the application */
+ static int
+ parse_args(int argc, char **argv)
+ {
+ int opt, ret;
++ int speed_num;
+ char **argvopt;
+ int option_index;
+ char *prgname = argv[0];
+@@ -1765,6 +1768,7 @@ parse_args(int argc, char **argv)
+ {CMD_LINE_OPT_PAUSE_DURATION, 1, 0, 0},
+ {CMD_LINE_OPT_SCALE_FREQ_MIN, 1, 0, 0},
+ {CMD_LINE_OPT_SCALE_FREQ_MAX, 1, 0, 0},
++ {CMD_LINK_OPT_ETH_LINK_SPEED, 1, 0, 0},
+ {NULL, 0, 0, 0}
+ };
+
+@@ -1935,6 +1939,20 @@ parse_args(int argc, char **argv)
+ scale_freq_max = parse_int(optarg);
+ }
+
++ if (!strncmp(lgopts[option_index].name,
++ CMD_LINK_OPT_ETH_LINK_SPEED,
++ sizeof(CMD_LINK_OPT_ETH_LINK_SPEED))) {
++ speed_num = atoi(optarg);
++ if ((speed_num == RTE_ETH_SPEED_NUM_10M) ||
++ (speed_num == RTE_ETH_SPEED_NUM_100M)) {
++ fprintf(stderr, "Unsupported fixed speed\n");
++ print_usage(prgname);
++ return -1;
++ }
++ if (speed_num >= 0 && rte_eth_speed_bitflag(speed_num, 0) > 0)
++ port_conf.link_speeds = rte_eth_speed_bitflag(speed_num, 0);
++ }
++
+ break;
+
+ default:
+--
+2.33.0
+
diff --git a/0133-config-arm-add-HiSilicon-HIP12.patch b/0133-config-arm-add-HiSilicon-HIP12.patch
new file mode 100644
index 0000000..6fe7266
--- /dev/null
+++ b/0133-config-arm-add-HiSilicon-HIP12.patch
@@ -0,0 +1,94 @@
+From 21711b7deb6315f8394b7284a50a75e756fad0fa Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Wed, 29 Oct 2025 09:16:26 +0800
+Subject: [PATCH 20/24] config/arm: add HiSilicon HIP12
+
+[ upstream commit a054de204b0b937dd976d0390fbb03353745e7cb ]
+
+Adding support for HiSilicon HIP12 platform.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Acked-by: Huisong Li <lihuisong(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ config/arm/arm64_hip12_linux_gcc | 17 +++++++++++++++++
+ config/arm/meson.build | 18 ++++++++++++++++++
+ 2 files changed, 35 insertions(+)
+ create mode 100644 config/arm/arm64_hip12_linux_gcc
+
+diff --git a/config/arm/arm64_hip12_linux_gcc b/config/arm/arm64_hip12_linux_gcc
+new file mode 100644
+index 0000000..949093d
+--- /dev/null
++++ b/config/arm/arm64_hip12_linux_gcc
+@@ -0,0 +1,17 @@
++[binaries]
++c = ['ccache', 'aarch64-linux-gnu-gcc']
++cpp = ['ccache', 'aarch64-linux-gnu-g++']
++ar = 'aarch64-linux-gnu-gcc-ar'
++strip = 'aarch64-linux-gnu-strip'
++pkgconfig = 'aarch64-linux-gnu-pkg-config'
++pkg-config = 'aarch64-linux-gnu-pkg-config'
++pcap-config = ''
++
++[host_machine]
++system = 'linux'
++cpu_family = 'aarch64'
++cpu = 'armv8.5-a'
++endian = 'little'
++
++[properties]
++platform = 'hip12'
+diff --git a/config/arm/meson.build b/config/arm/meson.build
+index 7c8fcb8..303b7ca 100644
+--- a/config/arm/meson.build
++++ b/config/arm/meson.build
+@@ -233,6 +233,15 @@ implementer_hisilicon = {
+ ['RTE_MAX_LCORE', 1280],
+ ['RTE_MAX_NUMA_NODES', 16]
+ ]
++ },
++ '0xd06': {
++ 'mcpu': 'mcpu_hip12',
++ 'flags': [
++ ['RTE_MACHINE', '"hip12"'],
++ ['RTE_ARM_FEATURE_ATOMICS', true],
++ ['RTE_MAX_LCORE', 1280],
++ ['RTE_MAX_NUMA_NODES', 16]
++ ]
+ }
+ }
+ }
+@@ -436,6 +445,13 @@ soc_hip10 = {
+ 'numa': true
+ }
+
++soc_hip12 = {
++ 'description': 'HiSilicon HIP12',
++ 'implementer': '0x48',
++ 'part_number': '0xd06',
++ 'numa': true
++}
++
+ soc_kunpeng920 = {
+ 'description': 'HiSilicon Kunpeng 920',
+ 'implementer': '0x48',
+@@ -537,6 +553,7 @@ tys2500: Phytium TengYun S2500
+ graviton2: AWS Graviton2
+ graviton3: AWS Graviton3
+ hip10: HiSilicon HIP10
++hip12: HiSilicon HIP12
+ kunpeng920: HiSilicon Kunpeng 920
+ kunpeng930: HiSilicon Kunpeng 930
+ n1sdp: Arm Neoverse N1SDP
+@@ -568,6 +585,7 @@ socs = {
+ 'graviton2': soc_graviton2,
+ 'graviton3': soc_graviton3,
+ 'hip10': soc_hip10,
++ 'hip12': soc_hip12,
+ 'kunpeng920': soc_kunpeng920,
+ 'kunpeng930': soc_kunpeng930,
+ 'n1sdp': soc_n1sdp,
+--
+2.33.0
+
diff --git a/0134-app-testpmd-fix-DCB-Tx-port.patch b/0134-app-testpmd-fix-DCB-Tx-port.patch
new file mode 100644
index 0000000..f17b135
--- /dev/null
+++ b/0134-app-testpmd-fix-DCB-Tx-port.patch
@@ -0,0 +1,51 @@
+From 64f53c7016c0480acd0103a533328f070acc47ef Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 6 Nov 2025 08:29:19 +0800
+Subject: [PATCH 21/24] app/testpmd: fix DCB Tx port
+
+[ upstream commit 47012b7cbf78531e99b6ab3faa3a69e941ddbaa0 ]
+
+The txp maybe invalid (e.g. start with only one port but set with 1),
+this commit fix it by get txp from fwd_topology_tx_port_get() function.
+
+An added benefit is that the DCB test also supports '--port-topology'
+parameter.
+
+Fixes: 1a572499beb6 ("app/testpmd: setup DCB forwarding based on traffic class")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/config.c | 7 ++-----
+ 1 file changed, 2 insertions(+), 5 deletions(-)
+
+diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
+index 0722cc2..d71b398 100644
+--- a/app/test-pmd/config.c
++++ b/app/test-pmd/config.c
+@@ -4919,7 +4919,7 @@ dcb_fwd_config_setup(void)
+ /* reinitialize forwarding streams */
+ init_fwd_streams();
+ sm_id = 0;
+- txp = 1;
++ txp = fwd_topology_tx_port_get(rxp);
+ /* get the dcb info on the first RX and TX ports */
+ (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[rxp], &rxp_dcb_info);
+ (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[txp], &txp_dcb_info);
+@@ -4967,11 +4967,8 @@ dcb_fwd_config_setup(void)
+ rxp++;
+ if (rxp >= nb_fwd_ports)
+ return;
++ txp = fwd_topology_tx_port_get(rxp);
+ /* get the dcb information on next RX and TX ports */
+- if ((rxp & 0x1) == 0)
+- txp = (portid_t) (rxp + 1);
+- else
+- txp = (portid_t) (rxp - 1);
+ rte_eth_dev_get_dcb_info(fwd_ports_ids[rxp], &rxp_dcb_info);
+ rte_eth_dev_get_dcb_info(fwd_ports_ids[txp], &txp_dcb_info);
+ }
+--
+2.33.0
+
diff --git a/0135-app-testpmd-fix-DCB-Rx-queues.patch b/0135-app-testpmd-fix-DCB-Rx-queues.patch
new file mode 100644
index 0000000..a052b89
--- /dev/null
+++ b/0135-app-testpmd-fix-DCB-Rx-queues.patch
@@ -0,0 +1,35 @@
+From 27c05f7ee5c1e0567e602862961db082542b9b44 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Thu, 6 Nov 2025 08:29:20 +0800
+Subject: [PATCH 22/24] app/testpmd: fix DCB Rx queues
+
+[ upstream commit 32387caaa00660ebe35be25f2371edb0069cc80a ]
+
+The nb_rx_queue should get from rxp_dcb_info not txp_dcb_info, this
+commit fix it.
+
+Fixes: 1a572499beb6 ("app/testpmd: setup DCB forwarding based on traffic class")
+Cc: stable(a)dpdk.org
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/config.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
+index d71b398..c65586b 100644
+--- a/app/test-pmd/config.c
++++ b/app/test-pmd/config.c
+@@ -4937,7 +4937,7 @@ dcb_fwd_config_setup(void)
+ fwd_lcores[lc_id]->stream_idx;
+ rxq = rxp_dcb_info.tc_queue.tc_rxq[i][tc].base;
+ txq = txp_dcb_info.tc_queue.tc_txq[i][tc].base;
+- nb_rx_queue = txp_dcb_info.tc_queue.tc_rxq[i][tc].nb_queue;
++ nb_rx_queue = rxp_dcb_info.tc_queue.tc_rxq[i][tc].nb_queue;
+ nb_tx_queue = txp_dcb_info.tc_queue.tc_txq[i][tc].nb_queue;
+ for (j = 0; j < nb_rx_queue; j++) {
+ struct fwd_stream *fs;
+--
+2.33.0
+
diff --git a/0136-app-testpmd-support-specify-TCs-when-DCB-forward.patch b/0136-app-testpmd-support-specify-TCs-when-DCB-forward.patch
new file mode 100644
index 0000000..bd57ad5
--- /dev/null
+++ b/0136-app-testpmd-support-specify-TCs-when-DCB-forward.patch
@@ -0,0 +1,254 @@
+From d1caf16b597ccd08ee72765d7027bb3a9ea172c6 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 11 Nov 2025 17:13:02 +0800
+Subject: [PATCH 23/24] app/testpmd: support specify TCs when DCB forward
+
+[ upstream commit 48077248013eb2b52e020cf2eb103a314d794e81 ]
+
+This commit supports specify TCs when DCB forwarding, the command:
+
+ set dcb fwd_tc (tc_mask)
+
+The background of this command: only some TCs are expected to generate
+traffic when the DCB function is tested based on txonly forwarding, we
+could use this command to specify TCs to be used.
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Acked-by: Huisong Li <lihuisong(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/cmdline.c | 57 +++++++++++++++++++++
+ app/test-pmd/config.c | 50 +++++++++++++++++-
+ app/test-pmd/testpmd.c | 6 +++
+ app/test-pmd/testpmd.h | 3 ++
+ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 8 +++
+ 5 files changed, 122 insertions(+), 2 deletions(-)
+
+diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
+index 332d7b3..c8a8ecd 100644
+--- a/app/test-pmd/cmdline.c
++++ b/app/test-pmd/cmdline.c
+@@ -488,6 +488,9 @@ static void cmd_help_long_parsed(void *parsed_result,
+ "set fwd (%s)\n"
+ " Set packet forwarding mode.\n\n"
+
++ "set dcb fwd_tc (tc_mask)\n"
++ " Set dcb forwarding on specify TCs, if bit-n in tc-mask is 1, then TC-n's forwarding is enabled\n\n"
++
+ "mac_addr add (port_id) (XX:XX:XX:XX:XX:XX)\n"
+ " Add a MAC address on port_id.\n\n"
+
+@@ -5944,6 +5947,59 @@ static void cmd_set_fwd_retry_mode_init(void)
+ token_struct->string_data.str = token;
+ }
+
++/* *** set dcb forward TCs *** */
++struct cmd_set_dcb_fwd_tc_result {
++ cmdline_fixed_string_t set;
++ cmdline_fixed_string_t dcb;
++ cmdline_fixed_string_t fwd_tc;
++ uint8_t tc_mask;
++};
++
++static void cmd_set_dcb_fwd_tc_parsed(void *parsed_result,
++ __rte_unused struct cmdline *cl,
++ __rte_unused void *data)
++{
++ struct cmd_set_dcb_fwd_tc_result *res = parsed_result;
++ int i;
++ if (res->tc_mask == 0) {
++ fprintf(stderr, "TC mask should not be zero!\n");
++ return;
++ }
++ printf("Enabled DCB forwarding TC list:");
++ dcb_fwd_tc_mask = res->tc_mask;
++ for (i = 0; i < RTE_ETH_8_TCS; i++) {
++ if (dcb_fwd_tc_mask & (1u << i))
++ printf(" %d", i);
++ }
++ printf("\n");
++}
++
++static cmdline_parse_token_string_t cmd_set_dcb_fwd_tc_set =
++ TOKEN_STRING_INITIALIZER(struct cmd_set_dcb_fwd_tc_result,
++ set, "set");
++static cmdline_parse_token_string_t cmd_set_dcb_fwd_tc_dcb =
++ TOKEN_STRING_INITIALIZER(struct cmd_set_dcb_fwd_tc_result,
++ dcb, "dcb");
++static cmdline_parse_token_string_t cmd_set_dcb_fwd_tc_fwdtc =
++ TOKEN_STRING_INITIALIZER(struct cmd_set_dcb_fwd_tc_result,
++ fwd_tc, "fwd_tc");
++static cmdline_parse_token_num_t cmd_set_dcb_fwd_tc_tcmask =
++ TOKEN_NUM_INITIALIZER(struct cmd_set_dcb_fwd_tc_result,
++ tc_mask, RTE_UINT8);
++
++static cmdline_parse_inst_t cmd_set_dcb_fwd_tc = {
++ .f = cmd_set_dcb_fwd_tc_parsed,
++ .data = NULL,
++ .help_str = "config DCB forwarding on specify TCs, if bit-n in tc-mask is 1, then TC-n's forwarding is enabled, and vice versa.",
++ .tokens = {
++ (void *)&cmd_set_dcb_fwd_tc_set,
++ (void *)&cmd_set_dcb_fwd_tc_dcb,
++ (void *)&cmd_set_dcb_fwd_tc_fwdtc,
++ (void *)&cmd_set_dcb_fwd_tc_tcmask,
++ NULL,
++ },
++};
++
+ /* *** SET BURST TX DELAY TIME RETRY NUMBER *** */
+ struct cmd_set_burst_tx_retry_result {
+ cmdline_fixed_string_t set;
+@@ -13318,6 +13374,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
+ (cmdline_parse_inst_t *)&cmd_set_fwd_mask,
+ (cmdline_parse_inst_t *)&cmd_set_fwd_mode,
+ (cmdline_parse_inst_t *)&cmd_set_fwd_retry_mode,
++ (cmdline_parse_inst_t *)&cmd_set_dcb_fwd_tc,
+ (cmdline_parse_inst_t *)&cmd_set_burst_tx_retry,
+ (cmdline_parse_inst_t *)&cmd_set_promisc_mode_one,
+ (cmdline_parse_inst_t *)&cmd_set_promisc_mode_all,
+diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
+index c65586b..4735dfa 100644
+--- a/app/test-pmd/config.c
++++ b/app/test-pmd/config.c
+@@ -4853,12 +4853,48 @@ get_fwd_port_total_tc_num(void)
+
+ for (i = 0; i < nb_fwd_ports; i++) {
+ (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[i], &dcb_info);
+- total_tc_num += dcb_info.nb_tcs;
++ total_tc_num += rte_popcount32(dcb_fwd_tc_mask & ((1u << dcb_info.nb_tcs) - 1));
+ }
+
+ return total_tc_num;
+ }
+
++static void
++dcb_fwd_tc_update_dcb_info(struct rte_eth_dcb_info *org_dcb_info)
++{
++ struct rte_eth_dcb_info dcb_info = {0};
++ uint32_t i, vmdq_idx;
++ uint32_t tc = 0;
++
++ if (dcb_fwd_tc_mask == DEFAULT_DCB_FWD_TC_MASK)
++ return;
++
++ /*
++ * Use compress scheme to update dcb-info.
++ * E.g. If org_dcb_info->nb_tcs is 4 and dcb_fwd_tc_mask is 0x8, it
++ * means only enable TC3, then the new dcb-info's nb_tcs is set to
++ * 1, and also move corresponding tc_rxq and tc_txq info to new
++ * index.
++ */
++ for (i = 0; i < org_dcb_info->nb_tcs; i++) {
++ if (!(dcb_fwd_tc_mask & (1u << i)))
++ continue;
++ for (vmdq_idx = 0; vmdq_idx < RTE_ETH_MAX_VMDQ_POOL; vmdq_idx++) {
++ dcb_info.tc_queue.tc_rxq[vmdq_idx][tc].base =
++ org_dcb_info->tc_queue.tc_rxq[vmdq_idx][i].base;
++ dcb_info.tc_queue.tc_rxq[vmdq_idx][tc].nb_queue =
++ org_dcb_info->tc_queue.tc_rxq[vmdq_idx][i].nb_queue;
++ dcb_info.tc_queue.tc_txq[vmdq_idx][tc].base =
++ org_dcb_info->tc_queue.tc_txq[vmdq_idx][i].base;
++ dcb_info.tc_queue.tc_txq[vmdq_idx][tc].nb_queue =
++ org_dcb_info->tc_queue.tc_txq[vmdq_idx][i].nb_queue;
++ }
++ tc++;
++ }
++ dcb_info.nb_tcs = tc;
++ *org_dcb_info = dcb_info;
++}
++
+ /**
+ * For the DCB forwarding test, each core is assigned on each traffic class.
+ *
+@@ -4908,11 +4944,17 @@ dcb_fwd_config_setup(void)
+ }
+ }
+
++ total_tc_num = get_fwd_port_total_tc_num();
++ if (total_tc_num == 0) {
++ fprintf(stderr, "Error: total forwarding TC num is zero!\n");
++ cur_fwd_config.nb_fwd_lcores = 0;
++ return;
++ }
++
+ cur_fwd_config.nb_fwd_lcores = (lcoreid_t) nb_fwd_lcores;
+ cur_fwd_config.nb_fwd_ports = nb_fwd_ports;
+ cur_fwd_config.nb_fwd_streams =
+ (streamid_t) (nb_rxq * cur_fwd_config.nb_fwd_ports);
+- total_tc_num = get_fwd_port_total_tc_num();
+ if (cur_fwd_config.nb_fwd_lcores > total_tc_num)
+ cur_fwd_config.nb_fwd_lcores = total_tc_num;
+
+@@ -4922,7 +4964,9 @@ dcb_fwd_config_setup(void)
+ txp = fwd_topology_tx_port_get(rxp);
+ /* get the dcb info on the first RX and TX ports */
+ (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[rxp], &rxp_dcb_info);
++ dcb_fwd_tc_update_dcb_info(&rxp_dcb_info);
+ (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[txp], &txp_dcb_info);
++ dcb_fwd_tc_update_dcb_info(&txp_dcb_info);
+
+ for (lc_id = 0; lc_id < cur_fwd_config.nb_fwd_lcores; lc_id++) {
+ fwd_lcores[lc_id]->stream_nb = 0;
+@@ -4970,7 +5014,9 @@ dcb_fwd_config_setup(void)
+ txp = fwd_topology_tx_port_get(rxp);
+ /* get the dcb information on next RX and TX ports */
+ rte_eth_dev_get_dcb_info(fwd_ports_ids[rxp], &rxp_dcb_info);
++ dcb_fwd_tc_update_dcb_info(&rxp_dcb_info);
+ rte_eth_dev_get_dcb_info(fwd_ports_ids[txp], &txp_dcb_info);
++ dcb_fwd_tc_update_dcb_info(&txp_dcb_info);
+ }
+ }
+
+diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
+index 5557314..770cb40 100644
+--- a/app/test-pmd/testpmd.c
++++ b/app/test-pmd/testpmd.c
+@@ -206,6 +206,12 @@ struct fwd_engine * fwd_engines[] = {
+ NULL,
+ };
+
++/*
++ * Bitmask for control DCB forwarding for TCs.
++ * If bit-n in tc-mask is 1, then TC-n's forwarding is enabled, and vice versa.
++ */
++uint8_t dcb_fwd_tc_mask = DEFAULT_DCB_FWD_TC_MASK;
++
+ struct rte_mempool *mempools[RTE_MAX_NUMA_NODES * MAX_SEGS_BUFFER_SPLIT];
+ uint16_t mempool_flags;
+
+diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
+index 4e12073..c22d673 100644
+--- a/app/test-pmd/testpmd.h
++++ b/app/test-pmd/testpmd.h
+@@ -464,6 +464,9 @@ extern cmdline_parse_inst_t cmd_show_set_raw_all;
+ extern cmdline_parse_inst_t cmd_set_flex_is_pattern;
+ extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
+
++#define DEFAULT_DCB_FWD_TC_MASK 0xFF
++extern uint8_t dcb_fwd_tc_mask;
++
+ extern uint16_t mempool_flags;
+
+ /**
+diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+index e816c81..83006aa 100644
+--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
++++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+@@ -1838,6 +1838,14 @@ during the flow rule creation::
+
+ Otherwise the default index ``0`` is used.
+
++set dcb fwd_tc
++~~~~~~~~~~~~~~
++
++Config DCB forwarding on specify TCs, if bit-n in tc-mask is 1, then TC-n's
++forwarding is enabled, and vice versa::
++
++ testpmd> set dcb fwd_tc (tc_mask)
++
+ Port Functions
+ --------------
+
+--
+2.33.0
+
diff --git a/0137-app-testpmd-support-multi-cores-process-one-TC.patch b/0137-app-testpmd-support-multi-cores-process-one-TC.patch
new file mode 100644
index 0000000..db46b6c
--- /dev/null
+++ b/0137-app-testpmd-support-multi-cores-process-one-TC.patch
@@ -0,0 +1,292 @@
+From 56209344f3eb31a960c38afa986bbb8a6072f838 Mon Sep 17 00:00:00 2001
+From: Chengwen Feng <fengchengwen(a)huawei.com>
+Date: Tue, 11 Nov 2025 17:13:03 +0800
+Subject: [PATCH 24/24] app/testpmd: support multi-cores process one TC
+
+[ upstream commit fca6f2910345c25a5050a0b586e0d324ca616cbb ]
+
+Currently, one TC can be processed by only one core, when there are a
+large number of small packets, this core becomes a bottleneck.
+
+This commit supports multi-cores process one TC, the command:
+
+ set dcb fwd_tc_cores (tc_cores)
+
+Signed-off-by: Chengwen Feng <fengchengwen(a)huawei.com>
+Acked-by: Huisong Li <lihuisong(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ app/test-pmd/cmdline.c | 48 ++++++++++++
+ app/test-pmd/config.c | 85 ++++++++++++++++-----
+ app/test-pmd/testpmd.c | 9 +++
+ app/test-pmd/testpmd.h | 1 +
+ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 8 ++
+ 5 files changed, 134 insertions(+), 17 deletions(-)
+
+diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
+index c8a8ecd..275df67 100644
+--- a/app/test-pmd/cmdline.c
++++ b/app/test-pmd/cmdline.c
+@@ -6000,6 +6000,53 @@ static cmdline_parse_inst_t cmd_set_dcb_fwd_tc = {
+ },
+ };
+
++/* *** set dcb forward cores per TC *** */
++struct cmd_set_dcb_fwd_tc_cores_result {
++ cmdline_fixed_string_t set;
++ cmdline_fixed_string_t dcb;
++ cmdline_fixed_string_t fwd_tc_cores;
++ uint8_t tc_cores;
++};
++
++static void cmd_set_dcb_fwd_tc_cores_parsed(void *parsed_result,
++ __rte_unused struct cmdline *cl,
++ __rte_unused void *data)
++{
++ struct cmd_set_dcb_fwd_tc_cores_result *res = parsed_result;
++ if (res->tc_cores == 0) {
++ fprintf(stderr, "Cores per-TC should not be zero!\n");
++ return;
++ }
++ dcb_fwd_tc_cores = res->tc_cores;
++ printf("Set cores-per-TC: %u\n", dcb_fwd_tc_cores);
++}
++
++static cmdline_parse_token_string_t cmd_set_dcb_fwd_tc_cores_set =
++ TOKEN_STRING_INITIALIZER(struct cmd_set_dcb_fwd_tc_cores_result,
++ set, "set");
++static cmdline_parse_token_string_t cmd_set_dcb_fwd_tc_cores_dcb =
++ TOKEN_STRING_INITIALIZER(struct cmd_set_dcb_fwd_tc_cores_result,
++ dcb, "dcb");
++static cmdline_parse_token_string_t cmd_set_dcb_fwd_tc_cores_fwdtccores =
++ TOKEN_STRING_INITIALIZER(struct cmd_set_dcb_fwd_tc_cores_result,
++ fwd_tc_cores, "fwd_tc_cores");
++static cmdline_parse_token_num_t cmd_set_dcb_fwd_tc_cores_tccores =
++ TOKEN_NUM_INITIALIZER(struct cmd_set_dcb_fwd_tc_cores_result,
++ tc_cores, RTE_UINT8);
++
++static cmdline_parse_inst_t cmd_set_dcb_fwd_tc_cores = {
++ .f = cmd_set_dcb_fwd_tc_cores_parsed,
++ .data = NULL,
++ .help_str = "config DCB forwarding cores per-TC, 1-means one core process all queues of a TC.",
++ .tokens = {
++ (void *)&cmd_set_dcb_fwd_tc_cores_set,
++ (void *)&cmd_set_dcb_fwd_tc_cores_dcb,
++ (void *)&cmd_set_dcb_fwd_tc_cores_fwdtccores,
++ (void *)&cmd_set_dcb_fwd_tc_cores_tccores,
++ NULL,
++ },
++};
++
+ /* *** SET BURST TX DELAY TIME RETRY NUMBER *** */
+ struct cmd_set_burst_tx_retry_result {
+ cmdline_fixed_string_t set;
+@@ -13375,6 +13422,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
+ (cmdline_parse_inst_t *)&cmd_set_fwd_mode,
+ (cmdline_parse_inst_t *)&cmd_set_fwd_retry_mode,
+ (cmdline_parse_inst_t *)&cmd_set_dcb_fwd_tc,
++ (cmdline_parse_inst_t *)&cmd_set_dcb_fwd_tc_cores,
+ (cmdline_parse_inst_t *)&cmd_set_burst_tx_retry,
+ (cmdline_parse_inst_t *)&cmd_set_promisc_mode_one,
+ (cmdline_parse_inst_t *)&cmd_set_promisc_mode_all,
+diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
+index 4735dfa..53809d9 100644
+--- a/app/test-pmd/config.c
++++ b/app/test-pmd/config.c
+@@ -4844,6 +4844,36 @@ rss_fwd_config_setup(void)
+ }
+ }
+
++static int
++dcb_fwd_check_cores_per_tc(void)
++{
++ struct rte_eth_dcb_info dcb_info = {0};
++ uint32_t port, tc, vmdq_idx;
++
++ if (dcb_fwd_tc_cores == 1)
++ return 0;
++
++ for (port = 0; port < nb_fwd_ports; port++) {
++ (void)rte_eth_dev_get_dcb_info(fwd_ports_ids[port], &dcb_info);
++ for (tc = 0; tc < dcb_info.nb_tcs; tc++) {
++ for (vmdq_idx = 0; vmdq_idx < RTE_ETH_MAX_VMDQ_POOL; vmdq_idx++) {
++ if (dcb_info.tc_queue.tc_rxq[vmdq_idx][tc].nb_queue == 0)
++ break;
++ /* make sure nb_rx_queue can be divisible. */
++ if (dcb_info.tc_queue.tc_rxq[vmdq_idx][tc].nb_queue %
++ dcb_fwd_tc_cores)
++ return -1;
++ /* make sure nb_tx_queue can be divisible. */
++ if (dcb_info.tc_queue.tc_txq[vmdq_idx][tc].nb_queue %
++ dcb_fwd_tc_cores)
++ return -1;
++ }
++ }
++ }
++
++ return 0;
++}
++
+ static uint16_t
+ get_fwd_port_total_tc_num(void)
+ {
+@@ -4896,14 +4926,17 @@ dcb_fwd_tc_update_dcb_info(struct rte_eth_dcb_info *org_dcb_info)
+ }
+
+ /**
+- * For the DCB forwarding test, each core is assigned on each traffic class.
++ * For the DCB forwarding test, each core is assigned on each traffic class
++ * defaultly:
++ * Each core is assigned a multi-stream, each stream being composed of
++ * a RX queue to poll on a RX port for input messages, associated with
++ * a TX queue of a TX port where to send forwarded packets. All RX and
++ * TX queues are mapping to the same traffic class.
++ * If VMDQ and DCB co-exist, each traffic class on different POOLs share
++ * the same core.
+ *
+- * Each core is assigned a multi-stream, each stream being composed of
+- * a RX queue to poll on a RX port for input messages, associated with
+- * a TX queue of a TX port where to send forwarded packets. All RX and
+- * TX queues are mapping to the same traffic class.
+- * If VMDQ and DCB co-exist, each traffic class on different POOLs share
+- * the same core
++ * If user set cores-per-TC to other value (e.g. 2), then there will multiple
++ * cores to process one TC.
+ */
+ static void
+ dcb_fwd_config_setup(void)
+@@ -4911,9 +4944,10 @@ dcb_fwd_config_setup(void)
+ struct rte_eth_dcb_info rxp_dcb_info, txp_dcb_info;
+ portid_t txp, rxp = 0;
+ queueid_t txq, rxq = 0;
+- lcoreid_t lc_id;
++ lcoreid_t lc_id, target_lcores;
+ uint16_t nb_rx_queue, nb_tx_queue;
+ uint16_t i, j, k, sm_id = 0;
++ uint16_t sub_core_idx = 0;
+ uint16_t total_tc_num;
+ struct rte_port *port;
+ uint8_t tc = 0;
+@@ -4944,6 +4978,13 @@ dcb_fwd_config_setup(void)
+ }
+ }
+
++ ret = dcb_fwd_check_cores_per_tc();
++ if (ret != 0) {
++ fprintf(stderr, "Error: check forwarding cores-per-TC failed!\n");
++ cur_fwd_config.nb_fwd_lcores = 0;
++ return;
++ }
++
+ total_tc_num = get_fwd_port_total_tc_num();
+ if (total_tc_num == 0) {
+ fprintf(stderr, "Error: total forwarding TC num is zero!\n");
+@@ -4951,12 +4992,17 @@ dcb_fwd_config_setup(void)
+ return;
+ }
+
+- cur_fwd_config.nb_fwd_lcores = (lcoreid_t) nb_fwd_lcores;
++ target_lcores = (lcoreid_t)total_tc_num * (lcoreid_t)dcb_fwd_tc_cores;
++ if (nb_fwd_lcores < target_lcores) {
++ fprintf(stderr, "Error: the number of forwarding cores is insufficient!\n");
++ cur_fwd_config.nb_fwd_lcores = 0;
++ return;
++ }
++
++ cur_fwd_config.nb_fwd_lcores = target_lcores;
+ cur_fwd_config.nb_fwd_ports = nb_fwd_ports;
+ cur_fwd_config.nb_fwd_streams =
+ (streamid_t) (nb_rxq * cur_fwd_config.nb_fwd_ports);
+- if (cur_fwd_config.nb_fwd_lcores > total_tc_num)
+- cur_fwd_config.nb_fwd_lcores = total_tc_num;
+
+ /* reinitialize forwarding streams */
+ init_fwd_streams();
+@@ -4979,10 +5025,12 @@ dcb_fwd_config_setup(void)
+ break;
+ k = fwd_lcores[lc_id]->stream_nb +
+ fwd_lcores[lc_id]->stream_idx;
+- rxq = rxp_dcb_info.tc_queue.tc_rxq[i][tc].base;
+- txq = txp_dcb_info.tc_queue.tc_txq[i][tc].base;
+- nb_rx_queue = rxp_dcb_info.tc_queue.tc_rxq[i][tc].nb_queue;
+- nb_tx_queue = txp_dcb_info.tc_queue.tc_txq[i][tc].nb_queue;
++ nb_rx_queue = rxp_dcb_info.tc_queue.tc_rxq[i][tc].nb_queue /
++ dcb_fwd_tc_cores;
++ nb_tx_queue = txp_dcb_info.tc_queue.tc_txq[i][tc].nb_queue /
++ dcb_fwd_tc_cores;
++ rxq = rxp_dcb_info.tc_queue.tc_rxq[i][tc].base + nb_rx_queue * sub_core_idx;
++ txq = txp_dcb_info.tc_queue.tc_txq[i][tc].base + nb_tx_queue * sub_core_idx;
+ for (j = 0; j < nb_rx_queue; j++) {
+ struct fwd_stream *fs;
+
+@@ -4994,11 +5042,14 @@ dcb_fwd_config_setup(void)
+ fs->peer_addr = fs->tx_port;
+ fs->retry_enabled = retry_enabled;
+ }
+- fwd_lcores[lc_id]->stream_nb +=
+- rxp_dcb_info.tc_queue.tc_rxq[i][tc].nb_queue;
++ sub_core_idx++;
++ fwd_lcores[lc_id]->stream_nb += nb_rx_queue;
+ }
+ sm_id = (streamid_t) (sm_id + fwd_lcores[lc_id]->stream_nb);
++ if (sub_core_idx < dcb_fwd_tc_cores)
++ continue;
+
++ sub_core_idx = 0;
+ tc++;
+ if (tc < rxp_dcb_info.nb_tcs)
+ continue;
+diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
+index 770cb40..f665f00 100644
+--- a/app/test-pmd/testpmd.c
++++ b/app/test-pmd/testpmd.c
+@@ -211,6 +211,15 @@ struct fwd_engine * fwd_engines[] = {
+ * If bit-n in tc-mask is 1, then TC-n's forwarding is enabled, and vice versa.
+ */
+ uint8_t dcb_fwd_tc_mask = DEFAULT_DCB_FWD_TC_MASK;
++/*
++ * Poll cores per TC when DCB forwarding.
++ * E.g. 1 indicates that one core process all queues of a TC.
++ * 2 indicates that two cores process all queues of a TC. If there
++ * is a TC with 8 queues, then [0, 3] belong to first core, and
++ * [4, 7] belong to second core.
++ * ...
++ */
++uint8_t dcb_fwd_tc_cores = 1;
+
+ struct rte_mempool *mempools[RTE_MAX_NUMA_NODES * MAX_SEGS_BUFFER_SPLIT];
+ uint16_t mempool_flags;
+diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
+index c22d673..06f432a 100644
+--- a/app/test-pmd/testpmd.h
++++ b/app/test-pmd/testpmd.h
+@@ -466,6 +466,7 @@ extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
+
+ #define DEFAULT_DCB_FWD_TC_MASK 0xFF
+ extern uint8_t dcb_fwd_tc_mask;
++extern uint8_t dcb_fwd_tc_cores;
+
+ extern uint16_t mempool_flags;
+
+diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+index 83006aa..fc63587 100644
+--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
++++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+@@ -1846,6 +1846,14 @@ forwarding is enabled, and vice versa::
+
+ testpmd> set dcb fwd_tc (tc_mask)
+
++set dcb fwd_tc_cores
++~~~~~~~~~~~~~~~~~~~~
++
++Config DCB forwarding cores per-TC, 1-means one core process all queues of a TC,
++2-means two cores process all queues of a TC, and so on::
++
++ testpmd> set dcb fwd_tc_cores (tc_cores)
++
+ Port Functions
+ --------------
+
+--
+2.33.0
+
diff --git a/dpdk.spec b/dpdk.spec
index f150946..6d3750c 100644
--- a/dpdk.spec
+++ b/dpdk.spec
@@ -11,7 +11,7 @@
Name: dpdk
Version: 23.11
-Release: 37
+Release: 38
URL: http://dpdk.org
Source: https://fast.dpdk.org/rel/dpdk-%{version}.tar.xz
@@ -146,6 +146,31 @@ Patch6110: 0110-net-hns3-fix-overwrite-mbuf-in-vector-path.patch
Patch6111: 0111-net-hns3-fix-unrelease-VLAN-resource-when-init-fail.patch
Patch6112: 0112-net-hns3-fix-VLAN-tag-loss-for-short-tunnel-frame.patch
Patch6113: 0113-app-testpmd-fix-L4-protocol-retrieval-from-L3-header.patch
+Patch6114: 0114-app-testpmd-handle-IEEE1588-init-failure.patch
+Patch6115: 0115-examples-l3fwd-add-option-to-set-Rx-burst-size.patch
+Patch6116: 0116-examples-eventdev-fix-queue-crash-with-generic-pipel.patch
+Patch6117: 0117-examples-l3fwd-add-Tx-burst-size-configuration-optio.patch
+Patch6118: 0118-net-hns3-remove-duplicate-struct-field.patch
+Patch6119: 0119-net-hns3-refactor-DCB-module.patch
+Patch6120: 0120-net-hns3-parse-max-TC-number-for-VF.patch
+Patch6121: 0121-net-hns3-support-multi-TCs-capability-for-VF.patch
+Patch6122: 0122-net-hns3-fix-queue-TC-configuration-on-VF.patch
+Patch6123: 0123-net-hns3-support-multi-TCs-configuration-for-VF.patch
+Patch6124: 0124-app-testpmd-avoid-crash-in-DCB-config.patch
+Patch6125: 0125-app-testpmd-show-all-DCB-priority-TC-map.patch
+Patch6126: 0126-app-testpmd-relax-number-of-TCs-in-DCB-command.patch
+Patch6127: 0127-app-testpmd-reuse-RSS-config-when-configuring-DCB.patch
+Patch6128: 0128-app-testpmd-add-prio-tc-map-in-DCB-command.patch
+Patch6129: 0129-app-testpmd-add-queue-restriction-in-DCB-command.patch
+Patch6130: 0130-app-testpmd-add-command-to-disable-DCB.patch
+Patch6131: 0131-examples-l3fwd-force-link-speed.patch
+Patch6132: 0132-examples-l3fwd-power-force-link-speed.patch
+Patch6133: 0133-config-arm-add-HiSilicon-HIP12.patch
+Patch6134: 0134-app-testpmd-fix-DCB-Tx-port.patch
+Patch6135: 0135-app-testpmd-fix-DCB-Rx-queues.patch
+Patch6136: 0136-app-testpmd-support-specify-TCs-when-DCB-forward.patch
+Patch6137: 0137-app-testpmd-support-multi-cores-process-one-TC.patch
+
BuildRequires: meson
BuildRequires: python3-pyelftools
@@ -350,6 +375,33 @@ fi
/usr/sbin/depmod
%changelog
+* Thu Nov 27 2025 huangdonghua <huangdonghua3(a)h-partners.com> - 23.11-38
+ Upload some patches of vf multiple tc and some of others:
+ - app/testpmd: handle IEEE1588 init failure
+ - examples/l3fwd: add option to set Rx burst size
+ - examples/eventdev: fix queue crash with generic pipeline
+ - examples/l3fwd: add Tx burst size configuration option
+ - net/hns3: remove duplicate struct field
+ - net/hns3: refactor DCB module
+ - net/hns3: parse max TC number for VF
+ - net/hns3: support multi-TCs capability for VF
+ - net/hns3: fix queue TC configuration on VF
+ - net/hns3: support multi-TCs configuration for VF
+ - app/testpmd: avoid crash in DCB config
+ - app/testpmd: show all DCB priority TC map
+ - app/testpmd: relax number of TCs in DCB command
+ - app/testpmd: reuse RSS config when configuring DCB
+ - app/testpmd: add prio-tc map in DCB command
+ - app/testpmd: add queue restriction in DCB command
+ - app/testpmd: add command to disable DCB
+ - examples/l3fwd: force link speed
+ - examples/l3fwd-power: force link speed
+ - config/arm: add HiSilicon HIP12
+ - app/testpmd: fix DCB Tx port
+ - app/testpmd: fix DCB Rx queues
+ - app/testpmd: support specify TCs when DCB forward
+ - app/testpmd: support multi-cores process one TC
+
* Wed Nov 05 2025 huangdonghua <huangdonghua3(a)h-partners.com> - 23.11-37
Fix unrelease VLAN resource and L4 protocol retrieval from L3 header:
- net/hns3: fix unrelease VLAN resource when init fail
--
2.33.0
1
0
Minor fix and cleanup.
Junxian Huang (2):
libhns: Fix wrong WQE data when QP wraps around
libhns: Clean up an extra blank line
providers/hns/hns_roce_u_hw_v2.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--
2.33.0
1
2
25 Jul '25
Support roce_dfx_sta query
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
...-roce-Fix-array-out-of-bounds-access.patch | 110 ++++++++
...l-roce-Support-to-print-u64-reg_data.patch | 198 ++++++++++++++
...d-roce_dfx_sta-cmd-for-RoCE-DFX-stat.patch | 248 ++++++++++++++++++
hikptool.spec | 8 +-
4 files changed, 563 insertions(+), 1 deletion(-)
create mode 100644 0100-hikptool-roce-Fix-array-out-of-bounds-access.patch
create mode 100644 0101-hikptool-roce-Support-to-print-u64-reg_data.patch
create mode 100644 0102-hikptool-roce-Add-roce_dfx_sta-cmd-for-RoCE-DFX-stat.patch
diff --git a/0100-hikptool-roce-Fix-array-out-of-bounds-access.patch b/0100-hikptool-roce-Fix-array-out-of-bounds-access.patch
new file mode 100644
index 0000000..9b02414
--- /dev/null
+++ b/0100-hikptool-roce-Fix-array-out-of-bounds-access.patch
@@ -0,0 +1,110 @@
+From e4888f49e7d72d23e72537849141901fa2227440 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Tue, 20 May 2025 20:37:22 +0800
+Subject: [PATCH 100/102] hikptool/roce: Fix array out-of-bounds access
+
+cur_block_num may be greater than reg data num. As
+a result, out-of-bounds access to the reg_data.offset or
+reg_data.data array may occur during memcpy().
+
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+---
+ net/roce/roce_ext_common/hikp_roce_ext_common.c | 10 ++++++++++
+ net/roce/roce_scc/hikp_roce_scc.c | 10 ++++++----
+ net/roce/roce_trp/hikp_roce_trp.c | 12 +++++++-----
+ 3 files changed, 23 insertions(+), 9 deletions(-)
+
+diff --git a/net/roce/roce_ext_common/hikp_roce_ext_common.c b/net/roce/roce_ext_common/hikp_roce_ext_common.c
+index 9c844f4..fda2cf8 100644
+--- a/net/roce/roce_ext_common/hikp_roce_ext_common.c
++++ b/net/roce/roce_ext_common/hikp_roce_ext_common.c
+@@ -96,6 +96,7 @@ static int hikp_roce_ext_get_res(enum roce_cmd_type cmd_type,
+ struct reg_data *reg = &output->reg;
+ struct hikp_cmd_ret *cmd_ret;
+ uint32_t remain_block;
++ size_t reg_data_size;
+ size_t cur_size;
+ int ret;
+
+@@ -144,6 +145,15 @@ static int hikp_roce_ext_get_res(enum roce_cmd_type cmd_type,
+ }
+
+ cur_size = res_head->cur_block_num * sizeof(uint32_t);
++ /*calculates the size of reg_data in the roce_ext_res_param structure.*/
++ reg_data_size = cmd_ret->rsp_data_num * sizeof(uint32_t) - sizeof(struct roce_ext_head);
++ if (cur_size + reg_array_length * sizeof(uint32_t) > reg_data_size) {
++ printf("hikptool roce_%s cur size error, cur_size: %zu, reg_data_size: %zu.\n",
++ cmd_name, cur_size, reg_data_size);
++ ret = -EINVAL;
++ hikp_roce_ext_reg_data_free(reg);
++ goto get_data_error;
++ }
+ memcpy(reg->offset + block_id,
+ (uint32_t *)&roce_ext_res->reg_data, cur_size);
+ memcpy(reg->data + block_id,
+diff --git a/net/roce/roce_scc/hikp_roce_scc.c b/net/roce/roce_scc/hikp_roce_scc.c
+index 67a2a1e..d8aee47 100644
+--- a/net/roce/roce_scc/hikp_roce_scc.c
++++ b/net/roce/roce_scc/hikp_roce_scc.c
+@@ -169,9 +169,10 @@ static int hikp_roce_scc_get_total_data_num(struct roce_scc_head *res_head,
+ }
+
+ cur_size = roce_scc_res->head.cur_block_num * sizeof(uint32_t);
+- if (cur_size > max_size) {
++ if (cur_size > max_size || roce_scc_res->head.cur_block_num > ROCE_HIKP_SCC_REG_NUM) {
+ printf("hikptool roce_scc log data copy size error, "
+- "data size: 0x%zx, max size: 0x%zx\n", cur_size, max_size);
++ "data size: 0x%zx, max size: 0x%zx, block_num: 0x%x\n",
++ cur_size, max_size, roce_scc_res->head.cur_block_num);
+ ret = -EINVAL;
+ goto get_data_error;
+ }
+@@ -204,10 +205,11 @@ static int hikp_roce_scc_get_next_data(struct roce_scc_head *res_head,
+
+ roce_scc_res = (struct roce_scc_res_param *)cmd_ret->rsp_data;
+ cur_size = roce_scc_res->head.cur_block_num * sizeof(uint32_t);
+- if (cur_size > data_size) {
++ if (cur_size > data_size || roce_scc_res->head.cur_block_num > ROCE_HIKP_SCC_REG_NUM) {
+ hikp_cmd_free(&cmd_ret);
+ printf("hikptool roce_scc next log data copy size error, "
+- "data size: 0x%zx, max size: 0x%zx\n", cur_size, data_size);
++ "data size: 0x%zx, max size: 0x%zx, block_num: 0x%x\n",
++ cur_size, data_size, roce_scc_res->head.cur_block_num);
+ return -EINVAL;
+ }
+ memcpy(*offset, roce_scc_res->reg_data.offset, cur_size);
+diff --git a/net/roce/roce_trp/hikp_roce_trp.c b/net/roce/roce_trp/hikp_roce_trp.c
+index 67dfb8e..8b34409 100644
+--- a/net/roce/roce_trp/hikp_roce_trp.c
++++ b/net/roce/roce_trp/hikp_roce_trp.c
+@@ -192,9 +192,10 @@ static int hikp_roce_trp_get_total_data_num(struct roce_trp_head *res_head,
+ }
+
+ cur_size = roce_trp_res->head.cur_block_num * sizeof(uint32_t);
+- if (cur_size > max_size) {
++ if (cur_size > max_size || roce_trp_res->head.cur_block_num > ROCE_HIKP_TRP_REG_NUM) {
+ printf("hikptool roce_trp log data copy size error, "
+- "data size: 0x%zx, max size: 0x%zx\n", cur_size, max_size);
++ "data size: 0x%zx, max size: 0x%zx, block_num: 0x%x\n",
++ cur_size, max_size, roce_trp_res->head.cur_block_num);
+ hikp_roce_trp_reg_data_free(offset, data);
+ ret = -EINVAL;
+ goto get_data_error;
+@@ -229,10 +230,11 @@ static int hikp_roce_trp_get_next_data(struct roce_trp_head *res_head,
+ roce_trp_res = (struct roce_trp_res_param *)cmd_ret->rsp_data;
+ cur_size = roce_trp_res->head.cur_block_num * sizeof(uint32_t);
+
+- if (cur_size > data_size) {
+- hikp_cmd_free(&cmd_ret);
++ if (cur_size > data_size || roce_trp_res->head.cur_block_num > ROCE_HIKP_TRP_REG_NUM) {
+ printf("hikptool roce_trp next log data copy size error, "
+- "data size: 0x%zx, max size: 0x%zx\n", cur_size, data_size);
++ "data size: 0x%zx, max size: 0x%zx, block_num: 0x%x\n",
++ cur_size, data_size, roce_trp_res->head.cur_block_num);
++ hikp_cmd_free(&cmd_ret);
+ return -EINVAL;
+ }
+ memcpy(*offset, roce_trp_res->reg_data.offset, cur_size);
+--
+2.33.0
+
diff --git a/0101-hikptool-roce-Support-to-print-u64-reg_data.patch b/0101-hikptool-roce-Support-to-print-u64-reg_data.patch
new file mode 100644
index 0000000..1e09945
--- /dev/null
+++ b/0101-hikptool-roce-Support-to-print-u64-reg_data.patch
@@ -0,0 +1,198 @@
+From e680dee0da5cc54d3a71076ecb49f7de88feb62a Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Thu, 3 Jul 2025 17:25:08 +0800
+Subject: [PATCH 101/102] hikptool/roce: Support to print u64 reg_data
+
+Support to print u64 reg_data.
+
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+---
+ .../roce_ext_common/hikp_roce_ext_common.c | 61 +++++++++++++------
+ .../roce_ext_common/hikp_roce_ext_common.h | 26 +++++++-
+ 2 files changed, 66 insertions(+), 21 deletions(-)
+
+diff --git a/net/roce/roce_ext_common/hikp_roce_ext_common.c b/net/roce/roce_ext_common/hikp_roce_ext_common.c
+index fda2cf8..c225ec8 100644
+--- a/net/roce/roce_ext_common/hikp_roce_ext_common.c
++++ b/net/roce/roce_ext_common/hikp_roce_ext_common.c
+@@ -12,6 +12,7 @@
+ */
+
+ #include "hikp_roce_ext_common.h"
++#include <stddef.h>
+
+ static void hikp_roce_ext_reg_data_free(struct reg_data *reg)
+ {
+@@ -95,9 +96,11 @@ static int hikp_roce_ext_get_res(enum roce_cmd_type cmd_type,
+ struct roce_ext_res_param *roce_ext_res;
+ struct reg_data *reg = &output->reg;
+ struct hikp_cmd_ret *cmd_ret;
++ size_t reg_data_offset;
+ uint32_t remain_block;
+- size_t reg_data_size;
+- size_t cur_size;
++ size_t offset_size;
++ size_t data_size;
++ void *dst_data;
+ int ret;
+
+ /* reg_array_length greater than or equal to 0 ensures that cmd_name
+@@ -117,6 +120,7 @@ static int hikp_roce_ext_get_res(enum roce_cmd_type cmd_type,
+
+ if (block_id == 0) {
+ res_head->total_block_num = roce_ext_res->head.total_block_num;
++ res_head->flags = roce_ext_res->head.flags;
+ if (!res_head->total_block_num) {
+ printf("hikptool roce_%s total_block_num error!\n",
+ cmd_name);
+@@ -124,10 +128,12 @@ static int hikp_roce_ext_get_res(enum roce_cmd_type cmd_type,
+ goto get_data_error;
+ }
+ reg->offset = (uint32_t *)calloc(res_head->total_block_num, sizeof(uint32_t));
+- reg->data = (uint32_t *)calloc(res_head->total_block_num, sizeof(uint32_t));
++ output->per_val_size = res_head->flags & ROCE_HIKP_DATA_U64_FLAG ?
++ sizeof(uint64_t) : sizeof(uint32_t);
++ reg->data = calloc(res_head->total_block_num, output->per_val_size);
+ if ((reg->offset == NULL) || (reg->data == NULL)) {
+- printf("hikptool roce_%s alloc log memmory 0x%zx failed!\n",
+- cmd_name, res_head->total_block_num * sizeof(uint32_t));
++ printf("hikptool roce_%s alloc log memmory failed!\n",
++ cmd_name);
+ ret = -ENOMEM;
+ hikp_roce_ext_reg_data_free(reg);
+ goto get_data_error;
+@@ -144,20 +150,32 @@ static int hikp_roce_ext_get_res(enum roce_cmd_type cmd_type,
+ goto get_data_error;
+ }
+
+- cur_size = res_head->cur_block_num * sizeof(uint32_t);
+- /*calculates the size of reg_data in the roce_ext_res_param structure.*/
+- reg_data_size = cmd_ret->rsp_data_num * sizeof(uint32_t) - sizeof(struct roce_ext_head);
+- if (cur_size + reg_array_length * sizeof(uint32_t) > reg_data_size) {
+- printf("hikptool roce_%s cur size error, cur_size: %zu, reg_data_size: %zu.\n",
+- cmd_name, cur_size, reg_data_size);
++ /*
++ * The data structure `roce_ext_res_param_u64` returned by the
++ * firmware is 8-byte aligned, so the offset of the `reg_data`
++ * member needs to be adjusted accordingly.
++ */
++ if (res_head->flags & ROCE_HIKP_DATA_U64_FLAG)
++ reg_data_offset = offsetof(struct roce_ext_res_param_u64, reg_data);
++ else
++ reg_data_offset = offsetof(struct roce_ext_res_param, reg_data);
++
++ offset_size = res_head->cur_block_num * sizeof(uint32_t);
++ data_size = res_head->cur_block_num * output->per_val_size;
++ dst_data = reg->data_u32 + block_id * output->per_val_size / sizeof(uint32_t);
++ /* Avoid memcpy out-of-bounds. */
++ if ((reg_data_offset + data_size) / sizeof(uint32_t) + reg_array_length > cmd_ret->rsp_data_num) {
++ printf("hikptool roce_%s cur size error, data_size: %zu, rsp_data_num: %u.\n",
++ cmd_name, data_size, cmd_ret->rsp_data_num);
+ ret = -EINVAL;
+ hikp_roce_ext_reg_data_free(reg);
+ goto get_data_error;
+ }
+ memcpy(reg->offset + block_id,
+- (uint32_t *)&roce_ext_res->reg_data, cur_size);
+- memcpy(reg->data + block_id,
+- (uint32_t *)&roce_ext_res->reg_data + reg_array_length, cur_size);
++ (uint32_t *)&roce_ext_res->head + reg_data_offset / sizeof(uint32_t),
++ offset_size);
++ memcpy(dst_data, (uint32_t *)&roce_ext_res->head + reg_data_offset
++ / sizeof(uint32_t) + reg_array_length, data_size);
+
+ get_data_error:
+ hikp_cmd_free(&cmd_ret);
+@@ -172,15 +190,20 @@ static void hikp_roce_ext_print(enum roce_cmd_type cmd_type,
+ const char *cmd_name = get_cmd_name(cmd_type);
+ uint8_t arr_len = output->reg_name.arr_len;
+ uint32_t *offset = output->reg.offset;
+- uint32_t *data = output->reg.data;
++ struct reg_data *reg = &output->reg;
++ const char *name;
+ uint32_t i;
+
+ printf("**************%s INFO*************\n", cmd_name);
+ printf("%-40s[addr_offset] : reg_data\n", "reg_name");
+- for (i = 0; i < total_block_num; i++)
+- printf("%-40s[0x%08X] : 0x%08X\n",
+- i < arr_len ? reg_name[i] : "",
+- offset[i], data[i]);
++ for (i = 0; i < total_block_num; i++) {
++ name = i < arr_len ? reg_name[i] : "";
++ printf("%-40s[0x%08X] : ", name, offset[i]);
++ if (output->res_head.flags & ROCE_HIKP_DATA_U64_FLAG)
++ printf("0x%016lX\n", reg->data_u64[i]);
++ else
++ printf("0x%08X\n", reg->data_u32[i]);
++ }
+ printf("************************************\n");
+ }
+
+diff --git a/net/roce/roce_ext_common/hikp_roce_ext_common.h b/net/roce/roce_ext_common/hikp_roce_ext_common.h
+index 8568556..6f04024 100644
+--- a/net/roce/roce_ext_common/hikp_roce_ext_common.h
++++ b/net/roce/roce_ext_common/hikp_roce_ext_common.h
+@@ -17,6 +17,7 @@
+ #include "hikp_net_lib.h"
+
+ #define ROCE_MAX_REG_NUM (NET_MAX_REQ_DATA_NUM - 1)
++#define ROCE_MAX_U64_REG_NUM 18
+
+ #define ROCE_HIKP_CAEP_REG_NUM_EXT ROCE_MAX_REG_NUM
+ #define ROCE_HIKP_GMV_REG_NUM_EXT ROCE_MAX_REG_NUM
+@@ -30,11 +31,15 @@
+ #define ROCE_HIKP_RST_REG_NUM ROCE_MAX_REG_NUM
+ #define ROCE_HIKP_GLOBAL_CFG_REG_NUM ROCE_MAX_REG_NUM
+ #define ROCE_HIKP_BOND_REG_NUM ROCE_MAX_REG_NUM
++#define ROCE_HIKP_DFX_STA_NUM_EXT ROCE_MAX_U64_REG_NUM
++
++#define ROCE_HIKP_DATA_U64_FLAG 1 << 0
+
+ struct roce_ext_head {
+ uint8_t total_block_num;
+ uint8_t cur_block_num;
+- uint16_t reserved;
++ uint8_t flags;
++ uint8_t reserved;
+ };
+
+ struct roce_ext_res_param {
+@@ -42,9 +47,25 @@ struct roce_ext_res_param {
+ uint32_t reg_data[0];
+ };
+
++struct roce_ext_res_data_u64 {
++ uint32_t offset[ROCE_MAX_U64_REG_NUM];
++ uint64_t data[ROCE_MAX_U64_REG_NUM];
++ uint32_t rsv[4];
++};
++
++struct roce_ext_res_param_u64 {
++ struct roce_ext_head head;
++ uint32_t rsv;
++ struct roce_ext_res_data_u64 reg_data;
++};
++
+ struct reg_data {
+ uint32_t *offset;
+- uint32_t *data;
++ union {
++ void *data;
++ uint32_t *data_u32;
++ uint64_t *data_u64;
++ };
+ };
+
+ struct roce_ext_reg_name {
+@@ -55,6 +76,7 @@ struct roce_ext_reg_name {
+ struct roce_ext_res_output {
+ struct roce_ext_head res_head;
+ struct reg_data reg;
++ uint32_t per_val_size;
+ struct roce_ext_reg_name reg_name;
+ };
+
+--
+2.33.0
+
diff --git a/0102-hikptool-roce-Add-roce_dfx_sta-cmd-for-RoCE-DFX-stat.patch b/0102-hikptool-roce-Add-roce_dfx_sta-cmd-for-RoCE-DFX-stat.patch
new file mode 100644
index 0000000..1c2d9eb
--- /dev/null
+++ b/0102-hikptool-roce-Add-roce_dfx_sta-cmd-for-RoCE-DFX-stat.patch
@@ -0,0 +1,248 @@
+From f704e9fc2d5d878e669b303ec8571e54c734e811 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Wed, 2 Jul 2025 11:46:15 +0800
+Subject: [PATCH 102/102] hikptool/roce: Add roce_dfx_sta cmd for RoCE DFX
+ statistics
+
+Add roce_dfx_sta cmd for RoCE DFX statistics.
+
+Example:
+hikptool roce_dfx_sta -i eth1
+
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+---
+ info_collect/hikp_collect_roce.c | 22 ++++
+ net/hikp_net_lib.h | 1 +
+ net/roce/roce_dfx_sta/hikp_roce_dfx_sta.c | 107 ++++++++++++++++++
+ net/roce/roce_dfx_sta/hikp_roce_dfx_sta.h | 33 ++++++
+ .../roce_ext_common/hikp_roce_ext_common.c | 1 +
+ 5 files changed, 164 insertions(+)
+ create mode 100644 net/roce/roce_dfx_sta/hikp_roce_dfx_sta.c
+ create mode 100644 net/roce/roce_dfx_sta/hikp_roce_dfx_sta.h
+
+diff --git a/info_collect/hikp_collect_roce.c b/info_collect/hikp_collect_roce.c
+index baf2899..01d773b 100644
+--- a/info_collect/hikp_collect_roce.c
++++ b/info_collect/hikp_collect_roce.c
+@@ -26,6 +26,7 @@
+ #include "hikp_roce_tsp.h"
+ #include "hikp_roce_scc.h"
+ #include "hikp_roce_gmv.h"
++#include "hikp_roce_dfx_sta.h"
+
+ static void collect_roce_devinfo_log(void)
+ {
+@@ -125,6 +126,26 @@ static int collect_hikp_roce_gmv_log(void *nic_name)
+ return 0;
+ }
+
++static int collect_hikp_roce_dfx_sta_log(void *nic_name)
++{
++ struct major_cmd_ctrl self = {0};
++ struct hikp_cmd_type type = {0};
++ int ret;
++
++ self.cmd_ptr = &type;
++ ret = hikp_roce_set_dfx_sta_bdf((char *)nic_name);
++ if (ret) {
++ HIKP_ERROR_PRINT("failed to set roce_dfx_sta bdf for %s.\n",
++ (char *)nic_name);
++ return ret;
++ }
++
++ printf("hikptool roce_dfx_sta -i %s\n", (char *)nic_name);
++ hikp_roce_dfx_sta_execute(&self);
++
++ return 0;
++}
++
+ static int collect_hikp_roce_scc_log(void *nic_name)
+ {
+ struct major_cmd_ctrl self = {0};
+@@ -466,6 +487,7 @@ static int collect_one_roce_hikp_log(void *net_name)
+ { "roce_tsp", collect_hikp_roce_tsp_log },
+ { "roce_scc", collect_hikp_roce_scc_log },
+ { "roce_gmv", collect_hikp_roce_gmv_log },
++ { "roce_dfx_sta", collect_hikp_roce_dfx_sta_log },
+ };
+ size_t i;
+
+diff --git a/net/hikp_net_lib.h b/net/hikp_net_lib.h
+index 7ebabfa..aa700ab 100644
+--- a/net/hikp_net_lib.h
++++ b/net/hikp_net_lib.h
+@@ -103,6 +103,7 @@ enum roce_cmd_type {
+ GET_ROCEE_RST_CMD,
+ GET_ROCEE_GLOBAL_CFG_CMD,
+ GET_ROCEE_BOND_CMD,
++ GET_ROCEE_DFX_STA_CMD,
+ };
+
+ enum ub_cmd_type {
+diff --git a/net/roce/roce_dfx_sta/hikp_roce_dfx_sta.c b/net/roce/roce_dfx_sta/hikp_roce_dfx_sta.c
+new file mode 100644
+index 0000000..b74507c
+--- /dev/null
++++ b/net/roce/roce_dfx_sta/hikp_roce_dfx_sta.c
+@@ -0,0 +1,107 @@
++/*
++ * Copyright (c) 2025 Hisilicon Technologies Co., Ltd.
++ * Hikptool is licensed under Mulan PSL v2.
++ * You can use this software according to the terms and conditions of the Mulan PSL v2.
++ * You may obtain a copy of Mulan PSL v2 at:
++ * http://license.coscl.org.cn/MulanPSL2
++ * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND,
++ * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT,
++ * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE.
++ *
++ * See the Mulan PSL v2 for more details.
++ */
++
++#include "hikp_roce_dfx_sta.h"
++
++static struct cmd_roce_dfx_sta_param_t g_roce_dfx_sta_param_t = { 0 };
++
++int hikp_roce_set_dfx_sta_bdf(char *nic_name)
++{
++ return tool_check_and_get_valid_bdf_id(nic_name,
++ &g_roce_dfx_sta_param_t.target);
++}
++
++static int hikp_roce_dfx_sta_help(struct major_cmd_ctrl *self, const char *argv)
++{
++ HIKP_SET_USED(argv);
++
++ printf("\n Usage: %s %s\n", self->cmd_ptr->name, "-i <interface>\n");
++ printf("\n %s\n", self->cmd_ptr->help_info);
++ printf(" Options:\n\n");
++ printf(" %s, %-25s %s\n", "-h", "--help", "display this help and exit");
++ printf(" %s, %-25s %s\n", "-i", "--interface=<interface>", "device target, e.g. eth0");
++ printf(" %s, %-25s %s\n", "-c", "--clear=<clear>", "clear param count registers");
++ printf("\n");
++
++ return 0;
++}
++
++static int hikp_roce_dfx_sta_target(struct major_cmd_ctrl *self, const char *argv)
++{
++ self->err_no = tool_check_and_get_valid_bdf_id(argv, &(g_roce_dfx_sta_param_t.target));
++ if (self->err_no != 0)
++ snprintf(self->err_str, sizeof(self->err_str), "Unknown device %s.", argv);
++
++ return self->err_no;
++}
++
++static int hikp_roce_dfx_sta_clear_set(struct major_cmd_ctrl *self, const char *argv)
++{
++ HIKP_SET_USED(self);
++ HIKP_SET_USED(argv);
++
++ g_roce_dfx_sta_param_t.reset_flag = 1;
++ return 0;
++}
++
++/* DON'T change the order of this array or add entries between! */
++static const char *g_dfx_sta_reg_name[] = {
++ "PKT_RNR_STA",
++ "PKT_RTY_STA",
++ "MSN_RTY_STA",
++};
++
++static int hikp_roce_dfx_sta_get_data(struct hikp_cmd_ret **cmd_ret,
++ uint32_t block_id,
++ struct roce_ext_reg_name *reg_name)
++{
++ struct hikp_cmd_header req_header = { 0 };
++ struct roce_dfx_sta_req_param req_data;
++ uint32_t req_size;
++ int ret;
++
++ reg_name->reg_name = g_dfx_sta_reg_name;
++ reg_name->arr_len = HIKP_ARRAY_SIZE(g_dfx_sta_reg_name);
++
++ req_data.reset_flag = g_roce_dfx_sta_param_t.reset_flag;
++ req_data.bdf = g_roce_dfx_sta_param_t.target.bdf;
++ req_data.block_id = block_id;
++
++ req_size = sizeof(struct roce_dfx_sta_req_param);
++ hikp_cmd_init(&req_header, ROCE_MOD, GET_ROCEE_DFX_STA_CMD, 0);
++ *cmd_ret = hikp_cmd_alloc(&req_header, &req_data, req_size);
++ ret = hikp_rsp_normal_check(*cmd_ret);
++ if (ret)
++ printf("hikptool roce_dfx_sta get cmd data failed, ret: %d\n", ret);
++
++ return ret;
++}
++
++void hikp_roce_dfx_sta_execute(struct major_cmd_ctrl *self)
++{
++ hikp_roce_ext_execute(self, GET_ROCEE_DFX_STA_CMD, hikp_roce_dfx_sta_get_data);
++}
++
++static void cmd_roce_dfx_sta_init(void)
++{
++ struct major_cmd_ctrl *major_cmd = get_major_cmd();
++
++ major_cmd->option_count = 0;
++ major_cmd->execute = hikp_roce_dfx_sta_execute;
++
++ cmd_option_register("-h", "--help", false, hikp_roce_dfx_sta_help);
++ cmd_option_register("-i", "--interface", true, hikp_roce_dfx_sta_target);
++ cmd_option_register("-c", "--clear", false, hikp_roce_dfx_sta_clear_set);
++}
++
++HIKP_CMD_DECLARE("roce_dfx_sta", "get or clear RoCE dfx statistics", cmd_roce_dfx_sta_init);
+diff --git a/net/roce/roce_dfx_sta/hikp_roce_dfx_sta.h b/net/roce/roce_dfx_sta/hikp_roce_dfx_sta.h
+new file mode 100644
+index 0000000..b515356
+--- /dev/null
++++ b/net/roce/roce_dfx_sta/hikp_roce_dfx_sta.h
+@@ -0,0 +1,33 @@
++/*
++ * Copyright (c) 2025 Hisilicon Technologies Co., Ltd.
++ * Hikptool is licensed under Mulan PSL v2.
++ * You can use this software according to the terms and conditions of the Mulan PSL v2.
++ * You may obtain a copy of Mulan PSL v2 at:
++ * http://license.coscl.org.cn/MulanPSL2
++ * THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND,
++ * EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT,
++ * MERCHANTABILITY OR FIT FOR A PARTICULAR PURPOSE.
++ *
++ * See the Mulan PSL v2 for more details.
++ */
++
++#ifndef HIKP_ROCE_DFX_STA_H
++#define HIKP_ROCE_DFX_STA_H
++
++#include "hikp_roce_ext_common.h"
++
++struct cmd_roce_dfx_sta_param_t {
++ uint8_t reset_flag;
++ struct tool_target target;
++};
++
++struct roce_dfx_sta_req_param {
++ struct bdf_t bdf;
++ uint32_t block_id;
++ uint8_t reset_flag;
++};
++
++int hikp_roce_set_dfx_sta_bdf(char *nic_name);
++void hikp_roce_dfx_sta_execute(struct major_cmd_ctrl *self);
++
++#endif /* HIKP_ROCE_DFX_STA_H */
+diff --git a/net/roce/roce_ext_common/hikp_roce_ext_common.c b/net/roce/roce_ext_common/hikp_roce_ext_common.c
+index c225ec8..ac6c8fb 100644
+--- a/net/roce/roce_ext_common/hikp_roce_ext_common.c
++++ b/net/roce/roce_ext_common/hikp_roce_ext_common.c
+@@ -44,6 +44,7 @@ static const struct cmd_type_info {
+ {GET_ROCEE_RST_CMD, "RST", ROCE_HIKP_RST_REG_NUM},
+ {GET_ROCEE_GLOBAL_CFG_CMD, "GLOBAL_CFG", ROCE_HIKP_GLOBAL_CFG_REG_NUM},
+ {GET_ROCEE_BOND_CMD, "BOND", ROCE_HIKP_BOND_REG_NUM},
++ {GET_ROCEE_DFX_STA_CMD, "DFX_STA", ROCE_HIKP_DFX_STA_NUM_EXT},
+ };
+
+ static int get_cmd_info_table_idx(enum roce_cmd_type cmd_type)
+--
+2.33.0
+
diff --git a/hikptool.spec b/hikptool.spec
index c19df02..c6ea91f 100644
--- a/hikptool.spec
+++ b/hikptool.spec
@@ -3,7 +3,7 @@
Name: hikptool
Summary: A userspace tool for Linux providing problem location on Kunpeng chips
Version: 1.0.0
-Release: 18
+Release: 19
License: MulanPSL2
Source: %{name}-%{version}.tar.gz
ExclusiveOS: linux
@@ -115,6 +115,9 @@ Patch0096: 0096-Hikptool-add-support-dump-SDMA-register-information-.patch
Patch0097: 0097-Add-support-collect-sdma-hikptool-dump-reg-info.patch
Patch0098: 0098-hikptool-Update-the-tool-version-number-to-1.1.4.patch
Patch0099: 0099-hikptool-The-cpu_ring-command-is-added.patch
+Patch0100: 0100-hikptool-roce-Fix-array-out-of-bounds-access.patch
+Patch0101: 0101-hikptool-roce-Support-to-print-u64-reg_data.patch
+Patch0102: 0102-hikptool-roce-Add-roce_dfx_sta-cmd-for-RoCE-DFX-stat.patch
%description
This package contains the hikptool
@@ -167,6 +170,9 @@ fi
/sbin/ldconfig
%changelog
+* Fri Jul 25 2025 Junxian Huang <huangjunxian6(a)hisilicon.com> 1.0.0-19
+- Support roce_dfx_sta query
+
* Fri Jun 6 2025 veega2022 <zhuweijia(a)huawei.com> 1.0.0-18
- The cpu_ring command is added.
--
2.33.0
1
0
您好!
sig-high-performance-network 邀请您参加 2025-07-10 11:00 召开的Zoom会议(自动录制)
会议主题:High-performance-network SIG例会
会议链接:https://us06web.zoom.us/j/84375148100?pwd=bJ4jQahZXEwaqIDDNYbdTibpgOiHp0.1
会议纪要:https://etherpad.openeuler.org/p/sig-high-performance-network-meetings
更多资讯尽在:https://www.openeuler.org/zh/
Hello!
sig-high-performance-network invites you to attend the Zoom conference(auto recording) will be held at 2025-07-10 11:00,
The subject of the conference is High-performance-network SIG例会
You can join the meeting at https://us06web.zoom.us/j/84375148100?pwd=bJ4jQahZXEwaqIDDNYbdTibpgOiHp0.1
Add topics at https://etherpad.openeuler.org/p/sig-high-performance-network-meetings
More information: https://www.openeuler.org/en/
1
0
From: Guofeng Yue <yueguofeng(a)h-partners.com>
1. Fix the double-free of rinl_buf->wqe_list
2. Fix ret not assigned in create_srq
3. Sync the TD lock-free code of the community
Signed-off-by: Guofeng Yue <yueguofeng(a)h-partners.com>
---
...Fix-double-free-of-rinl-buf-wqe-list.patch | 48 ++
...s-Fix-ret-not-assigned-in-create-srq.patch | 46 ++
...hns-Add-error-logs-to-help-diagnosis.patch | 240 ++++++++
...lock-free-codes-from-mainline-driver.patch | 519 ++++++++++++++++++
...-Assign-ibv-srq-pd-when-creating-SRQ.patch | 31 ++
0104-libhns-Clean-up-data-type-issues.patch | 113 ++++
...hns-Add-debug-log-for-lock-free-mode.patch | 46 ++
rdma-core.spec | 15 +-
8 files changed, 1057 insertions(+), 1 deletion(-)
create mode 100644 0099-libhns-Fix-double-free-of-rinl-buf-wqe-list.patch
create mode 100644 0100-libhns-Fix-ret-not-assigned-in-create-srq.patch
create mode 100644 0101-libhns-Add-error-logs-to-help-diagnosis.patch
create mode 100644 0102-libhns-Sync-lock-free-codes-from-mainline-driver.patch
create mode 100644 0103-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
create mode 100644 0104-libhns-Clean-up-data-type-issues.patch
create mode 100644 0105-libhns-Add-debug-log-for-lock-free-mode.patch
diff --git a/0099-libhns-Fix-double-free-of-rinl-buf-wqe-list.patch b/0099-libhns-Fix-double-free-of-rinl-buf-wqe-list.patch
new file mode 100644
index 0000000..042c03d
--- /dev/null
+++ b/0099-libhns-Fix-double-free-of-rinl-buf-wqe-list.patch
@@ -0,0 +1,48 @@
+From 0a14854f63540a745fcda95872d4ae0298bbc5f0 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Mon, 26 May 2025 21:20:29 +0800
+Subject: [PATCH 099/105] libhns: Fix double-free of rinl buf->wqe list
+
+rinl_buf->wqe_list will be double-freed in error flow, first in
+alloc_recv_rinl_buf() and then in free_recv_rinl_buf(). Actually
+free_recv_rinl_buf() shouldn't be called when alloc_recv_rinl_buf()
+failed.
+
+Fixes: 83b0baff3ccf ("libhns: Refactor rq inline")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 8 +++++---
+ 1 file changed, 5 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 7418d2c..7d83a33 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1658,18 +1658,20 @@ static int qp_alloc_wqe(struct ibv_qp_init_attr_ex *attr,
+ qp->dca_wqe.shift = qp->pageshift;
+ qp->dca_wqe.bufs = calloc(qp->dca_wqe.max_cnt, sizeof(void *));
+ if (!qp->dca_wqe.bufs)
+- goto err_alloc;
++ goto err_alloc_recv_rinl_buf;
+ verbs_debug(&ctx->ibv_ctx, "alloc DCA buf.\n");
+ } else {
+ if (hns_roce_alloc_buf(&qp->buf, qp->buf_size,
+ 1 << qp->pageshift))
+- goto err_alloc;
++ goto err_alloc_recv_rinl_buf;
+ }
+
+ return 0;
+
+-err_alloc:
++err_alloc_recv_rinl_buf:
+ free_recv_rinl_buf(&qp->rq_rinl_buf);
++
++err_alloc:
+ if (qp->rq.wrid)
+ free(qp->rq.wrid);
+
+--
+2.33.0
+
diff --git a/0100-libhns-Fix-ret-not-assigned-in-create-srq.patch b/0100-libhns-Fix-ret-not-assigned-in-create-srq.patch
new file mode 100644
index 0000000..fa05de9
--- /dev/null
+++ b/0100-libhns-Fix-ret-not-assigned-in-create-srq.patch
@@ -0,0 +1,46 @@
+From 138d2d80aea27adea77fee042ba6107adaee8687 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Wed, 23 Apr 2025 16:55:14 +0800
+Subject: [PATCH 100/105] libhns: Fix ret not assigned in create srq()
+
+Fix the problem that ret may not be assigned in the error flow
+of create_srq().
+
+Fixes: b38bae4b5b9e ("libhns: Add support for lock-free SRQ")
+Fixes: b914c76318f5 ("libhns: Refactor the process of create_srq")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 10 +++++++---
+ 1 file changed, 7 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 7d83a33..3a1c40e 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1070,16 +1070,20 @@ static struct ibv_srq *create_srq(struct ibv_context *context,
+ goto err;
+ }
+
+- if (hns_roce_srq_spinlock_init(context, srq, init_attr))
++ ret = hns_roce_srq_spinlock_init(context, srq, init_attr);
++ if (ret)
+ goto err_free_srq;
+
+ set_srq_param(context, srq, init_attr);
+- if (alloc_srq_buf(srq))
++ ret = alloc_srq_buf(srq);
++ if (ret)
+ goto err_destroy_lock;
+
+ srq->rdb = hns_roce_alloc_db(hr_ctx, HNS_ROCE_SRQ_TYPE_DB);
+- if (!srq->rdb)
++ if (!srq->rdb) {
++ ret = ENOMEM;
+ goto err_srq_buf;
++ }
+
+ ret = exec_srq_create_cmd(context, srq, init_attr);
+ if (ret)
+--
+2.33.0
+
diff --git a/0101-libhns-Add-error-logs-to-help-diagnosis.patch b/0101-libhns-Add-error-logs-to-help-diagnosis.patch
new file mode 100644
index 0000000..1edde09
--- /dev/null
+++ b/0101-libhns-Add-error-logs-to-help-diagnosis.patch
@@ -0,0 +1,240 @@
+From b9513a369315c7d5c56b19b468369f1a6025d45f Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Fri, 27 Dec 2024 14:02:29 +0800
+Subject: [PATCH 101/105] libhns: Add error logs to help diagnosis
+
+Add error logs to help diagnosis.
+
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u.c | 4 +-
+ providers/hns/hns_roce_u_hw_v2.c | 3 ++
+ providers/hns/hns_roce_u_verbs.c | 87 +++++++++++++++++++++++++-------
+ 3 files changed, 74 insertions(+), 20 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
+index dfcd798..32a73c7 100644
+--- a/providers/hns/hns_roce_u.c
++++ b/providers/hns/hns_roce_u.c
+@@ -268,8 +268,10 @@ static int hns_roce_mmap(struct hns_roce_device *hr_dev,
+
+ context->uar = mmap(NULL, page_size, PROT_READ | PROT_WRITE,
+ MAP_SHARED, cmd_fd, 0);
+- if (context->uar == MAP_FAILED)
++ if (context->uar == MAP_FAILED) {
++ verbs_err(&context->ibv_ctx, "error: failed to mmap uar page.\n");
+ return -ENOMEM;
++ }
+
+ return 0;
+ }
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 70fe2f7..56a42e7 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -3131,6 +3131,9 @@ static int fill_send_wr_ops(const struct ibv_qp_init_attr_ex *attr,
+ fill_send_wr_ops_ud(qp_ex);
+ break;
+ default:
++ verbs_err(verbs_get_ctx(qp_ex->qp_base.context),
++ "QP type %d not supported for qp_ex send ops.\n",
++ attr->qp_type);
+ return -EOPNOTSUPP;
+ }
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 3a1c40e..271525a 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -524,8 +524,11 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ struct hns_roce_context *context,
+ struct hnsdv_cq_init_attr *hns_cq_attr)
+ {
+- if (!attr->cqe || attr->cqe > context->max_cqe)
+- return -EINVAL;
++ if (!attr->cqe || attr->cqe > context->max_cqe) {
++ verbs_err(&context->ibv_ctx, "unsupported cq depth %u.\n",
++ attr->cqe);
++ return EINVAL;
++ }
+
+ if (!check_comp_mask(attr->comp_mask, CREATE_CQ_SUPPORTED_COMP_MASK)) {
+ verbs_err(&context->ibv_ctx, "unsupported cq comps 0x%x\n",
+@@ -533,8 +536,11 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ return EOPNOTSUPP;
+ }
+
+- if (!check_comp_mask(attr->wc_flags, CREATE_CQ_SUPPORTED_WC_FLAGS))
+- return -EOPNOTSUPP;
++ if (!check_comp_mask(attr->wc_flags, CREATE_CQ_SUPPORTED_WC_FLAGS)) {
++ verbs_err(&context->ibv_ctx, "unsupported wc flags 0x%llx.\n",
++ attr->wc_flags);
++ return EOPNOTSUPP;
++ }
+
+ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) {
+ if (!to_hr_pad(attr->parent_domain)) {
+@@ -617,8 +623,11 @@ static int exec_cq_create_cmd(struct ibv_context *context,
+ ret = ibv_cmd_create_cq_ex(context, attr, &cq->verbs_cq,
+ &cmd_ex.ibv_cmd, sizeof(cmd_ex),
+ &resp_ex.ibv_resp, sizeof(resp_ex), 0);
+- if (ret)
++ if (ret) {
++ verbs_err(verbs_get_ctx(context),
++ "failed to exec create cq cmd, ret = %d.\n", ret);
+ return ret;
++ }
+
+ cq->cqn = resp_drv->cqn;
+ cq->flags = resp_drv->cap_flags;
+@@ -877,13 +886,20 @@ static int verify_srq_create_attr(struct hns_roce_context *context,
+ struct ibv_srq_init_attr_ex *attr)
+ {
+ if (attr->srq_type != IBV_SRQT_BASIC &&
+- attr->srq_type != IBV_SRQT_XRC)
++ attr->srq_type != IBV_SRQT_XRC) {
++ verbs_err(&context->ibv_ctx,
++ "unsupported srq type, type = %d.\n", attr->srq_type);
+ return -EINVAL;
++ }
+
+ if (!attr->attr.max_sge ||
+ attr->attr.max_wr > context->max_srq_wr ||
+- attr->attr.max_sge > context->max_srq_sge)
++ attr->attr.max_sge > context->max_srq_sge) {
++ verbs_err(&context->ibv_ctx,
++ "invalid srq attr size, max_wr = %u, max_sge = %u.\n",
++ attr->attr.max_wr, attr->attr.max_sge);
+ return -EINVAL;
++ }
+
+ attr->attr.max_wr = max_t(uint32_t, attr->attr.max_wr,
+ HNS_ROCE_MIN_SRQ_WQE_NUM);
+@@ -1015,8 +1031,12 @@ static int exec_srq_create_cmd(struct ibv_context *context,
+ ret = ibv_cmd_create_srq_ex(context, &srq->verbs_srq, init_attr,
+ &cmd_ex.ibv_cmd, sizeof(cmd_ex),
+ &resp_ex.ibv_resp, sizeof(resp_ex));
+- if (ret)
++ if (ret) {
++ verbs_err(verbs_get_ctx(context),
++ "failed to exec create srq cmd, ret = %d.\n",
++ ret);
+ return ret;
++ }
+
+ srq->srqn = resp_ex.srqn;
+ srq->cap_flags = resp_ex.cap_flags;
+@@ -1340,9 +1360,12 @@ static int check_qp_create_mask(struct hns_roce_context *ctx,
+ struct ibv_qp_init_attr_ex *attr)
+ {
+ struct hns_roce_device *hr_dev = to_hr_dev(ctx->ibv_ctx.context.device);
++ int ret = 0;
+
+- if (!check_comp_mask(attr->comp_mask, CREATE_QP_SUP_COMP_MASK))
+- return -EOPNOTSUPP;
++ if (!check_comp_mask(attr->comp_mask, CREATE_QP_SUP_COMP_MASK)) {
++ ret = EOPNOTSUPP;
++ goto out;
++ }
+
+ if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS &&
+ !check_comp_mask(attr->send_ops_flags, SEND_OPS_FLAG_MASK))
+@@ -1351,22 +1374,26 @@ static int check_qp_create_mask(struct hns_roce_context *ctx,
+ switch (attr->qp_type) {
+ case IBV_QPT_UD:
+ if (hr_dev->hw_version == HNS_ROCE_HW_VER2)
+- return -EINVAL;
++ return EINVAL;
+ SWITCH_FALLTHROUGH;
+ case IBV_QPT_RC:
+ case IBV_QPT_XRC_SEND:
+ if (!(attr->comp_mask & IBV_QP_INIT_ATTR_PD))
+- return -EINVAL;
++ ret = EINVAL;
+ break;
+ case IBV_QPT_XRC_RECV:
+ if (!(attr->comp_mask & IBV_QP_INIT_ATTR_XRCD))
+- return -EINVAL;
++ ret = EINVAL;
+ break;
+ default:
+- return -EINVAL;
++ return EOPNOTSUPP;
+ }
+
+- return 0;
++out:
++ if (ret)
++ verbs_err(&ctx->ibv_ctx, "invalid comp_mask 0x%x.\n",
++ attr->comp_mask);
++ return ret;
+ }
+
+ static int hns_roce_qp_has_rq(struct ibv_qp_init_attr_ex *attr)
+@@ -1391,8 +1418,13 @@ static int verify_qp_create_cap(struct hns_roce_context *ctx,
+ if (cap->max_send_wr > ctx->max_qp_wr ||
+ cap->max_recv_wr > ctx->max_qp_wr ||
+ cap->max_send_sge > ctx->max_sge ||
+- cap->max_recv_sge > ctx->max_sge)
++ cap->max_recv_sge > ctx->max_sge) {
++ verbs_err(&ctx->ibv_ctx,
++ "invalid qp cap size, max_send/recv_wr = {%u, %u}, max_send/recv_sge = {%u, %u}.\n",
++ cap->max_send_wr, cap->max_recv_wr,
++ cap->max_send_sge, cap->max_recv_sge);
+ return -EINVAL;
++ }
+
+ has_rq = hns_roce_qp_has_rq(attr);
+ if (!has_rq) {
+@@ -1401,12 +1433,20 @@ static int verify_qp_create_cap(struct hns_roce_context *ctx,
+ }
+
+ min_wqe_num = HNS_ROCE_V2_MIN_WQE_NUM;
+- if (cap->max_send_wr < min_wqe_num)
++ if (cap->max_send_wr < min_wqe_num) {
++ verbs_debug(&ctx->ibv_ctx,
++ "change sq depth from %u to minimum %u.\n",
++ cap->max_send_wr, min_wqe_num);
+ cap->max_send_wr = min_wqe_num;
++ }
+
+ if (cap->max_recv_wr) {
+- if (cap->max_recv_wr < min_wqe_num)
++ if (cap->max_recv_wr < min_wqe_num) {
++ verbs_debug(&ctx->ibv_ctx,
++ "change rq depth from %u to minimum %u.\n",
++ cap->max_recv_wr, min_wqe_num);
+ cap->max_recv_wr = min_wqe_num;
++ }
+
+ if (!cap->max_recv_sge)
+ return -EINVAL;
+@@ -1916,6 +1956,11 @@ static int qp_exec_create_cmd(struct ibv_qp_init_attr_ex *attr,
+ ret = ibv_cmd_create_qp_ex2(&ctx->ibv_ctx.context, &qp->verbs_qp, attr,
+ &cmd_ex.ibv_cmd, sizeof(cmd_ex),
+ &resp_ex.ibv_resp, sizeof(resp_ex));
++ if (ret) {
++ verbs_err(&ctx->ibv_ctx,
++ "failed to exec create qp cmd, ret = %d.\n", ret);
++ return ret;
++ }
+
+ qp->flags = resp_ex.drv_payload.cap_flags;
+ *dwqe_mmap_key = resp_ex.drv_payload.dwqe_mmap_key;
+@@ -1977,8 +2022,12 @@ static int mmap_dwqe(struct ibv_context *ibv_ctx, struct hns_roce_qp *qp,
+ {
+ qp->dwqe_page = mmap(NULL, HNS_ROCE_DWQE_PAGE_SIZE, PROT_WRITE,
+ MAP_SHARED, ibv_ctx->cmd_fd, dwqe_mmap_key);
+- if (qp->dwqe_page == MAP_FAILED)
++ if (qp->dwqe_page == MAP_FAILED) {
++ verbs_err(verbs_get_ctx(ibv_ctx),
++ "failed to mmap direct wqe page, QPN = %u.\n",
++ qp->verbs_qp.qp.qp_num);
+ return -EINVAL;
++ }
+
+ return 0;
+ }
+--
+2.33.0
+
diff --git a/0102-libhns-Sync-lock-free-codes-from-mainline-driver.patch b/0102-libhns-Sync-lock-free-codes-from-mainline-driver.patch
new file mode 100644
index 0000000..62509e3
--- /dev/null
+++ b/0102-libhns-Sync-lock-free-codes-from-mainline-driver.patch
@@ -0,0 +1,519 @@
+From 8cd132d5f4aa489b9eeaa3f43865c41e4ac28101 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Wed, 19 Mar 2025 18:13:52 +0800
+Subject: [PATCH 102/105] libhns: Sync lock-free codes from mainline driver
+
+Sync lock-free codes from mainline driver. There is only one functional
+change that add pad refcnt when creating qp/cq/srq, and other changes
+are mostly coding cleanup.
+
+The mainline PR was:
+https://github.com/linux-rdma/rdma-core/pull/1482
+https://github.com/linux-rdma/rdma-core/pull/1599/commits/f877d6e610e438515e1535c9ec7a3a3ef37c58e0
+https://github.com/linux-rdma/rdma-core/pull/1599/commits/234d135276ea8ef83633113e224e0cd735ebeca8
+
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u.h | 1 +
+ providers/hns/hns_roce_u_hw_v2.c | 18 +++-
+ providers/hns/hns_roce_u_hw_v2.h | 4 +-
+ providers/hns/hns_roce_u_verbs.c | 163 ++++++++++++++-----------------
+ 4 files changed, 88 insertions(+), 98 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
+index 863d4b5..7f5872c 100644
+--- a/providers/hns/hns_roce_u.h
++++ b/providers/hns/hns_roce_u.h
+@@ -318,6 +318,7 @@ struct hns_roce_cq {
+ unsigned long flags;
+ unsigned int cqe_size;
+ struct hns_roce_v2_cqe *cqe;
++ struct ibv_pd *parent_domain;
+ struct list_head list_sq;
+ struct list_head list_rq;
+ struct list_head list_srq;
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 56a42e7..acb373c 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -1976,8 +1976,11 @@ static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ return ret;
+ }
+
+-void hns_roce_lock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_cq)
++void hns_roce_lock_cqs(struct ibv_qp *qp)
+ {
++ struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq);
++ struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq);
++
+ if (send_cq && recv_cq) {
+ if (send_cq == recv_cq) {
+ hns_roce_spin_lock(&send_cq->hr_lock);
+@@ -1995,8 +1998,11 @@ void hns_roce_lock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_cq)
+ }
+ }
+
+-void hns_roce_unlock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_cq)
++void hns_roce_unlock_cqs(struct ibv_qp *qp)
+ {
++ struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq);
++ struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq);
++
+ if (send_cq && recv_cq) {
+ if (send_cq == recv_cq) {
+ hns_roce_spin_unlock(&send_cq->hr_lock);
+@@ -2017,6 +2023,7 @@ void hns_roce_unlock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_c
+ static int hns_roce_u_v2_destroy_qp(struct ibv_qp *ibqp)
+ {
+ struct hns_roce_context *ctx = to_hr_ctx(ibqp->context);
++ struct hns_roce_pad *pad = to_hr_pad(ibqp->pd);
+ struct hns_roce_qp *qp = to_hr_qp(ibqp);
+ int ret;
+
+@@ -2029,7 +2036,7 @@ static int hns_roce_u_v2_destroy_qp(struct ibv_qp *ibqp)
+
+ hns_roce_v2_clear_qp(ctx, qp);
+
+- hns_roce_lock_cqs(to_hr_cq(ibqp->send_cq), to_hr_cq(ibqp->recv_cq));
++ hns_roce_lock_cqs(ibqp);
+
+ if (ibqp->recv_cq) {
+ __hns_roce_v2_cq_clean(to_hr_cq(ibqp->recv_cq), ibqp->qp_num,
+@@ -2045,11 +2052,14 @@ static int hns_roce_u_v2_destroy_qp(struct ibv_qp *ibqp)
+ list_del(&qp->scq_node);
+ }
+
+- hns_roce_unlock_cqs(to_hr_cq(ibqp->send_cq), to_hr_cq(ibqp->recv_cq));
++ hns_roce_unlock_cqs(ibqp);
+
+ hns_roce_free_qp_buf(qp, ctx);
+ hns_roce_qp_spinlock_destroy(qp);
+
++ if (pad)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
++
+ free(qp);
+
+ if (ctx->dca_ctx.mem_cnt > 0)
+diff --git a/providers/hns/hns_roce_u_hw_v2.h b/providers/hns/hns_roce_u_hw_v2.h
+index fa83bbe..01d16ac 100644
+--- a/providers/hns/hns_roce_u_hw_v2.h
++++ b/providers/hns/hns_roce_u_hw_v2.h
+@@ -347,7 +347,7 @@ void hns_roce_v2_clear_qp(struct hns_roce_context *ctx, struct hns_roce_qp *qp);
+ void hns_roce_attach_cq_ex_ops(struct ibv_cq_ex *cq_ex, uint64_t wc_flags);
+ int hns_roce_attach_qp_ex_ops(struct ibv_qp_init_attr_ex *attr,
+ struct hns_roce_qp *qp);
+-void hns_roce_lock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_cq);
+-void hns_roce_unlock_cqs(struct hns_roce_cq *send_cq, struct hns_roce_cq *recv_cq);
++void hns_roce_lock_cqs(struct ibv_qp *qp);
++void hns_roce_unlock_cqs(struct ibv_qp *qp);
+
+ #endif /* _HNS_ROCE_U_HW_V2_H */
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 271525a..0708b95 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -44,16 +44,11 @@
+ #include "hns_roce_u_db.h"
+ #include "hns_roce_u_hw_v2.h"
+
+-static int hns_roce_whether_need_lock(struct ibv_pd *pd)
++static bool hns_roce_whether_need_lock(struct ibv_pd *pd)
+ {
+- struct hns_roce_pad *pad;
+- bool need_lock = true;
+-
+- pad = to_hr_pad(pd);
+- if (pad && pad->td)
+- need_lock = false;
++ struct hns_roce_pad *pad = to_hr_pad(pd);
+
+- return need_lock;
++ return !(pad && pad->td);
+ }
+
+ static int hns_roce_spinlock_init(struct hns_roce_spinlock *hr_lock,
+@@ -165,7 +160,7 @@ struct ibv_td *hns_roce_u_alloc_td(struct ibv_context *context,
+ struct hns_roce_td *td;
+
+ if (attr->comp_mask) {
+- errno = EINVAL;
++ errno = EOPNOTSUPP;
+ return NULL;
+ }
+
+@@ -184,19 +179,14 @@ struct ibv_td *hns_roce_u_alloc_td(struct ibv_context *context,
+ int hns_roce_u_dealloc_td(struct ibv_td *ibv_td)
+ {
+ struct hns_roce_td *td;
+- int ret = 0;
+
+ td = to_hr_td(ibv_td);
+- if (atomic_load(&td->refcount) > 1) {
+- ret = -EBUSY;
+- goto err;
+- }
++ if (atomic_load(&td->refcount) > 1)
++ return EBUSY;
+
+ free(td);
+
+-err:
+- errno = abs(ret);
+- return ret;
++ return 0;
+ }
+
+ struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context)
+@@ -204,7 +194,6 @@ struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context)
+ struct hns_roce_alloc_pd_resp resp = {};
+ struct ibv_alloc_pd cmd;
+ struct hns_roce_pd *pd;
+- int ret;
+
+ pd = calloc(1, sizeof(*pd));
+ if (!pd) {
+@@ -212,10 +201,9 @@ struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context)
+ return NULL;
+ }
+
+- ret = ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd),
+- &resp.ibv_resp, sizeof(resp));
+-
+- if (ret)
++ errno = ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd),
++ &resp.ibv_resp, sizeof(resp));
++ if (errno)
+ goto err;
+
+ atomic_init(&pd->refcount, 1);
+@@ -225,7 +213,6 @@ struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context)
+
+ err:
+ free(pd);
+- errno = abs(ret);
+ return NULL;
+ }
+
+@@ -256,41 +243,40 @@ struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
+ pad->pd.protection_domain = to_hr_pd(attr->pd);
+ atomic_fetch_add(&pad->pd.protection_domain->refcount, 1);
+
++ atomic_init(&pad->pd.refcount, 1);
+ ibv_initialize_parent_domain(&pad->pd.ibv_pd,
+ &pad->pd.protection_domain->ibv_pd);
+
+ return &pad->pd.ibv_pd;
+ }
+
+-static void hns_roce_free_pad(struct hns_roce_pad *pad)
++static int hns_roce_free_pad(struct hns_roce_pad *pad)
+ {
++ if (atomic_load(&pad->pd.refcount) > 1)
++ return EBUSY;
++
+ atomic_fetch_sub(&pad->pd.protection_domain->refcount, 1);
+
+ if (pad->td)
+ atomic_fetch_sub(&pad->td->refcount, 1);
+
+ free(pad);
++ return 0;
+ }
+
+ static int hns_roce_free_pd(struct hns_roce_pd *pd)
+ {
+ int ret;
+
+- if (atomic_load(&pd->refcount) > 1) {
+- ret = -EBUSY;
+- goto err;
+- }
++ if (atomic_load(&pd->refcount) > 1)
++ return EBUSY;
+
+ ret = ibv_cmd_dealloc_pd(&pd->ibv_pd);
+ if (ret)
+- goto err;
++ return ret;
+
+ free(pd);
+-
+-err:
+- errno = abs(ret);
+-
+- return ret;
++ return 0;
+ }
+
+ int hns_roce_u_dealloc_pd(struct ibv_pd *ibv_pd)
+@@ -298,10 +284,8 @@ int hns_roce_u_dealloc_pd(struct ibv_pd *ibv_pd)
+ struct hns_roce_pad *pad = to_hr_pad(ibv_pd);
+ struct hns_roce_pd *pd = to_hr_pd(ibv_pd);
+
+- if (pad) {
+- hns_roce_free_pad(pad);
+- return 0;
+- }
++ if (pad)
++ return hns_roce_free_pad(pad);
+
+ return hns_roce_free_pd(pd);
+ }
+@@ -524,6 +508,8 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ struct hns_roce_context *context,
+ struct hnsdv_cq_init_attr *hns_cq_attr)
+ {
++ struct hns_roce_pad *pad = to_hr_pad(attr->parent_domain);
++
+ if (!attr->cqe || attr->cqe > context->max_cqe) {
+ verbs_err(&context->ibv_ctx, "unsupported cq depth %u.\n",
+ attr->cqe);
+@@ -542,11 +528,9 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ return EOPNOTSUPP;
+ }
+
+- if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) {
+- if (!to_hr_pad(attr->parent_domain)) {
+- verbs_err(&context->ibv_ctx, "failed to check the pad of cq.\n");
+- return EINVAL;
+- }
++ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD && !pad) {
++ verbs_err(&context->ibv_ctx, "failed to check the pad of cq.\n");
++ return EINVAL;
+ }
+
+ attr->cqe = max_t(uint32_t, HNS_ROCE_MIN_CQE_NUM,
+@@ -668,19 +652,10 @@ static void hns_roce_uninit_cq_swc(struct hns_roce_cq *cq)
+ }
+ }
+
+-static int hns_roce_cq_spinlock_init(struct ibv_context *context,
+- struct hns_roce_cq *cq,
++static int hns_roce_cq_spinlock_init(struct hns_roce_cq *cq,
+ struct ibv_cq_init_attr_ex *attr)
+ {
+- struct hns_roce_pad *pad = NULL;
+- int need_lock;
+-
+- if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD)
+- pad = to_hr_pad(attr->parent_domain);
+-
+- need_lock = hns_roce_whether_need_lock(pad ? &pad->pd.ibv_pd : NULL);
+- if (!need_lock)
+- verbs_info(verbs_get_ctx(context), "configure cq as no lock.\n");
++ bool need_lock = hns_roce_whether_need_lock(attr->parent_domain);
+
+ return hns_roce_spinlock_init(&cq->hr_lock, need_lock);
+ }
+@@ -689,6 +664,7 @@ static struct ibv_cq_ex *create_cq(struct ibv_context *context,
+ struct ibv_cq_init_attr_ex *attr,
+ struct hnsdv_cq_init_attr *hns_cq_attr)
+ {
++ struct hns_roce_pad *pad = to_hr_pad(attr->parent_domain);
+ struct hns_roce_context *hr_ctx = to_hr_ctx(context);
+ struct hns_roce_cq *cq;
+ int ret;
+@@ -703,7 +679,12 @@ static struct ibv_cq_ex *create_cq(struct ibv_context *context,
+ goto err;
+ }
+
+- ret = hns_roce_cq_spinlock_init(context, cq, attr);
++ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) {
++ cq->parent_domain = attr->parent_domain;
++ atomic_fetch_add(&pad->pd.refcount, 1);
++ }
++
++ ret = hns_roce_cq_spinlock_init(cq, attr);
+ if (ret)
+ goto err_lock;
+
+@@ -741,6 +722,8 @@ err_db:
+ err_buf:
+ hns_roce_spinlock_destroy(&cq->hr_lock);
+ err_lock:
++ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
+ free(cq);
+ err:
+ errno = abs(ret);
+@@ -813,6 +796,7 @@ int hns_roce_u_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr)
+ int hns_roce_u_destroy_cq(struct ibv_cq *cq)
+ {
+ struct hns_roce_cq *hr_cq = to_hr_cq(cq);
++ struct hns_roce_pad *pad = to_hr_pad(hr_cq->parent_domain);
+ int ret;
+
+ ret = ibv_cmd_destroy_cq(cq);
+@@ -827,6 +811,9 @@ int hns_roce_u_destroy_cq(struct ibv_cq *cq)
+
+ hns_roce_spinlock_destroy(&hr_cq->hr_lock);
+
++ if (pad)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
++
+ free(hr_cq);
+
+ return ret;
+@@ -1060,15 +1047,10 @@ static void init_srq_cq_list(struct hns_roce_srq *srq,
+ hns_roce_spin_unlock(&srq_cq->hr_lock);
+ }
+
+-static int hns_roce_srq_spinlock_init(struct ibv_context *context,
+- struct hns_roce_srq *srq,
++static int hns_roce_srq_spinlock_init(struct hns_roce_srq *srq,
+ struct ibv_srq_init_attr_ex *attr)
+ {
+- int need_lock;
+-
+- need_lock = hns_roce_whether_need_lock(attr->pd);
+- if (!need_lock)
+- verbs_info(verbs_get_ctx(context), "configure srq as no lock.\n");
++ bool need_lock = hns_roce_whether_need_lock(attr->pd);
+
+ return hns_roce_spinlock_init(&srq->hr_lock, need_lock);
+ }
+@@ -1077,6 +1059,7 @@ static struct ibv_srq *create_srq(struct ibv_context *context,
+ struct ibv_srq_init_attr_ex *init_attr)
+ {
+ struct hns_roce_context *hr_ctx = to_hr_ctx(context);
++ struct hns_roce_pad *pad = to_hr_pad(init_attr->pd);
+ struct hns_roce_srq *srq;
+ int ret;
+
+@@ -1089,8 +1072,10 @@ static struct ibv_srq *create_srq(struct ibv_context *context,
+ ret = -ENOMEM;
+ goto err;
+ }
++ if (pad)
++ atomic_fetch_add(&pad->pd.refcount, 1);
+
+- ret = hns_roce_srq_spinlock_init(context, srq, init_attr);
++ ret = hns_roce_srq_spinlock_init(srq, init_attr);
+ if (ret)
+ goto err_free_srq;
+
+@@ -1134,6 +1119,8 @@ err_destroy_lock:
+ hns_roce_spinlock_destroy(&srq->hr_lock);
+
+ err_free_srq:
++ if (pad)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
+ free(srq);
+
+ err:
+@@ -1209,6 +1196,7 @@ static void del_srq_from_cq_list(struct hns_roce_srq *srq)
+ int hns_roce_u_destroy_srq(struct ibv_srq *ibv_srq)
+ {
+ struct hns_roce_context *ctx = to_hr_ctx(ibv_srq->context);
++ struct hns_roce_pad *pad = to_hr_pad(ibv_srq->pd);
+ struct hns_roce_srq *srq = to_hr_srq(ibv_srq);
+ int ret;
+
+@@ -1224,6 +1212,10 @@ int hns_roce_u_destroy_srq(struct ibv_srq *ibv_srq)
+ free_srq_buf(srq);
+
+ hns_roce_spinlock_destroy(&srq->hr_lock);
++
++ if (pad)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
++
+ free(srq);
+
+ return 0;
+@@ -1478,38 +1470,19 @@ static int verify_qp_create_attr(struct hns_roce_context *ctx,
+ return verify_qp_create_cap(ctx, attr);
+ }
+
+-static int hns_roce_qp_spinlock_init(struct hns_roce_context *ctx,
+- struct ibv_qp_init_attr_ex *attr,
++static int hns_roce_qp_spinlock_init(struct ibv_qp_init_attr_ex *attr,
+ struct hns_roce_qp *qp)
+ {
+- int sq_need_lock;
+- int rq_need_lock;
++ bool need_lock = hns_roce_whether_need_lock(attr->pd);
+ int ret;
+
+- sq_need_lock = hns_roce_whether_need_lock(attr->pd);
+- if (!sq_need_lock)
+- verbs_warn(&ctx->ibv_ctx, "configure sq as no lock.\n");
+-
+- rq_need_lock = hns_roce_whether_need_lock(attr->pd);
+- if (!rq_need_lock)
+- verbs_warn(&ctx->ibv_ctx, "configure rq as no lock.\n");
+-
+- ret = hns_roce_spinlock_init(&qp->sq.hr_lock, sq_need_lock);
+- if (ret) {
+- verbs_err(&ctx->ibv_ctx, "failed to init sq spinlock.\n");
++ ret = hns_roce_spinlock_init(&qp->sq.hr_lock, need_lock);
++ if (ret)
+ return ret;
+- }
+-
+- ret = hns_roce_spinlock_init(&qp->rq.hr_lock, rq_need_lock);
+- if (ret) {
+- verbs_err(&ctx->ibv_ctx, "failed to init rq spinlock.\n");
+- goto err_rq_lock;
+- }
+-
+- return 0;
+
+-err_rq_lock:
+- hns_roce_spinlock_destroy(&qp->sq.hr_lock);
++ ret = hns_roce_spinlock_init(&qp->rq.hr_lock, need_lock);
++ if (ret)
++ hns_roce_spinlock_destroy(&qp->sq.hr_lock);
+
+ return ret;
+ }
+@@ -2044,7 +2017,7 @@ static void add_qp_to_cq_list(struct ibv_qp_init_attr_ex *attr,
+ list_node_init(&qp->rcq_node);
+ list_node_init(&qp->srcq_node);
+
+- hns_roce_lock_cqs(send_cq, recv_cq);
++ hns_roce_lock_cqs(&qp->verbs_qp.qp);
+ if (send_cq)
+ list_add_tail(&send_cq->list_sq, &qp->scq_node);
+ if (recv_cq) {
+@@ -2053,7 +2026,7 @@ static void add_qp_to_cq_list(struct ibv_qp_init_attr_ex *attr,
+ else
+ list_add_tail(&recv_cq->list_rq, &qp->rcq_node);
+ }
+- hns_roce_unlock_cqs(send_cq, recv_cq);
++ hns_roce_unlock_cqs(&qp->verbs_qp.qp);
+ }
+
+ static struct ibv_qp *create_qp(struct ibv_context *ibv_ctx,
+@@ -2061,6 +2034,7 @@ static struct ibv_qp *create_qp(struct ibv_context *ibv_ctx,
+ struct hnsdv_qp_init_attr *hns_attr)
+ {
+ struct hns_roce_context *context = to_hr_ctx(ibv_ctx);
++ struct hns_roce_pad *pad = to_hr_pad(attr->pd);
+ struct hns_roce_cmd_flag cmd_flag = {};
+ struct hns_roce_qp *qp;
+ uint64_t dwqe_mmap_key;
+@@ -2078,7 +2052,10 @@ static struct ibv_qp *create_qp(struct ibv_context *ibv_ctx,
+
+ hns_roce_set_qp_params(attr, qp, context);
+
+- ret = hns_roce_qp_spinlock_init(context, attr, qp);
++ if (pad)
++ atomic_fetch_add(&pad->pd.refcount, 1);
++
++ ret = hns_roce_qp_spinlock_init(attr, qp);
+ if (ret)
+ goto err_spinlock;
+
+@@ -2121,6 +2098,8 @@ err_cmd:
+ err_buf:
+ hns_roce_qp_spinlock_destroy(qp);
+ err_spinlock:
++ if (pad)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
+ free(qp);
+ err:
+ if (ret < 0)
+--
+2.33.0
+
diff --git a/0103-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch b/0103-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
new file mode 100644
index 0000000..c4d3f96
--- /dev/null
+++ b/0103-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
@@ -0,0 +1,31 @@
+From 93ddf71e89c8a8a4c0e2d7bf2d1f1d2c1bc3d903 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Wed, 23 Apr 2025 16:55:17 +0800
+Subject: [PATCH 103/105] verbs: Assign ibv srq->pd when creating SRQ
+
+Some providers need to access ibv_srq->pd during SRQ destruction, but
+it may not be assigned currently when using ibv_create_srq_ex(). This
+may lead to some SRQ-related resource leaks. Assign ibv_srq->pd when
+creating SRQ to ensure pd can be obtained correctly.
+
+Fixes: 40c1365b2198 ("Add support for XRC SRQs")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ libibverbs/cmd_srq.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/libibverbs/cmd_srq.c b/libibverbs/cmd_srq.c
+index dfaaa6a..259ea0d 100644
+--- a/libibverbs/cmd_srq.c
++++ b/libibverbs/cmd_srq.c
+@@ -63,6 +63,7 @@ static int ibv_icmd_create_srq(struct ibv_pd *pd, struct verbs_srq *vsrq,
+ struct verbs_xrcd *vxrcd = NULL;
+ enum ibv_srq_type srq_type;
+
++ srq->pd = pd;
+ srq->context = pd->context;
+ pthread_mutex_init(&srq->mutex, NULL);
+ pthread_cond_init(&srq->cond, NULL);
+--
+2.33.0
+
diff --git a/0104-libhns-Clean-up-data-type-issues.patch b/0104-libhns-Clean-up-data-type-issues.patch
new file mode 100644
index 0000000..1f69533
--- /dev/null
+++ b/0104-libhns-Clean-up-data-type-issues.patch
@@ -0,0 +1,113 @@
+From a20bcd29a5c2194f947f1ce24970b4be9d1cf32a Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Thu, 13 Mar 2025 17:26:50 +0800
+Subject: [PATCH 104/105] libhns: Clean up data type issues
+
+Clean up mixed signed/unsigned type issues. Fix a wrong format
+character as well.
+
+Fixes: cf6d9149f8f5 ("libhns: Introduce hns direct verbs")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u.h | 2 +-
+ providers/hns/hns_roce_u_hw_v2.c | 13 +++++++------
+ providers/hns/hns_roce_u_verbs.c | 4 ++--
+ 3 files changed, 10 insertions(+), 9 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
+index 7f5872c..3e9b487 100644
+--- a/providers/hns/hns_roce_u.h
++++ b/providers/hns/hns_roce_u.h
+@@ -367,7 +367,7 @@ struct hns_roce_wq {
+ unsigned long *wrid;
+ struct hns_roce_spinlock hr_lock;
+ unsigned int wqe_cnt;
+- int max_post;
++ unsigned int max_post;
+ unsigned int head;
+ unsigned int tail;
+ unsigned int max_gs;
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index acb373c..70e5b1f 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -173,7 +173,7 @@ static enum ibv_wc_status get_wc_status(uint8_t status)
+ { HNS_ROCE_V2_CQE_XRC_VIOLATION_ERR, IBV_WC_REM_INV_RD_REQ_ERR },
+ };
+
+- for (int i = 0; i < ARRAY_SIZE(map); i++) {
++ for (unsigned int i = 0; i < ARRAY_SIZE(map); i++) {
+ if (status == map[i].cqe_status)
+ return map[i].wc_status;
+ }
+@@ -1216,7 +1216,7 @@ static int fill_ext_sge_inl_data(struct hns_roce_qp *qp,
+ unsigned int sge_mask = qp->ex_sge.sge_cnt - 1;
+ void *dst_addr, *src_addr, *tail_bound_addr;
+ uint32_t src_len, tail_len;
+- int i;
++ uint32_t i;
+
+ if (sge_info->total_len > qp->sq.ext_sge_cnt * HNS_ROCE_SGE_SIZE)
+ return EINVAL;
+@@ -1286,7 +1286,7 @@ static void fill_ud_inn_inl_data(const struct ibv_send_wr *wr,
+
+ static bool check_inl_data_len(struct hns_roce_qp *qp, unsigned int len)
+ {
+- int mtu = mtu_enum_to_int(qp->path_mtu);
++ unsigned int mtu = mtu_enum_to_int(qp->path_mtu);
+
+ return (len <= qp->max_inline_data && len <= mtu);
+ }
+@@ -1727,7 +1727,8 @@ static void fill_recv_sge_to_wqe(struct ibv_recv_wr *wr, void *wqe,
+ unsigned int max_sge, bool rsv)
+ {
+ struct hns_roce_v2_wqe_data_seg *dseg = wqe;
+- unsigned int i, cnt;
++ unsigned int cnt;
++ int i;
+
+ for (i = 0, cnt = 0; i < wr->num_sge; i++) {
+ /* Skip zero-length sge */
+@@ -2090,7 +2091,7 @@ static int check_post_srq_valid(struct hns_roce_srq *srq,
+ static int get_wqe_idx(struct hns_roce_srq *srq, unsigned int *wqe_idx)
+ {
+ struct hns_roce_idx_que *idx_que = &srq->idx_que;
+- int bit_num;
++ unsigned int bit_num;
+ int i;
+
+ /* bitmap[i] is set zero if all bits are allocated */
+@@ -2499,7 +2500,7 @@ static void set_sgl_rc(struct hns_roce_v2_wqe_data_seg *dseg,
+ unsigned int mask = qp->ex_sge.sge_cnt - 1;
+ unsigned int msg_len = 0;
+ unsigned int cnt = 0;
+- int i;
++ unsigned int i;
+
+ for (i = 0; i < num_sge; i++) {
+ if (!sge[i].length)
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 0708b95..1ea7501 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -510,7 +510,7 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ {
+ struct hns_roce_pad *pad = to_hr_pad(attr->parent_domain);
+
+- if (!attr->cqe || attr->cqe > context->max_cqe) {
++ if (!attr->cqe || attr->cqe > (uint32_t)context->max_cqe) {
+ verbs_err(&context->ibv_ctx, "unsupported cq depth %u.\n",
+ attr->cqe);
+ return EINVAL;
+@@ -1497,7 +1497,7 @@ static int alloc_recv_rinl_buf(uint32_t max_sge,
+ struct hns_roce_rinl_buf *rinl_buf)
+ {
+ unsigned int cnt;
+- int i;
++ unsigned int i;
+
+ cnt = rinl_buf->wqe_cnt;
+ rinl_buf->wqe_list = calloc(cnt,
+--
+2.33.0
+
diff --git a/0105-libhns-Add-debug-log-for-lock-free-mode.patch b/0105-libhns-Add-debug-log-for-lock-free-mode.patch
new file mode 100644
index 0000000..28b0762
--- /dev/null
+++ b/0105-libhns-Add-debug-log-for-lock-free-mode.patch
@@ -0,0 +1,46 @@
+From 8954a581ff8b82d6cb3cca93f8558c86091ea155 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Thu, 24 Apr 2025 20:32:12 +0800
+Subject: [PATCH 105/105] libhns: Add debug log for lock-free mode
+
+Currently there is no way to observe whether the lock-free mode is
+configured from the driver's perspective. Add debug log for this.
+
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 7 ++++++-
+ 1 file changed, 6 insertions(+), 1 deletion(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 1ea7501..8491431 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -219,6 +219,7 @@ err:
+ struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
+ struct ibv_parent_domain_init_attr *attr)
+ {
++ struct hns_roce_pd *protection_domain;
+ struct hns_roce_pad *pad;
+
+ if (ibv_check_alloc_parent_domain(attr))
+@@ -235,12 +236,16 @@ struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
+ return NULL;
+ }
+
++ protection_domain = to_hr_pd(attr->pd);
+ if (attr->td) {
+ pad->td = to_hr_td(attr->td);
+ atomic_fetch_add(&pad->td->refcount, 1);
++ verbs_debug(verbs_get_ctx(context),
++ "set PAD(0x%x) to lock-free mode.\n",
++ protection_domain->pdn);
+ }
+
+- pad->pd.protection_domain = to_hr_pd(attr->pd);
++ pad->pd.protection_domain = protection_domain;
+ atomic_fetch_add(&pad->pd.protection_domain->refcount, 1);
+
+ atomic_init(&pad->pd.refcount, 1);
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 131cbd6..098ddcc 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 41.0
-Release: 35
+Release: 36
Summary: RDMA core userspace libraries and daemons
License: GPLv2 or BSD
Url: https://github.com/linux-rdma/rdma-core
@@ -104,6 +104,13 @@ patch95: 0095-libhns-Adapt-UD-inline-data-size-for-UCX.patch
patch96: 0096-libhns-Fix-wrong-order-of-spin_unlock-in-modify_qp.patch
patch97: 0097-libxscale-Match-dev-by-vid-and-did.patch
patch98: 0098-libxscale-update-to-version-2412GA.patch
+patch99: 0099-libhns-Fix-double-free-of-rinl-buf-wqe-list.patch
+patch100: 0100-libhns-Fix-ret-not-assigned-in-create-srq.patch
+patch101: 0101-libhns-Add-error-logs-to-help-diagnosis.patch
+patch102: 0102-libhns-Sync-lock-free-codes-from-mainline-driver.patch
+patch103: 0103-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
+patch104: 0104-libhns-Clean-up-data-type-issues.patch
+patch105: 0105-libhns-Add-debug-log-for-lock-free-mode.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) valgrind-devel systemd systemd-devel
@@ -354,6 +361,12 @@ fi
%{_mandir}/*
%changelog
+* Fir Jul 4 2025 Guofeng Yue <yueguofeng(a)h-partners.com> - 41.0-36
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: sync some patches for libhns
+
* Wed May 14 2025 Xin Tian <tianx(a)yunsilicon.com> - 41.0-35
- Type: feature
- ID: NA
--
2.33.0
1
0
From: wenglianfa <wenglianfa(a)huawei.com>
Currently static compilation with lttng tracing enabled fails with
the following errors:
In file included from /home/rdma-core/providers/rxe/rxe_trace.c:9:
/rdma-core/providers/rxe/rxe_trace.h:12:38: fatal error: rxe_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "rxe_trace.h"
| ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/rxe/CMakeFiles/rxe.dir/build.make:76: providers/rxe/CMakeFiles/rxe.dir/rxe_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/efa/efa_trace.c:9:
/home/rdma-core/providers/efa/efa_trace.h:12:38: fatal error: efa_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "efa_trace.h"
| ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/efa/CMakeFiles/efa-static.dir/build.make:76: providers/efa/CMakeFiles/efa-static.dir/efa_trace.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:3085: providers/efa/CMakeFiles/efa-static.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/mlx5/mlx5_trace.c:9:
/home/rdma-core/providers/mlx5/mlx5_trace.h:12:38: fatal error: mlx5_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "mlx5_trace.h"
| ^~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/mlx5/CMakeFiles/mlx5-static.dir/build.make:76: providers/mlx5/CMakeFiles/mlx5-static.dir/mlx5_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/hns/hns_roce_u_trace.c:9:
/home/rdma-core/providers/hns/hns_roce_u_trace.h:12:38: fatal error: hns_roce_u_trace.h: No such file or directory
12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
Fix it by linking the library and including drivers' directories for
static compilation.
Fixes: 382b359d990c ("efa: Add support for LTTng tracing")
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/efa/CMakeLists.txt | 4 ++++
providers/hns/CMakeLists.txt | 4 ++++
providers/mlx5/CMakeLists.txt | 4 ++++
providers/rxe/CMakeLists.txt | 4 ++++
4 files changed, 16 insertions(+)
diff --git a/providers/efa/CMakeLists.txt b/providers/efa/CMakeLists.txt
index e999f3b77..865317446 100644
--- a/providers/efa/CMakeLists.txt
+++ b/providers/efa/CMakeLists.txt
@@ -18,4 +18,8 @@ rdma_pkg_config("efa" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}")
if (ENABLE_LTTNG AND LTTNGUST_FOUND)
target_include_directories(efa PUBLIC ".")
target_link_libraries(efa LINK_PRIVATE LTTng::UST)
+ if (ENABLE_STATIC)
+ target_include_directories(efa-static PUBLIC ".")
+ target_link_libraries(efa-static LINK_PRIVATE LTTng::UST)
+ endif()
endif()
diff --git a/providers/hns/CMakeLists.txt b/providers/hns/CMakeLists.txt
index 36ebfacfb..7277cd65f 100644
--- a/providers/hns/CMakeLists.txt
+++ b/providers/hns/CMakeLists.txt
@@ -21,4 +21,8 @@ rdma_pkg_config("hns" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}")
if (ENABLE_LTTNG AND LTTNGUST_FOUND)
target_include_directories(hns PUBLIC ".")
target_link_libraries(hns LINK_PRIVATE LTTng::UST)
+ if (ENABLE_STATIC)
+ target_include_directories(hns-static PUBLIC ".")
+ target_link_libraries(hns-static LINK_PRIVATE LTTng::UST)
+ endif()
endif()
diff --git a/providers/mlx5/CMakeLists.txt b/providers/mlx5/CMakeLists.txt
index 4a438d911..92f4e1b18 100644
--- a/providers/mlx5/CMakeLists.txt
+++ b/providers/mlx5/CMakeLists.txt
@@ -57,4 +57,8 @@ rdma_pkg_config("mlx5" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}")
if (ENABLE_LTTNG AND LTTNGUST_FOUND)
target_include_directories(mlx5 PUBLIC ".")
target_link_libraries(mlx5 LINK_PRIVATE LTTng::UST)
+ if (ENABLE_STATIC)
+ target_include_directories(mlx5-static PUBLIC ".")
+ target_link_libraries(mlx5-static LINK_PRIVATE LTTng::UST)
+ endif()
endif()
diff --git a/providers/rxe/CMakeLists.txt b/providers/rxe/CMakeLists.txt
index 0fdc1cb3e..8a0a16842 100644
--- a/providers/rxe/CMakeLists.txt
+++ b/providers/rxe/CMakeLists.txt
@@ -10,4 +10,8 @@ rdma_provider(rxe
if (ENABLE_LTTNG AND LTTNGUST_FOUND)
target_include_directories("rxe-rdmav${IBVERBS_PABI_VERSION}" PUBLIC ".")
target_link_libraries("rxe-rdmav${IBVERBS_PABI_VERSION}" LINK_PRIVATE LTTng::UST)
+ if (ENABLE_STATIC)
+ target_include_directories(rxe PUBLIC ".")
+ target_link_libraries(rxe LINK_PRIVATE LTTng::UST)
+ endif()
endif()
--
2.33.0
1
3
13 Jun '25
From: Guofeng Yue <yueguofeng(a)h-partners.com>
Sync some patches from mainline
Signed-off-by: Guofeng Yue <yueguofeng(a)h-partners.com>
---
...replace-rand-with-getrandom-during-M.patch | 91 +++++++++++++++
...m-buffer-initialization-optimization.patch | 82 ++++++++++++++
...st-Fix-perform-warm-up-process-stuck.patch | 64 +++++++++++
...lock-free-mode-not-working-for-SRQ-X.patch | 105 ++++++++++++++++++
0020-Perftest-Fix-recv-cq-leak.patch | 54 +++++++++
perftest.spec | 13 ++-
6 files changed, 408 insertions(+), 1 deletion(-)
create mode 100644 0016-Revert-Perftest-replace-rand-with-getrandom-during-M.patch
create mode 100644 0017-Perftest-random-buffer-initialization-optimization.patch
create mode 100644 0018-Perftest-Fix-perform-warm-up-process-stuck.patch
create mode 100644 0019-Perftest-Fix-TD-lock-free-mode-not-working-for-SRQ-X.patch
create mode 100644 0020-Perftest-Fix-recv-cq-leak.patch
diff --git a/0016-Revert-Perftest-replace-rand-with-getrandom-during-M.patch b/0016-Revert-Perftest-replace-rand-with-getrandom-during-M.patch
new file mode 100644
index 0000000..0551e6d
--- /dev/null
+++ b/0016-Revert-Perftest-replace-rand-with-getrandom-during-M.patch
@@ -0,0 +1,91 @@
+From 454a41de4caa020a900eb9511fc49069ef10c53d Mon Sep 17 00:00:00 2001
+From: Guofeng Yue <yueguofeng(a)h-partners.com>
+Date: Mon, 9 Jun 2025 14:51:20 +0800
+Subject: [PATCH 16/20] Revert "Perftest: replace rand() with getrandom()
+ during MR buffer initialization"
+
+This reverts commit 189406b72d9d94c3c95298ba65ad9ce4ae90405b.
+---
+ configure.ac | 1 -
+ src/perftest_resources.c | 31 +++++--------------------------
+ 2 files changed, 5 insertions(+), 27 deletions(-)
+
+diff --git a/configure.ac b/configure.ac
+index d976663..a756488 100755
+--- a/configure.ac
++++ b/configure.ac
+@@ -60,7 +60,6 @@ AC_PROG_LIBTOOL
+ AC_PROG_RANLIB
+ AC_HEADER_STDC
+ AC_CHECK_HEADERS([infiniband/verbs.h],,[AC_MSG_ERROR([ibverbs header files not found])])
+-AC_CHECK_HEADERS([sys/random.h],,)
+ AC_CHECK_LIB([ibverbs], [ibv_get_device_list], [], [AC_MSG_ERROR([libibverbs not found])])
+ AC_CHECK_LIB([rdmacm], [rdma_create_event_channel], [], AC_MSG_ERROR([librdmacm-devel not found]))
+ AC_CHECK_LIB([ibumad], [umad_init], [LIBUMAD=-libumad], AC_MSG_ERROR([libibumad not found]))
+diff --git a/src/perftest_resources.c b/src/perftest_resources.c
+index 843c45f..6609afc 100755
+--- a/src/perftest_resources.c
++++ b/src/perftest_resources.c
+@@ -22,9 +22,6 @@
+ #ifdef HAVE_CONFIG_H
+ #include <config.h>
+ #endif
+-#ifdef HAVE_SYS_RANDOM_H
+-#include <sys/random.h>
+-#endif
+ #ifdef HAVE_SRD
+ #include <infiniband/efadv.h>
+ #endif
+@@ -1604,33 +1601,12 @@ int create_cqs(struct pingpong_context *ctx, struct perftest_parameters *user_pa
+ return ret;
+ }
+
+-static void random_data(char *buf, int buff_size)
+-{
+- int i;
+-#ifdef HAVE_SYS_RANDOM_H
+- char *tmp = buf;
+- int ret;
+-
+- for(i = buff_size; i > 0;) {
+- ret = getrandom(tmp, i, 0);
+- if(ret < 0)
+- goto fall_back;
+- tmp += ret;
+- i -= ret;
+- }
+- return;
+-fall_back:
+-#endif
+- srand(time(NULL));
+- for (i = 0; i < buff_size; i++)
+- buf[i] = (char)rand();
+-}
+-
+ /******************************************************************************
+ *
+ ******************************************************************************/
+ int create_single_mr(struct pingpong_context *ctx, struct perftest_parameters *user_param, int qp_index)
+ {
++ int i;
+ int flags = IBV_ACCESS_LOCAL_WRITE;
+
+
+@@ -1769,10 +1745,13 @@ int create_single_mr(struct pingpong_context *ctx, struct perftest_parameters *u
+ #ifdef HAVE_CUDA
+ if (!user_param->use_cuda) {
+ #endif
++ srand(time(NULL));
+ if (user_param->verb == WRITE && user_param->tst == LAT) {
+ memset(ctx->buf[qp_index], 0, ctx->buff_size);
+ } else {
+- random_data(ctx->buf[qp_index], ctx->buff_size);
++ for (i = 0; i < ctx->buff_size; i++) {
++ ((char*)ctx->buf[qp_index])[i] = (char)rand();
++ }
+ }
+ #ifdef HAVE_CUDA
+ }
+--
+2.33.0
+
diff --git a/0017-Perftest-random-buffer-initialization-optimization.patch b/0017-Perftest-random-buffer-initialization-optimization.patch
new file mode 100644
index 0000000..fd48325
--- /dev/null
+++ b/0017-Perftest-random-buffer-initialization-optimization.patch
@@ -0,0 +1,82 @@
+From eef2e242bf7db2879b7b87fb53312030513754b6 Mon Sep 17 00:00:00 2001
+From: Shmuel Shaul <sshaul(a)nvidia.com>
+Date: Mon, 21 Apr 2025 14:58:47 +0300
+Subject: [PATCH 17/20] Perftest: random buffer initialization optimization
+
+Replace the standard rand() function with PCG32 algorithm in buffer
+initialization
+to improve performance. The PCG32 implementation:
+- Generates 32-bit random numbers (0 to 4,294,967,295)
+- Uses /dev/urandom for initial seeding with fallback to time+pid+clock
+- Provides better performance than standard rand()
+- Maintains good randomness properties
+
+Signed-off-by: Shmuel Shaul <sshaul(a)nvidia.com>
+---
+ src/perftest_resources.c | 32 ++++++++++++++++++++++++++++++--
+ 1 file changed, 30 insertions(+), 2 deletions(-)
+
+diff --git a/src/perftest_resources.c b/src/perftest_resources.c
+index 6609afc..7c01da7 100755
+--- a/src/perftest_resources.c
++++ b/src/perftest_resources.c
+@@ -38,6 +38,7 @@ static enum ibv_wr_opcode opcode_atomic_array[] = {IBV_WR_ATOMIC_CMP_AND_SWP,IBV
+ struct perftest_parameters* duration_param;
+ struct check_alive_data check_alive_data;
+
++
+ /******************************************************************************
+ * Beginning
+ ******************************************************************************/
+@@ -320,6 +321,33 @@ static int pp_free_mmap(struct pingpong_context *ctx)
+ return 0;
+ }
+
++static uint32_t perftest_rand(uint32_t *state) {
++ uint32_t x = *state;
++ *state = x * 747796405 + 2891336453;
++ uint32_t word = ((x >> ((x >> 28) + 4)) ^ x) * 277803737;
++ return (word >> 22) ^ word;
++ }
++
++ // Proper initialization the rand algorithm
++ static uint32_t init_perftest_rand_state() {
++ uint32_t seed;
++
++ FILE* f = fopen("/dev/urandom", "rb");
++ if (f) {
++ if (fread(&seed, sizeof(seed), 1, f) == 1) {
++ fclose(f);
++ return seed;
++ }
++ fclose(f);
++ }
++
++ seed = (uint32_t)time(NULL);
++ seed ^= (uint32_t)getpid();
++ seed ^= (uint32_t)clock();
++
++ return seed;
++ }
++
+ static int next_word_string(char* input, char* output, int from_index)
+ {
+ int i = from_index;
+@@ -1745,12 +1773,12 @@ int create_single_mr(struct pingpong_context *ctx, struct perftest_parameters *u
+ #ifdef HAVE_CUDA
+ if (!user_param->use_cuda) {
+ #endif
+- srand(time(NULL));
++ uint32_t rng_state = init_perftest_rand_state();
+ if (user_param->verb == WRITE && user_param->tst == LAT) {
+ memset(ctx->buf[qp_index], 0, ctx->buff_size);
+ } else {
+ for (i = 0; i < ctx->buff_size; i++) {
+- ((char*)ctx->buf[qp_index])[i] = (char)rand();
++ ((char*)ctx->buf[qp_index])[i] = (char)perftest_rand(&rng_state);
+ }
+ }
+ #ifdef HAVE_CUDA
+--
+2.33.0
+
diff --git a/0018-Perftest-Fix-perform-warm-up-process-stuck.patch b/0018-Perftest-Fix-perform-warm-up-process-stuck.patch
new file mode 100644
index 0000000..8054456
--- /dev/null
+++ b/0018-Perftest-Fix-perform-warm-up-process-stuck.patch
@@ -0,0 +1,64 @@
+From eeb0572c2500ade41860dc9b2bb89619aa13b07a Mon Sep 17 00:00:00 2001
+From: Guofeng Yue <yueguofeng(a)h-partners.com>
+Date: Tue, 15 Apr 2025 17:09:47 +0800
+Subject: [PATCH 18/20] Perftest: Fix perform warm up process stuck
+
+In perform_warm_up mode, if the length of post_list is 1 and the
+message size is less than or equal to 8192, all send_flags in WRs
+are 0 and CQEs will not be generated since IBV_SEND_SIGNALED is
+not set. As a result, the perform_warm_up process will stuck in
+an infinite poll-CQ loop.
+
+Set IBV_SEND_SIGNALED in this case to requiring CQE, and clear the
+flag after post_send_method to avoid affecting subsequent tests.
+
+Fixes: 56d025e4f19a ("Allow overriding CQ moderation on post list mode (#58)")
+Signed-off-by: Guofeng Yue <yueguofeng(a)h-partners.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ src/perftest_resources.c | 13 ++++++++++++-
+ 1 file changed, 12 insertions(+), 1 deletion(-)
+
+diff --git a/src/perftest_resources.c b/src/perftest_resources.c
+index 7c01da7..d123e79 100755
+--- a/src/perftest_resources.c
++++ b/src/perftest_resources.c
+@@ -3301,6 +3301,7 @@ int perform_warm_up(struct pingpong_context *ctx,struct perftest_parameters *use
+ struct ibv_wc *wc_for_cleaning = NULL;
+ int num_of_qps = user_param->num_of_qps;
+ int return_value = 0;
++ int set_signaled = 0;
+
+ if(user_param->duplex && (user_param->use_xrc || user_param->connection_type == DC))
+ num_of_qps /= 2;
+@@ -3317,9 +3318,13 @@ int perform_warm_up(struct pingpong_context *ctx,struct perftest_parameters *use
+ ne = ibv_poll_cq(ctx->send_cq,user_param->tx_depth,wc_for_cleaning);
+
+ for (index=0 ; index < num_of_qps ; index++) {
++ /* ask for completion on this wr */
++ if (user_param->post_list == 1 && !(ctx->wr[index].send_flags & IBV_SEND_SIGNALED)) {
++ ctx->wr[index].send_flags |= IBV_SEND_SIGNALED;
++ set_signaled = 1;
++ }
+
+ for (warmindex = 0 ;warmindex < warmupsession ;warmindex += user_param->post_list) {
+-
+ err = post_send_method(ctx, index, user_param);
+ if (err) {
+ fprintf(stderr,"Couldn't post send during warm up: qp %d scnt=%d \n",index,warmindex);
+@@ -3328,6 +3333,12 @@ int perform_warm_up(struct pingpong_context *ctx,struct perftest_parameters *use
+ }
+ }
+
++ /* Clear the flag to avoid affecting subsequent tests. */
++ if (set_signaled) {
++ ctx->wr[index].send_flags &= ~IBV_SEND_SIGNALED;
++ set_signaled = 0;
++ }
++
+ do {
+
+ ne = ibv_poll_cq(ctx->send_cq,1,&wc);
+--
+2.33.0
+
diff --git a/0019-Perftest-Fix-TD-lock-free-mode-not-working-for-SRQ-X.patch b/0019-Perftest-Fix-TD-lock-free-mode-not-working-for-SRQ-X.patch
new file mode 100644
index 0000000..5f7957f
--- /dev/null
+++ b/0019-Perftest-Fix-TD-lock-free-mode-not-working-for-SRQ-X.patch
@@ -0,0 +1,105 @@
+From 68fd12d94e24a6cd250e682f8242d9f2be2d4ba5 Mon Sep 17 00:00:00 2001
+From: Guofeng Yue <yueguofeng(a)h-partners.com>
+Date: Tue, 15 Apr 2025 17:09:46 +0800
+Subject: [PATCH 19/20] Perftest: Fix TD lock-free mode not working for SRQ/XRC
+ QP
+
+When creating SRQ/XRC QP in TD lock-free mode, pass in ctx->pad
+instead of ctx->pd, otherwise the lock-free won't work.
+
+Besides, use ctx->pad directly when creating QP/SRQ since pad
+is designed to be interchangeable with the usual pd. When
+lock-free mode is disabled, pad is the exactly the usual pd.
+
+Fixes: b6f957f6bc6c ("Perftest: Add support for TD lock-free mode")
+Signed-off-by: Guofeng Yue <yueguofeng(a)h-partners.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ src/perftest_resources.c | 21 ++++++++++++---------
+ src/perftest_resources.h | 2 +-
+ 2 files changed, 13 insertions(+), 10 deletions(-)
+
+diff --git a/src/perftest_resources.c b/src/perftest_resources.c
+index d123e79..b388a45 100755
+--- a/src/perftest_resources.c
++++ b/src/perftest_resources.c
+@@ -913,7 +913,8 @@ static int ctx_xrc_srq_create(struct pingpong_context *ctx,
+ else
+ srq_init_attr.cq = ctx->send_cq;
+
+- srq_init_attr.pd = ctx->pd;
++ srq_init_attr.pd = ctx->pad;
++
+ ctx->srq = ibv_create_srq_ex(ctx->context, &srq_init_attr);
+ if (ctx->srq == NULL) {
+ fprintf(stderr, "Couldn't open XRC SRQ\n");
+@@ -956,7 +957,8 @@ static struct ibv_qp *ctx_xrc_qp_create(struct pingpong_context *ctx,
+ qp_init_attr.cap.max_send_wr = user_param->tx_depth;
+ qp_init_attr.cap.max_send_sge = 1;
+ qp_init_attr.comp_mask = IBV_QP_INIT_ATTR_PD;
+- qp_init_attr.pd = ctx->pd;
++ qp_init_attr.pd = ctx->pad;
++
+ #ifdef HAVE_IBV_WR_API
+ if (!user_param->use_old_post_send)
+ qp_init_attr.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS;
+@@ -1994,6 +1996,10 @@ int ctx_init(struct pingpong_context *ctx, struct perftest_parameters *user_para
+ fprintf(stderr, "Couldn't allocate PAD\n");
+ return FAILURE;
+ }
++ } else {
++ #endif
++ ctx->pad = ctx->pd;
++ #ifdef HAVE_TD_API
+ }
+ #endif
+
+@@ -2111,7 +2117,7 @@ int ctx_init(struct pingpong_context *ctx, struct perftest_parameters *user_para
+ attr.comp_mask = IBV_SRQ_INIT_ATTR_TYPE | IBV_SRQ_INIT_ATTR_PD;
+ attr.attr.max_wr = user_param->rx_depth;
+ attr.attr.max_sge = 1;
+- attr.pd = ctx->pd;
++ attr.pd = ctx->pad;
+
+ attr.srq_type = IBV_SRQT_BASIC;
+ ctx->srq = ibv_create_srq_ex(ctx->context, &attr);
+@@ -2132,7 +2138,7 @@ int ctx_init(struct pingpong_context *ctx, struct perftest_parameters *user_para
+ .max_sge = 1
+ }
+ };
+- ctx->srq = ibv_create_srq(ctx->pd, &attr);
++ ctx->srq = ibv_create_srq(ctx->pad, &attr);
+ if (!ctx->srq) {
+ fprintf(stderr, "Couldn't create SRQ\n");
+ return FAILURE;
+@@ -2319,11 +2325,8 @@ struct ibv_qp* ctx_qp_create(struct pingpong_context *ctx,
+ else if (opcode == IBV_WR_RDMA_READ)
+ attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_READ;
+ }
+- #ifdef HAVE_TD_API
+- attr_ex.pd = user_param->no_lock ? ctx->pad : ctx->pd;
+- #else
+- attr_ex.pd = ctx->pd;
+- #endif
++
++ attr_ex.pd = ctx->pad;
+ attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS | IBV_QP_INIT_ATTR_PD;
+ attr_ex.send_cq = attr.send_cq;
+ attr_ex.recv_cq = attr.recv_cq;
+diff --git a/src/perftest_resources.h b/src/perftest_resources.h
+index ba8630b..fb11d44 100755
+--- a/src/perftest_resources.h
++++ b/src/perftest_resources.h
+@@ -172,8 +172,8 @@ struct pingpong_context {
+ struct ibv_pd *pd;
+ #ifdef HAVE_TD_API
+ struct ibv_td *td;
+- struct ibv_pd *pad;
+ #endif
++ struct ibv_pd *pad;
+ struct ibv_mr **mr;
+ struct ibv_cq *send_cq;
+ struct ibv_cq *recv_cq;
+--
+2.33.0
+
diff --git a/0020-Perftest-Fix-recv-cq-leak.patch b/0020-Perftest-Fix-recv-cq-leak.patch
new file mode 100644
index 0000000..9a06f3c
--- /dev/null
+++ b/0020-Perftest-Fix-recv-cq-leak.patch
@@ -0,0 +1,54 @@
+From 7dc37bf199b64d9deb7ae041bc5c66819fdd6c32 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Thu, 21 Jul 2022 16:14:09 +0300
+Subject: [PATCH 20/20] Perftest: Fix recv-cq leak
+
+Perftest creates both send-cq and recv-cq but only destroy send-cq
+on SEND client. This further leads to failure in deallocating parent
+domain due to the pad refcount design in driver:
+
+Failed to deallocate PAD - No data available
+Failed to deallocate TD - No data available
+Failed to deallocate PD - No data available
+
+The original mainline PR was:
+https://github.com/linux-rdma/perftest/commit/869f96161be03850c9ace80bbac488ac6010a561
+
+Signed-off-by: Shmuel Shaul <sshaul(a)nvidia.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ src/perftest_resources.c | 11 +++++------
+ 1 file changed, 5 insertions(+), 6 deletions(-)
+
+diff --git a/src/perftest_resources.c b/src/perftest_resources.c
+index b388a45..b6a0da6 100755
+--- a/src/perftest_resources.c
++++ b/src/perftest_resources.c
+@@ -1253,6 +1253,7 @@ int destroy_ctx(struct pingpong_context *ctx,
+ int i, first, dereg_counter, rc;
+ int test_result = 0;
+ int num_of_qps = user_param->num_of_qps;
++ int dct_only = (user_param->machine == SERVER && !(user_param->duplex || user_param->tst == LAT));
+
+ if (user_param->wait_destroy) {
+ printf(" Waiting %u seconds before releasing resources...\n",
+@@ -1347,12 +1348,10 @@ int destroy_ctx(struct pingpong_context *ctx,
+ test_result = 1;
+ }
+
+- if (user_param->verb == SEND && (user_param->tst == LAT || user_param->machine == SERVER || user_param->duplex || (ctx->channel)) ) {
+- if (!(user_param->connection_type == DC && user_param->machine == SERVER)) {
+- if (ibv_destroy_cq(ctx->recv_cq)) {
+- fprintf(stderr, "Failed to destroy CQ - %s\n", strerror(errno));
+- test_result = 1;
+- }
++ if ((user_param->verb == SEND) || (user_param->connection_type == DC && !dct_only)){
++ if (ibv_destroy_cq(ctx->recv_cq)) {
++ fprintf(stderr, "Failed to destroy CQ - %s\n", strerror(errno));
++ test_result = 1;
+ }
+ }
+
+--
+2.33.0
+
diff --git a/perftest.spec b/perftest.spec
index 9aa4b46..cf405e8 100644
--- a/perftest.spec
+++ b/perftest.spec
@@ -1,6 +1,6 @@
Name: perftest
Version: 4.5
-Release: 13
+Release: 14
License: GPL-2.0-only OR BSD-2-Clause
Summary: RDMA Performance Testing Tools
Url: https://github.com/linux-rdma/perftest
@@ -21,6 +21,11 @@ Patch12: 0012-Perftest-Add-support-for-TD-lock-free-mode.patch
Patch13: 0013-Perftest-Fix-TD-lock-free-mode-not-working-for-QP.patch
Patch14: 0014-Perftest-Fix-failure-in-creating-cq-when-create-cq-e.patch
Patch15: 0015-Perftest-modify-source_ip-to-bind_sounce_ip-to-fix-i.patch
+Patch16: 0016-Revert-Perftest-replace-rand-with-getrandom-during-M.patch
+Patch17: 0017-Perftest-random-buffer-initialization-optimization.patch
+Patch18: 0018-Perftest-Fix-perform-warm-up-process-stuck.patch
+Patch19: 0019-Perftest-Fix-TD-lock-free-mode-not-working-for-SRQ-X.patch
+Patch20: 0020-Perftest-Fix-recv-cq-leak.patch
BuildRequires: automake gcc libibverbs-devel >= 1.2.0 librdmacm-devel >= 1.0.21 libibumad-devel >= 1.3.10.2
BuildRequires: pciutils-devel libibverbs librdmacm libibumad
@@ -47,6 +52,12 @@ done
%_bindir/*
%changelog
+* Tue Jun 10 2025 Guofeng Yue <yueguofeng(a)h-partners.com> - 4.5-14
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Sync some patches from mainline
+
* Wed Mar 12 2025 Funda Wang <fundawang(a)yeah.net> - 4.5-13
- Type: bugfix
- ID: NA
--
2.33.0
1
0
12 Jun '25
From: huangdonghua <huangdonghua3(a)h-partners.com>
Signed-off-by: huangdonghua <huangdonghua3(a)h-partners.com>
---
...-for-input-param-of-hnsdv_query_devi.patch | 54 +++++++++++++
...ns-Adapt-UD-inline-data-size-for-UCX.patch | 75 +++++++++++++++++++
rdma-core.spec | 10 ++-
3 files changed, 138 insertions(+), 1 deletion(-)
create mode 100644 0066-libhns-Add-check-for-input-param-of-hnsdv_query_devi.patch
create mode 100644 0067-libhns-Adapt-UD-inline-data-size-for-UCX.patch
diff --git a/0066-libhns-Add-check-for-input-param-of-hnsdv_query_devi.patch b/0066-libhns-Add-check-for-input-param-of-hnsdv_query_devi.patch
new file mode 100644
index 0000000..6980843
--- /dev/null
+++ b/0066-libhns-Add-check-for-input-param-of-hnsdv_query_devi.patch
@@ -0,0 +1,54 @@
+From 57985b930eab7e5cf4dc53efa6d303ede9b414c6 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Mon, 20 May 2024 14:05:33 +0800
+Subject: [PATCH 66/67] libhns: Add check for input param of
+ hnsdv_query_device()
+
+mainline inclusion
+from mainline-master
+commit 19e66a6b75fd1f441e787d1791fe8a416b2d56cb
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/#ICEES4
+CVE: NA
+
+Reference:
+https://github.com/linux-rdma/rdma-core/pull/1462/commits/5f9e08f62feb67d0841f6fff2bd119a3df63bde9
+
+------------------------------------------------------------------
+
+Add check for input param of hnsdv_query_device() to avoid null ptr.
+
+Fixes: cf6d9149f8f5 ("libhns: Introduce hns direct verbs")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 5 +++--
+ 1 file changed, 3 insertions(+), 2 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 8bf7bc1..8594666 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1933,9 +1933,9 @@ struct ibv_qp *hnsdv_create_qp(struct ibv_context *context,
+ int hnsdv_query_device(struct ibv_context *context,
+ struct hnsdv_context *attrs_out)
+ {
+- struct hns_roce_device *hr_dev = to_hr_dev(context->device);
++ struct hns_roce_device *hr_dev;
+
+- if (!hr_dev || !attrs_out)
++ if (!context || !context->device || !attrs_out)
+ return EINVAL;
+
+ if (!is_hns_dev(context->device)) {
+@@ -1944,6 +1944,7 @@ int hnsdv_query_device(struct ibv_context *context,
+ }
+ memset(attrs_out, 0, sizeof(*attrs_out));
+
++ hr_dev = to_hr_dev(context->device);
+ attrs_out->comp_mask |= HNSDV_CONTEXT_MASK_CONGEST_TYPE;
+ attrs_out->congest_type = hr_dev->congest_cap;
+
+--
+2.33.0
+
diff --git a/0067-libhns-Adapt-UD-inline-data-size-for-UCX.patch b/0067-libhns-Adapt-UD-inline-data-size-for-UCX.patch
new file mode 100644
index 0000000..e951fc6
--- /dev/null
+++ b/0067-libhns-Adapt-UD-inline-data-size-for-UCX.patch
@@ -0,0 +1,75 @@
+From 04af69fd5f136852024989d47076898be5982722 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Tue, 25 Feb 2025 20:29:53 +0800
+Subject: [PATCH 67/67] libhns: Adapt UD inline data size for UCX
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/ICEEPO
+
+------------------------------------------------------------------
+
+Adapt UD inline data size for UCX. The value
+must be at least 128 to avoid the ucx bug.
+
+The issue url:
+https://gitee.com/src-openeuler/rdma-core/issues/ICEEPO?from=project-issue
+
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Donghua Huang <huangdonghua3(a)h-partners.com>
+---
+ providers/hns/hns_roce_u.h | 2 ++
+ providers/hns/hns_roce_u_verbs.c | 16 ++++++++++++----
+ 2 files changed, 14 insertions(+), 4 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
+index e7e3f01..3d34495 100644
+--- a/providers/hns/hns_roce_u.h
++++ b/providers/hns/hns_roce_u.h
+@@ -83,6 +83,8 @@ typedef _Atomic(uint64_t) atomic_bitmap_t;
+ #define HNS_ROCE_ADDRESS_MASK 0xFFFFFFFF
+ #define HNS_ROCE_ADDRESS_SHIFT 32
+
++#define HNS_ROCE_MIN_UD_INLINE 128
++
+ #define roce_get_field(origin, mask, shift) \
+ (((le32toh(origin)) & (mask)) >> (shift))
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 8594666..7a0fcab 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1511,10 +1511,18 @@ static unsigned int get_sge_num_from_max_inl_data(bool is_ud,
+ }
+
+ static uint32_t get_max_inline_data(struct hns_roce_context *ctx,
+- struct ibv_qp_cap *cap)
++ struct ibv_qp_cap *cap,
++ bool is_ud)
+ {
+- if (cap->max_inline_data)
+- return min_t(uint32_t, roundup_pow_of_two(cap->max_inline_data),
++ uint32_t max_inline_data = cap->max_inline_data;
++
++ if (max_inline_data) {
++ max_inline_data = roundup_pow_of_two(max_inline_data);
++
++ if (is_ud && max_inline_data < HNS_ROCE_MIN_UD_INLINE)
++ max_inline_data = HNS_ROCE_MIN_UD_INLINE;
++
++ return min_t(uint32_t, max_inline_data,
+ ctx->max_inline_data);
+
+ return 0;
+@@ -1536,7 +1544,7 @@ static void set_ext_sge_param(struct hns_roce_context *ctx,
+ attr->cap.max_send_sge);
+
+ if (ctx->config & HNS_ROCE_RSP_EXSGE_FLAGS) {
+- attr->cap.max_inline_data = get_max_inline_data(ctx, &attr->cap);
++ attr->cap.max_inline_data = get_max_inline_data(ctx, &attr->cap, is_ud);
+
+ inline_ext_sge = max(ext_wqe_sge_cnt,
+ get_sge_num_from_max_inl_data(is_ud,
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index ed09fe8..046cf98 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 31
+Release: 32
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -71,6 +71,8 @@ patch62: 0062-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
patch63: 0063-libxscale-update-to-version-2412GA.patch
patch64: 0064-libxscale-automatically-load-xsc_ib.ko.patch
patch65: 0065-libhns-Fix-double-free-of-rinl_buf-wqe_list.patch
+patch66: 0066-libhns-Add-check-for-input-param-of-hnsdv_query_devi.patch
+patch67: 0067-libhns-Adapt-UD-inline-data-size-for-UCX.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -650,6 +652,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Thu Jun 12 2025 Donghua Huang <huangdonghua3(a)h-partners.com> - 50.0-32
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: libhns: Increase input parameter checks and adjust inline data size.
+
* Tue May 27 2025 Junxian Huang <huangjunxian6(a)hisilicon.com> - 50.0-31
- Type: bugfix
- ID: NA
--
2.33.0
1
0
MW is no longer supported in hns. Delete relevant codes.
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/hns_roce_u.c | 3 --
providers/hns/hns_roce_u.h | 5 ---
providers/hns/hns_roce_u_hw_v2.c | 32 ----------------
providers/hns/hns_roce_u_hw_v2.h | 7 ----
providers/hns/hns_roce_u_verbs.c | 63 --------------------------------
5 files changed, 110 deletions(-)
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index 63a1ac551..21c5f51e7 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -58,15 +58,12 @@ static const struct verbs_match_ent hca_table[] = {
};
static const struct verbs_context_ops hns_common_ops = {
- .alloc_mw = hns_roce_u_alloc_mw,
.alloc_pd = hns_roce_u_alloc_pd,
- .bind_mw = hns_roce_u_bind_mw,
.cq_event = hns_roce_u_cq_event,
.create_cq = hns_roce_u_create_cq,
.create_cq_ex = hns_roce_u_create_cq_ex,
.create_qp = hns_roce_u_create_qp,
.create_qp_ex = hns_roce_u_create_qp_ex,
- .dealloc_mw = hns_roce_u_dealloc_mw,
.dealloc_pd = hns_roce_u_dealloc_pd,
.dereg_mr = hns_roce_u_dereg_mr,
.destroy_cq = hns_roce_u_destroy_cq,
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index 614fed992..1cf3c7cb5 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -508,11 +508,6 @@ int hns_roce_u_rereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd,
void *addr, size_t length, int access);
int hns_roce_u_dereg_mr(struct verbs_mr *vmr);
-struct ibv_mw *hns_roce_u_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type);
-int hns_roce_u_dealloc_mw(struct ibv_mw *mw);
-int hns_roce_u_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw,
- struct ibv_mw_bind *mw_bind);
-
struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
struct ibv_comp_channel *channel,
int comp_vector);
diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
index d24cad5bf..784841f43 100644
--- a/providers/hns/hns_roce_u_hw_v2.c
+++ b/providers/hns/hns_roce_u_hw_v2.c
@@ -51,7 +51,6 @@ static const uint32_t hns_roce_opcode[] = {
HR_IBV_OPC_MAP(RDMA_READ, RDMA_READ),
HR_IBV_OPC_MAP(ATOMIC_CMP_AND_SWP, ATOMIC_COM_AND_SWAP),
HR_IBV_OPC_MAP(ATOMIC_FETCH_AND_ADD, ATOMIC_FETCH_AND_ADD),
- HR_IBV_OPC_MAP(BIND_MW, BIND_MW_TYPE),
HR_IBV_OPC_MAP(SEND_WITH_INV, SEND_WITH_INV),
};
@@ -386,7 +385,6 @@ static const unsigned int wc_send_op_map[] = {
[HNS_ROCE_SQ_OP_RDMA_READ] = IBV_WC_RDMA_READ,
[HNS_ROCE_SQ_OP_ATOMIC_COMP_AND_SWAP] = IBV_WC_COMP_SWAP,
[HNS_ROCE_SQ_OP_ATOMIC_FETCH_AND_ADD] = IBV_WC_FETCH_ADD,
- [HNS_ROCE_SQ_OP_BIND_MW] = IBV_WC_BIND_MW,
};
static const unsigned int wc_rcv_op_map[] = {
@@ -568,7 +566,6 @@ static void parse_cqe_for_req(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
case HNS_ROCE_SQ_OP_SEND:
case HNS_ROCE_SQ_OP_SEND_WITH_INV:
case HNS_ROCE_SQ_OP_RDMA_WRITE:
- case HNS_ROCE_SQ_OP_BIND_MW:
wc->wc_flags = 0;
break;
case HNS_ROCE_SQ_OP_SEND_WITH_IMM:
@@ -1251,28 +1248,6 @@ static int set_rc_inl(struct hns_roce_qp *qp, const struct ibv_send_wr *wr,
return 0;
}
-static void set_bind_mw_seg(struct hns_roce_rc_sq_wqe *wqe,
- const struct ibv_send_wr *wr)
-{
- unsigned int access = wr->bind_mw.bind_info.mw_access_flags;
-
- hr_reg_write_bool(wqe, RCWQE_MW_TYPE, wr->bind_mw.mw->type - 1);
- hr_reg_write_bool(wqe, RCWQE_MW_RA_EN,
- !!(access & IBV_ACCESS_REMOTE_ATOMIC));
- hr_reg_write_bool(wqe, RCWQE_MW_RR_EN,
- !!(access & IBV_ACCESS_REMOTE_READ));
- hr_reg_write_bool(wqe, RCWQE_MW_RW_EN,
- !!(access & IBV_ACCESS_REMOTE_WRITE));
-
- wqe->new_rkey = htole32(wr->bind_mw.rkey);
- wqe->byte_16 = htole32(wr->bind_mw.bind_info.length &
- HNS_ROCE_ADDRESS_MASK);
- wqe->byte_20 = htole32(wr->bind_mw.bind_info.length >>
- HNS_ROCE_ADDRESS_SHIFT);
- wqe->rkey = htole32(wr->bind_mw.bind_info.mr->rkey);
- wqe->va = htole64(wr->bind_mw.bind_info.addr);
-}
-
static int check_rc_opcode(struct hns_roce_rc_sq_wqe *wqe,
const struct ibv_send_wr *wr)
{
@@ -1298,9 +1273,6 @@ static int check_rc_opcode(struct hns_roce_rc_sq_wqe *wqe,
case IBV_WR_SEND_WITH_INV:
wqe->inv_key = htole32(wr->invalidate_rkey);
break;
- case IBV_WR_BIND_MW:
- set_bind_mw_seg(wqe, wr);
- break;
default:
ret = EINVAL;
break;
@@ -1334,9 +1306,6 @@ static int set_rc_wqe(void *wqe, struct hns_roce_qp *qp, struct ibv_send_wr *wr,
hr_reg_write(rc_sq_wqe, RCWQE_MSG_START_SGE_IDX,
sge_info->start_idx & (qp->ex_sge.sge_cnt - 1));
- if (wr->opcode == IBV_WR_BIND_MW)
- goto wqe_valid;
-
wqe += sizeof(struct hns_roce_rc_sq_wqe);
dseg = wqe;
@@ -1357,7 +1326,6 @@ static int set_rc_wqe(void *wqe, struct hns_roce_qp *qp, struct ibv_send_wr *wr,
if (ret)
return ret;
-wqe_valid:
enable_wqe(qp, rc_sq_wqe, qp->sq.head + nreq);
return 0;
diff --git a/providers/hns/hns_roce_u_hw_v2.h b/providers/hns/hns_roce_u_hw_v2.h
index abf94673e..af061399c 100644
--- a/providers/hns/hns_roce_u_hw_v2.h
+++ b/providers/hns/hns_roce_u_hw_v2.h
@@ -60,7 +60,6 @@ enum {
HNS_ROCE_WQE_OP_ATOMIC_MASK_COMP_AND_SWAP = 0x8,
HNS_ROCE_WQE_OP_ATOMIC_MASK_FETCH_AND_ADD = 0x9,
HNS_ROCE_WQE_OP_FAST_REG_PMR = 0xa,
- HNS_ROCE_WQE_OP_BIND_MW_TYPE = 0xc,
HNS_ROCE_WQE_OP_MASK = 0x1f
};
@@ -84,7 +83,6 @@ enum {
HNS_ROCE_SQ_OP_ATOMIC_MASK_COMP_AND_SWAP = 0x8,
HNS_ROCE_SQ_OP_ATOMIC_MASK_FETCH_AND_ADD = 0x9,
HNS_ROCE_SQ_OP_FAST_REG_PMR = 0xa,
- HNS_ROCE_SQ_OP_BIND_MW = 0xc,
};
enum {
@@ -232,11 +230,6 @@ struct hns_roce_rc_sq_wqe {
#define RCWQE_VA1_L RCWQE_FIELD_LOC(479, 448)
#define RCWQE_VA1_H RCWQE_FIELD_LOC(511, 480)
-#define RCWQE_MW_TYPE RCWQE_FIELD_LOC(256, 256)
-#define RCWQE_MW_RA_EN RCWQE_FIELD_LOC(258, 258)
-#define RCWQE_MW_RR_EN RCWQE_FIELD_LOC(259, 259)
-#define RCWQE_MW_RW_EN RCWQE_FIELD_LOC(260, 260)
-
struct hns_roce_v2_wqe_data_seg {
__le32 len;
__le32 lkey;
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index a906e8d58..10fb474af 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -346,69 +346,6 @@ int hns_roce_u_dereg_mr(struct verbs_mr *vmr)
return ret;
}
-int hns_roce_u_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw,
- struct ibv_mw_bind *mw_bind)
-{
- struct ibv_mw_bind_info *bind_info = &mw_bind->bind_info;
- struct ibv_send_wr *bad_wr = NULL;
- struct ibv_send_wr wr = {};
- int ret;
-
- if (bind_info->mw_access_flags & ~(IBV_ACCESS_REMOTE_WRITE |
- IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_ATOMIC))
- return EINVAL;
-
- wr.opcode = IBV_WR_BIND_MW;
- wr.next = NULL;
-
- wr.wr_id = mw_bind->wr_id;
- wr.send_flags = mw_bind->send_flags;
-
- wr.bind_mw.mw = mw;
- wr.bind_mw.rkey = ibv_inc_rkey(mw->rkey);
- wr.bind_mw.bind_info = mw_bind->bind_info;
-
- ret = hns_roce_u_v2_post_send(qp, &wr, &bad_wr);
- if (ret)
- return ret;
-
- mw->rkey = wr.bind_mw.rkey;
-
- return 0;
-}
-
-struct ibv_mw *hns_roce_u_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type)
-{
- struct ibv_mw *mw;
- struct ibv_alloc_mw cmd = {};
- struct ib_uverbs_alloc_mw_resp resp = {};
-
- mw = malloc(sizeof(*mw));
- if (!mw)
- return NULL;
-
- if (ibv_cmd_alloc_mw(pd, type, mw, &cmd, sizeof(cmd),
- &resp, sizeof(resp))) {
- free(mw);
- return NULL;
- }
-
- return mw;
-}
-
-int hns_roce_u_dealloc_mw(struct ibv_mw *mw)
-{
- int ret;
-
- ret = ibv_cmd_dealloc_mw(mw);
- if (ret)
- return ret;
-
- free(mw);
-
- return 0;
-}
-
enum {
CREATE_CQ_SUPPORTED_COMP_MASK = IBV_CQ_INIT_ATTR_MASK_FLAGS |
IBV_CQ_INIT_ATTR_MASK_PD,
--
2.33.0
1
0
rinl_buf->wqe_list will be double-freed in error flow, first in
alloc_recv_rinl_buf() and then in free_recv_rinl_buf(). Actually
free_recv_rinl_buf() shouldn't be called when alloc_recv_rinl_buf()
failed.
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
...Fix-double-free-of-rinl_buf-wqe_list.patch | 53 +++++++++++++++++++
rdma-core.spec | 9 +++-
2 files changed, 61 insertions(+), 1 deletion(-)
create mode 100644 0065-libhns-Fix-double-free-of-rinl_buf-wqe_list.patch
diff --git a/0065-libhns-Fix-double-free-of-rinl_buf-wqe_list.patch b/0065-libhns-Fix-double-free-of-rinl_buf-wqe_list.patch
new file mode 100644
index 0000000..e568c7a
--- /dev/null
+++ b/0065-libhns-Fix-double-free-of-rinl_buf-wqe_list.patch
@@ -0,0 +1,53 @@
+From 583d8210da89563fcef0c6e508f58cc7adf72a3b Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Mon, 12 May 2025 10:51:32 +0800
+Subject: [PATCH 65/65] libhns: Fix double-free of rinl_buf->wqe_list
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/ICAQ55
+
+------------------------------------------------------------------
+
+rinl_buf->wqe_list will be double-freed in error flow, first in
+alloc_recv_rinl_buf() and then in free_recv_rinl_buf(). Actually
+free_recv_rinl_buf() shouldn't be called when alloc_recv_rinl_buf()
+failed.
+
+Fixes: 83b0baff3ccf ("libhns: Refactor rq inline")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 7 ++++---
+ 1 file changed, 4 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index edd8e3d..8bf7bc1 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1453,18 +1453,19 @@ static int qp_alloc_wqe(struct ibv_qp_init_attr_ex *attr,
+ qp->dca_wqe.shift = qp->pageshift;
+ qp->dca_wqe.bufs = calloc(qp->dca_wqe.max_cnt, sizeof(void *));
+ if (!qp->dca_wqe.bufs)
+- goto err_alloc;
++ goto err_alloc_recv_rinl_buf;
+ verbs_debug(&ctx->ibv_ctx, "alloc DCA buf.\n");
+ } else {
+ if (hns_roce_alloc_buf(&qp->buf, qp->buf_size,
+ 1 << qp->pageshift))
+- goto err_alloc;
++ goto err_alloc_recv_rinl_buf;
+ }
+
+ return 0;
+
+-err_alloc:
++err_alloc_recv_rinl_buf:
+ free_recv_rinl_buf(&qp->rq_rinl_buf);
++err_alloc:
+ if (qp->rq.wrid)
+ free(qp->rq.wrid);
+
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index b252761..ed09fe8 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 30
+Release: 31
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -70,6 +70,7 @@ patch61: 0061-libhns-Fix-freeing-pad-without-checking-refcnt.patch
patch62: 0062-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
patch63: 0063-libxscale-update-to-version-2412GA.patch
patch64: 0064-libxscale-automatically-load-xsc_ib.ko.patch
+patch65: 0065-libhns-Fix-double-free-of-rinl_buf-wqe_list.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -649,6 +650,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Tue May 27 2025 Junxian Huang <huangjunxian6(a)hisilicon.com> - 50.0-31
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: libhns: Fix double-free of rinl_buf->wqe_list
+
* Fri May 16 2025 Xin Tian <tianx(a)yunsilicon.com> - 50.0-30
- Type: feature
- ID: NA
--
2.33.0
1
0
26 May '25
From: wenglianfa <wenglianfa(a)huawei.com>
rinl_buf->wqe_list will be double-freed in error flow, first in
alloc_recv_rinl_buf() and then in free_recv_rinl_buf(). Actually
free_recv_rinl_buf() shouldn't be called when alloc_recv_rinl_buf()
failed.
Fixes: 83b0baff3ccf ("libhns: Refactor rq inline")
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/hns_roce_u_verbs.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index 132fcaeba..a906e8d58 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -1256,12 +1256,13 @@ static int qp_alloc_wqe(struct ibv_qp_cap *cap, struct hns_roce_qp *qp,
}
if (hns_roce_alloc_buf(&qp->buf, qp->buf_size, HNS_HW_PAGE_SIZE))
- goto err_alloc;
+ goto err_alloc_recv_rinl_buf;
return 0;
-err_alloc:
+err_alloc_recv_rinl_buf:
free_recv_rinl_buf(&qp->rq_rinl_buf);
+err_alloc:
if (qp->rq.wrid)
free(qp->rq.wrid);
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
The last commit was found when I created a XRC SRQ in
lock-free mode but failed to destroy it because of the
refcnt check added in the previous commit.
The failure was because the PAD was acquired through
ibv_srq->pd in destroy_srq(), while ibv_srq->pd wasn't
assigned when the SRQ was created by ibv_create_srq_ex().
So let's assign ibv_srq->pd in the common ibv_icmd_create_srq() ,
so that drivers can get the correct pd no matter
which api the SRQ is created by.
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...hns-Add-debug-log-for-lock-free-mode.patch | 59 +++++++++++
...s-Fix-ret-not-assigned-in-create-srq.patch | 58 +++++++++++
...efcnt-leaking-in-error-flow-of-creat.patch | 99 +++++++++++++++++++
...-freeing-pad-without-checking-refcnt.patch | 69 +++++++++++++
...-Assign-ibv-srq-pd-when-creating-SRQ.patch | 43 ++++++++
rdma-core.spec | 13 ++-
6 files changed, 340 insertions(+), 1 deletion(-)
create mode 100644 0058-libhns-Add-debug-log-for-lock-free-mode.patch
create mode 100644 0059-libhns-Fix-ret-not-assigned-in-create-srq.patch
create mode 100644 0060-libhns-Fix-pad-refcnt-leaking-in-error-flow-of-creat.patch
create mode 100644 0061-libhns-Fix-freeing-pad-without-checking-refcnt.patch
create mode 100644 0062-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
diff --git a/0058-libhns-Add-debug-log-for-lock-free-mode.patch b/0058-libhns-Add-debug-log-for-lock-free-mode.patch
new file mode 100644
index 0000000..86bc0a5
--- /dev/null
+++ b/0058-libhns-Add-debug-log-for-lock-free-mode.patch
@@ -0,0 +1,59 @@
+From 20dc7f183603b936ba7a865fc8a6d115073b1e29 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Thu, 24 Apr 2025 20:32:12 +0800
+Subject: [PATCH 58/62] libhns: Add debug log for lock-free mode
+
+mainline inclusion
+from mainline-v56.0-65
+commit fb96940fcf6f96185d407d57bcaf775ccf8f1762
+category: cheanup
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC3X57
+CVE: NA
+
+Reference:
+https://github.com/linux-rdma/rdma-core/pull/1599/commits/fb96940fcf6f96185d407d57bcaf775ccf8f1762
+
+---------------------------------------------------------------------
+
+Currently there is no way to observe whether the lock-free mode is
+configured from the driver's perspective. Add debug log for this.
+
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 7 ++++++-
+ 1 file changed, 6 insertions(+), 1 deletion(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 5fe169e..3efc2f4 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -182,6 +182,7 @@ err:
+ struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
+ struct ibv_parent_domain_init_attr *attr)
+ {
++ struct hns_roce_pd *protection_domain;
+ struct hns_roce_pad *pad;
+
+ if (ibv_check_alloc_parent_domain(attr))
+@@ -198,12 +199,16 @@ struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
+ return NULL;
+ }
+
++ protection_domain = to_hr_pd(attr->pd);
+ if (attr->td) {
+ pad->td = to_hr_td(attr->td);
+ atomic_fetch_add(&pad->td->refcount, 1);
++ verbs_debug(verbs_get_ctx(context),
++ "set PAD(0x%x) to lock-free mode.\n",
++ protection_domain->pdn);
+ }
+
+- pad->pd.protection_domain = to_hr_pd(attr->pd);
++ pad->pd.protection_domain = protection_domain;
+ atomic_fetch_add(&pad->pd.protection_domain->refcount, 1);
+
+ atomic_init(&pad->pd.refcount, 1);
+--
+2.33.0
+
diff --git a/0059-libhns-Fix-ret-not-assigned-in-create-srq.patch b/0059-libhns-Fix-ret-not-assigned-in-create-srq.patch
new file mode 100644
index 0000000..10443d6
--- /dev/null
+++ b/0059-libhns-Fix-ret-not-assigned-in-create-srq.patch
@@ -0,0 +1,58 @@
+From cf284feddf6bb98c600061a3fd1f0095e46b540e Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Wed, 23 Apr 2025 16:55:14 +0800
+Subject: [PATCH 59/62] libhns: Fix ret not assigned in create srq()
+
+mainline inclusion
+from mainline-v56.0-65
+commit 2034b1860c5a8b0cc3879315259462c04e53a98d
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC3X57
+CVE: NA
+
+Reference:
+https://github.com/linux-rdma/rdma-core/pull/1599/commits/2034b1860c5a8b0cc3879315259462c04e53a98d
+
+---------------------------------------------------------------------
+
+Fix the problem that ret may not be assigned in the error flow
+of create_srq().
+
+Fixes: aa7bcf7f7e44 ("libhns: Add support for lock-free SRQ")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 10 +++++++---
+ 1 file changed, 7 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 3efc2f4..b26ac29 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -933,16 +933,20 @@ static struct ibv_srq *create_srq(struct ibv_context *context,
+ if (pad)
+ atomic_fetch_add(&pad->pd.refcount, 1);
+
+- if (hns_roce_srq_spinlock_init(context, srq, init_attr))
++ ret = hns_roce_srq_spinlock_init(context, srq, init_attr)
++ if (ret)
+ goto err_free_srq;
+
+ set_srq_param(context, srq, init_attr);
+- if (alloc_srq_buf(srq))
++ ret = alloc_srq_buf(srq)
++ if (ret)
+ goto err_destroy_lock;
+
+ srq->rdb = hns_roce_alloc_db(hr_ctx, HNS_ROCE_SRQ_TYPE_DB);
+- if (!srq->rdb)
++ if (!srq->rdb) {
++ ret = ENOMEM;
+ goto err_srq_buf;
++ }
+
+ ret = exec_srq_create_cmd(context, srq, init_attr);
+ if (ret)
+--
+2.33.0
+
diff --git a/0060-libhns-Fix-pad-refcnt-leaking-in-error-flow-of-creat.patch b/0060-libhns-Fix-pad-refcnt-leaking-in-error-flow-of-creat.patch
new file mode 100644
index 0000000..95a1e69
--- /dev/null
+++ b/0060-libhns-Fix-pad-refcnt-leaking-in-error-flow-of-creat.patch
@@ -0,0 +1,99 @@
+From 2bc3cafa227528e2893dadfff7cf54cfee427e1a Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Wed, 23 Apr 2025 16:55:15 +0800
+Subject: [PATCH 60/62] libhns: Fix pad refcnt leaking in error flow of create
+ qp/cq/srq
+
+mainline inclusion
+from mainline-v56.0-65
+commit f877d6e610e438515e1535c9ec7a3a3ef37c58e0
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC3X57
+CVE: NA
+
+Reference:
+https://github.com/linux-rdma/rdma-core/pull/1599/commits/f877d6e610e438515e1535c9ec7a3a3ef37c58e0
+
+---------------------------------------------------------------------
+
+Decrease pad refcnt by 1 in error flow of create qp/cq/srq.
+
+Fixes: f8b4f622b1c5 ("libhns: Add support for lock-free QP")
+Fixes: 95225025e24c ("libhns: Add support for lock-free CQ")
+Fixes: aa7bcf7f7e44 ("libhns: Add support for lock-free SRQ")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 20 +++++++++++++-------
+ 1 file changed, 13 insertions(+), 7 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index b26ac29..5a62bb2 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -445,12 +445,9 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ return EOPNOTSUPP;
+ }
+
+- if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) {
+- if (!pad) {
+- verbs_err(&context->ibv_ctx, "failed to check the pad of cq.\n");
+- return EINVAL;
+- }
+- atomic_fetch_add(&pad->pd.refcount, 1);
++ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD && !pad) {
++ verbs_err(&context->ibv_ctx, "failed to check the pad of cq.\n");
++ return EINVAL;
+ }
+
+ attr->cqe = max_t(uint32_t, HNS_ROCE_MIN_CQE_NUM,
+@@ -556,6 +553,7 @@ static void hns_roce_uninit_cq_swc(struct hns_roce_cq *cq)
+ static struct ibv_cq_ex *create_cq(struct ibv_context *context,
+ struct ibv_cq_init_attr_ex *attr)
+ {
++ struct hns_roce_pad *pad = to_hr_pad(attr->parent_domain);
+ struct hns_roce_context *hr_ctx = to_hr_ctx(context);
+ struct hns_roce_cq *cq;
+ int ret;
+@@ -570,8 +568,10 @@ static struct ibv_cq_ex *create_cq(struct ibv_context *context,
+ goto err;
+ }
+
+- if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD)
++ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) {
+ cq->parent_domain = attr->parent_domain;
++ atomic_fetch_add(&pad->pd.refcount, 1);
++ }
+
+ ret = hns_roce_cq_spinlock_init(context, cq, attr);
+ if (ret)
+@@ -611,6 +611,8 @@ err_db:
+ err_buf:
+ hns_roce_spinlock_destroy(&cq->hr_lock);
+ err_lock:
++ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
+ free(cq);
+ err:
+ if (ret < 0)
+@@ -977,6 +979,8 @@ err_destroy_lock:
+ hns_roce_spinlock_destroy(&srq->hr_lock);
+
+ err_free_srq:
++ if (pad)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
+ free(srq);
+
+ err:
+@@ -1872,6 +1876,8 @@ err_cmd:
+ err_buf:
+ hns_roce_qp_spinlock_destroy(qp);
+ err_spinlock:
++ if (pad)
++ atomic_fetch_sub(&pad->pd.refcount, 1);
+ free(qp);
+ err:
+ if (ret < 0)
+--
+2.33.0
+
diff --git a/0061-libhns-Fix-freeing-pad-without-checking-refcnt.patch b/0061-libhns-Fix-freeing-pad-without-checking-refcnt.patch
new file mode 100644
index 0000000..d37dee6
--- /dev/null
+++ b/0061-libhns-Fix-freeing-pad-without-checking-refcnt.patch
@@ -0,0 +1,69 @@
+From 0db7ff07caa483da0fb2cfd7944d549a38b4c720 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Wed, 23 Apr 2025 16:55:16 +0800
+Subject: [PATCH 61/62] libhns: Fix freeing pad without checking refcnt
+
+mainline inclusion
+from mainline-v56.0-65
+commit 234d135276ea8ef83633113e224e0cd735ebeca8
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC3X57
+CVE: NA
+
+Reference:
+https://github.com/linux-rdma/rdma-core/pull/1599/commits/234d135276ea8ef83633113e224e0cd735ebeca8
+
+---------------------------------------------------------------------
+
+Currently pad refcnt will be added when creating qp/cq/srq, but it is
+not checked when freeing pad. Add a check to prevent freeing pad when
+it is still used by any qp/cq/srq.
+
+Fixes: 7b6b3dae328f ("libhns: Add support for thread domain and parent
+domain")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 12 +++++++-----
+ 1 file changed, 7 insertions(+), 5 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 5a62bb2..8c37496 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -218,14 +218,18 @@ struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
+ return &pad->pd.ibv_pd;
+ }
+
+-static void hns_roce_free_pad(struct hns_roce_pad *pad)
++static int hns_roce_free_pad(struct hns_roce_pad *pad)
+ {
++ if (atomic_load(&pad->pd.refcount) > 1)
++ return EBUSY;
++
+ atomic_fetch_sub(&pad->pd.protection_domain->refcount, 1);
+
+ if (pad->td)
+ atomic_fetch_sub(&pad->td->refcount, 1);
+
+ free(pad);
++ return 0;
+ }
+
+ static int hns_roce_free_pd(struct hns_roce_pd *pd)
+@@ -248,10 +252,8 @@ int hns_roce_u_dealloc_pd(struct ibv_pd *ibv_pd)
+ struct hns_roce_pad *pad = to_hr_pad(ibv_pd);
+ struct hns_roce_pd *pd = to_hr_pd(ibv_pd);
+
+- if (pad) {
+- hns_roce_free_pad(pad);
+- return 0;
+- }
++ if (pad)
++ return hns_roce_free_pad(pad);
+
+ return hns_roce_free_pd(pd);
+ }
+--
+2.33.0
+
diff --git a/0062-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch b/0062-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
new file mode 100644
index 0000000..e7d0395
--- /dev/null
+++ b/0062-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
@@ -0,0 +1,43 @@
+From f8f9295695921fa796bb93c5ee7066e50221bbc3 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Wed, 23 Apr 2025 16:55:17 +0800
+Subject: [PATCH 62/62] verbs: Assign ibv srq->pd when creating SRQ
+
+mainline inclusion
+from mainline-v56.0-65
+commit bf1e427141fde2651bab4860e77a432bb7e26094
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC3X57
+CVE: NA
+
+Reference:
+https://github.com/linux-rdma/rdma-core/pull/1599/commits/bf1e427141fde2651bab4860e77a432bb7e26094
+
+---------------------------------------------------------------------
+
+Some providers need to access ibv_srq->pd during SRQ destruction, but
+it may not be assigned currently when using ibv_create_srq_ex(). This
+may lead to some SRQ-related resource leaks. Assign ibv_srq->pd when
+creating SRQ to ensure pd can be obtained correctly.
+
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ libibverbs/cmd_srq.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/libibverbs/cmd_srq.c b/libibverbs/cmd_srq.c
+index dfaaa6a..259ea0d 100644
+--- a/libibverbs/cmd_srq.c
++++ b/libibverbs/cmd_srq.c
+@@ -63,6 +63,7 @@ static int ibv_icmd_create_srq(struct ibv_pd *pd, struct verbs_srq *vsrq,
+ struct verbs_xrcd *vxrcd = NULL;
+ enum ibv_srq_type srq_type;
+
++ srq->pd = pd;
+ srq->context = pd->context;
+ pthread_mutex_init(&srq->mutex, NULL);
+ pthread_cond_init(&srq->cond, NULL);
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index e760f40..928fc75 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 27
+Release: 28
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -63,6 +63,11 @@ patch54: 0054-libhns-Fix-wrong-max-inline-data-value.patch
patch55: 0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch
patch56: 0056-libhns-Add-initial-support-for-HNS-LTTng-tracing.patch
patch57: 0057-libhns-Add-tracepoint-for-HNS-RoCE-I-O.patch
+patch58: 0058-libhns-Add-debug-log-for-lock-free-mode.patch
+patch59: 0059-libhns-Fix-ret-not-assigned-in-create-srq.patch
+patch60: 0060-libhns-Fix-pad-refcnt-leaking-in-error-flow-of-creat.patch
+patch61: 0061-libhns-Fix-freeing-pad-without-checking-refcnt.patch
+patch62: 0062-verbs-Assign-ibv-srq-pd-when-creating-SRQ.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -642,6 +647,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Fri Apr 25 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-28
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Bugfixes and one debug improvement
+
* Wed Apr 23 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-27
- Type: feature
- ID: NA
--
2.33.0
1
0
From: Chengchang Tang <tangchengchang(a)huawei.com>
HNS RoCE will check whether the length of the recv sge exceeds
the range of the corresponding MR, which can cause hns RoCE to
report a CQE error.
Signed-off-by: Chengchang Tang <tangchengchang(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
src/perftest_resources.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/perftest_resources.c b/src/perftest_resources.c
index 9e9f41f..f8c0de2 100755
--- a/src/perftest_resources.c
+++ b/src/perftest_resources.c
@@ -3261,6 +3261,19 @@ void ctx_set_send_reg_wqes(struct pingpong_context *ctx,
}
}
+static uint64_t set_recv_length(struct pingpong_context *ctx,
+ struct perftest_parameters *user_param)
+{
+ enum ctx_device current_dev = ib_dev_name(ctx->context);
+ int mtu = MTU_SIZE(user_param->curr_mtu);
+ uint64_t length = SIZE(user_param->connection_type, user_param->size, 1);
+
+ if (current_dev != HNS && user_param->use_srq == ON)
+ length = ((length + mtu - 1 )/ mtu) * mtu;
+
+ return length;
+}
+
/******************************************************************************
*
******************************************************************************/
@@ -3270,8 +3283,7 @@ int ctx_set_recv_wqes(struct pingpong_context *ctx,struct perftest_parameters *u
int num_of_qps = user_param->num_of_qps;
struct ibv_recv_wr *bad_wr_recv;
int size_per_qp = user_param->rx_depth / user_param->recv_post_list;
- int mtu = MTU_SIZE(user_param->curr_mtu);
- uint64_t length = user_param->use_srq == ON ? (((SIZE(user_param->connection_type ,user_param->size, 1) + mtu - 1 )/ mtu) * mtu) : SIZE(user_param->connection_type, user_param->size, 1);
+ uint64_t length = set_recv_length(ctx, user_param);
/* Write w/imm completions have zero recieve buffer length */
if (user_param->verb == WRITE_IMM)
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Add support for LTTng tracing. For now it is used for post_send, post_recv and poll_cq.
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...nitial-support-for-HNS-LTTng-tracing.patch | 112 +++++
...bhns-Add-tracepoint-for-HNS-RoCE-I-O.patch | 382 ++++++++++++++++++
rdma-core.spec | 10 +-
3 files changed, 503 insertions(+), 1 deletion(-)
create mode 100644 0056-libhns-Add-initial-support-for-HNS-LTTng-tracing.patch
create mode 100644 0057-libhns-Add-tracepoint-for-HNS-RoCE-I-O.patch
diff --git a/0056-libhns-Add-initial-support-for-HNS-LTTng-tracing.patch b/0056-libhns-Add-initial-support-for-HNS-LTTng-tracing.patch
new file mode 100644
index 0000000..f84fde7
--- /dev/null
+++ b/0056-libhns-Add-initial-support-for-HNS-LTTng-tracing.patch
@@ -0,0 +1,112 @@
+From dfcef98e85b947dd38738436c769926f66438a7d Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Tue, 22 Apr 2025 16:18:44 +0800
+Subject: [PATCH 56/57] libhns: Add initial support for HNS LTTng tracing
+
+mainline inclusion
+from mainline-v56.0-65
+commit 5d96d96c822323a1c9b0a6b98ce58a17a8f165c1
+category: feature
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC3E67
+CVE: NA
+
+Reference: https://github.com/linux-rdma/rdma-core/pull/1587/commits/5d96d96c822323a1c…
+
+---------------------------------------------------------------------
+
+Add initial support for HNS LTTng tracing.
+
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/CMakeLists.txt | 10 +++++++++
+ providers/hns/hns_roce_u_trace.c | 9 ++++++++
+ providers/hns/hns_roce_u_trace.h | 35 ++++++++++++++++++++++++++++++++
+ 3 files changed, 54 insertions(+)
+ create mode 100644 providers/hns/hns_roce_u_trace.c
+ create mode 100644 providers/hns/hns_roce_u_trace.h
+
+diff --git a/providers/hns/CMakeLists.txt b/providers/hns/CMakeLists.txt
+index 58139ae..36ebfac 100644
+--- a/providers/hns/CMakeLists.txt
++++ b/providers/hns/CMakeLists.txt
+@@ -1,5 +1,10 @@
++if (ENABLE_LTTNG AND LTTNGUST_FOUND)
++ set(TRACE_FILE hns_roce_u_trace.c)
++endif()
++
+ rdma_shared_provider(hns libhns.map
+ 1 1.0.${PACKAGE_VERSION}
++ ${TRACE_FILE}
+ hns_roce_u.c
+ hns_roce_u_buf.c
+ hns_roce_u_db.c
+@@ -12,3 +17,8 @@ publish_headers(infiniband
+ )
+
+ rdma_pkg_config("hns" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}")
++
++if (ENABLE_LTTNG AND LTTNGUST_FOUND)
++ target_include_directories(hns PUBLIC ".")
++ target_link_libraries(hns LINK_PRIVATE LTTng::UST)
++endif()
+diff --git a/providers/hns/hns_roce_u_trace.c b/providers/hns/hns_roce_u_trace.c
+new file mode 100644
+index 0000000..812f54c
+--- /dev/null
++++ b/providers/hns/hns_roce_u_trace.c
+@@ -0,0 +1,9 @@
++// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
++/*
++ * Copyright (c) 2025 Hisilicon Limited.
++ */
++
++#define LTTNG_UST_TRACEPOINT_CREATE_PROBES
++#define LTTNG_UST_TRACEPOINT_DEFINE
++
++#include "hns_roce_u_trace.h"
+diff --git a/providers/hns/hns_roce_u_trace.h b/providers/hns/hns_roce_u_trace.h
+new file mode 100644
+index 0000000..9b9485c
+--- /dev/null
++++ b/providers/hns/hns_roce_u_trace.h
+@@ -0,0 +1,35 @@
++/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
++/*
++ * Copyright (c) 2025 Hisilicon Limited.
++ */
++
++#if defined(LTTNG_ENABLED)
++
++#undef LTTNG_UST_TRACEPOINT_PROVIDER
++#define LTTNG_UST_TRACEPOINT_PROVIDER rdma_core_hns
++
++#undef LTTNG_UST_TRACEPOINT_INCLUDE
++#define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
++
++#if !defined(__HNS_TRACE_H__) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ)
++#define __HNS_TRACE_H__
++
++#include <lttng/tracepoint.h>
++#include <infiniband/verbs.h>
++
++#define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg)
++
++#endif /* __HNS_TRACE_H__*/
++
++#include <lttng/tracepoint-event.h>
++
++#else
++
++#ifndef __HNS_TRACE_H__
++#define __HNS_TRACE_H__
++
++#define rdma_tracepoint(arg...)
++
++#endif /* __HNS_TRACE_H__*/
++
++#endif /* defined(LTTNG_ENABLED) */
+--
+2.33.0
+
diff --git a/0057-libhns-Add-tracepoint-for-HNS-RoCE-I-O.patch b/0057-libhns-Add-tracepoint-for-HNS-RoCE-I-O.patch
new file mode 100644
index 0000000..39c6d6f
--- /dev/null
+++ b/0057-libhns-Add-tracepoint-for-HNS-RoCE-I-O.patch
@@ -0,0 +1,382 @@
+From feec8deebf58cf6faaf9f70eda49b929eb674f72 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Tue, 22 Apr 2025 16:18:45 +0800
+Subject: [PATCH 57/57] libhns: Add tracepoint for HNS RoCE I/O
+
+mainline inclusion
+from mainline-v56.0-65
+commit 19cb51c73029b593608f0c5d41a4ace8d1f1e334
+category: feature
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC3E67
+CVE: NA
+
+Reference: https://github.com/linux-rdma/rdma-core/pull/1587/commits/19cb51c73029b5936…
+
+---------------------------------------------------------------------
+
+Add tracepoint for HNS RoCE I/O, including post_send, post_recv and
+poll_cq.
+
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 153 +++++++++++++++++++++++++++++++
+ providers/hns/hns_roce_u_trace.h | 98 ++++++++++++++++++++
+ 2 files changed, 251 insertions(+)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 3a1249f..b80c574 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -38,6 +38,7 @@
+ #include "hns_roce_u.h"
+ #include "hns_roce_u_db.h"
+ #include "hns_roce_u_hw_v2.h"
++#include "hns_roce_u_trace.h"
+
+ #define HR_IBV_OPC_MAP(ib_key, hr_key) \
+ [IBV_WR_ ## ib_key] = HNS_ROCE_WQE_OP_ ## hr_key
+@@ -764,6 +765,80 @@ static int parse_cqe_for_cq(struct hns_roce_context *ctx, struct hns_roce_cq *cq
+ return 0;
+ }
+
++#ifdef LTTNG_ENABLED
++static uint8_t read_wc_sl(struct hns_roce_qp *hr_qp,
++ struct hns_roce_v2_cqe *cqe,
++ struct ibv_wc *wc)
++{
++ return hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD &&
++ hr_reg_read(cqe, CQE_S_R) == CQE_FOR_RQ ?
++ wc->sl : UINT8_MAX;
++}
++
++static uint32_t read_wc_rqpn(struct hns_roce_qp *hr_qp,
++ struct hns_roce_v2_cqe *cqe,
++ struct ibv_wc *wc)
++{
++ return hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD &&
++ hr_reg_read(cqe, CQE_S_R) == CQE_FOR_RQ ?
++ wc->src_qp : UINT32_MAX;
++}
++
++static uint32_t read_wc_byte_len(struct hns_roce_v2_cqe *cqe,
++ struct ibv_wc *wc)
++{
++ if (hr_reg_read(cqe, CQE_S_R) == CQE_FOR_RQ)
++ return wc->byte_len;
++
++ switch (hr_reg_read(cqe, CQE_OPCODE)) {
++ case HNS_ROCE_SQ_OP_RDMA_READ:
++ case HNS_ROCE_SQ_OP_ATOMIC_COMP_AND_SWAP:
++ case HNS_ROCE_SQ_OP_ATOMIC_FETCH_AND_ADD:
++ case HNS_ROCE_SQ_OP_ATOMIC_MASK_COMP_AND_SWAP:
++ case HNS_ROCE_SQ_OP_ATOMIC_MASK_FETCH_AND_ADD:
++ return wc->byte_len;
++ default:
++ return UINT32_MAX;
++ }
++}
++
++static uint8_t trace_wc_read_sl(struct ibv_cq_ex *cq_ex)
++{
++ return cq_ex->read_sl ? cq_ex->read_sl(cq_ex) : UINT8_MAX;
++}
++
++static uint32_t trace_wc_read_qp_num(struct ibv_cq_ex *cq_ex)
++{
++ return cq_ex->read_qp_num ?
++ cq_ex->read_qp_num(cq_ex) : UINT32_MAX;
++}
++
++static uint32_t trace_wc_read_src_qp(struct ibv_cq_ex *cq_ex)
++{
++ return cq_ex->read_src_qp ?
++ cq_ex->read_src_qp(cq_ex) : UINT32_MAX;
++}
++
++static uint32_t trace_wc_read_byte_len(struct ibv_cq_ex *cq_ex)
++{
++ return cq_ex->read_byte_len ?
++ cq_ex->read_byte_len(cq_ex) : UINT32_MAX;
++}
++
++static uint32_t get_send_wr_rqpn(struct ibv_send_wr *wr,
++ uint8_t qp_type)
++{
++ return qp_type == IBV_QPT_UD ? wr->wr.ud.remote_qpn : UINT32_MAX;
++}
++
++static uint8_t get_send_wr_tclass(struct ibv_send_wr *wr,
++ uint8_t qp_type)
++{
++ return qp_type == IBV_QPT_UD ?
++ to_hr_ah(wr->wr.ud.ah)->av.tclass : UINT8_MAX;
++}
++#endif
++
+ static int hns_roce_poll_one(struct hns_roce_context *ctx,
+ struct hns_roce_qp **cur_qp, struct hns_roce_cq *cq,
+ struct ibv_wc *wc)
+@@ -800,8 +875,27 @@ static int hns_roce_poll_one(struct hns_roce_context *ctx,
+ wc->status = wc_status;
+ wc->vendor_err = hr_reg_read(cqe, CQE_SUB_STATUS);
+ wc->qp_num = qpn;
++
++ rdma_tracepoint(rdma_core_hns, poll_cq,
++ cq->verbs_cq.cq.context->device->name,
++ wc->wr_id, wc_status, wc->opcode,
++ wc->wc_flags, wc->vendor_err,
++ read_wc_sl(*cur_qp, cqe, wc),
++ wc->qp_num, read_wc_rqpn(*cur_qp, cqe, wc),
++ read_wc_byte_len(cqe, wc));
+ } else {
+ cq->verbs_cq.cq_ex.status = wc_status;
++
++ rdma_tracepoint(rdma_core_hns, poll_cq,
++ cq->verbs_cq.cq.context->device->name,
++ cq->verbs_cq.cq_ex.wr_id, wc_status,
++ ibv_wc_read_opcode(&cq->verbs_cq.cq_ex),
++ ibv_wc_read_wc_flags(&cq->verbs_cq.cq_ex),
++ ibv_wc_read_vendor_err(&cq->verbs_cq.cq_ex),
++ trace_wc_read_sl(&cq->verbs_cq.cq_ex),
++ trace_wc_read_qp_num(&cq->verbs_cq.cq_ex),
++ trace_wc_read_src_qp(&cq->verbs_cq.cq_ex),
++ trace_wc_read_byte_len(&cq->verbs_cq.cq_ex));
+ }
+
+ if (status == HNS_ROCE_V2_CQE_SUCCESS ||
+@@ -1635,6 +1729,14 @@ int hns_roce_u_v2_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+ *bad_wr = wr;
+ goto out;
+ }
++
++ rdma_tracepoint(rdma_core_hns, post_send,
++ ibvqp->context->device->name, wr->wr_id,
++ sge_info.valid_num, ibvqp->qp_num,
++ get_send_wr_rqpn(wr, ibvqp->qp_type),
++ wr->send_flags, sge_info.total_len,
++ wr->opcode, qp->sl,
++ get_send_wr_tclass(wr, ibvqp->qp_type));
+ }
+
+ out:
+@@ -1785,6 +1887,10 @@ static int hns_roce_u_v2_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
+ wqe_idx = (qp->rq.head + nreq) & (qp->rq.wqe_cnt - 1);
+ fill_rq_wqe(qp, wr, wqe_idx, max_sge);
+ qp->rq.wrid[wqe_idx] = wr->wr_id;
++
++ rdma_tracepoint(rdma_core_hns, post_recv,
++ ibvqp->context->device->name, wr->wr_id,
++ wr->num_sge, ibvqp->qp_num, 0);
+ }
+
+ out:
+@@ -2153,6 +2259,10 @@ static int hns_roce_u_v2_post_srq_recv(struct ibv_srq *ib_srq,
+ fill_wqe_idx(srq, wqe_idx);
+
+ srq->wrid[wqe_idx] = wr->wr_id;
++
++ rdma_tracepoint(rdma_core_hns, post_recv,
++ ib_srq->context->device->name, wr->wr_id,
++ wr->num_sge, srq->srqn, 1);
+ }
+
+ if (nreq) {
+@@ -2442,6 +2552,12 @@ static void wr_set_sge_rc(struct ibv_qp_ex *ibv_qp, uint32_t lkey,
+ wqe->msg_len = htole32(length);
+ hr_reg_write(wqe, RCWQE_LEN0, length);
+ hr_reg_write(wqe, RCWQE_SGE_NUM, !!length);
++
++ rdma_tracepoint(rdma_core_hns, post_send,
++ ibv_qp->qp_base.context->device->name, ibv_qp->wr_id,
++ !!length, ibv_qp->qp_base.qp_num, UINT32_MAX,
++ ibv_qp->wr_flags, length,
++ hr_reg_read(wqe, RCWQE_OPCODE), qp->sl, UINT8_MAX);
+ }
+
+ static void set_sgl_rc(struct hns_roce_v2_wqe_data_seg *dseg,
+@@ -2506,6 +2622,12 @@ static void wr_set_sge_list_rc(struct ibv_qp_ex *ibv_qp, size_t num_sge,
+
+ wqe->msg_len = htole32(qp->sge_info.total_len);
+ hr_reg_write(wqe, RCWQE_SGE_NUM, qp->sge_info.valid_num);
++
++ rdma_tracepoint(rdma_core_hns, post_send,
++ ibv_qp->qp_base.context->device->name, ibv_qp->wr_id,
++ qp->sge_info.valid_num, ibv_qp->qp_base.qp_num,
++ UINT32_MAX, ibv_qp->wr_flags, qp->sge_info.total_len,
++ opcode, qp->sl, UINT8_MAX);
+ }
+
+ static void wr_send_rc(struct ibv_qp_ex *ibv_qp)
+@@ -2680,6 +2802,14 @@ static void set_inline_data_list_rc(struct hns_roce_qp *qp,
+
+ hr_reg_write(wqe, RCWQE_SGE_NUM, qp->sge_info.valid_num);
+ }
++
++ rdma_tracepoint(rdma_core_hns, post_send,
++ qp->verbs_qp.qp.context->device->name,
++ qp->verbs_qp.qp_ex.wr_id,
++ hr_reg_read(wqe, RCWQE_SGE_NUM),
++ qp->verbs_qp.qp.qp_num, UINT32_MAX,
++ qp->verbs_qp.qp_ex.wr_flags, msg_len,
++ hr_reg_read(wqe, RCWQE_OPCODE), qp->sl, UINT8_MAX);
+ }
+
+ static void wr_set_inline_data_rc(struct ibv_qp_ex *ibv_qp, void *addr,
+@@ -2812,6 +2942,13 @@ static void wr_set_sge_ud(struct ibv_qp_ex *ibv_qp, uint32_t lkey,
+ dseg->len = htole32(length);
+
+ qp->sge_info.start_idx++;
++
++ rdma_tracepoint(rdma_core_hns, post_send,
++ ibv_qp->qp_base.context->device->name, ibv_qp->wr_id,
++ 1, ibv_qp->qp_base.qp_num,
++ hr_reg_read(wqe, UDWQE_DQPN), ibv_qp->wr_flags,
++ length, hr_reg_read(wqe, UDWQE_OPCODE),
++ qp->sl, hr_reg_read(wqe, UDWQE_TCLASS));
+ }
+
+ static void wr_set_sge_list_ud(struct ibv_qp_ex *ibv_qp, size_t num_sge,
+@@ -2850,6 +2987,13 @@ static void wr_set_sge_list_ud(struct ibv_qp_ex *ibv_qp, size_t num_sge,
+ hr_reg_write(wqe, UDWQE_SGE_NUM, cnt);
+
+ qp->sge_info.start_idx += cnt;
++
++ rdma_tracepoint(rdma_core_hns, post_send,
++ ibv_qp->qp_base.context->device->name, ibv_qp->wr_id,
++ cnt, ibv_qp->qp_base.qp_num,
++ hr_reg_read(wqe, UDWQE_DQPN), ibv_qp->wr_flags,
++ msg_len, hr_reg_read(wqe, UDWQE_OPCODE),
++ qp->sl, hr_reg_read(wqe, UDWQE_TCLASS));
+ }
+
+ static void set_inline_data_list_ud(struct hns_roce_qp *qp,
+@@ -2898,6 +3042,15 @@ static void set_inline_data_list_ud(struct hns_roce_qp *qp,
+
+ hr_reg_write(wqe, UDWQE_SGE_NUM, qp->sge_info.valid_num);
+ }
++
++ rdma_tracepoint(rdma_core_hns, post_send,
++ qp->verbs_qp.qp.context->device->name,
++ qp->verbs_qp.qp_ex.wr_id,
++ hr_reg_read(wqe, UDWQE_SGE_NUM),
++ qp->verbs_qp.qp.qp_num, hr_reg_read(wqe, UDWQE_DQPN),
++ qp->verbs_qp.qp_ex.wr_flags, msg_len,
++ hr_reg_read(wqe, UDWQE_OPCODE), qp->sl,
++ hr_reg_read(wqe, UDWQE_TCLASS));
+ }
+
+ static void wr_set_inline_data_ud(struct ibv_qp_ex *ibv_qp, void *addr,
+diff --git a/providers/hns/hns_roce_u_trace.h b/providers/hns/hns_roce_u_trace.h
+index 9b9485c..4654985 100644
+--- a/providers/hns/hns_roce_u_trace.h
++++ b/providers/hns/hns_roce_u_trace.h
+@@ -17,6 +17,104 @@
+ #include <lttng/tracepoint.h>
+ #include <infiniband/verbs.h>
+
++LTTNG_UST_TRACEPOINT_EVENT(
++ /* Tracepoint provider name */
++ rdma_core_hns,
++
++ /* Tracepoint name */
++ post_send,
++
++ /* Input arguments */
++ LTTNG_UST_TP_ARGS(
++ char *, dev_name,
++ uint64_t, wr_id,
++ int32_t, num_sge,
++ uint32_t, lqpn,
++ uint32_t, rqpn,
++ uint32_t, send_flags,
++ uint32_t, msg_len,
++ uint8_t, opcode,
++ uint8_t, sl,
++ uint8_t, t_class
++ ),
++
++ /* Output event fields */
++ LTTNG_UST_TP_FIELDS(
++ lttng_ust_field_string(dev_name, dev_name)
++ lttng_ust_field_integer_hex(uint64_t, wr_id, wr_id)
++ lttng_ust_field_integer_hex(int32_t, num_sge, num_sge)
++ lttng_ust_field_integer_hex(uint32_t, lqpn, lqpn)
++ lttng_ust_field_integer_hex(uint32_t, rqpn, rqpn)
++ lttng_ust_field_integer_hex(uint32_t, send_flags, send_flags)
++ lttng_ust_field_integer_hex(uint32_t, msg_len, msg_len)
++ lttng_ust_field_integer_hex(uint8_t, opcode, opcode)
++ lttng_ust_field_integer_hex(uint8_t, sl, sl)
++ lttng_ust_field_integer_hex(uint8_t, t_class, t_class)
++ )
++)
++
++LTTNG_UST_TRACEPOINT_EVENT(
++ /* Tracepoint provider name */
++ rdma_core_hns,
++
++ /* Tracepoint name */
++ post_recv,
++
++ /* Input arguments */
++ LTTNG_UST_TP_ARGS(
++ char *, dev_name,
++ uint64_t, wr_id,
++ int32_t, num_sge,
++ uint32_t, rqn,
++ uint8_t, is_srq
++ ),
++
++ /* Output event fields */
++ LTTNG_UST_TP_FIELDS(
++ lttng_ust_field_string(dev_name, dev_name)
++ lttng_ust_field_integer_hex(uint64_t, wr_id, wr_id)
++ lttng_ust_field_integer_hex(int32_t, num_sge, num_sge)
++ lttng_ust_field_integer_hex(uint32_t, rqn, rqn)
++ lttng_ust_field_integer_hex(uint8_t, is_srq, is_srq)
++ )
++)
++
++LTTNG_UST_TRACEPOINT_EVENT(
++ /* Tracepoint provider name */
++ rdma_core_hns,
++
++ /* Tracepoint name */
++ poll_cq,
++
++ /* Input arguments */
++ LTTNG_UST_TP_ARGS(
++ char *, dev_name,
++ uint64_t, wr_id,
++ uint8_t, status,
++ uint8_t, opcode,
++ uint8_t, wc_flags,
++ uint8_t, vendor_err,
++ uint8_t, pktype,
++ uint32_t, lqpn,
++ uint32_t, rqpn,
++ uint32_t, byte_len
++ ),
++
++ /* Output event fields */
++ LTTNG_UST_TP_FIELDS(
++ lttng_ust_field_string(dev_name, dev_name)
++ lttng_ust_field_integer_hex(uint64_t, wr_id, wr_id)
++ lttng_ust_field_integer_hex(uint8_t, status, status)
++ lttng_ust_field_integer_hex(uint8_t, opcode, opcode)
++ lttng_ust_field_integer_hex(uint8_t, wc_flags, wc_flags)
++ lttng_ust_field_integer_hex(uint8_t, vendor_err, vendor_err)
++ lttng_ust_field_integer_hex(uint8_t, pktype, pktype)
++ lttng_ust_field_integer_hex(uint32_t, lqpn, lqpn)
++ lttng_ust_field_integer_hex(uint32_t, rqpn, rqpn)
++ lttng_ust_field_integer_hex(uint32_t, byte_len, byte_len)
++ )
++)
++
+ #define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg)
+
+ #endif /* __HNS_TRACE_H__*/
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index b49c6bf..e760f40 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 26
+Release: 27
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -61,6 +61,8 @@ patch52: 0052-libxscale-Match-dev-by-vid-and-did.patch
patch53: 0053-libhns-Clean-up-data-type-issues.patch
patch54: 0054-libhns-Fix-wrong-max-inline-data-value.patch
patch55: 0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch
+patch56: 0056-libhns-Add-initial-support-for-HNS-LTTng-tracing.patch
+patch57: 0057-libhns-Add-tracepoint-for-HNS-RoCE-I-O.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -640,6 +642,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Wed Apr 23 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-27
+- Type: feature
+- ID: NA
+- SUG: NA
+- DESC: libhns: Add support for LTTng tracing
+
* Thu Apr 17 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-26
- Type: bugfix
- ID: NA
--
2.33.0
1
0
24 Apr '25
Currently there is no way to observe whether the lock-free mode is
configured from the driver's perspective. Add debug log for this.
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/hns_roce_u_verbs.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index 6fc8ece32..ae60955ba 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -177,6 +177,7 @@ err:
struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
struct ibv_parent_domain_init_attr *attr)
{
+ struct hns_roce_pd *protection_domain;
struct hns_roce_pad *pad;
if (ibv_check_alloc_parent_domain(attr))
@@ -193,12 +194,16 @@ struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context,
return NULL;
}
+ protection_domain = to_hr_pd(attr->pd);
if (attr->td) {
pad->td = to_hr_td(attr->td);
atomic_fetch_add(&pad->td->refcount, 1);
+ verbs_debug(verbs_get_ctx(context),
+ "set PAD(0x%x) to lock-free mode.\n",
+ protection_domain->pdn);
}
- pad->pd.protection_domain = to_hr_pd(attr->pd);
+ pad->pd.protection_domain = protection_domain;
atomic_fetch_add(&pad->pd.protection_domain->refcount, 1);
atomic_init(&pad->pd.refcount, 1);
--
2.33.0
1
0
23 Apr '25
The first commit adds some debug logs for lock-free mode so that
we can intuitively observe whether the lock-free mode is configured.
The following three commits fix some errors.
The issue of the last commit was found during the testing of the
previous ones, where I created a XRC SRQ in lock-free mode but failed
to destroy it because of the refcnt check added in the second commit.
The failure was because the PAD was acquired through ibv_srq->pd in
destroy_srq(), while ibv_srq->pd wasn't assigned when the SRQ was
created by ibv_create_srq_ex().
Junxian Huang (5):
libhns: Add debug logs for lock-free mode
libhns: Fix ret not assigned in create_srq()
libhns: Fix pad refcnt leaking in error flow of create_qp/cq/srq
libhns: Fix freeing pad without checking refcnt
verbs: Assign ibv_srq->pd when creating SRQ
libibverbs/cmd_srq.c | 1 +
providers/hns/hns_roce_u_verbs.c | 51 +++++++++++++++++++++++---------
2 files changed, 38 insertions(+), 14 deletions(-)
--
2.33.0
1
5
From: wenglianfa <wenglianfa(a)huawei.com>
Add initial support for HNS LTTng tracing.
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/CMakeLists.txt | 10 +++++++++
providers/hns/hns_roce_u_trace.c | 9 ++++++++
providers/hns/hns_roce_u_trace.h | 35 ++++++++++++++++++++++++++++++++
3 files changed, 54 insertions(+)
create mode 100644 providers/hns/hns_roce_u_trace.c
create mode 100644 providers/hns/hns_roce_u_trace.h
diff --git a/providers/hns/CMakeLists.txt b/providers/hns/CMakeLists.txt
index 58139ae2b..36ebfacfb 100644
--- a/providers/hns/CMakeLists.txt
+++ b/providers/hns/CMakeLists.txt
@@ -1,5 +1,10 @@
+if (ENABLE_LTTNG AND LTTNGUST_FOUND)
+ set(TRACE_FILE hns_roce_u_trace.c)
+endif()
+
rdma_shared_provider(hns libhns.map
1 1.0.${PACKAGE_VERSION}
+ ${TRACE_FILE}
hns_roce_u.c
hns_roce_u_buf.c
hns_roce_u_db.c
@@ -12,3 +17,8 @@ publish_headers(infiniband
)
rdma_pkg_config("hns" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}")
+
+if (ENABLE_LTTNG AND LTTNGUST_FOUND)
+ target_include_directories(hns PUBLIC ".")
+ target_link_libraries(hns LINK_PRIVATE LTTng::UST)
+endif()
diff --git a/providers/hns/hns_roce_u_trace.c b/providers/hns/hns_roce_u_trace.c
new file mode 100644
index 000000000..812f54cfe
--- /dev/null
+++ b/providers/hns/hns_roce_u_trace.c
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
+/*
+ * Copyright (c) 2025 Hisilicon Limited.
+ */
+
+#define LTTNG_UST_TRACEPOINT_CREATE_PROBES
+#define LTTNG_UST_TRACEPOINT_DEFINE
+
+#include "hns_roce_u_trace.h"
diff --git a/providers/hns/hns_roce_u_trace.h b/providers/hns/hns_roce_u_trace.h
new file mode 100644
index 000000000..9b9485c59
--- /dev/null
+++ b/providers/hns/hns_roce_u_trace.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
+/*
+ * Copyright (c) 2025 Hisilicon Limited.
+ */
+
+#if defined(LTTNG_ENABLED)
+
+#undef LTTNG_UST_TRACEPOINT_PROVIDER
+#define LTTNG_UST_TRACEPOINT_PROVIDER rdma_core_hns
+
+#undef LTTNG_UST_TRACEPOINT_INCLUDE
+#define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
+
+#if !defined(__HNS_TRACE_H__) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ)
+#define __HNS_TRACE_H__
+
+#include <lttng/tracepoint.h>
+#include <infiniband/verbs.h>
+
+#define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg)
+
+#endif /* __HNS_TRACE_H__*/
+
+#include <lttng/tracepoint-event.h>
+
+#else
+
+#ifndef __HNS_TRACE_H__
+#define __HNS_TRACE_H__
+
+#define rdma_tracepoint(arg...)
+
+#endif /* __HNS_TRACE_H__*/
+
+#endif /* defined(LTTNG_ENABLED) */
--
2.33.0
1
1
[PATCH v3 rdma-core 1/2] libhns: Add initial support for HNS LTTng tracing
by Junxian Huang 22 Apr '25
by Junxian Huang 22 Apr '25
22 Apr '25
From: wenglianfa <wenglianfa(a)huawei.com>
Add initial support for HNS LTTng tracing.
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/CMakeLists.txt | 10 +++++++++
providers/hns/hns_roce_u_trace.c | 9 ++++++++
providers/hns/hns_roce_u_trace.h | 35 ++++++++++++++++++++++++++++++++
3 files changed, 54 insertions(+)
create mode 100644 providers/hns/hns_roce_u_trace.c
create mode 100644 providers/hns/hns_roce_u_trace.h
diff --git a/providers/hns/CMakeLists.txt b/providers/hns/CMakeLists.txt
index 58139ae2b..36ebfacfb 100644
--- a/providers/hns/CMakeLists.txt
+++ b/providers/hns/CMakeLists.txt
@@ -1,5 +1,10 @@
+if (ENABLE_LTTNG AND LTTNGUST_FOUND)
+ set(TRACE_FILE hns_roce_u_trace.c)
+endif()
+
rdma_shared_provider(hns libhns.map
1 1.0.${PACKAGE_VERSION}
+ ${TRACE_FILE}
hns_roce_u.c
hns_roce_u_buf.c
hns_roce_u_db.c
@@ -12,3 +17,8 @@ publish_headers(infiniband
)
rdma_pkg_config("hns" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}")
+
+if (ENABLE_LTTNG AND LTTNGUST_FOUND)
+ target_include_directories(hns PUBLIC ".")
+ target_link_libraries(hns LINK_PRIVATE LTTng::UST)
+endif()
diff --git a/providers/hns/hns_roce_u_trace.c b/providers/hns/hns_roce_u_trace.c
new file mode 100644
index 000000000..812f54cfe
--- /dev/null
+++ b/providers/hns/hns_roce_u_trace.c
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
+/*
+ * Copyright (c) 2025 Hisilicon Limited.
+ */
+
+#define LTTNG_UST_TRACEPOINT_CREATE_PROBES
+#define LTTNG_UST_TRACEPOINT_DEFINE
+
+#include "hns_roce_u_trace.h"
diff --git a/providers/hns/hns_roce_u_trace.h b/providers/hns/hns_roce_u_trace.h
new file mode 100644
index 000000000..9b9485c59
--- /dev/null
+++ b/providers/hns/hns_roce_u_trace.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
+/*
+ * Copyright (c) 2025 Hisilicon Limited.
+ */
+
+#if defined(LTTNG_ENABLED)
+
+#undef LTTNG_UST_TRACEPOINT_PROVIDER
+#define LTTNG_UST_TRACEPOINT_PROVIDER rdma_core_hns
+
+#undef LTTNG_UST_TRACEPOINT_INCLUDE
+#define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
+
+#if !defined(__HNS_TRACE_H__) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ)
+#define __HNS_TRACE_H__
+
+#include <lttng/tracepoint.h>
+#include <infiniband/verbs.h>
+
+#define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg)
+
+#endif /* __HNS_TRACE_H__*/
+
+#include <lttng/tracepoint-event.h>
+
+#else
+
+#ifndef __HNS_TRACE_H__
+#define __HNS_TRACE_H__
+
+#define rdma_tracepoint(arg...)
+
+#endif /* __HNS_TRACE_H__*/
+
+#endif /* defined(LTTNG_ENABLED) */
--
2.33.0
1
1
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Cleanup and Bugfixes:
0053-libhns-Clean-up-data-type-issues.patch
0054-libhns-Fix-wrong-max-inline-data-value.patch
0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
0053-libhns-Clean-up-data-type-issues.patch | 152 ++++++++++++++++++
...bhns-Fix-wrong-max-inline-data-value.patch | 63 ++++++++
...ng-order-of-spin-unlock-in-modify-qp.patch | 42 +++++
rdma-core.spec | 11 +-
4 files changed, 267 insertions(+), 1 deletion(-)
create mode 100644 0053-libhns-Clean-up-data-type-issues.patch
create mode 100644 0054-libhns-Fix-wrong-max-inline-data-value.patch
create mode 100644 0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch
diff --git a/0053-libhns-Clean-up-data-type-issues.patch b/0053-libhns-Clean-up-data-type-issues.patch
new file mode 100644
index 0000000..95a95cc
--- /dev/null
+++ b/0053-libhns-Clean-up-data-type-issues.patch
@@ -0,0 +1,152 @@
+From 8f95635c359ca3c36f5b1b48889719b6840c07cc Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Thu, 13 Mar 2025 17:26:50 +0800
+Subject: [PATCH 53/55] libhns: Clean up data type issues
+
+mainline inclusion
+from mainline-v56.0-65
+commit fbe8827f270d0aff4a28bb645b826fa98fe00c9d
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC1V44
+CVE: NA
+
+Reference: https://github.com/linux-rdma/rdma-core/pull/1579/commits/fbe8827f270d0aff4…
+
+---------------------------------------------------------------------
+
+Clean up mixed signed/unsigned type issues. Fix a wrong format
+character as well.
+
+Fixes: cf6d9149f8f5 ("libhns: Introduce hns direct verbs")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u.h | 4 ++--
+ providers/hns/hns_roce_u_hw_v2.c | 15 ++++++++-------
+ providers/hns/hns_roce_u_verbs.c | 6 +++---
+ 3 files changed, 13 insertions(+), 12 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
+index 5eedb81..e7e3f01 100644
+--- a/providers/hns/hns_roce_u.h
++++ b/providers/hns/hns_roce_u.h
+@@ -356,7 +356,7 @@ struct hns_roce_wq {
+ unsigned long *wrid;
+ struct hns_roce_spinlock hr_lock;
+ unsigned int wqe_cnt;
+- int max_post;
++ unsigned int max_post;
+ unsigned int head;
+ unsigned int tail;
+ unsigned int max_gs;
+@@ -392,7 +392,7 @@ struct hns_roce_qp {
+ struct verbs_qp verbs_qp;
+ struct hns_roce_buf buf;
+ struct hns_roce_dca_buf dca_wqe;
+- int max_inline_data;
++ unsigned int max_inline_data;
+ unsigned int buf_size;
+ unsigned int sq_signal_bits;
+ struct hns_roce_wq sq;
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 3137111..cea3043 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -173,7 +173,7 @@ static enum ibv_wc_status get_wc_status(uint8_t status)
+ { HNS_ROCE_V2_CQE_XRC_VIOLATION_ERR, IBV_WC_REM_INV_RD_REQ_ERR },
+ };
+
+- for (int i = 0; i < ARRAY_SIZE(map); i++) {
++ for (unsigned int i = 0; i < ARRAY_SIZE(map); i++) {
+ if (status == map[i].cqe_status)
+ return map[i].wc_status;
+ }
+@@ -1189,7 +1189,7 @@ static int fill_ext_sge_inl_data(struct hns_roce_qp *qp,
+ unsigned int sge_mask = qp->ex_sge.sge_cnt - 1;
+ void *dst_addr, *src_addr, *tail_bound_addr;
+ uint32_t src_len, tail_len;
+- int i;
++ uint32_t i;
+
+ if (sge_info->total_len > qp->sq.ext_sge_cnt * HNS_ROCE_SGE_SIZE)
+ return EINVAL;
+@@ -1259,7 +1259,7 @@ static void fill_ud_inn_inl_data(const struct ibv_send_wr *wr,
+
+ static bool check_inl_data_len(struct hns_roce_qp *qp, unsigned int len)
+ {
+- int mtu = mtu_enum_to_int(qp->path_mtu);
++ unsigned int mtu = mtu_enum_to_int(qp->path_mtu);
+
+ return (len <= qp->max_inline_data && len <= mtu);
+ }
+@@ -1698,7 +1698,8 @@ static void fill_recv_sge_to_wqe(struct ibv_recv_wr *wr, void *wqe,
+ unsigned int max_sge, bool rsv)
+ {
+ struct hns_roce_v2_wqe_data_seg *dseg = wqe;
+- unsigned int i, cnt;
++ unsigned int cnt;
++ int i;
+
+ for (i = 0, cnt = 0; i < wr->num_sge; i++) {
+ /* Skip zero-length sge */
+@@ -1726,7 +1727,7 @@ static void fill_recv_inl_buf(struct hns_roce_rinl_buf *rinl_buf,
+ unsigned int wqe_idx, struct ibv_recv_wr *wr)
+ {
+ struct ibv_sge *sge_list;
+- unsigned int i;
++ int i;
+
+ if (!rinl_buf->wqe_cnt)
+ return;
+@@ -2053,7 +2054,7 @@ static int check_post_srq_valid(struct hns_roce_srq *srq,
+ static int get_wqe_idx(struct hns_roce_srq *srq, unsigned int *wqe_idx)
+ {
+ struct hns_roce_idx_que *idx_que = &srq->idx_que;
+- int bit_num;
++ unsigned int bit_num;
+ int i;
+
+ /* bitmap[i] is set zero if all bits are allocated */
+@@ -2451,7 +2452,7 @@ static void set_sgl_rc(struct hns_roce_v2_wqe_data_seg *dseg,
+ unsigned int mask = qp->ex_sge.sge_cnt - 1;
+ unsigned int msg_len = 0;
+ unsigned int cnt = 0;
+- int i;
++ unsigned int i;
+
+ for (i = 0; i < num_sge; i++) {
+ if (!sge[i].length)
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 848f836..f0098ed 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -422,7 +422,7 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ {
+ struct hns_roce_pad *pad = to_hr_pad(attr->parent_domain);
+
+- if (!attr->cqe || attr->cqe > context->max_cqe) {
++ if (!attr->cqe || attr->cqe > (uint32_t)context->max_cqe) {
+ verbs_err(&context->ibv_ctx, "unsupported cq depth %u.\n",
+ attr->cqe);
+ return EINVAL;
+@@ -1080,7 +1080,7 @@ static int check_hnsdv_qp_attr(struct hns_roce_context *ctx,
+ return 0;
+
+ if (!check_comp_mask(hns_attr->comp_mask, HNSDV_QP_SUP_COMP_MASK)) {
+- verbs_err(&ctx->ibv_ctx, "invalid hnsdv comp_mask 0x%x.\n",
++ verbs_err(&ctx->ibv_ctx, "invalid hnsdv comp_mask 0x%llx.\n",
+ hns_attr->comp_mask);
+ return EINVAL;
+ }
+@@ -1257,7 +1257,7 @@ static int alloc_recv_rinl_buf(uint32_t max_sge,
+ struct hns_roce_rinl_buf *rinl_buf)
+ {
+ unsigned int cnt;
+- int i;
++ unsigned int i;
+
+ cnt = rinl_buf->wqe_cnt;
+ rinl_buf->wqe_list = calloc(cnt, sizeof(struct hns_roce_rinl_wqe));
+--
+2.33.0
+
diff --git a/0054-libhns-Fix-wrong-max-inline-data-value.patch b/0054-libhns-Fix-wrong-max-inline-data-value.patch
new file mode 100644
index 0000000..4389023
--- /dev/null
+++ b/0054-libhns-Fix-wrong-max-inline-data-value.patch
@@ -0,0 +1,63 @@
+From 10534f0ef2ca73e8e59a38e51969cae864f9fbbf Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Thu, 13 Mar 2025 17:26:51 +0800
+Subject: [PATCH 54/55] libhns: Fix wrong max inline data value
+
+mainline inclusion
+from mainline-v56.0-65
+commit 8307b7c54ed81c343ec874e2066de79260b666d2
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC1V44
+CVE: NA
+
+Reference: https://github.com/linux-rdma/rdma-core/pull/1579/commits/8307b7c54ed81c343…
+
+---------------------------------------------------------------------
+
+When cap.max_inline_data is 0, it will be modified to 1 since
+roundup_pow_of_two(0) == 1, which violates users' expectations.
+Here fix it.
+
+Fixes: 2aff0d55098c ("libhns: Fix the problem of sge nums")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 14 +++++++++++---
+ 1 file changed, 11 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index f0098ed..5fe169e 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1494,6 +1494,16 @@ static unsigned int get_sge_num_from_max_inl_data(bool is_ud,
+ return inline_sge;
+ }
+
++static uint32_t get_max_inline_data(struct hns_roce_context *ctx,
++ struct ibv_qp_cap *cap)
++{
++ if (cap->max_inline_data)
++ return min_t(uint32_t, roundup_pow_of_two(cap->max_inline_data),
++ ctx->max_inline_data);
++
++ return 0;
++}
++
+ static void set_ext_sge_param(struct hns_roce_context *ctx,
+ struct ibv_qp_init_attr_ex *attr,
+ struct hns_roce_qp *qp, unsigned int wr_cnt)
+@@ -1510,9 +1520,7 @@ static void set_ext_sge_param(struct hns_roce_context *ctx,
+ attr->cap.max_send_sge);
+
+ if (ctx->config & HNS_ROCE_RSP_EXSGE_FLAGS) {
+- attr->cap.max_inline_data = min_t(uint32_t, roundup_pow_of_two(
+- attr->cap.max_inline_data),
+- ctx->max_inline_data);
++ attr->cap.max_inline_data = get_max_inline_data(ctx, &attr->cap);
+
+ inline_ext_sge = max(ext_wqe_sge_cnt,
+ get_sge_num_from_max_inl_data(is_ud,
+--
+2.33.0
+
diff --git a/0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch b/0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch
new file mode 100644
index 0000000..b16fe12
--- /dev/null
+++ b/0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch
@@ -0,0 +1,42 @@
+From d1409106e1323c54fbbb0618c071efb024f58130 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Thu, 13 Mar 2025 17:26:52 +0800
+Subject: [PATCH 55/55] libhns: Fix wrong order of spin unlock in modify qp
+
+mainline inclusion
+from mainline-v56.0-65
+commit d2b41c86c49335b3c6ab638abb1c0e31f5ba0e8f
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IC1V44
+CVE: NA
+
+Reference: https://github.com/linux-rdma/rdma-core/pull/1579/commits/d2b41c86c49335b3c…
+
+---------------------------------------------------------------------
+
+The spin_unlock order should be the reverse of spin_lock order.
+
+Fixes: 179f015e090d ("libhns: Add support for lock-free QP")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index cea3043..3a1249f 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -1910,8 +1910,8 @@ static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ if (flag) {
+ if (!ret)
+ qp->state = IBV_QPS_ERR;
+- hns_roce_spin_unlock(&hr_qp->sq.hr_lock);
+ hns_roce_spin_unlock(&hr_qp->rq.hr_lock);
++ hns_roce_spin_unlock(&hr_qp->sq.hr_lock);
+ }
+
+ if (ret)
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 286b8dd..b49c6bf 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 25
+Release: 26
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -58,6 +58,9 @@ patch49: 0049-libzrdma-Add-poll-cqe-error-to-Failed-status.patch
patch50: 0050-libzrdma-Add-sq-rq-flush-cqe-and-log-optimization.patch
patch51: 0051-libzrdma-Fix-capability-related-bugs.patch
patch52: 0052-libxscale-Match-dev-by-vid-and-did.patch
+patch53: 0053-libhns-Clean-up-data-type-issues.patch
+patch54: 0054-libhns-Fix-wrong-max-inline-data-value.patch
+patch55: 0055-libhns-Fix-wrong-order-of-spin-unlock-in-modify-qp.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -637,6 +640,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Thu Apr 17 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-26
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: libhns: Cleanup and Bugfixes
+
* Thu Mar 20 2025 Xin Tian <tianx(a)yunsilicon.com> - 50.0-25
- Type: bugfix
- ID: NA
--
2.33.0
1
0
From: Guofeng Yue <yueguofeng(a)h-partners.com>
When creating SRQ/XRC QP in TD lock-free mode, pass in ctx->pad
instead of ctx->pd, otherwise the lock-free won't work.
Besides, use ctx->pad directly when creating QP/SRQ since pad
is designed to be interchangeable with the usual pd. When
lock-free mode is disabled, pad is the exactly the usual pd.
Fixes: b6f957f6bc6c ("Perftest: Add support for TD lock-free mode")
Signed-off-by: Guofeng Yue <yueguofeng(a)h-partners.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
src/perftest_resources.c | 21 ++++++++++++---------
src/perftest_resources.h | 2 +-
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/src/perftest_resources.c b/src/perftest_resources.c
index 2d2e9f2..6543bc7 100755
--- a/src/perftest_resources.c
+++ b/src/perftest_resources.c
@@ -737,7 +737,8 @@ static int ctx_xrc_srq_create(struct pingpong_context *ctx,
else
srq_init_attr.cq = ctx->send_cq;
- srq_init_attr.pd = ctx->pd;
+ srq_init_attr.pd = ctx->pad;
+
ctx->srq = ibv_create_srq_ex(ctx->context, &srq_init_attr);
if (ctx->srq == NULL) {
fprintf(stderr, "Couldn't open XRC SRQ\n");
@@ -780,7 +781,8 @@ static struct ibv_qp *ctx_xrc_qp_create(struct pingpong_context *ctx,
qp_init_attr.cap.max_send_wr = user_param->tx_depth;
qp_init_attr.cap.max_send_sge = 1;
qp_init_attr.comp_mask = IBV_QP_INIT_ATTR_PD;
- qp_init_attr.pd = ctx->pd;
+ qp_init_attr.pd = ctx->pad;
+
#ifdef HAVE_IBV_WR_API
if (!user_param->use_old_post_send)
qp_init_attr.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS;
@@ -2037,9 +2039,14 @@ int ctx_init(struct pingpong_context *ctx, struct perftest_parameters *user_para
fprintf(stderr, "Couldn't allocate PAD\n");
goto td;
}
+ } else {
+ #endif
+ ctx->pad = ctx->pd;
+ #ifdef HAVE_TD_API
}
#endif
+
#ifdef HAVE_AES_XTS
if(user_param->aes_xts){
struct mlx5dv_dek_init_attr dek_attr = {};
@@ -2127,7 +2134,7 @@ int ctx_init(struct pingpong_context *ctx, struct perftest_parameters *user_para
attr.comp_mask = IBV_SRQ_INIT_ATTR_TYPE | IBV_SRQ_INIT_ATTR_PD;
attr.attr.max_wr = user_param->rx_depth;
attr.attr.max_sge = 1;
- attr.pd = ctx->pd;
+ attr.pd = ctx->pad;
attr.srq_type = IBV_SRQT_BASIC;
ctx->srq = ibv_create_srq_ex(ctx->context, &attr);
@@ -2148,7 +2155,7 @@ int ctx_init(struct pingpong_context *ctx, struct perftest_parameters *user_para
.max_sge = 1
}
};
- ctx->srq = ibv_create_srq(ctx->pd, &attr);
+ ctx->srq = ibv_create_srq(ctx->pad, &attr);
if (!ctx->srq) {
fprintf(stderr, "Couldn't create SRQ\n");
goto xrcd;
@@ -2406,11 +2413,7 @@ struct ibv_qp* ctx_qp_create(struct pingpong_context *ctx,
attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_READ;
}
- #ifdef HAVE_TD_API
- attr_ex.pd = user_param->no_lock ? ctx->pad : ctx->pd;
- #else
- attr_ex.pd = ctx->pd;
- #endif
+ attr_ex.pd = ctx->pad;
attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS | IBV_QP_INIT_ATTR_PD;
attr_ex.send_cq = attr.send_cq;
diff --git a/src/perftest_resources.h b/src/perftest_resources.h
index b28e136..25a7438 100755
--- a/src/perftest_resources.h
+++ b/src/perftest_resources.h
@@ -183,8 +183,8 @@ struct pingpong_context {
struct ibv_pd *pd;
#ifdef HAVE_TD_API
struct ibv_td *td;
- struct ibv_pd *pad;
#endif
+ struct ibv_pd *pad;
struct ibv_mr **mr;
struct ibv_mr *null_mr;
struct ibv_cq *send_cq;
--
2.33.0
1
0
From: wenglianfa <wenglianfa(a)huawei.com>
Add initial support for HNS LTTng tracing. Add a new libhns_trace
so that the original libhns does not need to have a dependency on
LTTng.
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/CMakeLists.txt | 12 ++++++
providers/hns/hns_roce_u_hw_v2.c | 3 ++
providers/hns/hns_roce_u_trace.c | 36 ++++++++++++++++++
providers/hns/hns_roce_u_trace.h | 63 ++++++++++++++++++++++++++++++++
providers/hns/libhns_trace.map | 5 +++
5 files changed, 119 insertions(+)
create mode 100644 providers/hns/hns_roce_u_trace.c
create mode 100644 providers/hns/hns_roce_u_trace.h
create mode 100644 providers/hns/libhns_trace.map
diff --git a/providers/hns/CMakeLists.txt b/providers/hns/CMakeLists.txt
index 58139ae2b..84ddd6912 100644
--- a/providers/hns/CMakeLists.txt
+++ b/providers/hns/CMakeLists.txt
@@ -1,3 +1,10 @@
+if (ENABLE_LTTNG AND LTTNGUST_FOUND)
+ rdma_shared_provider(hns_trace libhns_trace.map
+ 1 1.0.${PACKAGE_VERSION}
+ hns_roce_u_trace.c
+)
+endif()
+
rdma_shared_provider(hns libhns.map
1 1.0.${PACKAGE_VERSION}
hns_roce_u.c
@@ -12,3 +19,8 @@ publish_headers(infiniband
)
rdma_pkg_config("hns" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}")
+
+if (ENABLE_LTTNG AND LTTNGUST_FOUND)
+ target_include_directories(hns_trace PUBLIC ".")
+ target_link_libraries(hns_trace LINK_PRIVATE LTTng::UST)
+endif()
diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
index c02d2c1b8..b4b335fba 100644
--- a/providers/hns/hns_roce_u_hw_v2.c
+++ b/providers/hns/hns_roce_u_hw_v2.c
@@ -39,6 +39,9 @@
#include "hns_roce_u_db.h"
#include "hns_roce_u_hw_v2.h"
+#define LTTNG_UST_TRACEPOINT_DEFINE
+#define LTTNG_UST_TRACEPOINT_PROBE_DYNAMIC_LINKAGE
+
#define HR_IBV_OPC_MAP(ib_key, hr_key) \
[IBV_WR_ ## ib_key] = HNS_ROCE_WQE_OP_ ## hr_key
diff --git a/providers/hns/hns_roce_u_trace.c b/providers/hns/hns_roce_u_trace.c
new file mode 100644
index 000000000..65aa45ca3
--- /dev/null
+++ b/providers/hns/hns_roce_u_trace.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
+/*
+ * Copyright (c) 2025 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#define LTTNG_UST_TRACEPOINT_CREATE_PROBES
+
+#include "hns_roce_u_trace.h"
diff --git a/providers/hns/hns_roce_u_trace.h b/providers/hns/hns_roce_u_trace.h
new file mode 100644
index 000000000..74d33344d
--- /dev/null
+++ b/providers/hns/hns_roce_u_trace.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
+/*
+ * Copyright (c) 2025 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#if defined(LTTNG_ENABLED)
+
+#undef LTTNG_UST_TRACEPOINT_PROVIDER
+#define LTTNG_UST_TRACEPOINT_PROVIDER rdma_core_hns
+
+#undef LTTNG_UST_TRACEPOINT_INCLUDE
+#define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
+
+#if !defined(__HNS_TRACE_H__) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ)
+#define __HNS_TRACE_H__
+
+#include <lttng/tracepoint.h>
+#include <infiniband/verbs.h>
+
+#define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg)
+
+#endif /* __HNS_TRACE_H__*/
+
+#include <lttng/tracepoint-event.h>
+
+#else
+
+#ifndef __HNS_TRACE_H__
+#define __HNS_TRACE_H__
+
+#define rdma_tracepoint(arg...)
+
+#endif /* __HNS_TRACE_H__*/
+
+#endif /* defined(LTTNG_ENABLED) */
diff --git a/providers/hns/libhns_trace.map b/providers/hns/libhns_trace.map
new file mode 100644
index 000000000..e74bc7479
--- /dev/null
+++ b/providers/hns/libhns_trace.map
@@ -0,0 +1,5 @@
+/* Export symbols should be added below according to
+ Documentation/versioning.md document. */
+HNS_TRACE_1.0 {
+ global:*;
+};
\ No newline at end of file
--
2.33.0
1
0
28 Mar '25
Add support for LTTng tracing. For now it is used for post_send,
post_recv and poll_cq.
wenglianfa (2):
libhns: Add initial support for HNS LTTng tracing
libhns: Add tracepoint for HNS RoCE I/O
providers/hns/CMakeLists.txt | 12 +++
providers/hns/hns_roce_u_hw_v2.c | 154 +++++++++++++++++++++++++++++
providers/hns/hns_roce_u_trace.c | 36 +++++++
providers/hns/hns_roce_u_trace.h | 161 +++++++++++++++++++++++++++++++
providers/hns/libhns_trace.map | 5 +
5 files changed, 368 insertions(+)
create mode 100644 providers/hns/hns_roce_u_trace.c
create mode 100644 providers/hns/hns_roce_u_trace.h
create mode 100644 providers/hns/libhns_trace.map
--
2.33.0
1
2
From: Xinghai Cen <cenxinghai(a)h-partners.com>
driver inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IB66RT
------------------------------------------------------------------
Changes to be committed:
modified: 0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
new file: 0045-libhns-fix-incorrectly-using-fixed-pagesize.patch
new file: 0046-libhns-fix-missing-new-IO-support-for-DCA.patch
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...x-memory-leakage-when-DCA-is-enabled.patch | 12 +-
...fix-incorrectly-using-fixed-pagesize.patch | 110 ++++++++++++++++++
...s-fix-missing-new-IO-support-for-DCA.patch | 55 +++++++++
rdma-core.spec | 10 +-
4 files changed, 181 insertions(+), 6 deletions(-)
create mode 100644 0045-libhns-fix-incorrectly-using-fixed-pagesize.patch
create mode 100644 0046-libhns-fix-missing-new-IO-support-for-DCA.patch
diff --git a/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch b/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
index f046264..58cd8ec 100644
--- a/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
+++ b/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
@@ -1,4 +1,4 @@
-From af20dd32df73ff72d35a430a5fb87ac42d70cdf4 Mon Sep 17 00:00:00 2001
+From f8e29f955dd5399bd227c4de532f6d09872a254a Mon Sep 17 00:00:00 2001
From: wenglianfa <wenglianfa(a)huawei.com>
Date: Thu, 25 Jul 2024 11:06:01 +0800
Subject: [PATCH] libhns: Fix memory leakage when DCA is enabled
@@ -17,18 +17,20 @@ Fixes: 2783884a97e7 ("libhns: Add support for attaching QP's WQE buffer")
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
- providers/hns/hns_roce_u_verbs.c | 3 ++-
- 1 file changed, 2 insertions(+), 1 deletion(-)
+ providers/hns/hns_roce_u_verbs.c | 5 ++++-
+ 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
-index e30880c..c733b21 100644
+index e30880c..154e800 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
-@@ -1357,7 +1357,8 @@ static void qp_free_wqe(struct hns_roce_qp *qp)
+@@ -1357,7 +1357,10 @@ static void qp_free_wqe(struct hns_roce_qp *qp)
if (qp->rq.wqe_cnt)
free(qp->rq.wrid);
- hns_roce_free_buf(&qp->buf);
++ if (qp->dca_wqe.bufs)
++ free(qp->dca_wqe.bufs);
+ else
+ hns_roce_free_buf(&qp->buf);
}
diff --git a/0045-libhns-fix-incorrectly-using-fixed-pagesize.patch b/0045-libhns-fix-incorrectly-using-fixed-pagesize.patch
new file mode 100644
index 0000000..717b340
--- /dev/null
+++ b/0045-libhns-fix-incorrectly-using-fixed-pagesize.patch
@@ -0,0 +1,110 @@
+From 7bd22fed52a1828b0d44a990b52266e9e1d92b5d Mon Sep 17 00:00:00 2001
+From: Chengchang Tang <tangchengchang(a)huawei.com>
+Date: Tue, 30 Jan 2024 21:00:46 +0800
+Subject: [PATCH 45/46] libhns: fix incorrectly using fixed pagesize
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IB66RT
+
+------------------------------------------------------------------
+
+Currently, actually used page size is fixed, causing the flexible wqe
+buffer size feature to not take effect.
+
+Fixes: 9ab7600d832b ("libhns: Add support for attaching QP's WQE buffer")
+Signed-off-by: Chengchang Tang <tangchengchang(a)huawei.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 24 +++++++++++++-----------
+ 1 file changed, 13 insertions(+), 11 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index bce215e..848f836 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1296,14 +1296,14 @@ static void free_recv_rinl_buf(struct hns_roce_rinl_buf *rinl_buf)
+
+ static void get_best_multi_region_pg_shift(struct hns_roce_device *hr_dev,
+ struct hns_roce_context *ctx,
+- struct hns_roce_qp *qp)
++ struct hns_roce_qp *qp, bool dca_en)
+ {
+ uint32_t ext_sge_size;
+ uint32_t sq_size;
+ uint32_t rq_size;
+ uint8_t pg_shift;
+
+- if (!(ctx->config & HNS_ROCE_UCTX_RSP_DYN_QP_PGSZ)) {
++ if (!(ctx->config & HNS_ROCE_UCTX_RSP_DYN_QP_PGSZ || dca_en)) {
+ qp->pageshift = HNS_HW_PAGE_SHIFT;
+ return;
+ }
+@@ -1334,7 +1334,7 @@ static void get_best_multi_region_pg_shift(struct hns_roce_device *hr_dev,
+
+ static int calc_qp_buff_size(struct hns_roce_device *hr_dev,
+ struct hns_roce_context *ctx,
+- struct hns_roce_qp *qp)
++ struct hns_roce_qp *qp, bool dca_en)
+ {
+ struct hns_roce_wq *sq = &qp->sq;
+ struct hns_roce_wq *rq = &qp->rq;
+@@ -1342,7 +1342,7 @@ static int calc_qp_buff_size(struct hns_roce_device *hr_dev,
+ unsigned int size;
+
+ qp->buf_size = 0;
+- get_best_multi_region_pg_shift(hr_dev, ctx, qp);
++ get_best_multi_region_pg_shift(hr_dev, ctx, qp, dca_en);
+ page_size = 1 << qp->pageshift;
+
+ /* SQ WQE */
+@@ -1384,7 +1384,7 @@ static inline bool check_qp_support_dca(struct hns_roce_dca_ctx *dca_ctx,
+ if (hns_attr &&
+ (hns_attr->comp_mask & HNSDV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS) &&
+ (hns_attr->create_flags & HNSDV_QP_CREATE_ENABLE_DCA_MODE))
+- return true;
++ return dca_ctx->max_size > 0;
+
+ return false;
+ }
+@@ -1408,9 +1408,12 @@ static int qp_alloc_wqe(struct ibv_qp_init_attr_ex *attr,
+ struct hns_roce_qp *qp, struct hns_roce_context *ctx)
+ {
+ struct hns_roce_device *hr_dev = to_hr_dev(ctx->ibv_ctx.context.device);
++ bool dca_en = check_qp_support_dca(&ctx->dca_ctx, attr, hns_attr);
++ int ret;
+
+- if (calc_qp_buff_size(hr_dev, ctx, qp))
+- return -EINVAL;
++ ret = calc_qp_buff_size(hr_dev, ctx, qp, dca_en);
++ if (ret)
++ return ret;
+
+ qp->sq.wrid = malloc(qp->sq.wqe_cnt * sizeof(uint64_t));
+ if (!qp->sq.wrid)
+@@ -1428,19 +1431,18 @@ static int qp_alloc_wqe(struct ibv_qp_init_attr_ex *attr,
+ goto err_alloc;
+ }
+
+- if (check_qp_support_dca(&ctx->dca_ctx, attr, hns_attr) &&
+- ctx->dca_ctx.max_size > 0) {
++ if (check_qp_support_dca(&ctx->dca_ctx, attr, hns_attr)) {
+ /* when DCA is enabled, use a buffer list to store page addr */
+ qp->buf.buf = NULL;
+ qp->dca_wqe.max_cnt = hr_hw_page_count(qp->buf_size);
+- qp->dca_wqe.shift = HNS_HW_PAGE_SHIFT;
++ qp->dca_wqe.shift = qp->pageshift;
+ qp->dca_wqe.bufs = calloc(qp->dca_wqe.max_cnt, sizeof(void *));
+ if (!qp->dca_wqe.bufs)
+ goto err_alloc;
+ verbs_debug(&ctx->ibv_ctx, "alloc DCA buf.\n");
+ } else {
+ if (hns_roce_alloc_buf(&qp->buf, qp->buf_size,
+- HNS_HW_PAGE_SIZE))
++ 1 << qp->pageshift))
+ goto err_alloc;
+ }
+
+--
+2.33.0
+
diff --git a/0046-libhns-fix-missing-new-IO-support-for-DCA.patch b/0046-libhns-fix-missing-new-IO-support-for-DCA.patch
new file mode 100644
index 0000000..1b98200
--- /dev/null
+++ b/0046-libhns-fix-missing-new-IO-support-for-DCA.patch
@@ -0,0 +1,55 @@
+From 199b2f78ff9eeeb25acc78f9da495ae58877807a Mon Sep 17 00:00:00 2001
+From: Chengchang Tang <tangchengchang(a)huawei.com>
+Date: Tue, 30 Jan 2024 21:28:44 +0800
+Subject: [PATCH 46/46] libhns: fix missing new IO support for DCA
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBSL67
+
+------------------------------------------------------------------
+
+New IO related support has been missed for DCA.
+
+Fixes: 9ab7600d832b ("libhns: Add support for attaching QP's WQE buffer")
+Signed-off-by: Chengchang Tang <tangchengchang(a)huawei.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 7 +++++++
+ 1 file changed, 7 insertions(+)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index aadea7a..3137111 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -2191,6 +2191,8 @@ static int wc_start_poll_cq(struct ibv_cq_ex *current,
+ }
+
+ err = hns_roce_poll_one(ctx, &qp, cq, NULL);
++ if (qp && check_dca_detach_enable(qp))
++ dca_detach_qp_buf(ctx, qp);
+
+ start_poll_done:
+ if (err != V2_CQ_OK)
+@@ -2210,6 +2212,8 @@ static int wc_next_poll_cq(struct ibv_cq_ex *current)
+ return hns_roce_poll_one_swc(cq, NULL);
+
+ err = hns_roce_poll_one(ctx, &qp, cq, NULL);
++ if (qp && check_dca_detach_enable(qp))
++ dca_detach_qp_buf(ctx, qp);
+ if (err != V2_CQ_OK)
+ return err;
+
+@@ -2408,6 +2412,9 @@ init_rc_wqe(struct hns_roce_qp *qp, uint64_t wr_id, unsigned int opcode)
+ hr_reg_write_bool(wqe, RCWQE_SE, send_flags & IBV_SEND_SOLICITED);
+ hr_reg_clear(wqe, RCWQE_INLINE);
+
++ if (check_qp_dca_enable(qp))
++ fill_rc_dca_fields(qp->verbs_qp.qp.qp_num, wqe);
++
+ qp->sq.wrid[wqe_idx] = wr_id;
+ qp->cur_wqe = wqe;
+
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index b202c8d..0edb02b 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 22
+Release: 23
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -50,6 +50,8 @@ patch41: 0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
patch42: 0042-libhns-Add-error-logs-to-help-diagnosis.patch
patch43: 0043-libhns-Fix-missing-fields-for-SRQ-WC.patch
patch44: 0044-libxscale-Add-Yunsilicon-User-Space-RDMA-Driver.patch
+patch45: 0045-libhns-fix-incorrectly-using-fixed-pagesize.patch
+patch46: 0046-libhns-fix-missing-new-IO-support-for-DCA.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -629,6 +631,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Tue Mar 11 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-23
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Fix some bugs for libhns
+
* Wed Feb 26 2025 Xin Tian <tianx(a)yunsilicon.com> - 50.0-22
- Type: requirement
- ID: NA
--
2.33.0
1
0
This series contains some recent cleanup and bugfixes for libhns.
Junxian Huang (2):
libhns: Clean up data type issues
libhns: Fix wrong order of spin_unlock in modify_qp
wenglianfa (1):
libhns: Fix wrong max_inline_data value
providers/hns/hns_roce_u.h | 4 ++--
providers/hns/hns_roce_u_hw_v2.c | 17 +++++++++--------
providers/hns/hns_roce_u_verbs.c | 20 ++++++++++++++------
3 files changed, 25 insertions(+), 16 deletions(-)
--
2.33.0
1
3
12 Mar '25
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Changes to be committed:
new file: 0094-libhns-Fix-the-max_inline_data-value.patch
new file: 0095-libhns-Adapt-UD-inline-data-size-for-UCX.patch
new file: 0096-libhns-Fix-wrong-order-of-spin_unlock-in-modify_qp.patch
modified: rdma-core.spec
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...libhns-Fix-the-max_inline_data-value.patch | 60 +++++++++++++++
...ns-Adapt-UD-inline-data-size-for-UCX.patch | 76 +++++++++++++++++++
...ng-order-of-spin_unlock-in-modify_qp.patch | 37 +++++++++
rdma-core.spec | 11 ++-
4 files changed, 183 insertions(+), 1 deletion(-)
create mode 100644 0094-libhns-Fix-the-max_inline_data-value.patch
create mode 100644 0095-libhns-Adapt-UD-inline-data-size-for-UCX.patch
create mode 100644 0096-libhns-Fix-wrong-order-of-spin_unlock-in-modify_qp.patch
diff --git a/0094-libhns-Fix-the-max_inline_data-value.patch b/0094-libhns-Fix-the-max_inline_data-value.patch
new file mode 100644
index 0000000..6b68cd9
--- /dev/null
+++ b/0094-libhns-Fix-the-max_inline_data-value.patch
@@ -0,0 +1,60 @@
+From 1960a28512d4ddd96e82e141f1135ace8cf6054b Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Tue, 25 Feb 2025 20:18:01 +0800
+Subject: [PATCH] libhns: Fix the max_inline_data value
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBSLL5
+
+----------------------------------------------------------------------
+
+If max_inline_data=0, roundup_pow_of_two(0)=1.
+cap->max_inline_data will be modify to 1, which
+doesn't meet expectations. Here fix it.
+
+Fixes: e0c8de59b29a ("libhns: Fix the problem of sge nums")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 17 ++++++++++++++---
+ 1 file changed, 14 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 47b1f8b..98a18c6 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1714,6 +1714,18 @@ static unsigned int get_sge_num_from_max_inl_data(bool is_ud,
+ return inline_sge;
+ }
+
++static uint32_t get_max_inline_data(struct hns_roce_context *ctx,
++ struct ibv_qp_cap *cap)
++{
++ if (cap->max_inline_data) {
++ return min_t(uint32_t,
++ roundup_pow_of_two(cap->max_inline_data),
++ ctx->max_inline_data);
++ }
++
++ return 0;
++}
++
+ static void set_ext_sge_param(struct hns_roce_context *ctx,
+ struct ibv_qp_init_attr_ex *attr,
+ struct hns_roce_qp *qp, unsigned int wr_cnt)
+@@ -1730,9 +1742,8 @@ static void set_ext_sge_param(struct hns_roce_context *ctx,
+ attr->cap.max_send_sge);
+
+ if (ctx->config & HNS_ROCE_RSP_EXSGE_FLAGS) {
+- attr->cap.max_inline_data = min_t(uint32_t, roundup_pow_of_two(
+- attr->cap.max_inline_data),
+- ctx->max_inline_data);
++ attr->cap.max_inline_data =
++ get_max_inline_data(ctx, &attr->cap);
+
+ inline_ext_sge = max(ext_wqe_sge_cnt,
+ get_sge_num_from_max_inl_data(is_ud,
+--
+2.33.0
+
diff --git a/0095-libhns-Adapt-UD-inline-data-size-for-UCX.patch b/0095-libhns-Adapt-UD-inline-data-size-for-UCX.patch
new file mode 100644
index 0000000..1dbd61c
--- /dev/null
+++ b/0095-libhns-Adapt-UD-inline-data-size-for-UCX.patch
@@ -0,0 +1,76 @@
+From 32cc1a9fe19c00bcf5512afa7b51a24de9dd8424 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Tue, 25 Feb 2025 20:29:53 +0800
+Subject: [PATCH] libhns: Adapt UD inline data size for UCX
+
+driver inclusion
+category: feature
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBSLL5
+
+----------------------------------------------------------------------
+
+Adapt UD inline data size for UCX. The value
+must be at least 128 to avoid the ucx bug.
+
+The issue url:
+https://github.com/openucx/ucx/issues/10423
+
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u.h | 2 ++
+ providers/hns/hns_roce_u_verbs.c | 16 ++++++++++++----
+ 2 files changed, 14 insertions(+), 4 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
+index 323d2f9..863d4b5 100644
+--- a/providers/hns/hns_roce_u.h
++++ b/providers/hns/hns_roce_u.h
+@@ -83,6 +83,8 @@ typedef _Atomic(uint64_t) atomic_bitmap_t;
+ #define HNS_ROCE_ADDRESS_MASK 0xFFFFFFFF
+ #define HNS_ROCE_ADDRESS_SHIFT 32
+
++#define HNS_ROCE_MIN_UD_INLINE 128
++
+ #define roce_get_field(origin, mask, shift) \
+ (((le32toh(origin)) & (mask)) >> (shift))
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 98a18c6..7418d2c 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1715,11 +1715,19 @@ static unsigned int get_sge_num_from_max_inl_data(bool is_ud,
+ }
+
+ static uint32_t get_max_inline_data(struct hns_roce_context *ctx,
+- struct ibv_qp_cap *cap)
++ struct ibv_qp_cap *cap,
++ bool is_ud)
+ {
+- if (cap->max_inline_data) {
++ uint32_t max_inline_data = cap->max_inline_data;
++
++ if (max_inline_data) {
++ max_inline_data = roundup_pow_of_two(max_inline_data);
++
++ if (is_ud && max_inline_data < HNS_ROCE_MIN_UD_INLINE)
++ max_inline_data = HNS_ROCE_MIN_UD_INLINE;
++
+ return min_t(uint32_t,
+- roundup_pow_of_two(cap->max_inline_data),
++ max_inline_data,
+ ctx->max_inline_data);
+ }
+
+@@ -1743,7 +1751,7 @@ static void set_ext_sge_param(struct hns_roce_context *ctx,
+
+ if (ctx->config & HNS_ROCE_RSP_EXSGE_FLAGS) {
+ attr->cap.max_inline_data =
+- get_max_inline_data(ctx, &attr->cap);
++ get_max_inline_data(ctx, &attr->cap, is_ud);
+
+ inline_ext_sge = max(ext_wqe_sge_cnt,
+ get_sge_num_from_max_inl_data(is_ud,
+--
+2.33.0
+
diff --git a/0096-libhns-Fix-wrong-order-of-spin_unlock-in-modify_qp.patch b/0096-libhns-Fix-wrong-order-of-spin_unlock-in-modify_qp.patch
new file mode 100644
index 0000000..2954997
--- /dev/null
+++ b/0096-libhns-Fix-wrong-order-of-spin_unlock-in-modify_qp.patch
@@ -0,0 +1,37 @@
+From a1e9d50eda32f05fe6af3712d1e70d3b6944528a Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Mon, 10 Mar 2025 11:47:23 +0800
+Subject: [PATCH 96/96] libhns: Fix wrong order of spin_unlock in modify_qp
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBSLL5
+
+----------------------------------------------------------------------
+
+The spin_unlock order should be the reverse of spin_lock() order.
+
+Fixes: f29e2a7fa40d ("[PATCH] libhns: Add support for the thread domain and the parent domain")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index f05b839..70fe2f7 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -1936,8 +1936,8 @@ static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ if (flag) {
+ if (!ret)
+ qp->state = IBV_QPS_ERR;
+- hns_roce_spin_unlock(&hr_qp->sq.hr_lock);
+ hns_roce_spin_unlock(&hr_qp->rq.hr_lock);
++ hns_roce_spin_unlock(&hr_qp->sq.hr_lock);
+ }
+
+ if (ret)
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 2756cb6..0a82bf4 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 41.0
-Release: 32
+Release: 33
Summary: RDMA core userspace libraries and daemons
License: GPLv2 or BSD
Url: https://github.com/linux-rdma/rdma-core
@@ -99,6 +99,9 @@ patch90: 0090-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
patch91: 0091-libhns-Fix-the-identification-mark-of-RDMA-UD-packet.patch
patch92: 0092-libhns-Fix-missing-fields-for-SRQ-WC.patch
patch93: 0093-libxscale-Add-Yunsilicon-User-Space-RDMA-Driver.patch
+patch94: 0094-libhns-Fix-the-max_inline_data-value.patch
+patch95: 0095-libhns-Adapt-UD-inline-data-size-for-UCX.patch
+patch96: 0096-libhns-Fix-wrong-order-of-spin_unlock-in-modify_qp.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) valgrind-devel systemd systemd-devel
@@ -349,6 +352,12 @@ fi
%{_mandir}/*
%changelog
+* Wed Mar 12 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 41.0-33
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Fixes some bugs for libhns
+
* Tue Mar 4 2025 Xin Tian <tianx(a)yunsilicon.com> - 41.0-32
- Type: requirement
- ID: NA
--
2.33.0
1
0
Add myself to RDMA-Core/Perftest committer. Also update the current
committer Chengchang to README.
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
sig/sig-high-performance-network/README.md | 10 +++++++---
sig/sig-high-performance-network/sig-info.yaml | 4 ++++
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/sig/sig-high-performance-network/README.md b/sig/sig-high-performance-network/README.md
index 2ab740be0..050b329c3 100644
--- a/sig/sig-high-performance-network/README.md
+++ b/sig/sig-high-performance-network/README.md
@@ -11,7 +11,7 @@
4. 通过本sig的持续发展,实现openEuler特色的网络加速技术体系,体现几个特点:易用性、高性能、更好的软件生态。
- Sig-high-performance-network的业务范围
- 1. 维护DPDK/RDMA/XDP相关软件包,包括
- 2. - 基础软件包:RDMA-Core、DPDK、DPDK驱动。
+ 2. - 基础软件包:RDMA-Core、DPDK、DPDK驱动。
- 语言库:dpdk-go、libbpf、goebf等
- 工具包:dpdk-perf、xdp-tools、dpdk-benchmark等
- 基础服务:libnet、libvma、libkefir等
@@ -58,6 +58,10 @@ Please [readme](./Vision-en.md)
- qianguoxin[@qianguoxin](https://gitee.com/qianguoxin) email: qianguoxin(a)huawei.com
- Yan Fangfang[@fangfang-yan](https://gitee.com/fangfang-yan) email: fangfang.yan(a)huawei.com
+#### RDMA-Core项目Committer列表
+- 汤乘畅[@hellotcc](https://gitee.com/hellotcc) email: tangchengchang(a)huawei.com
+- 黄俊贤[@hginjgerx](https://gitee.com/hginjgerx) email: huangjunxian6(a)hisilicon.com
+
### 突出贡献者
| gitee昵称 | giteeID | 邮箱 | 姓名(可选)|
|---|---|---|---|
@@ -66,11 +70,11 @@ Please [readme](./Vision-en.md)
# 联系方式
-- 邮件列表 <high-performance-network(a)openeuler.org> <dev(a)openeuler.org>
+- 邮件列表 <high-performance-network(a)openeuler.org> <dev(a)openeuler.org>
[注册邮件列表](https://mailweb.openeuler.org/postorius/lists/high-performance-netw… [历史邮件](https://mailweb.openeuler.org/hyperkitty/list/high-performance-networ…
- [IRC公开会议]()
- 视频会议
-- 微信
+- 微信

diff --git a/sig/sig-high-performance-network/sig-info.yaml b/sig/sig-high-performance-network/sig-info.yaml
index 0f3660e4e..d25b7f5a0 100644
--- a/sig/sig-high-performance-network/sig-info.yaml
+++ b/sig/sig-high-performance-network/sig-info.yaml
@@ -92,3 +92,7 @@ repositories:
name: Chengchang Tang
organization: huawei
email: tangchengchang(a)huawei.com
+ - gitee_id: hginjgerx
+ name: Junxian Huang
+ organization: huawei
+ email: huangjunxian6(a)hisilicon.com
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
The sl and src_qpn fields in recv-WC are not filled when the QP is UD
and has an SRQ. Here fix it.
In addition, UD QP does not support RQ INLINE and CQE INLINE features.
Reorder the related if-else statements to reduce the number of
conditional checks in IO path.
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...libhns-Fix-missing-fields-for-SRQ-WC.patch | 82 +++++++++++++++++++
rdma-core.spec | 11 ++-
2 files changed, 91 insertions(+), 2 deletions(-)
create mode 100644 0092-libhns-Fix-missing-fields-for-SRQ-WC.patch
diff --git a/0092-libhns-Fix-missing-fields-for-SRQ-WC.patch b/0092-libhns-Fix-missing-fields-for-SRQ-WC.patch
new file mode 100644
index 0000000..9bccce2
--- /dev/null
+++ b/0092-libhns-Fix-missing-fields-for-SRQ-WC.patch
@@ -0,0 +1,82 @@
+From 1bebc7700d46ef8e4cafc757e8f989ace4789d69 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Wed, 15 Jan 2025 15:55:29 +0800
+Subject: [PATCH] libhns: Fix missing fields for SRQ WC
+
+mainline inclusion
+from mainline-master
+commit c4119911c212aaa552c9cb928fba0a696640c9b5
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBPB3C
+CVE: NA
+Reference: https://github.com/linux-rdma/rdma-core/pull/1543/commits/65a7ce99cf4bfd674…
+
+----------------------------------------------------------------------
+
+The sl and src_qpn fields in recv-WC are not filled when the QP is UD
+and has an SRQ. Here fix it.
+
+In addition, UD QP does not support RQ INLINE and CQE INLINE features.
+Reorder the related if-else statements to reduce the number of
+conditional checks in IO path.
+
+Fixes: 061f7e1757ca ("libhns: Refactor the poll one interface")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 13 ++++++++-----
+ 1 file changed, 8 insertions(+), 5 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 1bbd788..f05b839 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -540,7 +540,8 @@ static void parse_for_ud_qp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+ }
+
+ static void parse_cqe_for_srq(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+- struct hns_roce_srq *srq)
++ struct hns_roce_srq *srq,
++ struct hns_roce_qp *hr_qp)
+ {
+ uint32_t wqe_idx;
+
+@@ -550,6 +551,9 @@ static void parse_cqe_for_srq(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+
+ if (hr_reg_read(cqe, CQE_CQE_INLINE))
+ handle_recv_cqe_inl_from_srq(cqe, srq);
++ else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD) {
++ parse_for_ud_qp(cqe, wc, hr_qp->enable_ud_sl);
++ }
+ }
+
+ static int parse_cqe_for_resp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+@@ -561,13 +565,12 @@ static int parse_cqe_for_resp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+ wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+ ++wq->tail;
+
+- if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
+- parse_for_ud_qp(cqe, wc, hr_qp->enable_ud_sl);
+-
+ if (hr_reg_read(cqe, CQE_CQE_INLINE))
+ handle_recv_cqe_inl_from_rq(cqe, hr_qp);
+ else if (hr_reg_read(cqe, CQE_RQ_INLINE))
+ handle_recv_rq_inl(cqe, hr_qp);
++ else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
++ parse_for_ud_qp(cqe, wc, hr_qp->enable_ud_sl);
+
+ return 0;
+ }
+@@ -777,7 +780,7 @@ static int parse_cqe_for_cq(struct hns_roce_context *ctx, struct hns_roce_cq *cq
+ return V2_CQ_POLL_ERR;
+
+ if (srq)
+- parse_cqe_for_srq(cqe, wc, srq);
++ parse_cqe_for_srq(cqe, wc, srq, cur_qp);
+ else
+ parse_cqe_for_resp(cqe, wc, cur_qp);
+ }
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 369d9b8..33781ce 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 41.0
-Release: 30
+Release: 31
Summary: RDMA core userspace libraries and daemons
License: GPLv2 or BSD
Url: https://github.com/linux-rdma/rdma-core
@@ -97,6 +97,7 @@ patch88: 0088-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
patch89: 0089-libhns-Fix-bypassed-vendor-check-in-hnsdv_query_devi.patch
patch90: 0090-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
patch91: 0091-libhns-Fix-the-identification-mark-of-RDMA-UD-packet.patch
+patch92: 0092-libhns-Fix-missing-fields-for-SRQ-WC.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) valgrind-devel systemd systemd-devel
@@ -344,7 +345,13 @@ fi
%{_mandir}/*
%changelog
-* Thu Feb 25 2025 Dazhao Lao <laodazhao(a)huawei.com> - 41.0-30
+* Thu Feb 27 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 41.0-31
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Fix missing fields for SRQ WC
+
+* Tue Feb 25 2025 Dazhao Lao <laodazhao(a)huawei.com> - 41.0-30
- Type: bugfix
- ID: NA
- SUG: NA
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
mainline inclusion:
libhns: Fix missing fields for SRQ WC
Modify the information of some patch Fixes
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...x-memory-leakage-when-DCA-is-enabled.patch | 7 +-
...ump-during-QP-destruction-when-send_.patch | 4 +-
...libhns-Fix-missing-fields-for-SRQ-WC.patch | 82 +++++++++++++++++++
rdma-core.spec | 9 +-
4 files changed, 95 insertions(+), 7 deletions(-)
create mode 100644 0043-libhns-Fix-missing-fields-for-SRQ-WC.patch
diff --git a/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch b/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
index e1763f7..f046264 100644
--- a/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
+++ b/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
@@ -1,4 +1,4 @@
-From edaf09dbfc7203ea68becfcb56eecf4af31ba555 Mon Sep 17 00:00:00 2001
+From af20dd32df73ff72d35a430a5fb87ac42d70cdf4 Mon Sep 17 00:00:00 2001
From: wenglianfa <wenglianfa(a)huawei.com>
Date: Thu, 25 Jul 2024 11:06:01 +0800
Subject: [PATCH] libhns: Fix memory leakage when DCA is enabled
@@ -13,8 +13,7 @@ After DCA is enabled and a QP is created, the memory block
applied for DCA is not free when the QP is destroyed. Here
fix it.
-Fixes: 41e39ab792c8 ("[BigDipperV3R9,NeZha][ROCE] libhns: Add support for at taching QP's WQE buffer")
-
+Fixes: 2783884a97e7 ("libhns: Add support for attaching QP's WQE buffer")
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
@@ -36,5 +35,5 @@ index e30880c..c733b21 100644
static int qp_alloc_wqe(struct ibv_qp_init_attr_ex *attr,
--
-2.25.1
+2.33.0
diff --git a/0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch b/0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
index b3ee84f..2e9f546 100644
--- a/0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
+++ b/0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
@@ -1,4 +1,4 @@
-From 263479c6fb4712528ccae276960ec94fd77afc51 Mon Sep 17 00:00:00 2001
+From 83784fc2538d24f3f06f023c21cc045d5b7f44ce Mon Sep 17 00:00:00 2001
From: Yuyu Li <liyuyu6(a)huawei.com>
Date: Mon, 25 Nov 2024 16:13:48 +0800
Subject: [PATCH] libhns: Fix coredump during QP destruction when send_cq ==
@@ -24,7 +24,7 @@ coredump info:
0x0000ffff8feae39c in __ibv_create_qp_1_1
0x0000000000401420 in test_ctrl_path
-Fixes: 95e05809d2d2 ("[BigDipperV3R9,NeZha][ROCE] libhns: Support reporting wc as software mode")
+Fixes: 5494e44cf97e ("Support reporting wc as software mode.")
Signed-off-by: Yuyu Li <liyuyu6(a)huawei.com>
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
diff --git a/0043-libhns-Fix-missing-fields-for-SRQ-WC.patch b/0043-libhns-Fix-missing-fields-for-SRQ-WC.patch
new file mode 100644
index 0000000..4058094
--- /dev/null
+++ b/0043-libhns-Fix-missing-fields-for-SRQ-WC.patch
@@ -0,0 +1,82 @@
+From b52618371517527ce8ea4b8f5bd2571c7f69a2ba Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Wed, 15 Jan 2025 15:55:29 +0800
+Subject: [PATCH] libhns: Fix missing fields for SRQ WC
+
+mainline inclusion
+from mainline-master
+commit 65a7ce99cf4bfd6748346206f546e51c0a82c993
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBIEA4
+CVE: NA
+Reference: https://github.com/linux-rdma/rdma-core/pull/1543/commits/65a7ce99cf4bfd674…
+
+----------------------------------------------------------------------
+
+The sl and src_qpn fields in recv-WC are not filled when the QP is UD
+and has an SRQ. Here fix it.
+
+In addition, UD QP does not support RQ INLINE and CQE INLINE features.
+Reorder the related if-else statements to reduce the number of
+conditional checks in IO path.
+
+Fixes: 061f7e1757ca ("libhns: Refactor the poll one interface")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 13 ++++++++-----
+ 1 file changed, 8 insertions(+), 5 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 0628646..aadea7a 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -519,7 +519,8 @@ static void parse_for_ud_qp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc)
+ }
+
+ static void parse_cqe_for_srq(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+- struct hns_roce_srq *srq)
++ struct hns_roce_srq *srq,
++ struct hns_roce_qp *hr_qp)
+ {
+ uint32_t wqe_idx;
+
+@@ -529,6 +530,8 @@ static void parse_cqe_for_srq(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+
+ if (hr_reg_read(cqe, CQE_CQE_INLINE))
+ handle_recv_cqe_inl_from_srq(cqe, srq);
++ else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
++ parse_for_ud_qp(cqe, wc);
+ }
+
+ static void parse_cqe_for_resp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+@@ -540,13 +543,13 @@ static void parse_cqe_for_resp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+ wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+ ++wq->tail;
+
+- if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
+- parse_for_ud_qp(cqe, wc);
+-
+ if (hr_reg_read(cqe, CQE_CQE_INLINE))
+ handle_recv_cqe_inl_from_rq(cqe, hr_qp);
+ else if (hr_reg_read(cqe, CQE_RQ_INLINE))
+ handle_recv_rq_inl(cqe, hr_qp);
++ else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
++ parse_for_ud_qp(cqe, wc);
++
+ }
+
+ static void parse_cqe_for_req(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
+@@ -753,7 +756,7 @@ static int parse_cqe_for_cq(struct hns_roce_context *ctx, struct hns_roce_cq *cq
+ return V2_CQ_POLL_ERR;
+
+ if (srq)
+- parse_cqe_for_srq(cqe, wc, srq);
++ parse_cqe_for_srq(cqe, wc, srq, cur_qp);
+ else
+ parse_cqe_for_resp(cqe, wc, cur_qp);
+ }
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 035faf7..8fd61b8 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 20
+Release: 21
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -48,6 +48,7 @@ patch39: 0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
patch40: 0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
patch41: 0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
patch42: 0042-libhns-Add-error-logs-to-help-diagnosis.patch
+patch43: 0043-libhns-Fix-missing-fields-for-SRQ-WC.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -623,6 +624,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Fri Jan 17 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-21
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Fix missing fields for SRQ WC
+
* Wed Jan 08 2025 Funda Wang <fundawang(a)yeah.net> - 50.0-20
- Type: bugfix
- ID: NA
--
2.33.0
1
0
From: wenglianfa <wenglianfa(a)huawei.com>
The sl and src_qpn fields in recv-WC are not filled when the QP is UD
and has an SRQ. Here fix it.
In addition, UD QP does not support RQ INLINE and CQE INLINE features.
Reorder the related if-else statements to reduce the number of
conditional checks in IO path.
Fixes: 061f7e1757ca ("libhns: Refactor the poll one interface")
Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
providers/hns/hns_roce_u_hw_v2.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
index 1d5c9a6a5..f46337226 100644
--- a/providers/hns/hns_roce_u_hw_v2.c
+++ b/providers/hns/hns_roce_u_hw_v2.c
@@ -510,12 +510,15 @@ static void parse_for_ud_qp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc)
}
static void parse_cqe_for_srq(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
- struct hns_roce_srq *srq)
+ struct hns_roce_srq *srq,
+ struct hns_roce_qp *hr_qp)
{
uint32_t wqe_idx;
if (hr_reg_read(cqe, CQE_CQE_INLINE))
handle_recv_cqe_inl_from_srq(cqe, srq);
+ else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
+ parse_for_ud_qp(cqe, wc);
wqe_idx = hr_reg_read(cqe, CQE_WQE_IDX);
wc->wr_id = srq->wrid[wqe_idx & (srq->wqe_cnt - 1)];
@@ -531,13 +534,13 @@ static void parse_cqe_for_resp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
++wq->tail;
- if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
- parse_for_ud_qp(cqe, wc);
-
if (hr_reg_read(cqe, CQE_CQE_INLINE))
handle_recv_cqe_inl_from_rq(cqe, hr_qp);
else if (hr_reg_read(cqe, CQE_RQ_INLINE))
handle_recv_rq_inl(cqe, hr_qp);
+ else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD)
+ parse_for_ud_qp(cqe, wc);
+
}
static void parse_cqe_for_req(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc,
@@ -669,7 +672,7 @@ static int parse_cqe_for_cq(struct hns_roce_context *ctx, struct hns_roce_cq *cq
return V2_CQ_POLL_ERR;
if (srq)
- parse_cqe_for_srq(cqe, wc, srq);
+ parse_cqe_for_srq(cqe, wc, srq, cur_qp);
else
parse_cqe_for_resp(cqe, wc, cur_qp);
}
--
2.33.0
1
0
[PATCH] Perftest: modify --source_ip to --bind_source_ip to fix init connection establishment with specific interface
by Chengchang Tang 10 Jan '25
by Chengchang Tang 10 Jan '25
10 Jan '25
From: Guofeng Yue <yueguofeng(a)h-partners.com>
mainline inclusion
commit ba4580a6c4f16ab7791e4d809eda91d586c1f04f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/perftest/issues/IBGXBU
Reference: https://github.com/linux-rdma/perftest/commit/8ff29c1603215c012a886284b7184…
----------------------------
When there are several network interface with different subnet address,
perftest tools will always choose default route even I add --source_ip
option to ask perftest to bind an interface.
I found that there are two options that use the same name "--source_ip".
Therefore, change --source_ip to --bind_source_ip to fix init connection
establishment with specific interface.
Signed-off-by: Guofeng Yue <yueguofeng(a)h-partners.com>
---
...-to-bind_sounce_ip-to-fix-init-conne.patch | 35 +++++++++++++++++++
perftest.spec | 9 ++++-
2 files changed, 43 insertions(+), 1 deletion(-)
create mode 100644 0015-modify-source_ip-to-bind_sounce_ip-to-fix-init-conne.patch
diff --git a/0015-modify-source_ip-to-bind_sounce_ip-to-fix-init-conne.patch b/0015-modify-source_ip-to-bind_sounce_ip-to-fix-init-conne.patch
new file mode 100644
index 0000000..95e5c41
--- /dev/null
+++ b/0015-modify-source_ip-to-bind_sounce_ip-to-fix-init-conne.patch
@@ -0,0 +1,35 @@
+From ee491fb51e068f1ff3d0277a3a3b33ee0bb38779 Mon Sep 17 00:00:00 2001
+From: "Huai-En, Tseng" <huaien.tseng(a)shopee.com>
+Date: Thu, 18 May 2023 11:31:31 +0800
+Subject: [PATCH] modify --source_ip to --bind_sounce_ip to fix init connection
+ establishment with specific interface
+
+---
+ src/perftest_parameters.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/src/perftest_parameters.c b/src/perftest_parameters.c
+index 0a18b2a..73907ca 100755
+--- a/src/perftest_parameters.c
++++ b/src/perftest_parameters.c
+@@ -476,7 +476,7 @@ static void usage(const char *argv0, VerbType verb, TestType tst, int connection
+
+ // please note it is a different source_ip from raw_ethernet case
+ if (connection_type != RawEth) {
+- printf(" --source_ip ");
++ printf(" --bind_source_ip ");
+ printf(" Source IP of the interface used for connection establishment. By default taken from routing table.\n");
+ }
+
+@@ -2386,7 +2386,7 @@ int parser(struct perftest_parameters *user_param,char *argv[], int argc)
+ #ifdef HAVE_HNSDV
+ { .name = "congest_type", .has_arg = 1, .flag = &congest_type_flag, .val = 1},
+ #endif
+- {.name = "source_ip", .has_arg = 1, .flag = &source_ip_flag, .val = 1},
++ {.name = "bind_source_ip", .has_arg = 1, .flag = &source_ip_flag, .val = 1},
+ {0}
+ };
+ c = getopt_long(argc,argv,"w:y:p:d:i:m:s:n:t:u:S:x:c:q:I:o:M:r:Q:A:l:D:f:B:T:L:E:J:j:K:k:X:W:aFegzRvhbNVCHUOZP",long_options,NULL);
+--
+2.33.0
+
diff --git a/perftest.spec b/perftest.spec
index 737c993..6f48fac 100644
--- a/perftest.spec
+++ b/perftest.spec
@@ -1,6 +1,6 @@
Name: perftest
Version: 4.5
-Release: 11
+Release: 12
License: GPLv2 or BSD
Summary: RDMA Performance Testing Tools
Url: https://github.com/linux-rdma/perftest
@@ -20,6 +20,7 @@ Patch11: 0011-Perftest-Fix-rx_depth-check-for-XRC.patch
Patch12: 0012-Perftest-Add-support-for-TD-lock-free-mode.patch
Patch13: 0013-Perftest-Fix-TD-lock-free-mode-not-working-for-QP.patch
Patch14: 0014-Perftest-Fix-failure-in-creating-cq-when-create-cq-e.patch
+Patch15: 0015-modify-source_ip-to-bind_sounce_ip-to-fix-init-conne.patch
BuildRequires: automake gcc libibverbs-devel >= 1.2.0 librdmacm-devel >= 1.0.21 libibumad-devel >= 1.3.10.2
BuildRequires: pciutils-devel libibverbs librdmacm libibumad
@@ -47,6 +48,12 @@ done
%_bindir/*
%changelog
+* Thu Jun 10 2025 Guofeng Yue <yueguofeng(a)h-partners.com> - 4.5-12
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: modify --source_ip to --bind_source_ip to fix init connection establishment with specific interface
+
* Mon Sep 2 2024 Xinghai Cen <cenxinghai(a)h-partners.com> - 4.5-11
- Type: bugfix
- ID: NA
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Add error logs to help diagnosis.
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...hns-Add-error-logs-to-help-diagnosis.patch | 242 ++++++++++++++++++
rdma-core.spec | 9 +-
2 files changed, 250 insertions(+), 1 deletion(-)
create mode 100644 0042-libhns-Add-error-logs-to-help-diagnosis.patch
diff --git a/0042-libhns-Add-error-logs-to-help-diagnosis.patch b/0042-libhns-Add-error-logs-to-help-diagnosis.patch
new file mode 100644
index 0000000..9d880ea
--- /dev/null
+++ b/0042-libhns-Add-error-logs-to-help-diagnosis.patch
@@ -0,0 +1,242 @@
+From 60c45b5f7c2cd0c2e7139d472406f071f327bb91 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Fri, 27 Dec 2024 14:02:29 +0800
+Subject: [PATCH] libhns: Add error logs to help diagnosis
+
+mainline inclusion
+from mainline-master
+commit 7849f1b17f89b8baa0065adaf9cd04204698ea82
+category: feature
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBFGPH
+CVE: NA
+
+Reference: https://github.com/linux-rdma/rdma-core/pull/1533/commits/7849f1b17f89b8baa…
+
+----------------------------------------------------------------------
+
+Add error logs to help diagnosis.
+
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+---
+ providers/hns/hns_roce_u.c | 4 +-
+ providers/hns/hns_roce_u_hw_v2.c | 3 ++
+ providers/hns/hns_roce_u_verbs.c | 79 ++++++++++++++++++++++++++------
+ 3 files changed, 70 insertions(+), 16 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
+index e219b9e..ec995e7 100644
+--- a/providers/hns/hns_roce_u.c
++++ b/providers/hns/hns_roce_u.c
+@@ -424,8 +424,10 @@ static struct verbs_context *hns_roce_alloc_context(struct ibv_device *ibdev,
+
+ context->uar = mmap(NULL, hr_dev->page_size, PROT_READ | PROT_WRITE,
+ MAP_SHARED, cmd_fd, 0);
+- if (context->uar == MAP_FAILED)
++ if (context->uar == MAP_FAILED) {
++ verbs_err(&context->ibv_ctx, "failed to mmap uar page.\n");
+ goto err_set_attr;
++ }
+
+ if (init_dca_context(context, cmd_fd,
+ &resp, ctx_attr, hr_dev->page_size))
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index c746e03..0628646 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -3057,6 +3057,9 @@ static int fill_send_wr_ops(const struct ibv_qp_init_attr_ex *attr,
+ fill_send_wr_ops_ud(qp_ex);
+ break;
+ default:
++ verbs_err(verbs_get_ctx(qp_ex->qp_base.context),
++ "QP type %d not supported for qp_ex send ops.\n",
++ attr->qp_type);
+ return -EOPNOTSUPP;
+ }
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index c733b21..e9acfab 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -422,8 +422,11 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ {
+ struct hns_roce_pad *pad = to_hr_pad(attr->parent_domain);
+
+- if (!attr->cqe || attr->cqe > context->max_cqe)
++ if (!attr->cqe || attr->cqe > context->max_cqe) {
++ verbs_err(&context->ibv_ctx, "unsupported cq depth %u.\n",
++ attr->cqe);
+ return EINVAL;
++ }
+
+ if (!check_comp_mask(attr->comp_mask, CREATE_CQ_SUPPORTED_COMP_MASK)) {
+ verbs_err(&context->ibv_ctx, "unsupported cq comps 0x%x\n",
+@@ -431,8 +434,11 @@ static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr,
+ return EOPNOTSUPP;
+ }
+
+- if (!check_comp_mask(attr->wc_flags, CREATE_CQ_SUPPORTED_WC_FLAGS))
++ if (!check_comp_mask(attr->wc_flags, CREATE_CQ_SUPPORTED_WC_FLAGS)) {
++ verbs_err(&context->ibv_ctx, "unsupported wc flags 0x%llx.\n",
++ attr->wc_flags);
+ return EOPNOTSUPP;
++ }
+
+ if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) {
+ if (!pad) {
+@@ -504,8 +510,11 @@ static int exec_cq_create_cmd(struct ibv_context *context,
+ ret = ibv_cmd_create_cq_ex(context, attr, &cq->verbs_cq,
+ &cmd_ex.ibv_cmd, sizeof(cmd_ex),
+ &resp_ex.ibv_resp, sizeof(resp_ex), 0);
+- if (ret)
++ if (ret) {
++ verbs_err(verbs_get_ctx(context),
++ "failed to exec create cq cmd, ret = %d.\n", ret);
+ return ret;
++ }
+
+ cq->cqn = resp_drv->cqn;
+ cq->flags = resp_drv->cap_flags;
+@@ -724,13 +733,20 @@ static int verify_srq_create_attr(struct hns_roce_context *context,
+ struct ibv_srq_init_attr_ex *attr)
+ {
+ if (attr->srq_type != IBV_SRQT_BASIC &&
+- attr->srq_type != IBV_SRQT_XRC)
++ attr->srq_type != IBV_SRQT_XRC) {
++ verbs_err(&context->ibv_ctx,
++ "unsupported srq type, type = %d.\n", attr->srq_type);
+ return -EINVAL;
++ }
+
+ if (!attr->attr.max_sge ||
+ attr->attr.max_wr > context->max_srq_wr ||
+- attr->attr.max_sge > context->max_srq_sge)
++ attr->attr.max_sge > context->max_srq_sge) {
++ verbs_err(&context->ibv_ctx,
++ "invalid srq attr size, max_wr = %u, max_sge = %u.\n",
++ attr->attr.max_wr, attr->attr.max_sge);
+ return -EINVAL;
++ }
+
+ attr->attr.max_wr = max_t(uint32_t, attr->attr.max_wr,
+ HNS_ROCE_MIN_SRQ_WQE_NUM);
+@@ -862,8 +878,12 @@ static int exec_srq_create_cmd(struct ibv_context *context,
+ ret = ibv_cmd_create_srq_ex(context, &srq->verbs_srq, init_attr,
+ &cmd_ex.ibv_cmd, sizeof(cmd_ex),
+ &resp_ex.ibv_resp, sizeof(resp_ex));
+- if (ret)
++ if (ret) {
++ verbs_err(verbs_get_ctx(context),
++ "failed to exec create srq cmd, ret = %d.\n",
++ ret);
+ return ret;
++ }
+
+ srq->srqn = resp_ex.srqn;
+ srq->cap_flags = resp_ex.cap_flags;
+@@ -1086,9 +1106,12 @@ static int check_qp_create_mask(struct hns_roce_context *ctx,
+ struct ibv_qp_init_attr_ex *attr)
+ {
+ struct hns_roce_device *hr_dev = to_hr_dev(ctx->ibv_ctx.context.device);
++ int ret = 0;
+
+- if (!check_comp_mask(attr->comp_mask, CREATE_QP_SUP_COMP_MASK))
+- return EOPNOTSUPP;
++ if (!check_comp_mask(attr->comp_mask, CREATE_QP_SUP_COMP_MASK)) {
++ ret = EOPNOTSUPP;
++ goto out;
++ }
+
+ if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS &&
+ !check_comp_mask(attr->send_ops_flags, SEND_OPS_FLAG_MASK))
+@@ -1102,17 +1125,21 @@ static int check_qp_create_mask(struct hns_roce_context *ctx,
+ case IBV_QPT_RC:
+ case IBV_QPT_XRC_SEND:
+ if (!(attr->comp_mask & IBV_QP_INIT_ATTR_PD))
+- return EINVAL;
++ ret = EINVAL;
+ break;
+ case IBV_QPT_XRC_RECV:
+ if (!(attr->comp_mask & IBV_QP_INIT_ATTR_XRCD))
+- return EINVAL;
++ ret = EINVAL;
+ break;
+ default:
+ return EOPNOTSUPP;
+ }
+
+- return 0;
++out:
++ if (ret)
++ verbs_err(&ctx->ibv_ctx, "invalid comp_mask 0x%x.\n",
++ attr->comp_mask);
++ return ret;
+ }
+
+ static int hns_roce_qp_has_rq(struct ibv_qp_init_attr_ex *attr)
+@@ -1137,8 +1164,13 @@ static int verify_qp_create_cap(struct hns_roce_context *ctx,
+ if (cap->max_send_wr > ctx->max_qp_wr ||
+ cap->max_recv_wr > ctx->max_qp_wr ||
+ cap->max_send_sge > ctx->max_sge ||
+- cap->max_recv_sge > ctx->max_sge)
++ cap->max_recv_sge > ctx->max_sge) {
++ verbs_err(&ctx->ibv_ctx,
++ "invalid qp cap size, max_send/recv_wr = {%u, %u}, max_send/recv_sge = {%u, %u}.\n",
++ cap->max_send_wr, cap->max_recv_wr,
++ cap->max_send_sge, cap->max_recv_sge);
+ return -EINVAL;
++ }
+
+ has_rq = hns_roce_qp_has_rq(attr);
+ if (!has_rq) {
+@@ -1147,12 +1179,20 @@ static int verify_qp_create_cap(struct hns_roce_context *ctx,
+ }
+
+ min_wqe_num = HNS_ROCE_V2_MIN_WQE_NUM;
+- if (cap->max_send_wr < min_wqe_num)
++ if (cap->max_send_wr < min_wqe_num) {
++ verbs_debug(&ctx->ibv_ctx,
++ "change sq depth from %u to minimum %u.\n",
++ cap->max_send_wr, min_wqe_num);
+ cap->max_send_wr = min_wqe_num;
++ }
+
+ if (cap->max_recv_wr) {
+- if (cap->max_recv_wr < min_wqe_num)
++ if (cap->max_recv_wr < min_wqe_num) {
++ verbs_debug(&ctx->ibv_ctx,
++ "change rq depth from %u to minimum %u.\n",
++ cap->max_recv_wr, min_wqe_num);
+ cap->max_recv_wr = min_wqe_num;
++ }
+
+ if (!cap->max_recv_sge)
+ return -EINVAL;
+@@ -1646,6 +1686,11 @@ static int qp_exec_create_cmd(struct ibv_qp_init_attr_ex *attr,
+ ret = ibv_cmd_create_qp_ex2(&ctx->ibv_ctx.context, &qp->verbs_qp, attr,
+ &cmd_ex.ibv_cmd, sizeof(cmd_ex),
+ &resp_ex.ibv_resp, sizeof(resp_ex));
++ if (ret) {
++ verbs_err(&ctx->ibv_ctx,
++ "failed to exec create qp cmd, ret = %d.\n", ret);
++ return ret;
++ }
+
+ qp->flags = resp_ex.drv_payload.cap_flags;
+ *dwqe_mmap_key = resp_ex.drv_payload.dwqe_mmap_key;
+@@ -1707,8 +1752,12 @@ static int mmap_dwqe(struct ibv_context *ibv_ctx, struct hns_roce_qp *qp,
+ {
+ qp->dwqe_page = mmap(NULL, HNS_ROCE_DWQE_PAGE_SIZE, PROT_WRITE,
+ MAP_SHARED, ibv_ctx->cmd_fd, dwqe_mmap_key);
+- if (qp->dwqe_page == MAP_FAILED)
++ if (qp->dwqe_page == MAP_FAILED) {
++ verbs_err(verbs_get_ctx(ibv_ctx),
++ "failed to mmap direct wqe page, QPN = %u.\n",
++ qp->verbs_qp.qp.qp_num);
+ return -EINVAL;
++ }
+
+ return 0;
+ }
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 121f2f5..db8e113 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 17
+Release: 18
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -47,6 +47,7 @@ patch38: 0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
patch39: 0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
patch40: 0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
patch41: 0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
+patch42: 0042-libhns-Add-error-logs-to-help-diagnosis.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -626,6 +627,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Fri Jan 3 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-18
+- Type: requirement
+- ID: NA
+- SUG: NA
+- DESC: Add error logs to help diagnosis
+
* Thu Nov 28 2024 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-17
- Type: bugfix
- ID: NA
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Fixed two bugs in libhns:
libhns: Fix bypassed vendor check in hnsdv_query_device()
libhns: Fix coredump during QP destruction when send_cq: == recv_cq
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...sed-vendor-check-in-hnsdv_query_devi.patch | 40 ++++++++++++++
...ump-during-QP-destruction-when-send_.patch | 53 +++++++++++++++++++
rdma-core.spec | 10 +++-
3 files changed, 102 insertions(+), 1 deletion(-)
create mode 100644 0089-libhns-Fix-bypassed-vendor-check-in-hnsdv_query_devi.patch
create mode 100644 0090-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
diff --git a/0089-libhns-Fix-bypassed-vendor-check-in-hnsdv_query_devi.patch b/0089-libhns-Fix-bypassed-vendor-check-in-hnsdv_query_devi.patch
new file mode 100644
index 0000000..14f2edc
--- /dev/null
+++ b/0089-libhns-Fix-bypassed-vendor-check-in-hnsdv_query_devi.patch
@@ -0,0 +1,40 @@
+From 485cddd47c83d6f229450b28d55d8e07f60ddcc0 Mon Sep 17 00:00:00 2001
+From: Yuyu Li <liyuyu6(a)huawei.com>
+Date: Thu, 21 Nov 2024 21:37:15 +0800
+Subject: [PATCH] libhns: Fix bypassed vendor check in hnsdv_query_device()
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBF87T
+
+--------------------------------------------------------------------------
+
+The device vendor check is actually bypassed currently due
+to the wrong if-condition. It should be a '||' statement.
+
+Fixes: 19e1eabc154f ("libhns: Add input parameter check for hnsdv_query_device()")
+Signed-off-by: Yuyu Li <liyuyu6(a)huawei.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index 090efbf..a6afce2 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -128,10 +128,10 @@ int hnsdv_query_device(struct ibv_context *context,
+ struct hns_roce_context *ctx = context ? to_hr_ctx(context) : NULL;
+ struct hns_roce_device *hr_dev;
+
+- if (!ctx || !attrs_out)
++ if (!ctx || !context->device || !attrs_out)
+ return EINVAL;
+
+- if (!context->device && !is_hns_dev(context->device)) {
++ if (!is_hns_dev(context->device)) {
+ verbs_err(verbs_get_ctx(context), "not a HNS RoCE device!\n");
+ return EOPNOTSUPP;
+ }
+--
+2.33.0
+
diff --git a/0090-libhns-Fix-coredump-during-QP-destruction-when-send_.patch b/0090-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
new file mode 100644
index 0000000..4909bd3
--- /dev/null
+++ b/0090-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
@@ -0,0 +1,53 @@
+From ad5055f9b32ab0915803575385985fb10a29434a Mon Sep 17 00:00:00 2001
+From: Yuyu Li <liyuyu6(a)huawei.com>
+Date: Mon, 25 Nov 2024 15:42:16 +0800
+Subject: [PATCH] libhns: Fix coredump during QP destruction when send_cq
+ == recv_cq
+
+driver inclusion
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IBF87T
+
+--------------------------------------------------------------------------
+
+If the specified send CQ and recv CQ are both
+the same CQ, the QP node in SCQ is not deleted.
+which causes a segfault to occur when recreating
+the QP. Here fix it.
+
+coredump info:
+0x0000ffff8fbc37d4 in list_add_before_
+0x0000ffff8fbc381c in list_add_tail_
+0x0000ffff8fbc9d9c in add_qp_to_cq_list
+0x0000ffff8fbca008 in create_qp
+0x0000ffff8fbca110 in hns_roce_u_create_qp
+0x0000ffff8feae39c in __ibv_create_qp_1_1
+0x0000000000401420 in test_ctrl_path
+
+Fixes: 5bebdb5ba77b ("libhns: Support reporting wc as software mode")
+Signed-off-by: Yuyu Li <liyuyu6(a)huawei.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 7 ++++---
+ 1 file changed, 4 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 8f071e1..48a7566 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -2033,9 +2033,10 @@ static int hns_roce_u_v2_destroy_qp(struct ibv_qp *ibqp)
+ list_del(&qp->rcq_node);
+ }
+
+- if (ibqp->send_cq && ibqp->send_cq != ibqp->recv_cq) {
+- __hns_roce_v2_cq_clean(to_hr_cq(ibqp->send_cq), ibqp->qp_num,
+- NULL);
++ if (ibqp->send_cq) {
++ if (ibqp->send_cq != ibqp->recv_cq)
++ __hns_roce_v2_cq_clean(to_hr_cq(ibqp->send_cq), ibqp->qp_num,
++ NULL);
+ list_del(&qp->scq_node);
+ }
+
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 229810a..5e7be99 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 41.0
-Release: 28
+Release: 29
Summary: RDMA core userspace libraries and daemons
License: GPLv2 or BSD
Url: https://github.com/linux-rdma/rdma-core
@@ -94,6 +94,8 @@ patch85: 0085-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
patch86: 0086-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
patch87: 0087-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch
patch88: 0088-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
+patch89: 0089-libhns-Fix-bypassed-vendor-check-in-hnsdv_query_devi.patch
+patch90: 0090-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) valgrind-devel systemd systemd-devel
@@ -341,6 +343,12 @@ fi
%{_mandir}/*
%changelog
+* Thu Jan 2 2025 Xinghai Cen <cenxinghai(a)h-partners.com> - 41.0-29
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Fixed two bugs in libhns
+
* Thu Nov 21 2024 Wentao Hu <huwentao19(a)h-partners.com> - 41.0-28
- Type: bugfix
- ID: NA
--
2.33.0
1
0
Junxian Huang (3):
MAINTAINERS: Update hns maintainers
README: Update the name of hns kernel module
libhns: Add error logs to help diagnosis
MAINTAINERS | 6 +--
README.md | 2 +-
providers/hns/hns_roce_u.c | 5 +-
providers/hns/hns_roce_u_hw_v2.c | 3 ++
providers/hns/hns_roce_u_verbs.c | 79 ++++++++++++++++++++++++++------
5 files changed, 74 insertions(+), 21 deletions(-)
--
2.33.0
1
3
[PATCH] libhns: Fix coredump during QP destruction when send_cq == recv_cq
by Chengchang Tang 28 Nov '24
by Chengchang Tang 28 Nov '24
28 Nov '24
From: Xinghai Cen <cenxinghai(a)h-partners.com>
driver inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IB7JZL
------------------------------------------------------------------
If the specified send CQ and recv CQ are both
the same CQ, the QP node in SCQ is not deleted.
which causes a segfault to occur when recreating
the QP. Here fix it.
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...ump-during-QP-destruction-when-send_.patch | 54 +++++++++++++++++++
rdma-core.spec | 9 +++-
2 files changed, 62 insertions(+), 1 deletion(-)
create mode 100644 0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
diff --git a/0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch b/0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
new file mode 100644
index 0000000..b3ee84f
--- /dev/null
+++ b/0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
@@ -0,0 +1,54 @@
+From 263479c6fb4712528ccae276960ec94fd77afc51 Mon Sep 17 00:00:00 2001
+From: Yuyu Li <liyuyu6(a)huawei.com>
+Date: Mon, 25 Nov 2024 16:13:48 +0800
+Subject: [PATCH] libhns: Fix coredump during QP destruction when send_cq ==
+ recv_cq
+
+driver inclusion
+category: feature
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IB7JZL
+
+------------------------------------------------------------------
+
+If the specified send CQ and recv CQ are both
+the same CQ, the QP node in SCQ is not deleted.
+which causes a segfault to occur when recreating
+the QP. Here fix it.
+
+coredump info:
+0x0000ffff8fbc37d4 in list_add_before_
+0x0000ffff8fbc381c in list_add_tail_
+0x0000ffff8fbc9d9c in add_qp_to_cq_list
+0x0000ffff8fbca008 in create_qp
+0x0000ffff8fbca110 in hns_roce_u_create_qp
+0x0000ffff8feae39c in __ibv_create_qp_1_1
+0x0000000000401420 in test_ctrl_path
+
+Fixes: 95e05809d2d2 ("[BigDipperV3R9,NeZha][ROCE] libhns: Support reporting wc as software mode")
+Signed-off-by: Yuyu Li <liyuyu6(a)huawei.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 7 ++++---
+ 1 file changed, 4 insertions(+), 3 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index e4232ea..c746e03 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -2006,9 +2006,10 @@ static int hns_roce_u_v2_destroy_qp(struct ibv_qp *ibqp)
+ list_del(&qp->rcq_node);
+ }
+
+- if (ibqp->send_cq && ibqp->send_cq != ibqp->recv_cq) {
+- __hns_roce_v2_cq_clean(to_hr_cq(ibqp->send_cq), ibqp->qp_num,
+- NULL);
++ if (ibqp->send_cq) {
++ if (ibqp->send_cq != ibqp->recv_cq)
++ __hns_roce_v2_cq_clean(to_hr_cq(ibqp->send_cq), ibqp->qp_num,
++ NULL);
+ list_del(&qp->scq_node);
+ }
+
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index fb2c277..121f2f5 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 16
+Release: 17
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -46,6 +46,7 @@ patch37: 0037-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch
patch38: 0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
patch39: 0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
patch40: 0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
+patch41: 0041-libhns-Fix-coredump-during-QP-destruction-when-send_.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -625,6 +626,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Thu Nov 28 2024 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-17
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Fix coredump during QP destruction when send_cq == recv_cq
+
* Mon Nov 25 2024 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-16
- Type: bugfix
- ID: NA
--
2.33.0
1
0
[PATCH OLK-6.6] RDMA/hns: Fix bonding failure due to wrong reset_state
by Chengchang Tang 22 Nov '24
by Chengchang Tang 22 Nov '24
22 Nov '24
From: Junxian Huang <huangjunxian6(a)hisilicon.com>
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/IB3K00
----------------------------------------------------------------------
When roce driver is removed during reset, the reset flow of roce may
not be fully completed. This may lead to the reset_state of roce
handler stored in nic driver remaining in a middle state, such as
HNS_ROCE_STATE_RST_DOWN or HNS_ROCE_STATE_RST_UNINIT.
The reset_state won't be cleared even if roce driver is re-inited.
This cause that roce bonding which currently relies on reset_state
fails in this case.
Replace the reset detection for bonding with nic APIs (.ae_dev_resetting()
and .get_hw_reset_stat()), just like the reset detection elsewhere in
roce driver.
Fixes: 26d71e7c13cf ("RDMA/hns: Fix the concurrency error between bond and reset.")
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
drivers/infiniband/hw/hns/hns_roce_bond.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_bond.c b/drivers/infiniband/hw/hns/hns_roce_bond.c
index 7adae8990acd..f76667335189 100644
--- a/drivers/infiniband/hw/hns/hns_roce_bond.c
+++ b/drivers/infiniband/hw/hns/hns_roce_bond.c
@@ -543,6 +543,7 @@ static void hns_roce_do_bond(struct hns_roce_bond_group *bond_grp)
bool is_bond_slave_in_reset(struct hns_roce_bond_group *bond_grp)
{
+ const struct hnae3_ae_ops *ops;
struct hnae3_handle *handle;
struct net_device *net_dev;
int i;
@@ -550,9 +551,11 @@ bool is_bond_slave_in_reset(struct hns_roce_bond_group *bond_grp)
for (i = 0; i < ROCE_BOND_FUNC_MAX; i++) {
net_dev = bond_grp->bond_func_info[i].net_dev;
handle = bond_grp->bond_func_info[i].handle;
- if (net_dev && handle &&
- handle->rinfo.reset_state != HNS_ROCE_STATE_NON_RST &&
- handle->rinfo.reset_state != HNS_ROCE_STATE_RST_INITED)
+ if (!net_dev || !handle)
+ continue;
+ ops = handle->ae_algo->ops;
+ if (ops->ae_dev_resetting(handle) ||
+ ops->get_hw_reset_stat(handle))
return true;
}
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Fixes several bugs for libhns:
libhns: Fix memory leakage when DCA is enabled
libhns: Fix the exception branch of wr_start() is not locked
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...xception-branch-of-wr_start-is-not-l.patch | 42 +++++++++++++++++++
...x-memory-leakage-when-DCA-is-enabled.patch | 41 ++++++++++++++++++
rdma-core.spec | 10 ++++-
3 files changed, 92 insertions(+), 1 deletion(-)
create mode 100644 0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
create mode 100644 0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
diff --git a/0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch b/0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
new file mode 100644
index 0000000..a5fc588
--- /dev/null
+++ b/0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
@@ -0,0 +1,42 @@
+From 7acbe6f79bbfe32f207800173e5b2b0c13ef23d9 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Wed, 12 Jun 2024 17:11:13 +0800
+Subject: [PATCH 39/40] libhns: Fix the exception branch of wr_start() is not
+ locked
+
+driver inclusion
+category: feature
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IB66RT
+
+------------------------------------------------------------------
+
+The provider should provide locking to ensure that ibv_wr_start()
+and ibv_wr_complete()/abort() form a per-QP critical section
+where no other threads can enter.
+
+The exception branch of wr_start() is not locked, fix it here.
+Because check_qp_send () does not require lock protection,
+hns_roce_spin_lock () is placed after check_qp_send ().
+
+Fixes: 36446a56eea5 ("libhns: Extended QP supports the new post send mechanism")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 465ef1e..e4232ea 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -2930,6 +2930,7 @@ static void wr_start(struct ibv_qp_ex *ibv_qp)
+
+ ret = check_qp_send(qp, ctx);
+ if (ret) {
++ hns_roce_spin_lock(&qp->sq.hr_lock);
+ qp->err = ret;
+ return;
+ }
+--
+2.33.0
+
diff --git a/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch b/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
new file mode 100644
index 0000000..6f5a9c6
--- /dev/null
+++ b/0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
@@ -0,0 +1,41 @@
+From 1f5569fa17fc65391b504ca8118c3089e598dbc1 Mon Sep 17 00:00:00 2001
+From: wenglianfa <wenglianfa(a)huawei.com>
+Date: Thu, 25 Jul 2024 11:06:01 +0800
+Subject: [PATCH 40/40] libhns: Fix memory leakage when DCA is enabled
+
+driver inclusion
+category: feature
+bugzilla: https://gitee.com/src-openeuler/rdma-core/issues/IB66RT
+
+------------------------------------------------------------------
+
+After DCA is enabled and a QP is created, the memory block
+applied for DCA is not free when the QP is destroyed. Here
+fix it.
+
+Fixes: 41e39ab792c8 ("[BigDipperV3R9,NeZha][ROCE] libhns: Add support for attaching QP's WQE buffer")
+Signed-off-by: wenglianfa <wenglianfa(a)huawei.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_verbs.c | 5 ++++-
+ 1 file changed, 4 insertions(+), 1 deletion(-)
+
+diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
+index e30880c..154e800 100644
+--- a/providers/hns/hns_roce_u_verbs.c
++++ b/providers/hns/hns_roce_u_verbs.c
+@@ -1357,7 +1357,10 @@ static void qp_free_wqe(struct hns_roce_qp *qp)
+
+ if (qp->rq.wqe_cnt)
+ free(qp->rq.wrid);
+- hns_roce_free_buf(&qp->buf);
++ if (qp->dca_wqe.bufs)
++ free(qp->dca_wqe.bufs);
++ else
++ hns_roce_free_buf(&qp->buf);
+ }
+
+ static int qp_alloc_wqe(struct ibv_qp_init_attr_ex *attr,
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index fd956d5..2f4ebe5 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 15
+Release: 16
Summary: RDMA core userspace libraries and daemons
License: GPL-2.0-only OR BSD-2-Clause AND BSD-3-Clause
Url: https://github.com/linux-rdma/rdma-core
@@ -44,6 +44,8 @@ patch35: 0035-Fix-the-stride-calculation-for-MSN-PSN-area.patch
patch36: 0036-add-ZTE-Dinghai-rdma-driver.patch
patch37: 0037-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch
patch38: 0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
+patch39: 0039-libhns-Fix-the-exception-branch-of-wr_start-is-not-l.patch
+patch40: 0040-libhns-Fix-memory-leakage-when-DCA-is-enabled.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -623,6 +625,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
%changelog
+* Thu Nov 21 2024 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-16
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Fixes several bugs for libhns
+
* Fri Nov 15 2024 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-15
- Type: bugfix
- ID: NA
--
2.33.0
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Some patches for RDMA/hns
Chengchang Tang (6):
RDMA/hns: Fix HW UAF when destroy context timeout
RDMA/hns: Fix integer overflow in calc_loading_percent()
RDMA/hns: Fix possible RAS when DCA is not attached
RDMA/hns: Fix a meaningless loop in active_dca_pages_proc()
RDMA/hns: Fix list_*_careful() not being used in pairs
RDMA/hns: Use one CQ bank per context
Junxian Huang (5):
RDMA/hns: Fix wrong output of sysfs scc pram when configuration failed
RDMA/hns: Fix concurrency between sysfs store and FW configuration of
scc params
RDMA/hns: Fix mixed use of u32 and __le32 in sysfs
RDMA/hns: Fix dereference of noderef expression
RDMA/hns: Fix "Should it be static?" warnings
wenglianfa (2):
RDMA/hns: Fix the modification of max_send_sge
RDMA/hns: Fix RoCEE hang when multiple QP banks use EXT_SGE EXT_SGE
drivers/infiniband/hw/hns/hns_roce_bond.c | 8 ++-
drivers/infiniband/hw/hns/hns_roce_cq.c | 73 +++++++++++++++++--
drivers/infiniband/hw/hns/hns_roce_dca.c | 60 ++++++++++++++--
drivers/infiniband/hw/hns/hns_roce_dca.h | 2 +
drivers/infiniband/hw/hns/hns_roce_debugfs.c | 10 ++-
drivers/infiniband/hw/hns/hns_roce_device.h | 17 ++++-
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 26 +++++--
drivers/infiniband/hw/hns/hns_roce_main.c | 20 ++++++
drivers/infiniband/hw/hns/hns_roce_mr.c | 6 +-
drivers/infiniband/hw/hns/hns_roce_qp.c | 74 ++++++++++++++++----
drivers/infiniband/hw/hns/hns_roce_srq.c | 6 +-
drivers/infiniband/hw/hns/hns_roce_sysfs.c | 41 +++++++----
12 files changed, 288 insertions(+), 55 deletions(-)
--
2.33.0
1
13
[PATCH OLK-6.6] RDMA/hns: Fix different dgids mapping to the same dip_idx
by Chengchang Tang 20 Nov '24
by Chengchang Tang 20 Nov '24
20 Nov '24
From: Feng Fang <fangfeng4(a)huawei.com>
maillist inclusion
commit faa62440a5772b40bb7d78bf9e29556a82ecf153
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IB4OOG
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=fa…
---------------------------------------------------------------------
DIP algorithm requires a one-to-one mapping between dgid and dip_idx.
Currently a queue 'spare_idx' is used to store QPN of QPs that use
DIP algorithm. For a new dgid, use a QPN from spare_idx as dip_idx.
This method lacks a mechanism for deduplicating QPN, which may result
in different dgids sharing the same dip_idx and break the one-to-one
mapping requirement.
This patch replaces spare_idx with xarray and introduces a refcnt of
a dip_idx to indicate the number of QPs that using this dip_idx.
The state machine for dip_idx management is implemented as:
* The entry at an index in xarray is empty -- This indicates that the
corresponding dip_idx hasn't been created.
* The entry at an index in xarray is not empty but with 0 refcnt --
This indicates that the corresponding dip_idx has been created but
not used as dip_idx yet.
* The entry at an index in xarray is not empty and with non-0 refcnt --
This indicates that the corresponding dip_idx is being used by refcnt
number of DIP QPs.
Fixes: eb653eda1e91 ("RDMA/hns: Bugfix for incorrect association between dip_idx and dgid")
Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Signed-off-by: Feng Fang <fangfeng4(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
Link: https://patch.msgid.link/20241112055553.3681129-1-huangjunxian6@hisilicon.c…
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
drivers/infiniband/hw/hns/hns_roce_device.h | 11 +--
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 96 +++++++++++++++------
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +-
drivers/infiniband/hw/hns/hns_roce_main.c | 2 -
drivers/infiniband/hw/hns/hns_roce_qp.c | 8 +-
5 files changed, 75 insertions(+), 44 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index fdc1fe5e6a81..e3303cc3584a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -570,12 +570,6 @@ struct hns_roce_bank {
u32 next; /* Next ID to allocate. */
};
-struct hns_roce_idx_table {
- u32 *spare_idx;
- u32 head;
- u32 tail;
-};
-
struct hns_roce_qp_table {
struct hns_roce_hem_table qp_table;
struct hns_roce_hem_table irrl_table;
@@ -584,7 +578,7 @@ struct hns_roce_qp_table {
struct mutex scc_mutex;
struct hns_roce_bank bank[HNS_ROCE_QP_BANK_NUM];
struct mutex bank_mutex;
- struct hns_roce_idx_table idx_table;
+ struct xarray dip_xa;
};
struct hns_roce_cq_table {
@@ -742,6 +736,7 @@ struct hns_roce_qp {
bool delayed_destroy_flag;
struct hns_roce_mtr_node *mtr_node;
spinlock_t flush_lock;
+ struct hns_roce_dip *dip;
};
struct hns_roce_ib_iboe {
@@ -1102,8 +1097,6 @@ struct hns_roce_dev {
enum hns_roce_device_state state;
struct list_head qp_list; /* list of all qps on this dev */
spinlock_t qp_list_lock; /* protect qp_list */
- struct list_head dip_list; /* list of all dest ips on this dev */
- spinlock_t dip_list_lock; /* protect dip_list */
struct list_head pgdir_list;
struct mutex pgdir_mutex;
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 3f254ac48b42..88bd75c5743e 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -2720,20 +2720,19 @@ static void hns_roce_free_link_table(struct hns_roce_dev *hr_dev)
free_link_table_buf(hr_dev, &priv->ext_llm);
}
-static void free_dip_list(struct hns_roce_dev *hr_dev)
+static void free_dip_entry(struct hns_roce_dev *hr_dev)
{
struct hns_roce_dip *hr_dip;
- struct hns_roce_dip *tmp;
- unsigned long flags;
+ unsigned long idx;
- spin_lock_irqsave(&hr_dev->dip_list_lock, flags);
+ xa_lock(&hr_dev->qp_table.dip_xa);
- list_for_each_entry_safe(hr_dip, tmp, &hr_dev->dip_list, node) {
- list_del(&hr_dip->node);
+ xa_for_each(&hr_dev->qp_table.dip_xa, idx, hr_dip) {
+ __xa_erase(&hr_dev->qp_table.dip_xa, hr_dip->dip_idx);
kfree(hr_dip);
}
- spin_unlock_irqrestore(&hr_dev->dip_list_lock, flags);
+ xa_unlock(&hr_dev->qp_table.dip_xa);
}
static int hns_roce_v2_get_reset_page(struct hns_roce_dev *hr_dev)
@@ -3182,7 +3181,7 @@ static void hns_roce_v2_exit(struct hns_roce_dev *hr_dev)
hns_roce_v2_put_reset_page(hr_dev);
if (hr_dev->pci_dev->revision == PCI_REVISION_ID_HIP09)
- free_dip_list(hr_dev);
+ free_dip_entry(hr_dev);
}
static inline void mbox_desc_init(struct hns_roce_post_mbox *mb,
@@ -5042,26 +5041,49 @@ static int modify_qp_rtr_to_rts(struct ib_qp *ibqp, int attr_mask,
return 0;
}
+static int alloc_dip_entry(struct xarray *dip_xa, u32 qpn)
+{
+ struct hns_roce_dip *hr_dip;
+ int ret;
+
+ hr_dip = xa_load(dip_xa, qpn);
+ if (hr_dip)
+ return 0;
+
+ hr_dip = kzalloc(sizeof(*hr_dip), GFP_KERNEL);
+ if (!hr_dip)
+ return -ENOMEM;
+
+ ret = xa_err(xa_store(dip_xa, qpn, hr_dip, GFP_KERNEL));
+ if (ret)
+ kfree(hr_dip);
+
+ return ret;
+}
+
static int get_dip_ctx_idx(struct ib_qp *ibqp, const struct ib_qp_attr *attr,
u32 *dip_idx)
{
const struct ib_global_route *grh = rdma_ah_read_grh(&attr->ah_attr);
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
- u32 *spare_idx = hr_dev->qp_table.idx_table.spare_idx;
- u32 *head = &hr_dev->qp_table.idx_table.head;
- u32 *tail = &hr_dev->qp_table.idx_table.tail;
+ struct xarray *dip_xa = &hr_dev->qp_table.dip_xa;
+ struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
struct hns_roce_dip *hr_dip;
- unsigned long flags;
+ unsigned long idx;
int ret = 0;
- spin_lock_irqsave(&hr_dev->dip_list_lock, flags);
+ ret = alloc_dip_entry(dip_xa, ibqp->qp_num);
+ if (ret)
+ return ret;
- spare_idx[*tail] = ibqp->qp_num;
- *tail = (*tail == hr_dev->caps.num_qps - 1) ? 0 : (*tail + 1);
+ xa_lock(dip_xa);
- list_for_each_entry(hr_dip, &hr_dev->dip_list, node) {
- if (!memcmp(grh->dgid.raw, hr_dip->dgid, GID_LEN_V2)) {
+ xa_for_each(dip_xa, idx, hr_dip) {
+ if (hr_dip->qp_cnt &&
+ !memcmp(grh->dgid.raw, hr_dip->dgid, GID_LEN_V2)) {
*dip_idx = hr_dip->dip_idx;
+ hr_dip->qp_cnt++;
+ hr_qp->dip = hr_dip;
goto out;
}
}
@@ -5069,19 +5091,24 @@ static int get_dip_ctx_idx(struct ib_qp *ibqp, const struct ib_qp_attr *attr,
/* If no dgid is found, a new dip and a mapping between dgid and
* dip_idx will be created.
*/
- hr_dip = kzalloc(sizeof(*hr_dip), GFP_ATOMIC);
- if (!hr_dip) {
- ret = -ENOMEM;
- goto out;
+ xa_for_each(dip_xa, idx, hr_dip) {
+ if (hr_dip->qp_cnt)
+ continue;
+
+ *dip_idx = idx;
+ memcpy(hr_dip->dgid, grh->dgid.raw, sizeof(grh->dgid.raw));
+ hr_dip->dip_idx = idx;
+ hr_dip->qp_cnt++;
+ hr_qp->dip = hr_dip;
+ break;
}
- memcpy(hr_dip->dgid, grh->dgid.raw, sizeof(grh->dgid.raw));
- hr_dip->dip_idx = *dip_idx = spare_idx[*head];
- *head = (*head == hr_dev->caps.num_qps - 1) ? 0 : (*head + 1);
- list_add_tail(&hr_dip->node, &hr_dev->dip_list);
+ /* This should never happen. */
+ if (WARN_ON_ONCE(!hr_qp->dip))
+ ret = -ENOSPC;
out:
- spin_unlock_irqrestore(&hr_dev->dip_list_lock, flags);
+ xa_unlock(dip_xa);
return ret;
}
@@ -6005,6 +6032,20 @@ static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
return ret;
}
+static void put_dip_ctx_idx(struct hns_roce_dev *hr_dev,
+ struct hns_roce_qp *hr_qp)
+{
+ struct hns_roce_dip *hr_dip = hr_qp->dip;
+
+ xa_lock(&hr_dev->qp_table.dip_xa);
+
+ hr_dip->qp_cnt--;
+ if (!hr_dip->qp_cnt)
+ memset(hr_dip->dgid, 0, GID_LEN_V2);
+
+ xa_unlock(&hr_dev->qp_table.dip_xa);
+}
+
int hns_roce_v2_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
@@ -6018,6 +6059,9 @@ int hns_roce_v2_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
spin_unlock_irqrestore(&hr_qp->flush_lock, flags);
flush_work(&hr_qp->flush_work.work);
+ if (hr_qp->cong_type == CONG_TYPE_DIP)
+ put_dip_ctx_idx(hr_dev, hr_qp);
+
ret = hns_roce_v2_destroy_qp_common(hr_dev, hr_qp, udata);
if (ret)
ibdev_err_ratelimited(&hr_dev->ib_dev,
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index 201902dff611..f43444be0308 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -1366,7 +1366,7 @@ struct hns_roce_v2_priv {
struct hns_roce_dip {
u8 dgid[GID_LEN_V2];
u32 dip_idx;
- struct list_head node; /* all dips are on a list */
+ u32 qp_cnt;
};
struct fmea_ram_ecc {
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index d488c3d5986f..520112bd43d5 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -1250,8 +1250,6 @@ static int hns_roce_setup_hca(struct hns_roce_dev *hr_dev)
INIT_LIST_HEAD(&hr_dev->qp_list);
spin_lock_init(&hr_dev->qp_list_lock);
- INIT_LIST_HEAD(&hr_dev->dip_list);
- spin_lock_init(&hr_dev->dip_list_lock);
INIT_LIST_HEAD(&hr_dev->uctx_list);
mutex_init(&hr_dev->uctx_list_mutex);
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 5ed2647567aa..77ec0c8678d3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -1701,14 +1701,10 @@ int hns_roce_init_qp_table(struct hns_roce_dev *hr_dev)
unsigned int reserved_from_bot;
unsigned int i;
- qp_table->idx_table.spare_idx = kcalloc(hr_dev->caps.num_qps,
- sizeof(u32), GFP_KERNEL);
- if (!qp_table->idx_table.spare_idx)
- return -ENOMEM;
-
mutex_init(&qp_table->scc_mutex);
mutex_init(&qp_table->bank_mutex);
xa_init(&hr_dev->qp_table_xa);
+ xa_init(&qp_table->dip_xa);
reserved_from_bot = hr_dev->caps.reserved_qps;
@@ -1733,7 +1729,7 @@ void hns_roce_cleanup_qp_table(struct hns_roce_dev *hr_dev)
for (i = 0; i < HNS_ROCE_QP_BANK_NUM; i++)
ida_destroy(&hr_dev->qp_table.bank[i].ida);
+ xa_destroy(&hr_dev->qp_table.dip_xa);
mutex_destroy(&hr_dev->qp_table.bank_mutex);
mutex_destroy(&hr_dev->qp_table.scc_mutex);
- kfree(hr_dev->qp_table.idx_table.spare_idx);
}
--
2.33.0
1
0
[PATCH OLK-6.6] RDMA/hns: Fix different dgids mapping to the same dip_idx
by Chengchang Tang 19 Nov '24
by Chengchang Tang 19 Nov '24
19 Nov '24
From: Feng Fang <fangfeng4(a)huawei.com>
mainline inclusion
from mainline-v6.12-rc2
commit faa62440a5772b40bb7d78bf9e29556a82ecf153
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IB4OOG
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/commit/?id=fa…
---------------------------------------------------------------------
DIP algorithm requires a one-to-one mapping between dgid and dip_idx.
Currently a queue 'spare_idx' is used to store QPN of QPs that use
DIP algorithm. For a new dgid, use a QPN from spare_idx as dip_idx.
This method lacks a mechanism for deduplicating QPN, which may result
in different dgids sharing the same dip_idx and break the one-to-one
mapping requirement.
This patch replaces spare_idx with xarray and introduces a refcnt of
a dip_idx to indicate the number of QPs that using this dip_idx.
The state machine for dip_idx management is implemented as:
* The entry at an index in xarray is empty -- This indicates that the
corresponding dip_idx hasn't been created.
* The entry at an index in xarray is not empty but with 0 refcnt --
This indicates that the corresponding dip_idx has been created but
not used as dip_idx yet.
* The entry at an index in xarray is not empty and with non-0 refcnt --
This indicates that the corresponding dip_idx is being used by refcnt
number of DIP QPs.
Fixes: eb653eda1e91 ("RDMA/hns: Bugfix for incorrect association between dip_idx and dgid")
Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Signed-off-by: Feng Fang <fangfeng4(a)huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
Link: https://patch.msgid.link/20241112055553.3681129-1-huangjunxian6@hisilicon.c…
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
---
drivers/infiniband/hw/hns/hns_roce_device.h | 11 +--
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 96 +++++++++++++++------
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +-
drivers/infiniband/hw/hns/hns_roce_main.c | 2 -
drivers/infiniband/hw/hns/hns_roce_qp.c | 8 +-
5 files changed, 75 insertions(+), 44 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index fdc1fe5e6a81..e3303cc3584a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -570,12 +570,6 @@ struct hns_roce_bank {
u32 next; /* Next ID to allocate. */
};
-struct hns_roce_idx_table {
- u32 *spare_idx;
- u32 head;
- u32 tail;
-};
-
struct hns_roce_qp_table {
struct hns_roce_hem_table qp_table;
struct hns_roce_hem_table irrl_table;
@@ -584,7 +578,7 @@ struct hns_roce_qp_table {
struct mutex scc_mutex;
struct hns_roce_bank bank[HNS_ROCE_QP_BANK_NUM];
struct mutex bank_mutex;
- struct hns_roce_idx_table idx_table;
+ struct xarray dip_xa;
};
struct hns_roce_cq_table {
@@ -742,6 +736,7 @@ struct hns_roce_qp {
bool delayed_destroy_flag;
struct hns_roce_mtr_node *mtr_node;
spinlock_t flush_lock;
+ struct hns_roce_dip *dip;
};
struct hns_roce_ib_iboe {
@@ -1102,8 +1097,6 @@ struct hns_roce_dev {
enum hns_roce_device_state state;
struct list_head qp_list; /* list of all qps on this dev */
spinlock_t qp_list_lock; /* protect qp_list */
- struct list_head dip_list; /* list of all dest ips on this dev */
- spinlock_t dip_list_lock; /* protect dip_list */
struct list_head pgdir_list;
struct mutex pgdir_mutex;
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 3f254ac48b42..88bd75c5743e 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -2720,20 +2720,19 @@ static void hns_roce_free_link_table(struct hns_roce_dev *hr_dev)
free_link_table_buf(hr_dev, &priv->ext_llm);
}
-static void free_dip_list(struct hns_roce_dev *hr_dev)
+static void free_dip_entry(struct hns_roce_dev *hr_dev)
{
struct hns_roce_dip *hr_dip;
- struct hns_roce_dip *tmp;
- unsigned long flags;
+ unsigned long idx;
- spin_lock_irqsave(&hr_dev->dip_list_lock, flags);
+ xa_lock(&hr_dev->qp_table.dip_xa);
- list_for_each_entry_safe(hr_dip, tmp, &hr_dev->dip_list, node) {
- list_del(&hr_dip->node);
+ xa_for_each(&hr_dev->qp_table.dip_xa, idx, hr_dip) {
+ __xa_erase(&hr_dev->qp_table.dip_xa, hr_dip->dip_idx);
kfree(hr_dip);
}
- spin_unlock_irqrestore(&hr_dev->dip_list_lock, flags);
+ xa_unlock(&hr_dev->qp_table.dip_xa);
}
static int hns_roce_v2_get_reset_page(struct hns_roce_dev *hr_dev)
@@ -3182,7 +3181,7 @@ static void hns_roce_v2_exit(struct hns_roce_dev *hr_dev)
hns_roce_v2_put_reset_page(hr_dev);
if (hr_dev->pci_dev->revision == PCI_REVISION_ID_HIP09)
- free_dip_list(hr_dev);
+ free_dip_entry(hr_dev);
}
static inline void mbox_desc_init(struct hns_roce_post_mbox *mb,
@@ -5042,26 +5041,49 @@ static int modify_qp_rtr_to_rts(struct ib_qp *ibqp, int attr_mask,
return 0;
}
+static int alloc_dip_entry(struct xarray *dip_xa, u32 qpn)
+{
+ struct hns_roce_dip *hr_dip;
+ int ret;
+
+ hr_dip = xa_load(dip_xa, qpn);
+ if (hr_dip)
+ return 0;
+
+ hr_dip = kzalloc(sizeof(*hr_dip), GFP_KERNEL);
+ if (!hr_dip)
+ return -ENOMEM;
+
+ ret = xa_err(xa_store(dip_xa, qpn, hr_dip, GFP_KERNEL));
+ if (ret)
+ kfree(hr_dip);
+
+ return ret;
+}
+
static int get_dip_ctx_idx(struct ib_qp *ibqp, const struct ib_qp_attr *attr,
u32 *dip_idx)
{
const struct ib_global_route *grh = rdma_ah_read_grh(&attr->ah_attr);
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
- u32 *spare_idx = hr_dev->qp_table.idx_table.spare_idx;
- u32 *head = &hr_dev->qp_table.idx_table.head;
- u32 *tail = &hr_dev->qp_table.idx_table.tail;
+ struct xarray *dip_xa = &hr_dev->qp_table.dip_xa;
+ struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
struct hns_roce_dip *hr_dip;
- unsigned long flags;
+ unsigned long idx;
int ret = 0;
- spin_lock_irqsave(&hr_dev->dip_list_lock, flags);
+ ret = alloc_dip_entry(dip_xa, ibqp->qp_num);
+ if (ret)
+ return ret;
- spare_idx[*tail] = ibqp->qp_num;
- *tail = (*tail == hr_dev->caps.num_qps - 1) ? 0 : (*tail + 1);
+ xa_lock(dip_xa);
- list_for_each_entry(hr_dip, &hr_dev->dip_list, node) {
- if (!memcmp(grh->dgid.raw, hr_dip->dgid, GID_LEN_V2)) {
+ xa_for_each(dip_xa, idx, hr_dip) {
+ if (hr_dip->qp_cnt &&
+ !memcmp(grh->dgid.raw, hr_dip->dgid, GID_LEN_V2)) {
*dip_idx = hr_dip->dip_idx;
+ hr_dip->qp_cnt++;
+ hr_qp->dip = hr_dip;
goto out;
}
}
@@ -5069,19 +5091,24 @@ static int get_dip_ctx_idx(struct ib_qp *ibqp, const struct ib_qp_attr *attr,
/* If no dgid is found, a new dip and a mapping between dgid and
* dip_idx will be created.
*/
- hr_dip = kzalloc(sizeof(*hr_dip), GFP_ATOMIC);
- if (!hr_dip) {
- ret = -ENOMEM;
- goto out;
+ xa_for_each(dip_xa, idx, hr_dip) {
+ if (hr_dip->qp_cnt)
+ continue;
+
+ *dip_idx = idx;
+ memcpy(hr_dip->dgid, grh->dgid.raw, sizeof(grh->dgid.raw));
+ hr_dip->dip_idx = idx;
+ hr_dip->qp_cnt++;
+ hr_qp->dip = hr_dip;
+ break;
}
- memcpy(hr_dip->dgid, grh->dgid.raw, sizeof(grh->dgid.raw));
- hr_dip->dip_idx = *dip_idx = spare_idx[*head];
- *head = (*head == hr_dev->caps.num_qps - 1) ? 0 : (*head + 1);
- list_add_tail(&hr_dip->node, &hr_dev->dip_list);
+ /* This should never happen. */
+ if (WARN_ON_ONCE(!hr_qp->dip))
+ ret = -ENOSPC;
out:
- spin_unlock_irqrestore(&hr_dev->dip_list_lock, flags);
+ xa_unlock(dip_xa);
return ret;
}
@@ -6005,6 +6032,20 @@ static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
return ret;
}
+static void put_dip_ctx_idx(struct hns_roce_dev *hr_dev,
+ struct hns_roce_qp *hr_qp)
+{
+ struct hns_roce_dip *hr_dip = hr_qp->dip;
+
+ xa_lock(&hr_dev->qp_table.dip_xa);
+
+ hr_dip->qp_cnt--;
+ if (!hr_dip->qp_cnt)
+ memset(hr_dip->dgid, 0, GID_LEN_V2);
+
+ xa_unlock(&hr_dev->qp_table.dip_xa);
+}
+
int hns_roce_v2_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
@@ -6018,6 +6059,9 @@ int hns_roce_v2_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
spin_unlock_irqrestore(&hr_qp->flush_lock, flags);
flush_work(&hr_qp->flush_work.work);
+ if (hr_qp->cong_type == CONG_TYPE_DIP)
+ put_dip_ctx_idx(hr_dev, hr_qp);
+
ret = hns_roce_v2_destroy_qp_common(hr_dev, hr_qp, udata);
if (ret)
ibdev_err_ratelimited(&hr_dev->ib_dev,
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index 201902dff611..f43444be0308 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -1366,7 +1366,7 @@ struct hns_roce_v2_priv {
struct hns_roce_dip {
u8 dgid[GID_LEN_V2];
u32 dip_idx;
- struct list_head node; /* all dips are on a list */
+ u32 qp_cnt;
};
struct fmea_ram_ecc {
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index d488c3d5986f..520112bd43d5 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -1250,8 +1250,6 @@ static int hns_roce_setup_hca(struct hns_roce_dev *hr_dev)
INIT_LIST_HEAD(&hr_dev->qp_list);
spin_lock_init(&hr_dev->qp_list_lock);
- INIT_LIST_HEAD(&hr_dev->dip_list);
- spin_lock_init(&hr_dev->dip_list_lock);
INIT_LIST_HEAD(&hr_dev->uctx_list);
mutex_init(&hr_dev->uctx_list_mutex);
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 5ed2647567aa..77ec0c8678d3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -1701,14 +1701,10 @@ int hns_roce_init_qp_table(struct hns_roce_dev *hr_dev)
unsigned int reserved_from_bot;
unsigned int i;
- qp_table->idx_table.spare_idx = kcalloc(hr_dev->caps.num_qps,
- sizeof(u32), GFP_KERNEL);
- if (!qp_table->idx_table.spare_idx)
- return -ENOMEM;
-
mutex_init(&qp_table->scc_mutex);
mutex_init(&qp_table->bank_mutex);
xa_init(&hr_dev->qp_table_xa);
+ xa_init(&qp_table->dip_xa);
reserved_from_bot = hr_dev->caps.reserved_qps;
@@ -1733,7 +1729,7 @@ void hns_roce_cleanup_qp_table(struct hns_roce_dev *hr_dev)
for (i = 0; i < HNS_ROCE_QP_BANK_NUM; i++)
ida_destroy(&hr_dev->qp_table.bank[i].ida);
+ xa_destroy(&hr_dev->qp_table.dip_xa);
mutex_destroy(&hr_dev->qp_table.bank_mutex);
mutex_destroy(&hr_dev->qp_table.scc_mutex);
- kfree(hr_dev->qp_table.idx_table.spare_idx);
}
--
2.33.0
1
0
您好!
sig-high-performance-network 邀请您参加 2024-11-18 14:30 召开的WeLink会议(自动录制)
会议主题:高性能网络sig会议
会议链接:https://meeting.huaweicloud.com:36443/#/j/982559975
会议纪要:https://etherpad.openeuler.org/p/sig-high-performance-network-meetings
更多资讯尽在:https://www.openeuler.org/zh/
Hello!
sig-high-performance-network invites you to attend the WeLink conference(auto recording) will be held at 2024-11-18 14:30,
The subject of the conference is 高性能网络sig会议,
You can join the meeting at https://meeting.huaweicloud.com:36443/#/j/982559975.
Add topics at https://etherpad.openeuler.org/p/sig-high-performance-network-meetings.
More information: https://www.openeuler.org/en/
1
0
From: Xinghai Cen <cenxinghai(a)h-partners.com>
Two bugfixes in post_send flow:
libhns: Fix out-of-order issue of requester when setting FENCE
libhns: Fix reference to uninitialized cq pointer
Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
---
...f-order-issue-of-requester-when-sett.patch | 50 ++++++++++++++
...eference-to-uninitialized-cq-pointer.patch | 68 +++++++++++++++++++
rdma-core.spec | 10 ++-
3 files changed, 127 insertions(+), 1 deletion(-)
create mode 100644 0037-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch
create mode 100644 0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
diff --git a/0037-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch b/0037-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch
new file mode 100644
index 0000000..0996c7f
--- /dev/null
+++ b/0037-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch
@@ -0,0 +1,50 @@
+From 23c00575e9dc2b61f2ac429f9d039cb7e2e6deb2 Mon Sep 17 00:00:00 2001
+From: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Date: Fri, 8 Nov 2024 17:04:09 +0800
+Subject: [PATCH] libhns: Fix out-of-order issue of requester when setting
+ FENCE
+
+mainline inclusion
+from mainline-master
+commit c4119911c212aaa552c9cb928fba0a696640c9b5
+category: bugfix
+bugzilla: https://gitee.com/openeuler/kernel/issues/IB3ZHQ
+CVE: NA
+Reference: https://github.com/linux-rdma/rdma-core/pull/1513/commits/c4119911c212aaa55…
+
+----------------------------------------------------------------------
+
+The FENCE indicator in hns WQE doesn't ensure that response data from
+a previous Read/Atomic operation has been written to the requester's
+memory before the subsequent Send/Write operation is processed. This
+may result in the subsequent Send/Write operation accessing the original
+data in memory instead of the expected response data.
+
+Unlike FENCE, the SO (Strong Order) indicator blocks the subsequent
+operation until the previous response data is written to memory and a
+bresp is returned. Set the SO indicator instead of FENCE to maintain
+strict order.
+
+Fixes: cbdf5e32a855 ("libhns: Reimplement verbs of post_send and post_recv for hip08 RoCE")
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 9371150..2debcb3 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -1527,7 +1527,7 @@ static int set_rc_wqe(void *wqe, struct hns_roce_qp *qp, struct ibv_send_wr *wr,
+
+ hr_reg_write_bool(wqe, RCWQE_CQE,
+ !!(wr->send_flags & IBV_SEND_SIGNALED));
+- hr_reg_write_bool(wqe, RCWQE_FENCE,
++ hr_reg_write_bool(wqe, RCWQE_SO,
+ !!(wr->send_flags & IBV_SEND_FENCE));
+ hr_reg_write_bool(wqe, RCWQE_SE,
+ !!(wr->send_flags & IBV_SEND_SOLICITED));
+--
+2.33.0
+
diff --git a/0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch b/0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
new file mode 100644
index 0000000..e45d61a
--- /dev/null
+++ b/0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
@@ -0,0 +1,68 @@
+From e941cc407d89595d1affd5211e1daf34786c7641 Mon Sep 17 00:00:00 2001
+From: Chengchang Tang <tangchengchang(a)huawei.com>
+Date: Fri, 8 Nov 2024 17:04:08 +0800
+Subject: [PATCH] libhns: Fix reference to uninitialized cq pointer
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+mainline inclusion
+from mainline-master
+commit 18e3117cdd161a3f40b8a917f24cfb5227a1d75a
+category: bugfix
+bugzilla: https://gitee.com/openeuler/kernel/issues/IB3ZHQ
+CVE: NA
+Reference: https://github.com/linux-rdma/rdma-core/pull/1513/commits/18e3117cdd161a3f4…
+
+----------------------------------------------------------------------
+
+For QPs which do not have an SQ, such as XRC TGT,the send_cq
+pointer will not be initailized. Since the supported max_gs
+will be 0 in this case, check it and return before referencing
+the send_cq pointer.
+
+Fixes: cbdf5e32a855 ("libhns: Reimplement verbs of post_send and post_recv for hip08 RoCE")
+Signed-off-by: Chengchang Tang <tangchengchang(a)huawei.com>
+Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
+Signed-off-by: Xinghai Cen <cenxinghai(a)h-partners.com>
+---
+ providers/hns/hns_roce_u_hw_v2.c | 12 ++++++------
+ 1 file changed, 6 insertions(+), 6 deletions(-)
+
+diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
+index 2debcb3..465ef1e 100644
+--- a/providers/hns/hns_roce_u_hw_v2.c
++++ b/providers/hns/hns_roce_u_hw_v2.c
+@@ -1579,7 +1579,7 @@ int hns_roce_u_v2_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+ struct hns_roce_context *ctx = to_hr_ctx(ibvqp->context);
+ struct hns_roce_qp *qp = to_hr_qp(ibvqp);
+ struct hns_roce_sge_info sge_info = {};
+- struct hns_roce_rc_sq_wqe *wqe;
++ struct hns_roce_rc_sq_wqe *wqe = NULL;
+ struct ibv_qp_attr attr = {};
+ unsigned int wqe_idx, nreq;
+ int ret;
+@@ -1595,15 +1595,15 @@ int hns_roce_u_v2_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+ sge_info.start_idx = qp->next_sge; /* start index of extend sge */
+
+ for (nreq = 0; wr; ++nreq, wr = wr->next) {
+- if (hns_roce_v2_wq_overflow(&qp->sq, nreq,
+- to_hr_cq(qp->verbs_qp.qp.send_cq))) {
+- ret = ENOMEM;
++ if (wr->num_sge > (int)qp->sq.max_gs) {
++ ret = qp->sq.max_gs > 0 ? EINVAL : EOPNOTSUPP;
+ *bad_wr = wr;
+ goto out;
+ }
+
+- if (wr->num_sge > qp->sq.max_gs) {
+- ret = EINVAL;
++ if (hns_roce_v2_wq_overflow(&qp->sq, nreq,
++ to_hr_cq(qp->verbs_qp.qp.send_cq))) {
++ ret = ENOMEM;
+ *bad_wr = wr;
+ goto out;
+ }
+--
+2.33.0
+
diff --git a/rdma-core.spec b/rdma-core.spec
index 21a690e..4135ada 100644
--- a/rdma-core.spec
+++ b/rdma-core.spec
@@ -1,6 +1,6 @@
Name: rdma-core
Version: 50.0
-Release: 13
+Release: 14
Summary: RDMA core userspace libraries and daemons
License: GPLv2 or BSD
Url: https://github.com/linux-rdma/rdma-core
@@ -42,6 +42,8 @@ patch33: 0033-libhns-Fix-missing-flag-when-creating-qp-by-hnsdv_cr.patch
patch34: 0034-librdmacm-Fix-an-overflow-bug-in-qsort-comparison-function.patch
patch35: 0035-Fix-the-stride-calculation-for-MSN-PSN-area.patch
patch36: 0036-add-ZTE-Dinghai-rdma-driver.patch
+patch37: 0037-libhns-Fix-out-of-order-issue-of-requester-when-sett.patch
+patch38: 0038-libhns-Fix-reference-to-uninitialized-cq-pointer.patch
BuildRequires: binutils cmake >= 2.8.11 gcc libudev-devel pkgconfig pkgconfig(libnl-3.0)
BuildRequires: pkgconfig(libnl-route-3.0) systemd systemd-devel
@@ -627,6 +629,12 @@ fi
%doc %{_docdir}/%{name}-%{version}/tag_matching.md
%doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules
+* Fri Nov 15 2024 Xinghai Cen <cenxinghai(a)h-partners.com> - 50.0-14
+- Type: bugfix
+- ID: NA
+- SUG: NA
+- DESC: Two bugfixes in post_send flow
+
%changelog
* Sat Aug 31 2024 Li Fuyan <li.fuyan(a)zte.com.cn> - 50.0-13
- Type: requirement
--
2.33.0
1
0
您好!
sig-high-performance-network 邀请您参加 2024-11-15 14:00 召开的WeLink会议(自动录制)
会议主题:高性能网络sig会议
会议链接:https://meeting.huaweicloud.com:36443/#/j/965651261
会议纪要:https://etherpad.openeuler.org/p/sig-high-performance-network-meetings
更多资讯尽在:https://www.openeuler.org/zh/
Hello!
sig-high-performance-network invites you to attend the WeLink conference(auto recording) will be held at 2024-11-15 14:00,
The subject of the conference is 高性能网络sig会议,
You can join the meeting at https://meeting.huaweicloud.com:36443/#/j/965651261.
Add topics at https://etherpad.openeuler.org/p/sig-high-performance-network-meetings.
More information: https://www.openeuler.org/en/
1
0
Two hns bugfixes in post_send flow
Chengchang Tang (1):
libhns: Fix reference to uninitialized cq pointer
Junxian Huang (1):
libhns: Fix out-of-order issue of requester when setting FENCE
providers/hns/hns_roce_u_hw_v2.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--
2.33.0
1
2
您好!
sig-high-performance-network 邀请您参加 2024-11-08 11:00 召开的WeLink会议(自动录制)
会议主题:高性能网络sig例会
会议链接:https://meeting.huaweicloud.com:36443/#/j/960933891
会议纪要:https://etherpad.openeuler.org/p/sig-high-performance-network-meetings
更多资讯尽在:https://www.openeuler.org/zh/
Hello!
sig-high-performance-network invites you to attend the WeLink conference(auto recording) will be held at 2024-11-08 11:00,
The subject of the conference is 高性能网络sig例会,
You can join the meeting at https://meeting.huaweicloud.com:36443/#/j/960933891.
Add topics at https://etherpad.openeuler.org/p/sig-high-performance-network-meetings.
More information: https://www.openeuler.org/en/
1
0
您好!
sig-high-performance-network 邀请您参加 2024-10-25 11:00 召开的WeLink会议
会议主题:高性能网络sig会议
会议链接:https://meeting.huaweicloud.com:36443/#/j/982220702
会议纪要:https://etherpad.openeuler.org/p/sig-high-performance-network-meetings
更多资讯尽在:https://www.openeuler.org/zh/
Hello!
sig-high-performance-network invites you to attend the WeLink conference will be held at 2024-10-25 11:00,
The subject of the conference is 高性能网络sig会议,
You can join the meeting at https://meeting.huaweicloud.com:36443/#/j/982220702.
Add topics at https://etherpad.openeuler.org/p/sig-high-performance-network-meetings.
More information: https://www.openeuler.org/en/
1
0
To commit: ?? ("RDMA/hns: Support mmapping reset state to userspace").
Signed-off-by: Junxian Huang <huangjunxian6(a)hisilicon.com>
---
kernel-headers/rdma/hns-abi.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel-headers/rdma/hns-abi.h b/kernel-headers/rdma/hns-abi.h
index 94e861870..065eb2e0a 100644
--- a/kernel-headers/rdma/hns-abi.h
+++ b/kernel-headers/rdma/hns-abi.h
@@ -136,6 +136,7 @@ struct hns_roce_ib_alloc_ucontext_resp {
__u32 max_inline_data;
__u8 congest_type;
__u8 reserved0[7];
+ __aligned_u64 reset_mmap_key;
};
struct hns_roce_ib_alloc_ucontext {
@@ -153,4 +154,9 @@ struct hns_roce_ib_create_ah_resp {
__u8 tc_mode;
};
+struct hns_roce_reset_state {
+ __u32 hw_ready;
+ __u32 reserved;
+};
+
#endif /* HNS_ABI_USER_H */
--
2.33.0
1
3
你好,
帮忙看下新的rpm差异分析是否还有问题。
https://gitee.com/src-openeuler/rdma-core/pulls/165
感谢!
李富艳
软件开发工程师
ICF六部/智算云底座研发中心/无线及算力研究院
ICF Dept. VI/Intelligent Computing and Cloud Foundation R&D Center/Wireless and Computing Product R&D Institute
中兴通讯股份有限公司
西安市长安区西沣路五星段9号中兴通讯1E-401 邮编: 710114
T: +86 029 xxxxxxxx M: +86 15332482266
E: li.fuyan(a)zte.com.cn
www.zte.com.cn
Original
From: 李富艳00122684
To: tangchengchang <tangchengchang(a)huawei.com>;
Cc: high-performance-network(a)openeuler.org <high-performance-network(a)openeuler.org>;dev(a)openeuler.org <dev(a)openeuler.org>;
Date: 2024年09月19日 14:49
Subject: rdma-core PR流程rpm差异分析已完成,待下一步处理
您好,
下面PR链接已经在9月11日针对RPM构建的差异完成了分析,下一步处理是不是到您那边了,请帮忙看一下。
https://gitee.com/src-openeuler/rdma-core/pulls/165
感谢!
李富艳
软件开发工程师
ICF六部/智算云底座研发中心/无线及算力研究院
ICF Dept. VI/Intelligent Computing and Cloud Foundation R&D Center/Wireless and Computing Product R&D Institute
中兴通讯股份有限公司
西安市长安区西沣路五星段9号中兴通讯1E-401 邮编: 710114
T: +86 029 xxxxxxxx M: +86 15332482266
E: li.fuyan(a)zte.com.cn
www.zte.com.cn
1
0
您好,
下面PR链接已经在9月11日针对RPM构建的差异完成了分析,下一步处理是不是到您那边了,请帮忙看一下。
https://gitee.com/src-openeuler/rdma-core/pulls/165
感谢!
李富艳
软件开发工程师
ICF六部/智算云底座研发中心/无线及算力研究院
ICF Dept. VI/Intelligent Computing and Cloud Foundation R&D Center/Wireless and Computing Product R&D Institute
中兴通讯股份有限公司
西安市长安区西沣路五星段9号中兴通讯1E-401 邮编: 710114
T: +86 029 xxxxxxxx M: +86 15332482266
E: li.fuyan(a)zte.com.cn
www.zte.com.cn
1
0
您好,
请帮忙看一下我在高性能网络SIG下提交的关于“ZTE Dinghai rdma-core驱动合入”的PR是什么状态,我在页面上看不到构建信息了。
https://gitee.com/src-openeuler/rdma-core/pulls/165
李富艳
中兴通讯股份有限公司
T: +86 029 xxxxxxxx M: +86 15332482266
E: li.fuyan(a)zte.com.cn
www.zte.com.cn
1
0