Kernel
Threads by month
- ----- 2025 -----
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- 55 participants
- 16918 discussions

[PATCH openEuler-1.0-LTS] xfs: verify buffer contents when we skip log replay
by Yongqiang Liu 18 May '23
by Yongqiang Liu 18 May '23
18 May '23
From: "Darrick J. Wong" <djwong(a)kernel.org>
mainline inclusion
from mainline-v6.3-rc6
commit 22ed903eee23a5b174e240f1cdfa9acf393a5210
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6X4UN
CVE: CVE-2023-2124
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
syzbot detected a crash during log recovery:
XFS (loop0): Mounting V5 Filesystem bfdc47fc-10d8-4eed-a562-11a831b3f791
XFS (loop0): Torn write (CRC failure) detected at log block 0x180. Truncating head block from 0x200.
XFS (loop0): Starting recovery (logdev: internal)
==================================================================
BUG: KASAN: slab-out-of-bounds in xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813
Read of size 8 at addr ffff88807e89f258 by task syz-executor132/5074
CPU: 0 PID: 5074 Comm: syz-executor132 Not tainted 6.2.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106
print_address_description+0x74/0x340 mm/kasan/report.c:306
print_report+0x107/0x1f0 mm/kasan/report.c:417
kasan_report+0xcd/0x100 mm/kasan/report.c:517
xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813
xfs_btree_lookup+0x346/0x12c0 fs/xfs/libxfs/xfs_btree.c:1913
xfs_btree_simple_query_range+0xde/0x6a0 fs/xfs/libxfs/xfs_btree.c:4713
xfs_btree_query_range+0x2db/0x380 fs/xfs/libxfs/xfs_btree.c:4953
xfs_refcount_recover_cow_leftovers+0x2d1/0xa60 fs/xfs/libxfs/xfs_refcount.c:1946
xfs_reflink_recover_cow+0xab/0x1b0 fs/xfs/xfs_reflink.c:930
xlog_recover_finish+0x824/0x920 fs/xfs/xfs_log_recover.c:3493
xfs_log_mount_finish+0x1ec/0x3d0 fs/xfs/xfs_log.c:829
xfs_mountfs+0x146a/0x1ef0 fs/xfs/xfs_mount.c:933
xfs_fs_fill_super+0xf95/0x11f0 fs/xfs/xfs_super.c:1666
get_tree_bdev+0x400/0x620 fs/super.c:1282
vfs_get_tree+0x88/0x270 fs/super.c:1489
do_new_mount+0x289/0xad0 fs/namespace.c:3145
do_mount fs/namespace.c:3488 [inline]
__do_sys_mount fs/namespace.c:3697 [inline]
__se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f89fa3f4aca
Code: 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fffd5fb5ef8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00646975756f6e2c RCX: 00007f89fa3f4aca
RDX: 0000000020000100 RSI: 0000000020009640 RDI: 00007fffd5fb5f10
RBP: 00007fffd5fb5f10 R08: 00007fffd5fb5f50 R09: 000000000000970d
R10: 0000000000200800 R11: 0000000000000206 R12: 0000000000000004
R13: 0000555556c6b2c0 R14: 0000000000200800 R15: 00007fffd5fb5f50
</TASK>
The fuzzed image contains an AGF with an obviously garbage
agf_refcount_level value of 32, and a dirty log with a buffer log item
for that AGF. The ondisk AGF has a higher LSN than the recovered log
item. xlog_recover_buf_commit_pass2 reads the buffer, compares the
LSNs, and decides to skip replay because the ondisk buffer appears to be
newer.
Unfortunately, the ondisk buffer is corrupt, but recovery just read the
buffer with no buffer ops specified:
error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno,
buf_f->blf_len, buf_flags, &bp, NULL);
Skipping the buffer leaves its contents in memory unverified. This sets
us up for a kernel crash because xfs_refcount_recover_cow_leftovers
reads the buffer (which is still around in XBF_DONE state, so no read
verification) and creates a refcountbt cursor of height 32. This is
impossible so we run off the end of the cursor object and crash.
Fix this by invoking the verifier on all skipped buffers and aborting
log recovery if the ondisk buffer is corrupt. It might be smarter to
force replay the log item atop the buffer and then see if it'll pass the
write verifier (like ext4 does) but for now let's go with the
conservative option where we stop immediately.
Link: https://syzkaller.appspot.com/bug?extid=7e9494b8b399902e994e
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Reviewed-by: Dave Chinner <dchinner(a)redhat.com>
Signed-off-by: Dave Chinner <david(a)fromorbit.com>
Conflicts:
fs/xfs/xfs_log_recover.c
Signed-off-by: Long Li <leo.lilong(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/xfs/xfs_log_recover.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index ca8075894bea..a6e8fadae007 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2855,6 +2855,16 @@ xlog_recover_buffer_pass2(
if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
trace_xfs_log_recover_buf_skip(log, buf_f);
xlog_recover_validate_buf_type(mp, bp, buf_f, NULLCOMMITLSN);
+
+ /*
+ * We're skipping replay of this buffer log item due to the log
+ * item LSN being behind the ondisk buffer. Verify the buffer
+ * contents since we aren't going to run the write verifier.
+ */
+ if (bp->b_ops) {
+ bp->b_ops->verify_read(bp);
+ error = bp->b_error;
+ }
goto out_release;
}
--
2.25.1
1
0
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6X8PA
CVE: NA
Reference: NA
---------------------------------
When using allyesconfig to configure the kernel,
errors may occur during the linking process when making.
Signed-off-by: zhoujiadong <zhoujiadong5(a)huawei.com>
Reviewed-by: Wulike (Collin) <wulike1(a)huawei.com>
---
drivers/net/ethernet/huawei/Kconfig | 1 +
drivers/net/ethernet/huawei/Makefile | 1 +
drivers/net/ethernet/huawei/hinic3/Kconfig | 13 +
drivers/net/ethernet/huawei/hinic3/Makefile | 45 +
.../ethernet/huawei/hinic3/cfg_mgt_comm_pub.h | 212 ++
.../ethernet/huawei/hinic3/comm_cmdq_intf.h | 239 ++
.../net/ethernet/huawei/hinic3/comm_defs.h | 105 +
.../ethernet/huawei/hinic3/comm_msg_intf.h | 664 +++++
.../ethernet/huawei/hinic3/hinic3_comm_cmd.h | 185 ++
.../ethernet/huawei/hinic3/hinic3_common.h | 119 +
.../net/ethernet/huawei/hinic3/hinic3_crm.h | 1162 +++++++++
.../net/ethernet/huawei/hinic3/hinic3_dbg.c | 983 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_dcb.c | 405 ++++
.../net/ethernet/huawei/hinic3/hinic3_dcb.h | 78 +
.../ethernet/huawei/hinic3/hinic3_ethtool.c | 1331 ++++++++++
.../huawei/hinic3/hinic3_ethtool_stats.c | 1233 ++++++++++
.../ethernet/huawei/hinic3/hinic3_filter.c | 483 ++++
.../net/ethernet/huawei/hinic3/hinic3_hw.h | 828 +++++++
.../net/ethernet/huawei/hinic3/hinic3_irq.c | 189 ++
.../net/ethernet/huawei/hinic3/hinic3_lld.h | 204 ++
.../ethernet/huawei/hinic3/hinic3_mag_cfg.c | 953 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_main.c | 1125 +++++++++
.../huawei/hinic3/hinic3_mgmt_interface.h | 1252 ++++++++++
.../net/ethernet/huawei/hinic3/hinic3_mt.h | 681 ++++++
.../huawei/hinic3/hinic3_netdev_ops.c | 1975 +++++++++++++++
.../net/ethernet/huawei/hinic3/hinic3_nic.h | 183 ++
.../ethernet/huawei/hinic3/hinic3_nic_cfg.c | 1608 +++++++++++++
.../ethernet/huawei/hinic3/hinic3_nic_cfg.h | 620 +++++
.../huawei/hinic3/hinic3_nic_cfg_vf.c | 637 +++++
.../ethernet/huawei/hinic3/hinic3_nic_cmd.h | 159 ++
.../ethernet/huawei/hinic3/hinic3_nic_dbg.c | 146 ++
.../ethernet/huawei/hinic3/hinic3_nic_dbg.h | 21 +
.../ethernet/huawei/hinic3/hinic3_nic_dev.h | 387 +++
.../ethernet/huawei/hinic3/hinic3_nic_event.c | 580 +++++
.../ethernet/huawei/hinic3/hinic3_nic_io.c | 1122 +++++++++
.../ethernet/huawei/hinic3/hinic3_nic_io.h | 325 +++
.../ethernet/huawei/hinic3/hinic3_nic_prof.c | 47 +
.../ethernet/huawei/hinic3/hinic3_nic_prof.h | 59 +
.../ethernet/huawei/hinic3/hinic3_nic_qp.h | 384 +++
.../ethernet/huawei/hinic3/hinic3_ntuple.c | 907 +++++++
.../ethernet/huawei/hinic3/hinic3_profile.h | 146 ++
.../net/ethernet/huawei/hinic3/hinic3_rss.c | 978 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_rss.h | 100 +
.../ethernet/huawei/hinic3/hinic3_rss_cfg.c | 384 +++
.../net/ethernet/huawei/hinic3/hinic3_rx.c | 1344 +++++++++++
.../net/ethernet/huawei/hinic3/hinic3_rx.h | 155 ++
.../ethernet/huawei/hinic3/hinic3_srv_nic.h | 213 ++
.../net/ethernet/huawei/hinic3/hinic3_tx.c | 1016 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_tx.h | 157 ++
.../net/ethernet/huawei/hinic3/hinic3_wq.h | 130 +
.../huawei/hinic3/hw/hinic3_api_cmd.c | 1211 ++++++++++
.../huawei/hinic3/hw/hinic3_api_cmd.h | 286 +++
.../ethernet/huawei/hinic3/hw/hinic3_cmdq.c | 1543 ++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_cmdq.h | 204 ++
.../ethernet/huawei/hinic3/hw/hinic3_common.c | 93 +
.../ethernet/huawei/hinic3/hw/hinic3_csr.h | 187 ++
.../huawei/hinic3/hw/hinic3_dev_mgmt.c | 803 +++++++
.../huawei/hinic3/hw/hinic3_dev_mgmt.h | 105 +
.../huawei/hinic3/hw/hinic3_devlink.c | 431 ++++
.../huawei/hinic3/hw/hinic3_devlink.h | 149 ++
.../ethernet/huawei/hinic3/hw/hinic3_eqs.c | 1381 +++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_eqs.h | 164 ++
.../ethernet/huawei/hinic3/hw/hinic3_hw_api.c | 453 ++++
.../ethernet/huawei/hinic3/hw/hinic3_hw_api.h | 141 ++
.../ethernet/huawei/hinic3/hw/hinic3_hw_cfg.c | 1480 ++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_hw_cfg.h | 332 +++
.../huawei/hinic3/hw/hinic3_hw_comm.c | 1540 ++++++++++++
.../huawei/hinic3/hw/hinic3_hw_comm.h | 51 +
.../ethernet/huawei/hinic3/hw/hinic3_hw_mt.c | 599 +++++
.../ethernet/huawei/hinic3/hw/hinic3_hw_mt.h | 49 +
.../ethernet/huawei/hinic3/hw/hinic3_hwdev.c | 2141 +++++++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_hwdev.h | 175 ++
.../ethernet/huawei/hinic3/hw/hinic3_hwif.c | 994 ++++++++
.../ethernet/huawei/hinic3/hw/hinic3_hwif.h | 113 +
.../ethernet/huawei/hinic3/hw/hinic3_lld.c | 1410 +++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_mbox.c | 1841 ++++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_mbox.h | 267 ++
.../ethernet/huawei/hinic3/hw/hinic3_mgmt.c | 1515 ++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_mgmt.h | 179 ++
.../huawei/hinic3/hw/hinic3_nictool.c | 974 ++++++++
.../huawei/hinic3/hw/hinic3_nictool.h | 35 +
.../huawei/hinic3/hw/hinic3_pci_id_tbl.h | 15 +
.../huawei/hinic3/hw/hinic3_prof_adap.c | 44 +
.../huawei/hinic3/hw/hinic3_prof_adap.h | 109 +
.../ethernet/huawei/hinic3/hw/hinic3_sm_lt.h | 160 ++
.../ethernet/huawei/hinic3/hw/hinic3_sml_lt.c | 160 ++
.../ethernet/huawei/hinic3/hw/hinic3_sriov.c | 267 ++
.../ethernet/huawei/hinic3/hw/hinic3_sriov.h | 35 +
.../net/ethernet/huawei/hinic3/hw/hinic3_wq.c | 159 ++
.../huawei/hinic3/hw/ossl_knl_linux.c | 121 +
drivers/net/ethernet/huawei/hinic3/mag_cmd.h | 886 +++++++
.../ethernet/huawei/hinic3/mgmt_msg_base.h | 27 +
.../net/ethernet/huawei/hinic3/nic_cfg_comm.h | 63 +
drivers/net/ethernet/huawei/hinic3/ossl_knl.h | 36 +
.../ethernet/huawei/hinic3/ossl_knl_linux.h | 284 +++
95 files changed, 49486 insertions(+)
create mode 100644 drivers/net/ethernet/huawei/hinic3/Kconfig
create mode 100644 drivers/net/ethernet/huawei/hinic3/Makefile
create mode 100644 drivers/net/ethernet/huawei/hinic3/cfg_mgt_comm_pub.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/comm_cmdq_intf.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/comm_defs.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/comm_msg_intf.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_comm_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_common.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_crm.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_dbg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_dcb.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_dcb.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_ethtool_stats.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_filter.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_hw.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_lld.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_mag_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_main.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_mt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg_vf.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_dev.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_event.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_qp.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_ntuple.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_profile.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rss.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rss.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rss_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_srv_nic.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_tx.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_tx.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_wq.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_common.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_csr.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_lld.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_pci_id_tbl.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sm_lt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sml_lt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_wq.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/ossl_knl_linux.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/mag_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/mgmt_msg_base.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/nic_cfg_comm.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/ossl_knl.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/ossl_knl_linux.h
diff --git a/drivers/net/ethernet/huawei/Kconfig b/drivers/net/ethernet/huawei/Kconfig
index 0afeb5021f17..0df9544dcf74 100644
--- a/drivers/net/ethernet/huawei/Kconfig
+++ b/drivers/net/ethernet/huawei/Kconfig
@@ -16,6 +16,7 @@ config NET_VENDOR_HUAWEI
if NET_VENDOR_HUAWEI
source "drivers/net/ethernet/huawei/hinic/Kconfig"
+source "drivers/net/ethernet/huawei/hinic3/Kconfig"
source "drivers/net/ethernet/huawei/bma/Kconfig"
endif # NET_VENDOR_HUAWEI
diff --git a/drivers/net/ethernet/huawei/Makefile b/drivers/net/ethernet/huawei/Makefile
index f5bf4ae195a3..d88e8fd772e3 100644
--- a/drivers/net/ethernet/huawei/Makefile
+++ b/drivers/net/ethernet/huawei/Makefile
@@ -4,4 +4,5 @@
#
obj-$(CONFIG_HINIC) += hinic/
+obj-$(CONFIG_HINIC3) += hinic3/
obj-$(CONFIG_BMA) += bma/
diff --git a/drivers/net/ethernet/huawei/hinic3/Kconfig b/drivers/net/ethernet/huawei/hinic3/Kconfig
new file mode 100644
index 000000000000..72088646a9bf
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/Kconfig
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Huawei driver configuration
+#
+
+config HINIC3
+ tristate "Huawei Intelligent Network Interface Card 3rd"
+ depends on PCI_MSI && NUMA && PCI_IOV && DCB && (X86 || ARM64)
+ help
+ This driver supports HiNIC PCIE Ethernet cards.
+ To compile this driver as part of the kernel, choose Y here.
+ If unsure, choose N.
+ The default is N.
diff --git a/drivers/net/ethernet/huawei/hinic3/Makefile b/drivers/net/ethernet/huawei/hinic3/Makefile
new file mode 100644
index 000000000000..b17f80ff19b8
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/Makefile
@@ -0,0 +1,45 @@
+# SPDX-License-Identifier: GPL-2.0-only
+ccflags-y += -I$(srctree)/drivers/net/ethernet/huawei/hinic3/
+
+obj-$(CONFIG_HINIC3) += hinic3.o
+hinic3-objs := hw/hinic3_hwdev.o \
+ hw/hinic3_hw_cfg.o \
+ hw/hinic3_hw_comm.o \
+ hw/hinic3_prof_adap.o \
+ hw/hinic3_sriov.o \
+ hw/hinic3_lld.o \
+ hw/hinic3_dev_mgmt.o \
+ hw/hinic3_common.o \
+ hw/hinic3_hwif.o \
+ hw/hinic3_wq.o \
+ hw/hinic3_cmdq.o \
+ hw/hinic3_eqs.o \
+ hw/hinic3_mbox.o \
+ hw/hinic3_mgmt.o \
+ hw/hinic3_api_cmd.o \
+ hw/hinic3_hw_api.o \
+ hw/hinic3_sml_lt.o \
+ hw/hinic3_hw_mt.o \
+ hw/hinic3_nictool.o \
+ hw/hinic3_devlink.o \
+ hw/ossl_knl_linux.o \
+ hinic3_main.o \
+ hinic3_tx.o \
+ hinic3_rx.o \
+ hinic3_rss.o \
+ hinic3_ntuple.o \
+ hinic3_dcb.o \
+ hinic3_ethtool.o \
+ hinic3_ethtool_stats.o \
+ hinic3_dbg.o \
+ hinic3_irq.o \
+ hinic3_filter.o \
+ hinic3_netdev_ops.o \
+ hinic3_nic_prof.o \
+ hinic3_nic_cfg.o \
+ hinic3_mag_cfg.o \
+ hinic3_nic_cfg_vf.o \
+ hinic3_rss_cfg.o \
+ hinic3_nic_event.o \
+ hinic3_nic_io.o \
+ hinic3_nic_dbg.o
\ No newline at end of file
diff --git a/drivers/net/ethernet/huawei/hinic3/cfg_mgt_comm_pub.h b/drivers/net/ethernet/huawei/hinic3/cfg_mgt_comm_pub.h
new file mode 100644
index 000000000000..6d391d0423a9
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/cfg_mgt_comm_pub.h
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2016-2022. All rights reserved.
+ * File name: Cfg_mgt_comm_pub.h
+ * Version No.: Draft
+ * Generation date: 2016 year 05 month 07 day
+ * Latest modification:
+ * Function description: Header file for communication between the: Host and FW
+ * Function list:
+ * Modification history:
+ * 1. Date: 2016 May 07
+ * Modify content: Create a file.
+ */
+#ifndef CFG_MGT_COMM_PUB_H
+#define CFG_MGT_COMM_PUB_H
+
+#include "mgmt_msg_base.h"
+
+typedef enum {
+ SERVICE_BIT_NIC = 0,
+ SERVICE_BIT_ROCE = 1,
+ SERVICE_BIT_VBS = 2,
+ SERVICE_BIT_TOE = 3,
+ SERVICE_BIT_IPSEC = 4,
+ SERVICE_BIT_FC = 5,
+ SERVICE_BIT_VIRTIO = 6,
+ SERVICE_BIT_OVS = 7,
+ SERVICE_BIT_NVME = 8,
+ SERVICE_BIT_ROCEAA = 9,
+ SERVICE_BIT_CURRENET = 10,
+ SERVICE_BIT_PPA = 11,
+ SERVICE_BIT_MIGRATE = 12,
+ SERVICE_BIT_MAX
+} servic_bit_define_e;
+
+#define CFG_SERVICE_MASK_NIC (0x1 << SERVICE_BIT_NIC)
+#define CFG_SERVICE_MASK_ROCE (0x1 << SERVICE_BIT_ROCE)
+#define CFG_SERVICE_MASK_VBS (0x1 << SERVICE_BIT_VBS)
+#define CFG_SERVICE_MASK_TOE (0x1 << SERVICE_BIT_TOE)
+#define CFG_SERVICE_MASK_IPSEC (0x1 << SERVICE_BIT_IPSEC)
+#define CFG_SERVICE_MASK_FC (0x1 << SERVICE_BIT_FC)
+#define CFG_SERVICE_MASK_VIRTIO (0x1 << SERVICE_BIT_VIRTIO)
+#define CFG_SERVICE_MASK_OVS (0x1 << SERVICE_BIT_OVS)
+#define CFG_SERVICE_MASK_NVME (0x1 << SERVICE_BIT_NVME)
+#define CFG_SERVICE_MASK_ROCEAA (0x1 << SERVICE_BIT_ROCEAA)
+#define CFG_SERVICE_MASK_CURRENET (0x1 << SERVICE_BIT_CURRENET)
+#define CFG_SERVICE_MASK_PPA (0x1 << SERVICE_BIT_PPA)
+#define CFG_SERVICE_MASK_MIGRATE (0x1 << SERVICE_BIT_MIGRATE)
+
+/* Definition of the scenario ID in the cfg_data, which is used for SML memory allocation. */
+typedef enum {
+ SCENES_ID_FPGA_ETH = 0,
+ SCENES_ID_FPGA_TIOE = 1, /* Discarded */
+ SCENES_ID_STORAGE_ROCEAA_2x100 = 2,
+ SCENES_ID_STORAGE_ROCEAA_4x25 = 3,
+ SCENES_ID_CLOUD = 4,
+ SCENES_ID_FC = 5,
+ SCENES_ID_STORAGE_ROCE = 6,
+ SCENES_ID_COMPUTE_ROCE = 7,
+ SCENES_ID_STORAGE_TOE = 8,
+ SCENES_ID_MAX
+} scenes_id_define_e;
+
+/* struct cfg_cmd_dev_cap.sf_svc_attr */
+enum {
+ SF_SVC_FT_BIT = (1 << 0),
+ SF_SVC_RDMA_BIT = (1 << 1),
+};
+
+enum cfg_cmd {
+ CFG_CMD_GET_DEV_CAP = 0,
+ CFG_CMD_GET_HOST_TIMER = 1,
+};
+
+struct cfg_cmd_host_timer {
+ struct mgmt_msg_head head;
+
+ u8 host_id;
+ u8 rsvd1;
+
+ u8 timer_pf_num;
+ u8 timer_pf_id_start;
+ u16 timer_vf_num;
+ u16 timer_vf_id_start;
+ u32 rsvd2[8];
+};
+
+struct cfg_cmd_dev_cap {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u16 rsvd1;
+
+ /* Public resources */
+ u8 host_id;
+ u8 ep_id;
+ u8 er_id;
+ u8 port_id;
+
+ u16 host_total_func;
+ u8 host_pf_num;
+ u8 pf_id_start;
+ u16 host_vf_num;
+ u16 vf_id_start;
+ u8 host_oq_id_mask_val;
+ u8 timer_en;
+ u8 host_valid_bitmap;
+ u8 rsvd_host;
+
+ u16 svc_cap_en;
+ u16 max_vf;
+ u8 flexq_en;
+ u8 valid_cos_bitmap;
+ /* Reserved for func_valid_cos_bitmap */
+ u8 port_cos_valid_bitmap;
+ u8 rsvd_func1;
+ u32 rsvd_func2;
+
+ u8 sf_svc_attr;
+ u8 func_sf_en;
+ u8 lb_mode;
+ u8 smf_pg;
+
+ u32 max_conn_num;
+ u16 max_stick2cache_num;
+ u16 max_bfilter_start_addr;
+ u16 bfilter_len;
+ u16 hash_bucket_num;
+
+ /* shared resource */
+ u8 host_sf_en;
+ u8 master_host_id;
+ u8 srv_multi_host_mode;
+ u8 virtio_vq_size;
+
+ u32 rsvd_func3[5];
+
+ /* l2nic */
+ u16 nic_max_sq_id;
+ u16 nic_max_rq_id;
+ u16 nic_default_num_queues;
+ u16 rsvd1_nic;
+ u32 rsvd2_nic[2];
+
+ /* RoCE */
+ u32 roce_max_qp;
+ u32 roce_max_cq;
+ u32 roce_max_srq;
+ u32 roce_max_mpt;
+ u32 roce_max_drc_qp;
+
+ u32 roce_cmtt_cl_start;
+ u32 roce_cmtt_cl_end;
+ u32 roce_cmtt_cl_size;
+
+ u32 roce_dmtt_cl_start;
+ u32 roce_dmtt_cl_end;
+ u32 roce_dmtt_cl_size;
+
+ u32 roce_wqe_cl_start;
+ u32 roce_wqe_cl_end;
+ u32 roce_wqe_cl_size;
+ u8 roce_srq_container_mode;
+ u8 rsvd_roce1[3];
+ u32 rsvd_roce2[5];
+
+ /* IPsec */
+ u32 ipsec_max_sactx;
+ u16 ipsec_max_cq;
+ u16 rsvd_ipsec1;
+ u32 rsvd_ipsec[2];
+
+ /* OVS */
+ u32 ovs_max_qpc;
+ u32 rsvd_ovs1[3];
+
+ /* ToE */
+ u32 toe_max_pctx;
+ u32 toe_max_cq;
+ u16 toe_max_srq;
+ u16 toe_srq_id_start;
+ u16 toe_max_mpt;
+ u16 toe_max_cctxt;
+ u32 rsvd_toe[2];
+
+ /* FC */
+ u32 fc_max_pctx;
+ u32 fc_max_scq;
+ u32 fc_max_srq;
+
+ u32 fc_max_cctx;
+ u32 fc_cctx_id_start;
+
+ u8 fc_vp_id_start;
+ u8 fc_vp_id_end;
+ u8 rsvd_fc1[2];
+ u32 rsvd_fc2[5];
+
+ /* VBS */
+ u16 vbs_max_volq;
+ u16 rsvd0_vbs;
+ u32 rsvd1_vbs[3];
+
+ u16 fake_vf_start_id;
+ u16 fake_vf_num;
+ u32 fake_vf_max_pctx;
+ u16 fake_vf_bfilter_start_addr;
+ u16 fake_vf_bfilter_len;
+ u32 rsvd_glb[8];
+};
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/comm_cmdq_intf.h b/drivers/net/ethernet/huawei/hinic3/comm_cmdq_intf.h
new file mode 100644
index 000000000000..6f5f87bc19b7
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/comm_cmdq_intf.h
@@ -0,0 +1,239 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/******************************************************************************
+ * Copyright (c) Huawei Technologies Co., Ltd. 2022. All rights reserved.
+ ******************************************************************************
+ File Name : comm_cmdq_intf.h
+ Version : Initial Draft
+ Description : common command queue interface
+ Function List :
+ History :
+ Modification: Created file
+
+******************************************************************************/
+
+#ifndef COMM_CMDQ_INTF_H
+#define COMM_CMDQ_INTF_H
+
+/* Cmdq ack type */
+enum hinic3_ack_type {
+ HINIC3_ACK_TYPE_CMDQ,
+ HINIC3_ACK_TYPE_SHARE_CQN,
+ HINIC3_ACK_TYPE_APP_CQN,
+
+ HINIC3_MOD_ACK_MAX = 15,
+};
+
+/* Defines the queue type of the set arm bit. */
+enum {
+ SET_ARM_BIT_FOR_CMDQ = 0,
+ SET_ARM_BIT_FOR_L2NIC_SQ,
+ SET_ARM_BIT_FOR_L2NIC_RQ,
+ SET_ARM_BIT_TYPE_NUM
+};
+
+/* Defines the type. Each function supports a maximum of eight CMDQ types. */
+enum {
+ CMDQ_0 = 0,
+ CMDQ_1 = 1, /* dedicated and non-blocking queues */
+ CMDQ_NUM
+};
+
+/* *******************cmd common command data structure ************************ */
+// Func->ucode, which is used to set arm bit data,
+// The microcode needs to perform big-endian conversion.
+struct comm_info_ucode_set_arm_bit {
+ u32 q_type;
+ u32 q_id;
+};
+
+/* *******************WQE data structure ************************ */
+union cmdq_wqe_cs_dw0 {
+ struct {
+ u32 err_status : 29;
+ u32 error_code : 2;
+ u32 rsvd : 1;
+ } bs;
+ u32 val;
+};
+
+union cmdq_wqe_cs_dw1 {
+ // This structure is used when the driver writes the wqe.
+ struct {
+ u32 token : 16; // [15:0]
+ u32 cmd : 8; // [23:16]
+ u32 mod : 5; // [28:24]
+ u32 ack_type : 2; // [30:29]
+ u32 obit : 1; // [31]
+ } drv_wr;
+
+ /* The uCode writes back the structure of the CS_DW1.
+ * The driver reads and uses the structure. */
+ struct {
+ u32 mod : 5; // [4:0]
+ u32 ack_type : 3; // [7:5]
+ u32 cmd : 8; // [15:8]
+ u32 arm : 1; // [16]
+ u32 rsvd : 14; // [30:17]
+ u32 obit : 1; // [31]
+ } wb;
+ u32 val;
+};
+
+/* CmdQ BD information or write back buffer information */
+struct cmdq_sge {
+ u32 pa_h; // Upper 32 bits of the physical address
+ u32 pa_l; // Upper 32 bits of the physical address
+ u32 len; // Invalid bit[31].
+ u32 resv;
+};
+
+/* Ctrls section definition of WQE */
+struct cmdq_wqe_ctrls {
+ union {
+ struct {
+ u32 bdsl : 8; // [7:0]
+ u32 drvsl : 2; // [9:8]
+ u32 rsv : 4; // [13:10]
+ u32 wf : 1; // [14]
+ u32 cf : 1; // [15]
+ u32 tsl : 5; // [20:16]
+ u32 va : 1; // [21]
+ u32 df : 1; // [22]
+ u32 cr : 1; // [23]
+ u32 difsl : 3; // [26:24]
+ u32 csl : 2; // [28:27]
+ u32 ctrlsl : 2; // [30:29]
+ u32 obit : 1; // [31]
+ } bs;
+ u32 val;
+ } header;
+ u32 qsf;
+};
+
+/* Complete section definition of WQE */
+struct cmdq_wqe_cs {
+ union cmdq_wqe_cs_dw0 dw0;
+ union cmdq_wqe_cs_dw1 dw1;
+ union {
+ struct cmdq_sge sge;
+ u32 dw2_5[4];
+ } ack;
+};
+
+/* Inline header in WQE inline, describing the length of inline data */
+union cmdq_wqe_inline_header {
+ struct {
+ u32 buf_len : 11; // [10:0] inline data len
+ u32 rsv : 21; // [31:11]
+ } bs;
+ u32 val;
+};
+
+/* Definition of buffer descriptor section in WQE */
+union cmdq_wqe_bds {
+ struct {
+ struct cmdq_sge bds_sge;
+ u32 rsvd[4]; /* Zwy is used to transfer the virtual address of the buffer. */
+ } lcmd; /* Long command, non-inline, and SGE describe the buffer information. */
+};
+
+/* Definition of CMDQ WQE */
+/* (long cmd, 64B)
+ * +----------------------------------------+
+ * | ctrl section(8B) |
+ * +----------------------------------------+
+ * | |
+ * | complete section(24B) |
+ * | |
+ * +----------------------------------------+
+ * | |
+ * | buffer descriptor section(16B) |
+ * | |
+ * +----------------------------------------+
+ * | driver section(16B) |
+ * +----------------------------------------+
+ *
+ *
+ * (middle cmd, 128B)
+ * +----------------------------------------+
+ * | ctrl section(8B) |
+ * +----------------------------------------+
+ * | |
+ * | complete section(24B) |
+ * | |
+ * +----------------------------------------+
+ * | |
+ * | buffer descriptor section(88B) |
+ * | |
+ * +----------------------------------------+
+ * | driver section(8B) |
+ * +----------------------------------------+
+ *
+ *
+ * (short cmd, 64B)
+ * +----------------------------------------+
+ * | ctrl section(8B) |
+ * +----------------------------------------+
+ * | |
+ * | complete section(24B) |
+ * | |
+ * +----------------------------------------+
+ * | |
+ * | buffer descriptor section(24B) |
+ * | |
+ * +----------------------------------------+
+ * | driver section(8B) |
+ * +----------------------------------------+
+ */
+struct cmdq_wqe {
+ struct cmdq_wqe_ctrls ctrls;
+ struct cmdq_wqe_cs cs;
+ union cmdq_wqe_bds bds;
+};
+
+/* Definition of ctrls section in inline WQE */
+struct cmdq_wqe_ctrls_inline {
+ union {
+ struct {
+ u32 bdsl : 8; // [7:0]
+ u32 drvsl : 2; // [9:8]
+ u32 rsv : 4; // [13:10]
+ u32 wf : 1; // [14]
+ u32 cf : 1; // [15]
+ u32 tsl : 5; // [20:16]
+ u32 va : 1; // [21]
+ u32 df : 1; // [22]
+ u32 cr : 1; // [23]
+ u32 difsl : 3; // [26:24]
+ u32 csl : 2; // [28:27]
+ u32 ctrlsl : 2; // [30:29]
+ u32 obit : 1; // [31]
+ } bs;
+ u32 val;
+ } header;
+ u32 qsf;
+ u64 db;
+};
+
+/* Buffer descriptor section definition of WQE */
+union cmdq_wqe_bds_inline {
+ struct {
+ union cmdq_wqe_inline_header header;
+ u32 rsvd;
+ u8 data_inline[80];
+ } mcmd; /* Middle command, inline mode */
+
+ struct {
+ union cmdq_wqe_inline_header header;
+ u32 rsvd;
+ u8 data_inline[16];
+ } scmd; /* Short command, inline mode */
+};
+
+struct cmdq_wqe_inline {
+ struct cmdq_wqe_ctrls_inline ctrls;
+ struct cmdq_wqe_cs cs;
+ union cmdq_wqe_bds_inline bds;
+};
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/comm_defs.h b/drivers/net/ethernet/huawei/hinic3/comm_defs.h
new file mode 100644
index 000000000000..70697a64b44e
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/comm_defs.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2021-2022. All rights reserved.
+ * File Name : comm_defs.h
+ * Version : Initial Draft
+ * Description : common definitions
+ * Function List :
+ * History :
+ * Modification: Created file
+ */
+
+#ifndef COMM_DEFS_H
+#define COMM_DEFS_H
+
+/* CMDQ MODULE_TYPE */
+typedef enum hinic3_mod_type {
+ HINIC3_MOD_COMM = 0, /* HW communication module */
+ HINIC3_MOD_L2NIC = 1, /* L2NIC module */
+ HINIC3_MOD_ROCE = 2,
+ HINIC3_MOD_PLOG = 3,
+ HINIC3_MOD_TOE = 4,
+ HINIC3_MOD_FLR = 5,
+ HINIC3_MOD_RSVD1 = 6,
+ HINIC3_MOD_CFGM = 7, /* Configuration module */
+ HINIC3_MOD_CQM = 8,
+ HINIC3_MOD_RSVD2 = 9,
+ COMM_MOD_FC = 10,
+ HINIC3_MOD_OVS = 11,
+ HINIC3_MOD_DSW = 12,
+ HINIC3_MOD_MIGRATE = 13,
+ HINIC3_MOD_HILINK = 14,
+ HINIC3_MOD_CRYPT = 15, /* secure crypto module */
+ HINIC3_MOD_VIO = 16,
+ HINIC3_MOD_IMU = 17,
+ HINIC3_MOD_DFT = 18, /* DFT */
+ HINIC3_MOD_HW_MAX = 19, /* hardware max module id */
+ /* Software module id, for PF/VF and multi-host */
+ HINIC3_MOD_SW_FUNC = 20,
+ HINIC3_MOD_MAX,
+} hinic3_mod_type_e;
+
+/* func reset的flag ,用于指示清理哪种资源 */
+typedef enum {
+ RES_TYPE_FLUSH_BIT = 0,
+ RES_TYPE_MQM,
+ RES_TYPE_SMF,
+ RES_TYPE_PF_BW_CFG,
+
+ RES_TYPE_COMM = 10,
+ RES_TYPE_COMM_MGMT_CH, /* clear mbox and aeq, The RES_TYPE_COMM bit must be set */
+ RES_TYPE_COMM_CMD_CH, /* clear cmdq and ceq, The RES_TYPE_COMM bit must be set */
+ RES_TYPE_NIC,
+ RES_TYPE_OVS,
+ RES_TYPE_VBS,
+ RES_TYPE_ROCE,
+ RES_TYPE_FC,
+ RES_TYPE_TOE,
+ RES_TYPE_IPSEC,
+ RES_TYPE_MAX,
+} func_reset_flag_e;
+
+#define HINIC3_COMM_RES \
+ ((1 << RES_TYPE_COMM) | (1 << RES_TYPE_COMM_CMD_CH) | \
+ (1 << RES_TYPE_FLUSH_BIT) | (1 << RES_TYPE_MQM) | \
+ (1 << RES_TYPE_SMF) | (1 << RES_TYPE_PF_BW_CFG))
+
+#define HINIC3_NIC_RES (1 << RES_TYPE_NIC)
+#define HINIC3_OVS_RES (1 << RES_TYPE_OVS)
+#define HINIC3_VBS_RES (1 << RES_TYPE_VBS)
+#define HINIC3_ROCE_RES (1 << RES_TYPE_ROCE)
+#define HINIC3_FC_RES (1 << RES_TYPE_FC)
+#define HINIC3_TOE_RES (1 << RES_TYPE_TOE)
+#define HINIC3_IPSEC_RES (1 << RES_TYPE_IPSEC)
+
+/* MODE OVS、NIC、UNKNOWN */
+#define HINIC3_WORK_MODE_OVS 0
+#define HINIC3_WORK_MODE_UNKNOWN 1
+#define HINIC3_WORK_MODE_NIC 2
+
+#define DEVICE_TYPE_L2NIC 0
+#define DEVICE_TYPE_NVME 1
+#define DEVICE_TYPE_VIRTIO_NET 2
+#define DEVICE_TYPE_VIRTIO_BLK 3
+#define DEVICE_TYPE_VIRTIO_VSOCK 4
+#define DEVICE_TYPE_VIRTIO_NET_TRANSITION 5
+#define DEVICE_TYPE_VIRTIO_BLK_TRANSITION 6
+#define DEVICE_TYPE_VIRTIO_SCSI_TRANSITION 7
+#define DEVICE_TYPE_VIRTIO_HPC 8
+
+#define IS_STORAGE_DEVICE_TYPE(dev_type) \
+ ((dev_type) == DEVICE_TYPE_VIRTIO_BLK || \
+ (dev_type) == DEVICE_TYPE_VIRTIO_BLK_TRANSITION || \
+ (dev_type) == DEVICE_TYPE_VIRTIO_SCSI_TRANSITION)
+
+/* Common header control information of the COMM message
+ * interaction command word between the driver and PF
+ */
+struct comm_info_head {
+ u8 status;
+ u8 version;
+ u8 rep_aeq_num;
+ u8 rsvd[5];
+};
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/comm_msg_intf.h b/drivers/net/ethernet/huawei/hinic3/comm_msg_intf.h
new file mode 100644
index 000000000000..eb11d39ba66c
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/comm_msg_intf.h
@@ -0,0 +1,664 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2021-2022. All rights reserved.
+ * File Name : comm_msg_intf.h
+ * Version : Initial Draft
+ * Created : 2021/6/28
+ * Last Modified :
+ * Description : COMM Command interfaces between Driver and MPU
+ * Function List :
+ */
+
+#ifndef COMM_MSG_INTF_H
+#define COMM_MSG_INTF_H
+
+#include "comm_defs.h"
+#include "mgmt_msg_base.h"
+
+/* func_reset_flag的边界值 */
+#define FUNC_RESET_FLAG_MAX_VALUE ((1U << (RES_TYPE_MAX + 1)) - 1)
+struct comm_cmd_func_reset {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u16 rsvd1[3];
+ u64 reset_flag;
+};
+
+struct comm_cmd_ppf_flr_type_set {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 rsvd1[2];
+ u32 ppf_flr_type;
+};
+
+enum {
+ COMM_F_API_CHAIN = 1U << 0,
+ COMM_F_CLP = 1U << 1,
+ COMM_F_CHANNEL_DETECT = 1U << 2,
+ COMM_F_MBOX_SEGMENT = 1U << 3,
+ COMM_F_CMDQ_NUM = 1U << 4,
+ COMM_F_VIRTIO_VQ_SIZE = 1U << 5,
+};
+
+#define COMM_MAX_FEATURE_QWORD 4
+struct comm_cmd_feature_nego {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 opcode; /* 1: set, 0: get */
+ u8 rsvd;
+ u64 s_feature[COMM_MAX_FEATURE_QWORD];
+};
+
+struct comm_cmd_clear_doorbell {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u16 rsvd1[3];
+};
+
+struct comm_cmd_clear_resource {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u16 rsvd1[3];
+};
+
+struct comm_global_attr {
+ u8 max_host_num;
+ u8 max_pf_num;
+ u16 vf_id_start;
+
+ u8 mgmt_host_node_id; /* for api cmd to mgmt cpu */
+ u8 cmdq_num;
+ u8 rsvd1[2];
+
+ u32 rsvd2[8];
+};
+
+typedef struct {
+ struct comm_info_head head;
+
+ u8 op_code; /* 0: get 1: set 2: check */
+ u8 rsvd[3];
+ u32 freq;
+} spu_cmd_freq_operation;
+
+typedef struct {
+ struct comm_info_head head;
+
+ u8 op_code; /* 0: get 1: set 2: init */
+ u8 slave_addr;
+ u8 cmd_id;
+ u8 size;
+ u32 value;
+} spu_cmd_power_operation;
+
+typedef struct {
+ struct comm_info_head head;
+
+ u8 op_code;
+ u8 rsvd[3];
+ s16 fabric_tsensor_temp_avg;
+ s16 fabric_tsensor_temp;
+ s16 sys_tsensor_temp_avg;
+ s16 sys_tsensor_temp;
+} spu_cmd_tsensor_operation;
+
+struct comm_cmd_heart_event {
+ struct mgmt_msg_head head;
+
+ u8 init_sta; /* 0: mpu init ok, 1: mpu init error. */
+ u8 rsvd1[3];
+ u32 heart; /* add one by one */
+ u32 heart_handshake; /* should be alwasys: 0x5A5A5A5A */
+};
+
+struct comm_cmd_channel_detect {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u16 rsvd1[3];
+ u32 rsvd2[2];
+};
+
+enum hinic3_svc_type {
+ SVC_T_COMM = 0,
+ SVC_T_NIC,
+ SVC_T_OVS,
+ SVC_T_ROCE,
+ SVC_T_TOE,
+ SVC_T_IOE,
+ SVC_T_FC,
+ SVC_T_VBS,
+ SVC_T_IPSEC,
+ SVC_T_VIRTIO,
+ SVC_T_MIGRATE,
+ SVC_T_PPA,
+ SVC_T_MAX,
+};
+
+struct comm_cmd_func_svc_used_state {
+ struct mgmt_msg_head head;
+ u16 func_id;
+ u16 svc_type;
+ u8 used_state;
+ u8 rsvd[35];
+};
+
+#define TABLE_INDEX_MAX 129
+
+struct sml_table_id_info {
+ u8 node_id;
+ u8 instance_id;
+};
+
+struct comm_cmd_get_sml_tbl_data {
+ struct comm_info_head head; /* 8B */
+ u8 tbl_data[512];
+};
+
+struct comm_cmd_get_glb_attr {
+ struct mgmt_msg_head head;
+
+ struct comm_global_attr attr;
+};
+
+enum hinic3_fw_ver_type {
+ HINIC3_FW_VER_TYPE_BOOT,
+ HINIC3_FW_VER_TYPE_MPU,
+ HINIC3_FW_VER_TYPE_NPU,
+ HINIC3_FW_VER_TYPE_SMU_L0,
+ HINIC3_FW_VER_TYPE_SMU_L1,
+ HINIC3_FW_VER_TYPE_CFG,
+};
+
+#define HINIC3_FW_VERSION_LEN 16
+#define HINIC3_FW_COMPILE_TIME_LEN 20
+struct comm_cmd_get_fw_version {
+ struct mgmt_msg_head head;
+
+ u16 fw_type;
+ u16 rsvd1;
+ u8 ver[HINIC3_FW_VERSION_LEN];
+ u8 time[HINIC3_FW_COMPILE_TIME_LEN];
+};
+
+/* hardware define: cmdq context */
+struct cmdq_ctxt_info {
+ u64 curr_wqe_page_pfn;
+ u64 wq_block_pfn;
+};
+
+struct comm_cmd_cmdq_ctxt {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 cmdq_id;
+ u8 rsvd1[5];
+
+ struct cmdq_ctxt_info ctxt;
+};
+
+struct comm_cmd_root_ctxt {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 set_cmdq_depth;
+ u8 cmdq_depth;
+ u16 rx_buf_sz;
+ u8 lro_en;
+ u8 rsvd1;
+ u16 sq_depth;
+ u16 rq_depth;
+ u64 rsvd2;
+};
+
+struct comm_cmd_wq_page_size {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 opcode;
+ /* real_size=4KB*2^page_size, range(0~20) must be checked by driver */
+ u8 page_size;
+
+ u32 rsvd1;
+};
+
+struct comm_cmd_msix_config {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 opcode;
+ u8 rsvd1;
+ u16 msix_index;
+ u8 pending_cnt;
+ u8 coalesce_timer_cnt;
+ u8 resend_timer_cnt;
+ u8 lli_timer_cnt;
+ u8 lli_credit_cnt;
+ u8 rsvd2[5];
+};
+
+enum cfg_msix_operation {
+ CFG_MSIX_OPERATION_FREE = 0,
+ CFG_MSIX_OPERATION_ALLOC = 1,
+};
+
+struct comm_cmd_cfg_msix_num {
+ struct comm_info_head head; /* 8B */
+
+ u16 func_id;
+ u8 op_code; /* 1: alloc 0: free */
+ u8 rsvd0;
+
+ u16 msix_num;
+ u16 rsvd1;
+};
+
+struct comm_cmd_dma_attr_config {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 entry_idx;
+ u8 st;
+ u8 at;
+ u8 ph;
+ u8 no_snooping;
+ u8 tph_en;
+ u32 resv1;
+};
+
+struct comm_cmd_ceq_ctrl_reg {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u16 q_id;
+ u32 ctrl0;
+ u32 ctrl1;
+ u32 rsvd1;
+};
+
+struct comm_cmd_func_tmr_bitmap_op {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 opcode; /* 1: start, 0: stop */
+ u8 rsvd1[5];
+};
+
+struct comm_cmd_ppf_tmr_op {
+ struct mgmt_msg_head head;
+
+ u8 ppf_id;
+ u8 opcode; /* 1: start, 0: stop */
+ u8 rsvd1[6];
+};
+
+struct comm_cmd_ht_gpa {
+ struct mgmt_msg_head head;
+
+ u8 host_id;
+ u8 rsvd0[3];
+ u32 rsvd1[7];
+ u64 page_pa0;
+ u64 page_pa1;
+};
+
+struct comm_cmd_get_eqm_num {
+ struct mgmt_msg_head head;
+
+ u8 host_id;
+ u8 rsvd1[3];
+ u32 chunk_num;
+ u32 search_gpa_num;
+};
+
+struct comm_cmd_eqm_cfg {
+ struct mgmt_msg_head head;
+
+ u8 host_id;
+ u8 valid;
+ u16 rsvd1;
+ u32 page_size;
+ u32 rsvd2;
+};
+
+struct comm_cmd_eqm_search_gpa {
+ struct mgmt_msg_head head;
+
+ u8 host_id;
+ u8 rsvd1[3];
+ u32 start_idx;
+ u32 num;
+ u32 rsvd2;
+ u64 gpa_hi52[0]; /*lint !e1501*/
+};
+
+struct comm_cmd_ffm_info {
+ struct mgmt_msg_head head;
+
+ u8 node_id;
+ /* error level of the interrupt source */
+ u8 err_level;
+ /* Classification by interrupt source properties */
+ u16 err_type;
+ u32 err_csr_addr;
+ u32 err_csr_value;
+ u32 rsvd1;
+};
+
+#define HARDWARE_ID_1XX3V100_TAG 31 /* 1xx3v100 tag */
+
+struct hinic3_board_info {
+ u8 board_type;
+ u8 port_num;
+ u8 port_speed;
+ u8 pcie_width;
+ u8 host_num;
+ u8 pf_num;
+ u16 vf_total_num;
+ u8 tile_num;
+ u8 qcm_num;
+ u8 core_num;
+ u8 work_mode;
+ u8 service_mode;
+ u8 pcie_mode;
+ u8 boot_sel;
+ u8 board_id;
+ u32 cfg_addr;
+ u32 service_en_bitmap;
+ u8 scenes_id;
+ u8 cfg_template_id;
+ u8 hardware_id;
+ u8 spu_en;
+ u16 pf_vendor_id;
+ u8 tile_bitmap;
+ u8 sm_bitmap;
+};
+
+struct comm_cmd_board_info {
+ struct mgmt_msg_head head;
+
+ struct hinic3_board_info info;
+ u32 rsvd[22];
+};
+
+struct comm_cmd_sync_time {
+ struct mgmt_msg_head head;
+
+ u64 mstime;
+ u64 rsvd1;
+};
+
+struct comm_cmd_sdi_info {
+ struct mgmt_msg_head head;
+ u32 cfg_sdi_mode;
+};
+
+/* func flr set */
+struct comm_cmd_func_flr_set {
+ struct mgmt_msg_head head;
+
+ u16 func_id;
+ u8 type; /* 1: close 置flush */
+ u8 isall; /* 是否操作对应pf下的所有vf 1: all vf */
+ u32 rsvd;
+};
+
+struct comm_cmd_bdf_info {
+ struct mgmt_msg_head head;
+
+ u16 function_idx;
+ u8 rsvd1[2];
+ u8 bus;
+ u8 device;
+ u8 function;
+ u8 rsvd2[5];
+};
+
+struct hw_pf_info {
+ u16 glb_func_idx;
+ u16 glb_pf_vf_offset;
+ u8 p2p_idx;
+ u8 itf_idx;
+ u16 max_vfs;
+ u16 max_queue_num;
+ u16 vf_max_queue_num;
+ u16 port_id;
+ u16 rsvd0;
+ u32 pf_service_en_bitmap;
+ u32 vf_service_en_bitmap;
+ u16 rsvd1[2];
+
+ u8 device_type;
+ u8 bus_num; /* tl_cfg_bus_num */
+ u16 vf_stride; /* VF_RID_SETTING.vf_stride */
+ u16 vf_offset; /* VF_RID_SETTING.vf_offset */
+ u8 rsvd[2];
+};
+
+#define CMD_MAX_MAX_PF_NUM 32
+struct hinic3_hw_pf_infos {
+ u8 num_pfs;
+ u8 rsvd1[3];
+
+ struct hw_pf_info infos[CMD_MAX_MAX_PF_NUM];
+};
+
+struct comm_cmd_hw_pf_infos {
+ struct mgmt_msg_head head;
+
+ struct hinic3_hw_pf_infos infos;
+};
+
+#define DD_CFG_TEMPLATE_MAX_IDX 12
+#define DD_CFG_TEMPLATE_MAX_TXT_LEN 64
+#define CFG_TEMPLATE_OP_QUERY 0
+#define CFG_TEMPLATE_OP_SET 1
+#define CFG_TEMPLATE_SET_MODE_BY_IDX 0
+#define CFG_TEMPLATE_SET_MODE_BY_NAME 1
+
+struct comm_cmd_cfg_template {
+ struct mgmt_msg_head head;
+ u8 opt_type; /* 0: query 1: set */
+ u8 set_mode; /* 0-index mode. 1-name mode. */
+ u8 tp_err;
+ u8 rsvd0;
+
+ u8 cur_index; /* Current cfg tempalte index. */
+ u8 cur_max_index; /* Max support cfg tempalte index. */
+ u8 rsvd1[2];
+ u8 cur_name[DD_CFG_TEMPLATE_MAX_TXT_LEN];
+ u8 cur_cfg_temp_info[DD_CFG_TEMPLATE_MAX_IDX][DD_CFG_TEMPLATE_MAX_TXT_LEN];
+
+ u8 next_index; /* Next reset cfg tempalte index. */
+ u8 next_max_index; /* Max support cfg tempalte index. */
+ u8 rsvd2[2];
+ u8 next_name[DD_CFG_TEMPLATE_MAX_TXT_LEN];
+ u8 next_cfg_temp_info[DD_CFG_TEMPLATE_MAX_IDX][DD_CFG_TEMPLATE_MAX_TXT_LEN];
+};
+
+#define MQM_SUPPORT_COS_NUM 8
+#define MQM_INVALID_WEIGHT 256
+#define MQM_LIMIT_SET_FLAG_READ 0
+#define MQM_LIMIT_SET_FLAG_WRITE 1
+struct comm_cmd_set_mqm_limit {
+ struct mgmt_msg_head head;
+
+ u16 set_flag; /* 置位该标记位表示设置 */
+ u16 func_id;
+ /* 对应cos_id所占的权重,0-255, 0为SP调度. */
+ u16 cos_weight[MQM_SUPPORT_COS_NUM];
+ u32 host_min_rate; /* 本host支持的最低限速 */
+ u32 func_min_rate; /* 本function支持的最低限速,单位Mbps */
+ u32 func_max_rate; /* 本function支持的最高限速,单位Mbps */
+ u8 rsvd[64]; /* Reserved */
+};
+
+#define DUMP_16B_PER_LINE 16
+#define DUMP_8_VAR_PER_LINE 8
+#define DUMP_4_VAR_PER_LINE 4
+
+#define DATA_LEN_1K 1024
+/* 软狗超时信息上报接口 */
+struct comm_info_sw_watchdog {
+ struct comm_info_head head;
+
+ /* 全局信息 */
+ u32 curr_time_h; /* 发生死循环的时间,cycle */
+ u32 curr_time_l; /* 发生死循环的时间,cycle */
+ u32 task_id; /* 发生死循环的任务 */
+ u32 rsv; /* 保留字段,用于扩展 */
+
+ /* 寄存器信息,TSK_CONTEXT_S */
+ u64 pc;
+
+ u64 elr;
+ u64 spsr;
+ u64 far;
+ u64 esr;
+ u64 xzr;
+ u64 x30;
+ u64 x29;
+ u64 x28;
+ u64 x27;
+ u64 x26;
+ u64 x25;
+ u64 x24;
+ u64 x23;
+ u64 x22;
+ u64 x21;
+ u64 x20;
+ u64 x19;
+ u64 x18;
+ u64 x17;
+ u64 x16;
+ u64 x15;
+ u64 x14;
+ u64 x13;
+ u64 x12;
+ u64 x11;
+ u64 x10;
+ u64 x09;
+ u64 x08;
+ u64 x07;
+ u64 x06;
+ u64 x05;
+ u64 x04;
+ u64 x03;
+ u64 x02;
+ u64 x01;
+ u64 x00;
+
+ /* 堆栈控制信息,STACK_INFO_S */
+ u64 stack_top; /* 栈顶 */
+ u64 stack_bottom; /* 栈底 */
+ u64 sp; /* 栈当前SP指针值 */
+ u32 curr_used; /* 栈当前使用的大小 */
+ u32 peak_used; /* 栈使用的历史峰值 */
+ u32 is_overflow; /* 栈是否溢出 */
+
+ /* 堆栈具体内容 */
+ u32 stack_actlen; /* 实际的堆栈长度(<=1024) */
+ u8 stack_data[DATA_LEN_1K]; /* 超过1024部分,会被截断 */
+};
+
+/* 临终遗言信息 */
+#define XREGS_NUM 31
+typedef struct tag_cpu_tick {
+ u32 cnt_hi; /* *< cycle计数高32位 */
+ u32 cnt_lo; /* *< cycle计数低32位 */
+} CPU_TICK;
+
+typedef struct tag_ax_exc_reg_info {
+ u64 ttbr0;
+ u64 ttbr1;
+ u64 tcr;
+ u64 mair;
+ u64 sctlr;
+ u64 vbar;
+ u64 current_el;
+ u64 sp;
+ /* 以下字段的内存布局与TskContext保持一致 */
+ u64 elr; /* 返回地址 */
+ u64 spsr;
+ u64 far_r;
+ u64 esr;
+ u64 xzr;
+ u64 xregs[XREGS_NUM]; /* 0~30: x30~x0 */
+} EXC_REGS_S;
+
+typedef struct tag_exc_info {
+ char os_ver[48]; /* *< OS版本号 */
+ char app_ver[64]; /* *< 产品版本号 */
+ u32 exc_cause; /* *< 异常原因 */
+ u32 thread_type; /* *< 异常前的线程类型 */
+ u32 thread_id; /* *< 异常前线程PID */
+ u16 byte_order; /* *< 字节序 */
+ u16 cpu_type; /* *< CPU类型 */
+ u32 cpu_id; /* *< CPU ID */
+ CPU_TICK cpu_tick; /* *< CPU Tick */
+ u32 nest_cnt; /* *< 异常嵌套计数 */
+ u32 fatal_errno; /* *< 致命错误码,发生致命错误时有效 */
+ u64 uw_sp; /* *< 异常前栈指针 */
+ u64 stack_bottom; /* *< 异常前栈底 */
+ /* 异常发生时的核内寄存器上下文信息,82\57必须位于152字节处,
+ * 若有改动,需更新sre_platform.eh中的OS_EXC_REGINFO_OFFSET宏
+ */
+ EXC_REGS_S reg_info;
+} EXC_INFO_S;
+
+/* 上报给驱动的up lastword模块接口 */
+#define MPU_LASTWORD_SIZE 1024
+typedef struct tag_comm_info_up_lastword {
+ struct comm_info_head head;
+
+ EXC_INFO_S stack_info;
+
+ /* 堆栈具体内容 */
+ u32 stack_actlen; /* 实际的堆栈长度(<=1024) */
+ u8 stack_data[MPU_LASTWORD_SIZE]; /* 超过1024部分,会被截断 */
+} comm_info_up_lastword_s;
+
+#define FW_UPDATE_MGMT_TIMEOUT 3000000U
+
+struct hinic3_cmd_update_firmware {
+ struct mgmt_msg_head msg_head;
+
+ struct {
+ u32 sl : 1;
+ u32 sf : 1;
+ u32 flag : 1;
+ u32 bit_signed : 1;
+ u32 reserved : 12;
+ u32 fragment_len : 16;
+ } ctl_info;
+
+ struct {
+ u32 section_crc;
+ u32 section_type;
+ } section_info;
+
+ u32 total_len;
+ u32 section_len;
+ u32 section_version;
+ u32 section_offset;
+ u32 data[384];
+};
+
+struct hinic3_cmd_activate_firmware {
+ struct mgmt_msg_head msg_head;
+ u8 index; /* 0 ~ 7 */
+ u8 data[7];
+};
+
+struct hinic3_cmd_switch_config {
+ struct mgmt_msg_head msg_head;
+ u8 index; /* 0 ~ 7 */
+ u8 data[7];
+};
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_comm_cmd.h b/drivers/net/ethernet/huawei/hinic3/hinic3_comm_cmd.h
new file mode 100644
index 000000000000..ad732c337520
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_comm_cmd.h
@@ -0,0 +1,185 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2019-2022. All rights reserved.
+ * File Name : hinic3_comm_cmd.h
+ * Version : Initial Draft
+ * Created : 2019/4/25
+ * Last Modified :
+ * Description : COMM Commands between Driver and MPU
+ * Function List :
+ */
+
+#ifndef HINIC3_COMMON_CMD_H
+#define HINIC3_COMMON_CMD_H
+
+/* COMM Commands between Driver to MPU */
+enum hinic3_mgmt_cmd {
+ /* flr及资源清理相关命令 */
+ COMM_MGMT_CMD_FUNC_RESET = 0,
+ COMM_MGMT_CMD_FEATURE_NEGO,
+ COMM_MGMT_CMD_FLUSH_DOORBELL,
+ COMM_MGMT_CMD_START_FLUSH,
+ COMM_MGMT_CMD_SET_FUNC_FLR,
+ COMM_MGMT_CMD_GET_GLOBAL_ATTR,
+ COMM_MGMT_CMD_SET_PPF_FLR_TYPE,
+ COMM_MGMT_CMD_SET_FUNC_SVC_USED_STATE,
+
+ /* 分配msi-x中断资源 */
+ COMM_MGMT_CMD_CFG_MSIX_NUM = 10,
+
+ /* 驱动相关配置命令 */
+ COMM_MGMT_CMD_SET_CMDQ_CTXT = 20,
+ COMM_MGMT_CMD_SET_VAT,
+ COMM_MGMT_CMD_CFG_PAGESIZE,
+ COMM_MGMT_CMD_CFG_MSIX_CTRL_REG,
+ COMM_MGMT_CMD_SET_CEQ_CTRL_REG,
+ COMM_MGMT_CMD_SET_DMA_ATTR,
+
+ /* INFRA配置相关命令字 */
+ COMM_MGMT_CMD_GET_MQM_FIX_INFO = 40,
+ COMM_MGMT_CMD_SET_MQM_CFG_INFO,
+ COMM_MGMT_CMD_SET_MQM_SRCH_GPA,
+ COMM_MGMT_CMD_SET_PPF_TMR,
+ COMM_MGMT_CMD_SET_PPF_HT_GPA,
+ COMM_MGMT_CMD_SET_FUNC_TMR_BITMAT,
+ COMM_MGMT_CMD_SET_MBX_CRDT,
+ COMM_MGMT_CMD_CFG_TEMPLATE,
+ COMM_MGMT_CMD_SET_MQM_LIMIT,
+
+ /* 信息获取相关命令字 */
+ COMM_MGMT_CMD_GET_FW_VERSION = 60,
+ COMM_MGMT_CMD_GET_BOARD_INFO,
+ COMM_MGMT_CMD_SYNC_TIME,
+ COMM_MGMT_CMD_GET_HW_PF_INFOS,
+ COMM_MGMT_CMD_SEND_BDF_INFO,
+ COMM_MGMT_CMD_GET_VIRTIO_BDF_INFO,
+ COMM_MGMT_CMD_GET_SML_TABLE_INFO,
+ COMM_MGMT_CMD_GET_SDI_INFO,
+
+ /* 升级相关命令字 */
+ COMM_MGMT_CMD_UPDATE_FW = 80,
+ COMM_MGMT_CMD_ACTIVE_FW,
+ COMM_MGMT_CMD_HOT_ACTIVE_FW,
+ COMM_MGMT_CMD_HOT_ACTIVE_DONE_NOTICE,
+ COMM_MGMT_CMD_SWITCH_CFG,
+ COMM_MGMT_CMD_CHECK_FLASH,
+ COMM_MGMT_CMD_CHECK_FLASH_RW,
+ COMM_MGMT_CMD_RESOURCE_CFG,
+ COMM_MGMT_CMD_UPDATE_BIOS, /* TODO: merge to COMM_MGMT_CMD_UPDATE_FW */
+ COMM_MGMT_CMD_MPU_GIT_CODE,
+
+ /* chip reset相关 */
+ COMM_MGMT_CMD_FAULT_REPORT = 100,
+ COMM_MGMT_CMD_WATCHDOG_INFO,
+ COMM_MGMT_CMD_MGMT_RESET,
+ COMM_MGMT_CMD_FFM_SET, /* TODO: check if needed */
+
+ /* chip info/log 相关 */
+ COMM_MGMT_CMD_GET_LOG = 120,
+ COMM_MGMT_CMD_TEMP_OP,
+ COMM_MGMT_CMD_EN_AUTO_RST_CHIP,
+ COMM_MGMT_CMD_CFG_REG,
+ COMM_MGMT_CMD_GET_CHIP_ID,
+ COMM_MGMT_CMD_SYSINFO_DFX,
+ COMM_MGMT_CMD_PCIE_DFX_NTC,
+ COMM_MGMT_CMD_DICT_LOG_STATUS, /* LOG STATUS 127 */
+ COMM_MGMT_CMD_MSIX_INFO,
+ COMM_MGMT_CMD_CHANNEL_DETECT,
+ COMM_MGMT_CMD_DICT_COUNTER_STATUS,
+
+ /* switch workmode 相关 */
+ COMM_MGMT_CMD_CHECK_IF_SWITCH_WORKMODE = 140,
+ COMM_MGMT_CMD_SWITCH_WORKMODE,
+
+ /* mpu 相关 */
+ COMM_MGMT_CMD_MIGRATE_DFX_HPA = 150,
+ COMM_MGMT_CMD_BDF_INFO,
+ COMM_MGMT_CMD_NCSI_CFG_INFO_GET_PROC,
+
+ /* rsvd0 section */
+ COMM_MGMT_CMD_SECTION_RSVD_0 = 160,
+
+ /* rsvd1 section */
+ COMM_MGMT_CMD_SECTION_RSVD_1 = 170,
+
+ /* rsvd2 section */
+ COMM_MGMT_CMD_SECTION_RSVD_2 = 180,
+
+ /* rsvd3 section */
+ COMM_MGMT_CMD_SECTION_RSVD_3 = 190,
+
+ /* TODO: move to DFT mode */
+ COMM_MGMT_CMD_GET_DIE_ID = 200,
+ COMM_MGMT_CMD_GET_EFUSE_TEST,
+ COMM_MGMT_CMD_EFUSE_INFO_CFG,
+ COMM_MGMT_CMD_GPIO_CTL,
+ COMM_MGMT_CMD_HI30_SERLOOP_START, /* TODO: DFT or hilink */
+ COMM_MGMT_CMD_HI30_SERLOOP_STOP, /* TODO: DFT or hilink */
+ COMM_MGMT_CMD_HI30_MBIST_SET_FLAG, /* TODO: DFT or hilink */
+ COMM_MGMT_CMD_HI30_MBIST_GET_RESULT, /* TODO: DFT or hilink */
+ COMM_MGMT_CMD_ECC_TEST,
+ COMM_MGMT_CMD_FUNC_BIST_TEST, /* 209 */
+
+ COMM_MGMT_CMD_VPD_SET = 210,
+ COMM_MGMT_CMD_VPD_GET,
+
+ COMM_MGMT_CMD_ERASE_FLASH,
+ COMM_MGMT_CMD_QUERY_FW_INFO,
+ COMM_MGMT_CMD_GET_CFG_INFO,
+ COMM_MGMT_CMD_GET_UART_LOG,
+ COMM_MGMT_CMD_SET_UART_CMD,
+ COMM_MGMT_CMD_SPI_TEST,
+
+ /* TODO: ALL reg read/write merge to COMM_MGMT_CMD_CFG_REG */
+ COMM_MGMT_CMD_UP_REG_GET,
+ COMM_MGMT_CMD_UP_REG_SET, /* 219 */
+
+ COMM_MGMT_CMD_REG_READ = 220,
+ COMM_MGMT_CMD_REG_WRITE,
+ COMM_MGMT_CMD_MAG_REG_WRITE,
+ COMM_MGMT_CMD_ANLT_REG_WRITE,
+
+ COMM_MGMT_CMD_HEART_EVENT, /* TODO: delete */
+ COMM_MGMT_CMD_NCSI_OEM_GET_DRV_INFO, /* TODO: delete */
+ COMM_MGMT_CMD_LASTWORD_GET,
+ COMM_MGMT_CMD_READ_BIN_DATA, /* TODO: delete */
+ /* COMM_MGMT_CMD_WWPN_GET, TODO: move to FC? */
+ /* COMM_MGMT_CMD_WWPN_SET, TODO: move to FC? */ /* 229 */
+
+ /* TODO: check if needed */
+ COMM_MGMT_CMD_SET_VIRTIO_DEV = 230,
+ COMM_MGMT_CMD_SET_MAC,
+ /* MPU patch cmd */
+ COMM_MGMT_CMD_LOAD_PATCH,
+ COMM_MGMT_CMD_REMOVE_PATCH,
+ COMM_MGMT_CMD_PATCH_ACTIVE,
+ COMM_MGMT_CMD_PATCH_DEACTIVE,
+ COMM_MGMT_CMD_PATCH_SRAM_OPTIMIZE,
+ /* container host process */
+ COMM_MGMT_CMD_CONTAINER_HOST_PROC,
+ /* nsci counter */
+ COMM_MGMT_CMD_NCSI_COUNTER_PROC,
+ COMM_MGMT_CMD_CHANNEL_STATUS_CHECK, /* 239 */
+
+ /* hot patch rsvd cmd */
+ COMM_MGMT_CMD_RSVD_0 = 240,
+ COMM_MGMT_CMD_RSVD_1,
+ COMM_MGMT_CMD_RSVD_2,
+ COMM_MGMT_CMD_RSVD_3,
+ COMM_MGMT_CMD_RSVD_4,
+ /* 无效字段,版本收编删除,编译使用 */
+ COMM_MGMT_CMD_SEND_API_ACK_BY_UP,
+
+ /* 注:添加cmd,不能修改已有命令字的值,请在前方rsvd
+ * section中添加;原则上所有分支cmd表完全一致
+ */
+ COMM_MGMT_CMD_MAX = 255,
+};
+
+/* CmdQ Common subtype */
+enum comm_cmdq_cmd {
+ COMM_CMD_UCODE_ARM_BIT_SET = 2,
+ COMM_CMD_SEND_NPU_DFT_CMD,
+};
+
+#endif /* HINIC3_COMMON_CMD_H */
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_common.h b/drivers/net/ethernet/huawei/hinic3/hinic3_common.h
new file mode 100644
index 000000000000..3010083e5200
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_common.h
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_COMMON_H
+#define HINIC3_COMMON_H
+
+#include <linux/types.h>
+
+struct hinic3_dma_addr_align {
+ u32 real_size;
+
+ void *ori_vaddr;
+ dma_addr_t ori_paddr;
+
+ void *align_vaddr;
+ dma_addr_t align_paddr;
+};
+
+enum hinic3_wait_return {
+ WAIT_PROCESS_CPL = 0,
+ WAIT_PROCESS_WAITING = 1,
+ WAIT_PROCESS_ERR = 2,
+};
+
+struct hinic3_sge {
+ u32 hi_addr;
+ u32 lo_addr;
+ u32 len;
+};
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+/* *
+ * hinic_cpu_to_be32 - convert data to big endian 32 bit format
+ * @data: the data to convert
+ * @len: length of data to convert, must be Multiple of 4B
+ */
+static inline void hinic3_cpu_to_be32(void *data, int len)
+{
+ int i, chunk_sz = sizeof(u32);
+ int data_len = len;
+ u32 *mem = data;
+
+ if (!data)
+ return;
+
+ data_len = data_len / chunk_sz;
+
+ for (i = 0; i < data_len; i++) {
+ *mem = cpu_to_be32(*mem);
+ mem++;
+ }
+}
+
+/* *
+ * hinic3_cpu_to_be32 - convert data from big endian 32 bit format
+ * @data: the data to convert
+ * @len: length of data to convert
+ */
+static inline void hinic3_be32_to_cpu(void *data, int len)
+{
+ int i, chunk_sz = sizeof(u32);
+ int data_len = len;
+ u32 *mem = data;
+
+ if (!data)
+ return;
+
+ data_len = data_len / chunk_sz;
+
+ for (i = 0; i < data_len; i++) {
+ *mem = be32_to_cpu(*mem);
+ mem++;
+ }
+}
+
+/* *
+ * hinic3_set_sge - set dma area in scatter gather entry
+ * @sge: scatter gather entry
+ * @addr: dma address
+ * @len: length of relevant data in the dma address
+ */
+static inline void hinic3_set_sge(struct hinic3_sge *sge, dma_addr_t addr,
+ int len)
+{
+ sge->hi_addr = upper_32_bits(addr);
+ sge->lo_addr = lower_32_bits(addr);
+ sge->len = len;
+}
+
+#define hinic3_hw_be32(val) (val)
+#define hinic3_hw_cpu32(val) (val)
+#define hinic3_hw_cpu16(val) (val)
+
+static inline void hinic3_hw_be32_len(void *data, int len)
+{
+}
+
+static inline void hinic3_hw_cpu32_len(void *data, int len)
+{
+}
+
+int hinic3_dma_zalloc_coherent_align(void *dev_hdl, u64 size, u64 align,
+ unsigned int flag,
+ struct hinic3_dma_addr_align *mem_align);
+
+void hinic3_dma_free_coherent_align(void *dev_hdl,
+ struct hinic3_dma_addr_align *mem_align);
+
+
+typedef enum hinic3_wait_return (*wait_cpl_handler)(void *priv_data);
+
+int hinic3_wait_for_timeout(void *priv_data, wait_cpl_handler handler,
+ u32 wait_total_ms, u32 wait_once_us);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_crm.h b/drivers/net/ethernet/huawei/hinic3/hinic3_crm.h
new file mode 100644
index 000000000000..98adaf057b47
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_crm.h
@@ -0,0 +1,1162 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_CRM_H
+#define HINIC3_CRM_H
+
+#define HINIC3_DBG
+
+#define HINIC3_DRV_VERSION ""
+#define HINIC3_DRV_DESC "Intelligent Network Interface Card Driver"
+#define HIUDK_DRV_DESC "Intelligent Network Unified Driver"
+
+#define ARRAY_LEN(arr) ((int)((int)sizeof(arr) / (int)sizeof((arr)[0])))
+
+#define HINIC3_MGMT_VERSION_MAX_LEN 32
+
+#define HINIC3_FW_VERSION_NAME 16
+#define HINIC3_FW_VERSION_SECTION_CNT 4
+#define HINIC3_FW_VERSION_SECTION_BORDER 0xFF
+struct hinic3_fw_version {
+ u8 mgmt_ver[HINIC3_FW_VERSION_NAME];
+ u8 microcode_ver[HINIC3_FW_VERSION_NAME];
+ u8 boot_ver[HINIC3_FW_VERSION_NAME];
+};
+
+#define HINIC3_MGMT_CMD_UNSUPPORTED 0xFF
+
+/* show each drivers only such as nic_service_cap,
+ * toe_service_cap structure, but not show service_cap
+ */
+enum hinic3_service_type {
+ SERVICE_T_NIC = 0,
+ SERVICE_T_OVS,
+ SERVICE_T_ROCE,
+ SERVICE_T_TOE,
+ SERVICE_T_IOE,
+ SERVICE_T_FC,
+ SERVICE_T_VBS,
+ SERVICE_T_IPSEC,
+ SERVICE_T_VIRTIO,
+ SERVICE_T_MIGRATE,
+ SERVICE_T_PPA,
+ SERVICE_T_CUSTOM,
+ SERVICE_T_VROCE,
+ SERVICE_T_MAX,
+
+ /* Only used for interruption resource management,
+ * mark the request module
+ */
+ SERVICE_T_INTF = (1 << 15),
+ SERVICE_T_CQM = (1 << 16),
+};
+
+enum hinic3_ppf_flr_type {
+ STATELESS_FLR_TYPE,
+ STATEFUL_FLR_TYPE,
+};
+
+struct nic_service_cap {
+ u16 max_sqs;
+ u16 max_rqs;
+ u16 default_num_queues;
+};
+
+struct ppa_service_cap {
+ u16 qpc_fake_vf_start;
+ u16 qpc_fake_vf_num;
+ u32 qpc_fake_vf_ctx_num;
+ u32 pctx_sz; /* 512B */
+ u32 bloomfilter_length;
+ u8 bloomfilter_en;
+ u8 rsvd;
+ u16 rsvd1;
+};
+
+struct vbs_service_cap {
+ u16 vbs_max_volq;
+ u16 rsvd1;
+};
+
+struct migr_service_cap {
+ u8 master_host_id;
+ u8 rsvd[3];
+};
+
+/* PF/VF ToE service resource structure */
+struct dev_toe_svc_cap {
+ /* PF resources */
+ u32 max_pctxs; /* Parent Context: max specifications 1M */
+ u32 max_cctxt;
+ u32 max_cqs;
+ u16 max_srqs;
+ u32 srq_id_start;
+ u32 max_mpts;
+};
+
+/* ToE services */
+struct toe_service_cap {
+ struct dev_toe_svc_cap dev_toe_cap;
+
+ bool alloc_flag;
+ u32 pctx_sz; /* 1KB */
+ u32 scqc_sz; /* 64B */
+};
+
+/* PF FC service resource structure defined */
+struct dev_fc_svc_cap {
+ /* PF Parent QPC */
+ u32 max_parent_qpc_num; /* max number is 2048 */
+
+ /* PF Child QPC */
+ u32 max_child_qpc_num; /* max number is 2048 */
+ u32 child_qpc_id_start;
+
+ /* PF SCQ */
+ u32 scq_num; /* 16 */
+
+ /* PF supports SRQ */
+ u32 srq_num; /* Number of SRQ is 2 */
+
+ u8 vp_id_start;
+ u8 vp_id_end;
+};
+
+/* FC services */
+struct fc_service_cap {
+ struct dev_fc_svc_cap dev_fc_cap;
+
+ /* Parent QPC */
+ u32 parent_qpc_size; /* 256B */
+
+ /* Child QPC */
+ u32 child_qpc_size; /* 256B */
+
+ /* SQ */
+ u32 sqe_size; /* 128B(in linked list mode) */
+
+ /* SCQ */
+ u32 scqc_size; /* Size of the Context 32B */
+ u32 scqe_size; /* 64B */
+
+ /* SRQ */
+ u32 srqc_size; /* Size of SRQ Context (64B) */
+ u32 srqe_size; /* 32B */
+};
+
+struct dev_roce_svc_own_cap {
+ u32 max_qps;
+ u32 max_cqs;
+ u32 max_srqs;
+ u32 max_mpts;
+ u32 max_drc_qps;
+
+ u32 cmtt_cl_start;
+ u32 cmtt_cl_end;
+ u32 cmtt_cl_sz;
+
+ u32 dmtt_cl_start;
+ u32 dmtt_cl_end;
+ u32 dmtt_cl_sz;
+
+ u32 wqe_cl_start;
+ u32 wqe_cl_end;
+ u32 wqe_cl_sz;
+
+ u32 qpc_entry_sz;
+ u32 max_wqes;
+ u32 max_rq_sg;
+ u32 max_sq_inline_data_sz;
+ u32 max_rq_desc_sz;
+
+ u32 rdmarc_entry_sz;
+ u32 max_qp_init_rdma;
+ u32 max_qp_dest_rdma;
+
+ u32 max_srq_wqes;
+ u32 reserved_srqs;
+ u32 max_srq_sge;
+ u32 srqc_entry_sz;
+
+ u32 max_msg_sz; /* Message size 2GB */
+};
+
+/* RDMA service capability structure */
+struct dev_rdma_svc_cap {
+ /* ROCE service unique parameter structure */
+ struct dev_roce_svc_own_cap roce_own_cap;
+};
+
+/* Defines the RDMA service capability flag */
+enum {
+ RDMA_BMME_FLAG_LOCAL_INV = (1 << 0),
+ RDMA_BMME_FLAG_REMOTE_INV = (1 << 1),
+ RDMA_BMME_FLAG_FAST_REG_WR = (1 << 2),
+ RDMA_BMME_FLAG_RESERVED_LKEY = (1 << 3),
+ RDMA_BMME_FLAG_TYPE_2_WIN = (1 << 4),
+ RDMA_BMME_FLAG_WIN_TYPE_2B = (1 << 5),
+
+ RDMA_DEV_CAP_FLAG_XRC = (1 << 6),
+ RDMA_DEV_CAP_FLAG_MEM_WINDOW = (1 << 7),
+ RDMA_DEV_CAP_FLAG_ATOMIC = (1 << 8),
+ RDMA_DEV_CAP_FLAG_APM = (1 << 9),
+};
+
+/* RDMA services */
+struct rdma_service_cap {
+ struct dev_rdma_svc_cap dev_rdma_cap;
+
+ u8 log_mtt; /* 1. the number of MTT PA must be integer power of 2
+ * 2. represented by logarithm. Each MTT table can
+ * contain 1, 2, 4, 8, and 16 PA)
+ */
+ /* todo: need to check whether related to max_mtt_seg */
+ u32 num_mtts; /* Number of MTT table (4M),
+ * is actually MTT seg number
+ */
+ u32 log_mtt_seg;
+ u32 mtt_entry_sz; /* MTT table size 8B, including 1 PA(64bits) */
+ u32 mpt_entry_sz; /* MPT table size (64B) */
+
+ u32 dmtt_cl_start;
+ u32 dmtt_cl_end;
+ u32 dmtt_cl_sz;
+
+ u8 log_rdmarc; /* 1. the number of RDMArc PA must be integer power of 2
+ * 2. represented by logarithm. Each MTT table can
+ * contain 1, 2, 4, 8, and 16 PA)
+ */
+
+ u32 reserved_qps; /* Number of reserved QP */
+ u32 max_sq_sg; /* Maximum SGE number of SQ (8) */
+ u32 max_sq_desc_sz; /* WQE maximum size of SQ(1024B), inline maximum
+ * size if 960B(944B aligned to the 960B),
+ * 960B=>wqebb alignment=>1024B
+ */
+ u32 wqebb_size; /* Currently, the supports 64B and 128B,
+ * defined as 64Bytes
+ */
+
+ u32 max_cqes; /* Size of the depth of the CQ (64K-1) */
+ u32 reserved_cqs; /* Number of reserved CQ */
+ u32 cqc_entry_sz; /* Size of the CQC (64B/128B) */
+ u32 cqe_size; /* Size of CQE (32B) */
+
+ u32 reserved_mrws; /* Number of reserved MR/MR Window */
+
+ u32 max_fmr_maps; /* max MAP of FMR,
+ * (1 << (32-ilog2(num_mpt)))-1;
+ */
+
+ /* todo: max value needs to be confirmed */
+ /* MTT table number of Each MTT seg(3) */
+
+ u32 log_rdmarc_seg; /* table number of each RDMArc seg(3) */
+
+ /* Timeout time. Formula:Tr=4.096us*2(local_ca_ack_delay), [Tr,4Tr] */
+ u32 local_ca_ack_delay;
+ u32 num_ports; /* Physical port number */
+
+ u32 db_page_size; /* Size of the DB (4KB) */
+ u32 direct_wqe_size; /* Size of the DWQE (256B) */
+
+ u32 num_pds; /* Maximum number of PD (128K) */
+ u32 reserved_pds; /* Number of reserved PD */
+ u32 max_xrcds; /* Maximum number of xrcd (64K) */
+ u32 reserved_xrcds; /* Number of reserved xrcd */
+
+ u32 max_gid_per_port; /* gid number (16) of each port */
+ u32 gid_entry_sz; /* RoCE v2 GID table is 32B,
+ * compatible RoCE v1 expansion
+ */
+
+ u32 reserved_lkey; /* local_dma_lkey */
+ u32 num_comp_vectors; /* Number of complete vector (32) */
+ u32 page_size_cap; /* Supports 4K,8K,64K,256K,1M and 4M page_size */
+
+ u32 flags; /* RDMA some identity */
+ u32 max_frpl_len; /* Maximum number of pages frmr registration */
+ u32 max_pkeys; /* Number of supported pkey group */
+};
+
+/* PF OVS service resource structure defined */
+struct dev_ovs_svc_cap {
+ u32 max_pctxs; /* Parent Context: max specifications 1M */
+ u32 fake_vf_max_pctx;
+ u16 fake_vf_num;
+ u16 fake_vf_start_id;
+ u8 dynamic_qp_en;
+};
+
+/* OVS services */
+struct ovs_service_cap {
+ struct dev_ovs_svc_cap dev_ovs_cap;
+
+ u32 pctx_sz; /* 512B */
+};
+
+/* PF IPsec service resource structure defined */
+struct dev_ipsec_svc_cap {
+ u32 max_sactxs; /* max IPsec SA context num */
+ u16 max_cqs; /* max IPsec SCQC num */
+ u16 rsvd0;
+};
+
+/* IPsec services */
+struct ipsec_service_cap {
+ struct dev_ipsec_svc_cap dev_ipsec_cap;
+ u32 sactx_sz; /* 512B */
+};
+
+/* Defines the IRQ information structure */
+struct irq_info {
+ u16 msix_entry_idx; /* IRQ corresponding index number */
+ u32 irq_id; /* the IRQ number from OS */
+};
+
+struct interrupt_info {
+ u32 lli_set;
+ u32 interrupt_coalesc_set;
+ u16 msix_index;
+ u8 lli_credit_limit;
+ u8 lli_timer_cfg;
+ u8 pending_limt;
+ u8 coalesc_timer_cfg;
+ u8 resend_timer_cfg;
+};
+
+enum hinic3_msix_state {
+ HINIC3_MSIX_ENABLE,
+ HINIC3_MSIX_DISABLE,
+};
+
+enum hinic3_msix_auto_mask {
+ HINIC3_CLR_MSIX_AUTO_MASK,
+ HINIC3_SET_MSIX_AUTO_MASK,
+};
+
+enum func_type {
+ TYPE_PF,
+ TYPE_VF,
+ TYPE_PPF,
+ TYPE_UNKNOWN,
+};
+
+struct hinic3_init_para {
+ /* Record hinic_pcidev or NDIS_Adapter pointer address */
+ void *adapter_hdl;
+ /* Record pcidev or Handler pointer address
+ * for example: ioremap interface input parameter
+ */
+ void *pcidev_hdl;
+ /* Record pcidev->dev or Handler pointer address which used to
+ * dma address application or dev_err print the parameter
+ */
+ void *dev_hdl;
+
+ /* Configure virtual address, PF is bar1, VF is bar0/1 */
+ void *cfg_reg_base;
+ /* interrupt configuration register address, PF is bar2, VF is bar2/3
+ */
+ void *intr_reg_base;
+ /* for PF bar3 virtual address, if function is VF should set to NULL */
+ void *mgmt_reg_base;
+
+ u64 db_dwqe_len;
+ u64 db_base_phy;
+ /* the doorbell address, bar4/5 higher 4M space */
+ void *db_base;
+ /* direct wqe 4M, follow the doorbell address space */
+ void *dwqe_mapping;
+ void **hwdev;
+ void *chip_node;
+ /* if use polling mode, set it true */
+ bool poll;
+
+ u16 probe_fault_level;
+};
+
+/* B200 config BAR45 4MB, DB & DWQE both 2MB */
+#define HINIC3_DB_DWQE_SIZE 0x00400000
+
+/* db/dwqe page size: 4K */
+#define HINIC3_DB_PAGE_SIZE 0x00001000ULL
+#define HINIC3_DWQE_OFFSET 0x00000800ULL
+
+#define HINIC3_DB_MAX_AREAS (HINIC3_DB_DWQE_SIZE / HINIC3_DB_PAGE_SIZE)
+
+#ifndef IFNAMSIZ
+#define IFNAMSIZ 16
+#endif
+#define MAX_FUNCTION_NUM 4096
+
+struct card_node {
+ struct list_head node;
+ struct list_head func_list;
+ char chip_name[IFNAMSIZ];
+ void *log_info;
+ void *dbgtool_info;
+ void *func_handle_array[MAX_FUNCTION_NUM];
+ unsigned char bus_num;
+ u16 func_num;
+ u32 rsvd1;
+ atomic_t channel_busy_cnt;
+ void *priv_data;
+ u64 rsvd2;
+};
+
+#define HINIC3_SYNFW_TIME_PERIOD (60 * 60 * 1000)
+#define HINIC3_SYNC_YEAR_OFFSET 1900
+#define HINIC3_SYNC_MONTH_OFFSET 1
+
+#define FAULT_SHOW_STR_LEN 16
+
+enum hinic3_fault_source_type {
+ /* same as FAULT_TYPE_CHIP */
+ HINIC3_FAULT_SRC_HW_MGMT_CHIP = 0,
+ /* same as FAULT_TYPE_UCODE */
+ HINIC3_FAULT_SRC_HW_MGMT_UCODE,
+ /* same as FAULT_TYPE_MEM_RD_TIMEOUT */
+ HINIC3_FAULT_SRC_HW_MGMT_MEM_RD_TIMEOUT,
+ /* same as FAULT_TYPE_MEM_WR_TIMEOUT */
+ HINIC3_FAULT_SRC_HW_MGMT_MEM_WR_TIMEOUT,
+ /* same as FAULT_TYPE_REG_RD_TIMEOUT */
+ HINIC3_FAULT_SRC_HW_MGMT_REG_RD_TIMEOUT,
+ /* same as FAULT_TYPE_REG_WR_TIMEOUT */
+ HINIC3_FAULT_SRC_HW_MGMT_REG_WR_TIMEOUT,
+ HINIC3_FAULT_SRC_SW_MGMT_UCODE,
+ HINIC3_FAULT_SRC_MGMT_WATCHDOG,
+ HINIC3_FAULT_SRC_MGMT_RESET = 8,
+ HINIC3_FAULT_SRC_HW_PHY_FAULT,
+ HINIC3_FAULT_SRC_TX_PAUSE_EXCP,
+ HINIC3_FAULT_SRC_PCIE_LINK_DOWN = 20,
+ HINIC3_FAULT_SRC_HOST_HEARTBEAT_LOST = 21,
+ HINIC3_FAULT_SRC_TX_TIMEOUT,
+ HINIC3_FAULT_SRC_TYPE_MAX,
+};
+
+union hinic3_fault_hw_mgmt {
+ u32 val[4];
+ /* valid only type == FAULT_TYPE_CHIP */
+ struct {
+ u8 node_id;
+ /* enum hinic_fault_err_level */
+ u8 err_level;
+ u16 err_type;
+ u32 err_csr_addr;
+ u32 err_csr_value;
+ /* func_id valid only if err_level == FAULT_LEVEL_SERIOUS_FLR */
+ u8 rsvd1;
+ u8 host_id;
+ u16 func_id;
+ } chip;
+
+ /* valid only if type == FAULT_TYPE_UCODE */
+ struct {
+ u8 cause_id;
+ u8 core_id;
+ u8 c_id;
+ u8 rsvd3;
+ u32 epc;
+ u32 rsvd4;
+ u32 rsvd5;
+ } ucode;
+
+ /* valid only if type == FAULT_TYPE_MEM_RD_TIMEOUT ||
+ * FAULT_TYPE_MEM_WR_TIMEOUT
+ */
+ struct {
+ u32 err_csr_ctrl;
+ u32 err_csr_data;
+ u32 ctrl_tab;
+ u32 mem_index;
+ } mem_timeout;
+
+ /* valid only if type == FAULT_TYPE_REG_RD_TIMEOUT ||
+ * FAULT_TYPE_REG_WR_TIMEOUT
+ */
+ struct {
+ u32 err_csr;
+ u32 rsvd6;
+ u32 rsvd7;
+ u32 rsvd8;
+ } reg_timeout;
+
+ struct {
+ /* 0: read; 1: write */
+ u8 op_type;
+ u8 port_id;
+ u8 dev_ad;
+ u8 rsvd9;
+ u32 csr_addr;
+ u32 op_data;
+ u32 rsvd10;
+ } phy_fault;
+};
+
+/* defined by chip */
+struct hinic3_fault_event {
+ /* enum hinic_fault_type */
+ u8 type;
+ u8 fault_level; /* sdk write fault level for uld event */
+ u8 rsvd0[2];
+ union hinic3_fault_hw_mgmt event;
+};
+
+struct hinic3_cmd_fault_event {
+ u8 status;
+ u8 version;
+ u8 rsvd0[6];
+ struct hinic3_fault_event event;
+};
+
+struct hinic3_sriov_state_info {
+ u8 enable;
+ u16 num_vfs;
+};
+
+enum hinic3_comm_event_type {
+ EVENT_COMM_PCIE_LINK_DOWN,
+ EVENT_COMM_HEART_LOST,
+ EVENT_COMM_FAULT,
+ EVENT_COMM_SRIOV_STATE_CHANGE,
+ EVENT_COMM_CARD_REMOVE,
+ EVENT_COMM_MGMT_WATCHDOG,
+};
+
+enum hinic3_event_service_type {
+ EVENT_SRV_COMM = 0,
+#define SERVICE_EVENT_BASE (EVENT_SRV_COMM + 1)
+ EVENT_SRV_NIC = SERVICE_EVENT_BASE + SERVICE_T_NIC,
+ EVENT_SRV_MIGRATE = SERVICE_EVENT_BASE + SERVICE_T_MIGRATE,
+};
+
+#define HINIC3_SRV_EVENT_TYPE(svc, type) ((((u32)(svc)) << 16) | (type))
+struct hinic3_event_info {
+ u16 service; /* enum hinic3_event_service_type */
+ u16 type;
+ u8 event_data[104];
+};
+
+typedef void (*hinic3_event_handler)(void *handle, struct hinic3_event_info *event);
+
+/* *
+ * @brief hinic3_event_register - register hardware event
+ * @param dev: device pointer to hwdev
+ * @param pri_handle: private data will be used by the callback
+ * @param callback: callback function
+ */
+void hinic3_event_register(void *dev, void *pri_handle,
+ hinic3_event_handler callback);
+
+/* *
+ * @brief hinic3_event_unregister - unregister hardware event
+ * @param dev: device pointer to hwdev
+ */
+void hinic3_event_unregister(void *dev);
+
+/* *
+ * @brief hinic3_set_msix_auto_mask - set msix auto mask function
+ * @param hwdev: device pointer to hwdev
+ * @param msix_idx: msix id
+ * @param flag: msix auto_mask flag, 1-enable, 2-clear
+ */
+void hinic3_set_msix_auto_mask_state(void *hwdev, u16 msix_idx,
+ enum hinic3_msix_auto_mask flag);
+
+/* *
+ * @brief hinic3_set_msix_state - set msix state
+ * @param hwdev: device pointer to hwdev
+ * @param msix_idx: msix id
+ * @param flag: msix state flag, 0-enable, 1-disable
+ */
+void hinic3_set_msix_state(void *hwdev, u16 msix_idx,
+ enum hinic3_msix_state flag);
+
+/* *
+ * @brief hinic3_misx_intr_clear_resend_bit - clear msix resend bit
+ * @param hwdev: device pointer to hwdev
+ * @param msix_idx: msix id
+ * @param clear_resend_en: 1-clear
+ */
+void hinic3_misx_intr_clear_resend_bit(void *hwdev, u16 msix_idx,
+ u8 clear_resend_en);
+
+/* *
+ * @brief hinic3_set_interrupt_cfg_direct - set interrupt cfg
+ * @param hwdev: device pointer to hwdev
+ * @param interrupt_para: interrupt info
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_interrupt_cfg_direct(void *hwdev,
+ struct interrupt_info *info,
+ u16 channel);
+
+int hinic3_set_interrupt_cfg(void *dev, struct interrupt_info info,
+ u16 channel);
+
+/* *
+ * @brief hinic3_get_interrupt_cfg - get interrupt cfg
+ * @param dev: device pointer to hwdev
+ * @param info: interrupt info
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_interrupt_cfg(void *dev, struct interrupt_info *info,
+ u16 channel);
+
+/* *
+ * @brief hinic3_alloc_irqs - alloc irq
+ * @param hwdev: device pointer to hwdev
+ * @param type: service type
+ * @param num: alloc number
+ * @param irq_info_array: alloc irq info
+ * @param act_num: alloc actual number
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_alloc_irqs(void *hwdev, enum hinic3_service_type type, u16 num,
+ struct irq_info *irq_info_array, u16 *act_num);
+
+/* *
+ * @brief hinic3_free_irq - free irq
+ * @param hwdev: device pointer to hwdev
+ * @param type: service type
+ * @param irq_id: irq id
+ */
+void hinic3_free_irq(void *hwdev, enum hinic3_service_type type, u32 irq_id);
+
+/* *
+ * @brief hinic3_alloc_ceqs - alloc ceqs
+ * @param hwdev: device pointer to hwdev
+ * @param type: service type
+ * @param num: alloc ceq number
+ * @param ceq_id_array: alloc ceq_id_array
+ * @param act_num: alloc actual number
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_alloc_ceqs(void *hwdev, enum hinic3_service_type type, int num,
+ int *ceq_id_array, int *act_num);
+
+/* *
+ * @brief hinic3_free_irq - free ceq
+ * @param hwdev: device pointer to hwdev
+ * @param type: service type
+ * @param irq_id: ceq id
+ */
+void hinic3_free_ceq(void *hwdev, enum hinic3_service_type type, int ceq_id);
+
+/* *
+ * @brief hinic3_get_pcidev_hdl - get pcidev_hdl
+ * @param hwdev: device pointer to hwdev
+ * @retval non-null: success
+ * @retval null: failure
+ */
+void *hinic3_get_pcidev_hdl(void *hwdev);
+
+/* *
+ * @brief hinic3_ppf_idx - get ppf id
+ * @param hwdev: device pointer to hwdev
+ * @retval ppf id
+ */
+u8 hinic3_ppf_idx(void *hwdev);
+
+/* *
+ * @brief hinic3_get_chip_present_flag - get chip present flag
+ * @param hwdev: device pointer to hwdev
+ * @retval 1: chip is present
+ * @retval 0: chip is absent
+ */
+int hinic3_get_chip_present_flag(const void *hwdev);
+
+/* *
+ * @brief hinic3_get_heartbeat_status - get heartbeat status
+ * @param hwdev: device pointer to hwdev
+ * @retval heartbeat status
+ */
+u32 hinic3_get_heartbeat_status(void *hwdev);
+
+/* *
+ * @brief hinic3_support_nic - function support nic
+ * @param hwdev: device pointer to hwdev
+ * @param cap: nic service capbility
+ * @retval true: function support nic
+ * @retval false: function not support nic
+ */
+bool hinic3_support_nic(void *hwdev, struct nic_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_ipsec - function support ipsec
+ * @param hwdev: device pointer to hwdev
+ * @param cap: ipsec service capbility
+ * @retval true: function support ipsec
+ * @retval false: function not support ipsec
+ */
+bool hinic3_support_ipsec(void *hwdev, struct ipsec_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_roce - function support roce
+ * @param hwdev: device pointer to hwdev
+ * @param cap: roce service capbility
+ * @retval true: function support roce
+ * @retval false: function not support roce
+ */
+bool hinic3_support_roce(void *hwdev, struct rdma_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_fc - function support fc
+ * @param hwdev: device pointer to hwdev
+ * @param cap: fc service capbility
+ * @retval true: function support fc
+ * @retval false: function not support fc
+ */
+bool hinic3_support_fc(void *hwdev, struct fc_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_rdma - function support rdma
+ * @param hwdev: device pointer to hwdev
+ * @param cap: rdma service capbility
+ * @retval true: function support rdma
+ * @retval false: function not support rdma
+ */
+bool hinic3_support_rdma(void *hwdev, struct rdma_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_ovs - function support ovs
+ * @param hwdev: device pointer to hwdev
+ * @param cap: ovs service capbility
+ * @retval true: function support ovs
+ * @retval false: function not support ovs
+ */
+bool hinic3_support_ovs(void *hwdev, struct ovs_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_vbs - function support vbs
+ * @param hwdev: device pointer to hwdev
+ * @param cap: vbs service capbility
+ * @retval true: function support vbs
+ * @retval false: function not support vbs
+ */
+bool hinic3_support_vbs(void *hwdev, struct vbs_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_toe - sync time to hardware
+ * @param hwdev: device pointer to hwdev
+ * @param cap: toe service capbility
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+bool hinic3_support_toe(void *hwdev, struct toe_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_ppa - function support ppa
+ * @param hwdev: device pointer to hwdev
+ * @param cap: ppa service capbility
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+bool hinic3_support_ppa(void *hwdev, struct ppa_service_cap *cap);
+
+/* *
+ * @brief hinic3_support_migr - function support migrate
+ * @param hwdev: device pointer to hwdev
+ * @param cap: migrate service capbility
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+bool hinic3_support_migr(void *hwdev, struct migr_service_cap *cap);
+
+/* *
+ * @brief hinic3_sync_time - sync time to hardware
+ * @param hwdev: device pointer to hwdev
+ * @param time: time to sync
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_sync_time(void *hwdev, u64 time);
+
+/* *
+ * @brief hinic3_disable_mgmt_msg_report - disable mgmt report msg
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_disable_mgmt_msg_report(void *hwdev);
+
+/* *
+ * @brief hinic3_func_for_mgmt - get function service type
+ * @param hwdev: device pointer to hwdev
+ * @retval true: function for mgmt
+ * @retval false: function is not for mgmt
+ */
+bool hinic3_func_for_mgmt(void *hwdev);
+
+/* *
+ * @brief hinic3_set_pcie_order_cfg - set pcie order cfg
+ * @param handle: device pointer to hwdev
+ */
+void hinic3_set_pcie_order_cfg(void *handle);
+
+/* *
+ * @brief hinic3_init_hwdev - call to init hwdev
+ * @param para: device pointer to para
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_init_hwdev(struct hinic3_init_para *para);
+
+/* *
+ * @brief hinic3_free_hwdev - free hwdev
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_free_hwdev(void *hwdev);
+
+/* *
+ * @brief hinic3_detect_hw_present - detect hardware present
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_detect_hw_present(void *hwdev);
+
+/* *
+ * @brief hinic3_record_pcie_error - record pcie error
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_record_pcie_error(void *hwdev);
+
+/* *
+ * @brief hinic3_shutdown_hwdev - shutdown hwdev
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_shutdown_hwdev(void *hwdev);
+
+/* *
+ * @brief hinic3_set_ppf_flr_type - set ppf flr type
+ * @param hwdev: device pointer to hwdev
+ * @param ppf_flr_type: ppf flr type
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_ppf_flr_type(void *hwdev, enum hinic3_ppf_flr_type flr_type);
+
+/* *
+ * @brief hinic3_get_mgmt_version - get management cpu version
+ * @param hwdev: device pointer to hwdev
+ * @param mgmt_ver: output management version
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_mgmt_version(void *hwdev, u8 *mgmt_ver, u8 version_size,
+ u16 channel);
+
+/* *
+ * @brief hinic3_get_fw_version - get firmware version
+ * @param hwdev: device pointer to hwdev
+ * @param fw_ver: firmware version
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_fw_version(void *hwdev, struct hinic3_fw_version *fw_ver,
+ u16 channel);
+
+/* *
+ * @brief hinic3_global_func_id - get global function id
+ * @param hwdev: device pointer to hwdev
+ * @retval global function id
+ */
+u16 hinic3_global_func_id(void *hwdev);
+
+/* *
+ * @brief hinic3_vector_to_eqn - vector to eq id
+ * @param hwdev: device pointer to hwdev
+ * @param type: service type
+ * @param vector: vertor
+ * @retval eq id
+ */
+int hinic3_vector_to_eqn(void *hwdev, enum hinic3_service_type type,
+ int vector);
+
+/* *
+ * @brief hinic3_glb_pf_vf_offset - get vf offset id of pf
+ * @param hwdev: device pointer to hwdev
+ * @retval vf offset id
+ */
+u16 hinic3_glb_pf_vf_offset(void *hwdev);
+
+/* *
+ * @brief hinic3_pf_id_of_vf - get pf id of vf
+ * @param hwdev: device pointer to hwdev
+ * @retval pf id
+ */
+u8 hinic3_pf_id_of_vf(void *hwdev);
+
+/* *
+ * @brief hinic3_func_type - get function type
+ * @param hwdev: device pointer to hwdev
+ * @retval function type
+ */
+enum func_type hinic3_func_type(void *hwdev);
+
+/* *
+ * @brief hinic3_get_stateful_enable - get stateful status
+ * @param hwdev: device pointer to hwdev
+ * @retval stateful enabel status
+ */
+bool hinic3_get_stateful_enable(void *hwdev);
+
+/* *
+ * @brief hinic3_host_oq_id_mask - get oq id
+ * @param hwdev: device pointer to hwdev
+ * @retval oq id
+ */
+u8 hinic3_host_oq_id_mask(void *hwdev);
+
+/* *
+ * @brief hinic3_host_id - get host id
+ * @param hwdev: device pointer to hwdev
+ * @retval host id
+ */
+u8 hinic3_host_id(void *hwdev);
+
+/* *
+ * @brief hinic3_func_max_qnum - get host total function number
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: host total function number
+ * @retval zero: failure
+ */
+u16 hinic3_host_total_func(void *hwdev);
+
+/* *
+ * @brief hinic3_func_max_qnum - get max nic queue number
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: max nic queue number
+ * @retval zero: failure
+ */
+u16 hinic3_func_max_nic_qnum(void *hwdev);
+
+/* *
+ * @brief hinic3_func_max_qnum - get max queue number
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: max queue number
+ * @retval zero: failure
+ */
+u16 hinic3_func_max_qnum(void *hwdev);
+
+/* *
+ * @brief hinic3_er_id - get ep id
+ * @param hwdev: device pointer to hwdev
+ * @retval ep id
+ */
+u8 hinic3_ep_id(void *hwdev); /* Obtain service_cap.ep_id */
+
+/* *
+ * @brief hinic3_er_id - get er id
+ * @param hwdev: device pointer to hwdev
+ * @retval er id
+ */
+u8 hinic3_er_id(void *hwdev); /* Obtain service_cap.er_id */
+
+/* *
+ * @brief hinic3_physical_port_id - get physical port id
+ * @param hwdev: device pointer to hwdev
+ * @retval physical port id
+ */
+u8 hinic3_physical_port_id(void *hwdev); /* Obtain service_cap.port_id */
+
+/* *
+ * @brief hinic3_func_max_vf - get vf number
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: vf number
+ * @retval zero: failure
+ */
+u16 hinic3_func_max_vf(void *hwdev); /* Obtain service_cap.max_vf */
+
+/* *
+ * @brief hinic3_max_pf_num - get global max pf number
+ */
+u8 hinic3_max_pf_num(void *hwdev);
+
+/* *
+ * @brief hinic3_host_pf_num - get current host pf number
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: pf number
+ * @retval zero: failure
+ */
+u32 hinic3_host_pf_num(void *hwdev); /* Obtain service_cap.pf_num */
+
+/* *
+ * @brief hinic3_host_pf_id_start - get current host pf id start
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: pf id start
+ * @retval zero: failure
+ */
+u32 hinic3_host_pf_id_start(void *hwdev); /* Obtain service_cap.pf_num */
+
+/* *
+ * @brief hinic3_pcie_itf_id - get pcie port id
+ * @param hwdev: device pointer to hwdev
+ * @retval pcie port id
+ */
+u8 hinic3_pcie_itf_id(void *hwdev);
+
+/* *
+ * @brief hinic3_vf_in_pf - get vf offset in pf
+ * @param hwdev: device pointer to hwdev
+ * @retval vf offset in pf
+ */
+u8 hinic3_vf_in_pf(void *hwdev);
+
+/* *
+ * @brief hinic3_cos_valid_bitmap - get cos valid bitmap
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: valid cos bit map
+ * @retval zero: failure
+ */
+int hinic3_cos_valid_bitmap(void *hwdev, u8 *func_dft_cos, u8 *port_cos_bitmap);
+
+/* *
+ * @brief hinic3_stateful_init - init stateful resource
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_stateful_init(void *hwdev);
+
+/* *
+ * @brief hinic3_stateful_deinit - deinit stateful resource
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_stateful_deinit(void *hwdev);
+
+/* *
+ * @brief hinic3_free_stateful - sdk remove free stateful resource
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_free_stateful(void *hwdev);
+
+/* *
+ * @brief hinic3_need_init_stateful_default - get need init stateful default
+ * @param hwdev: device pointer to hwdev
+ */
+bool hinic3_need_init_stateful_default(void *hwdev);
+
+/* *
+ * @brief hinic3_get_card_present_state - get card present state
+ * @param hwdev: device pointer to hwdev
+ * @param card_present_state: return card present state
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_card_present_state(void *hwdev, bool *card_present_state);
+
+/* *
+ * @brief hinic3_func_rx_tx_flush - function flush
+ * @param hwdev: device pointer to hwdev
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_func_rx_tx_flush(void *hwdev, u16 channel);
+
+/* *
+ * @brief hinic3_flush_mgmt_workq - when remove function should flush work queue
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_flush_mgmt_workq(void *hwdev);
+
+/* *
+ * @brief hinic3_ceq_num get toe ceq num
+ */
+u8 hinic3_ceq_num(void *hwdev);
+
+/* *
+ * @brief hinic3_intr_num get intr num
+ */
+u16 hinic3_intr_num(void *hwdev);
+
+/* *
+ * @brief hinic3_flexq_en get flexq en
+ */
+u8 hinic3_flexq_en(void *hwdev);
+
+/* *
+ * @brief hinic3_fault_event_report - report fault event
+ * @param hwdev: device pointer to hwdev
+ * @param src: fault event source, reference to enum hinic3_fault_source_type
+ * @param level: fault level, reference to enum hinic3_fault_err_level
+ */
+void hinic3_fault_event_report(void *hwdev, u16 src, u16 level);
+
+/* *
+ * @brief hinic3_probe_success - notify device probe successful
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_probe_success(void *hwdev);
+
+/* *
+ * @brief hinic3_set_func_svc_used_state - set function service used state
+ * @param hwdev: device pointer to hwdev
+ * @param svc_type: service type
+ * @param state: function used state
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_func_svc_used_state(void *hwdev, u16 svc_type, u8 state,
+ u16 channel);
+
+/* *
+ * @brief hinic3_get_self_test_result - get self test result
+ * @param hwdev: device pointer to hwdev
+ * @retval self test result
+ */
+u32 hinic3_get_self_test_result(void *hwdev);
+
+/* *
+ * @brief hinic3_get_slave_host_enable - get slave host enable
+ * @param hwdev: device pointer to hwdev
+ * @param host_id: set host id
+ * @param slave_en-zero: slave is enable
+ * @retval zero: failure
+ */
+void set_slave_host_enable(void *hwdev, u8 host_id, bool enable);
+
+/* *
+ * @brief hinic3_get_slave_bitmap - get slave host bitmap
+ * @param hwdev: device pointer to hwdev
+ * @param slave_host_bitmap-zero: slave host bitmap
+ * @retval zero: failure
+ */
+int hinic3_get_slave_bitmap(void *hwdev, u8 *slave_host_bitmap);
+
+/* *
+ * @brief hinic3_get_slave_host_enable - get slave host enable
+ * @param hwdev: device pointer to hwdev
+ * @param host_id: get host id
+ * @param slave_en-zero: slave is enable
+ * @retval zero: failure
+ */
+int hinic3_get_slave_host_enable(void *hwdev, u8 host_id, u8 *slave_en);
+
+/* *
+ * @brief hinic3_set_host_migrate_enable - set migrate host enable
+ * @param hwdev: device pointer to hwdev
+ * @param host_id: get host id
+ * @param slave_en-zero: migrate is enable
+ * @retval zero: failure
+ */
+int hinic3_set_host_migrate_enable(void *hwdev, u8 host_id, bool enable);
+
+/* *
+ * @brief hinic3_get_host_migrate_enable - get migrate host enable
+ * @param hwdev: device pointer to hwdev
+ * @param host_id: get host id
+ * @param slave_en-zero: migrte enable ptr
+ * @retval zero: failure
+ */
+int hinic3_get_host_migrate_enable(void *hwdev, u8 host_id, u8 *migrate_en);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_dbg.c b/drivers/net/ethernet/huawei/hinic3/hinic3_dbg.c
new file mode 100644
index 000000000000..4a688f190864
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_dbg.c
@@ -0,0 +1,983 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/types.h>
+#include <linux/semaphore.h>
+
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_mt.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_nic_dbg.h"
+#include "hinic3_nic_qp.h"
+#include "hinic3_rx.h"
+#include "hinic3_tx.h"
+#include "hinic3_dcb.h"
+#include "hinic3_nic.h"
+#include "hinic3_mgmt_interface.h"
+
+typedef int (*nic_driv_module)(struct hinic3_nic_dev *nic_dev,
+ const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+struct nic_drv_module_handle {
+ enum driver_cmd_type driv_cmd_name;
+ nic_driv_module driv_func;
+};
+
+static int get_nic_drv_version(void *buf_out, const u32 *out_size)
+{
+ struct drv_version_info *ver_info = buf_out;
+ int err;
+
+ if (!buf_out) {
+ pr_err("Buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (*out_size != sizeof(*ver_info)) {
+ pr_err("Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(*ver_info));
+ return -EINVAL;
+ }
+
+ err = snprintf(ver_info->ver, sizeof(ver_info->ver), "%s %s",
+ HINIC3_NIC_DRV_VERSION, "2023-05-17_19:56:38");
+ if (err < 0)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int get_tx_info(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ u16 q_id;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't get tx info\n");
+ return -EFAULT;
+ }
+
+ if (!buf_in || !buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Buf_in or buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (!out_size || in_size != sizeof(u32)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect in buf size from user :%u, expect: %lu\n",
+ in_size, sizeof(u32));
+ return -EINVAL;
+ }
+
+ q_id = (u16)(*((u32 *)buf_in));
+
+ return hinic3_dbg_get_sq_info(nic_dev->hwdev, q_id, buf_out, *out_size);
+}
+
+static int get_q_num(struct hinic3_nic_dev *nic_dev,
+ const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't get queue number\n");
+ return -EFAULT;
+ }
+
+ if (!buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Get queue number para buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (!out_size || *out_size != sizeof(u16)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user: %u, expect: %lu\n",
+ *out_size, sizeof(u16));
+ return -EINVAL;
+ }
+
+ *((u16 *)buf_out) = nic_dev->q_params.num_qps;
+
+ return 0;
+}
+
+static int get_tx_wqe_info(struct hinic3_nic_dev *nic_dev,
+ const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ const struct wqe_info *info = buf_in;
+ u16 wqebb_cnt = 1;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't get tx wqe info\n");
+ return -EFAULT;
+ }
+
+ if (!buf_in || !buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Buf_in or buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (!out_size || in_size != sizeof(struct wqe_info)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, expect: %lu\n",
+ in_size, sizeof(struct wqe_info));
+ return -EINVAL;
+ }
+
+ return hinic3_dbg_get_wqe_info(nic_dev->hwdev, (u16)info->q_id,
+ (u16)info->wqe_id, wqebb_cnt,
+ buf_out, (u16 *)out_size, HINIC3_SQ);
+}
+
+static int get_rx_info(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct nic_rq_info *rq_info = buf_out;
+ u16 q_id;
+ int err;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't get rx info\n");
+ return -EFAULT;
+ }
+
+ if (!buf_in || !buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Buf_in or buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (!out_size || in_size != sizeof(u32)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, expect: %lu\n",
+ in_size, sizeof(u32));
+ return -EINVAL;
+ }
+
+ q_id = (u16)(*((u32 *)buf_in));
+
+ err = hinic3_dbg_get_rq_info(nic_dev->hwdev, q_id, buf_out, *out_size);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Get rq info failed, ret is %d.\n", err);
+ return err;
+ }
+
+ rq_info->delta = (u16)nic_dev->rxqs[q_id].delta;
+ rq_info->ci = (u16)(nic_dev->rxqs[q_id].cons_idx & nic_dev->rxqs[q_id].q_mask);
+ rq_info->sw_pi = nic_dev->rxqs[q_id].next_to_update;
+ rq_info->msix_vector = nic_dev->rxqs[q_id].irq_id;
+
+ rq_info->coalesc_timer_cfg = nic_dev->rxqs[q_id].last_coalesc_timer_cfg;
+ rq_info->pending_limt = nic_dev->rxqs[q_id].last_pending_limt;
+
+ return 0;
+}
+
+static int get_rx_wqe_info(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ const struct wqe_info *info = buf_in;
+ u16 wqebb_cnt = 1;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't get rx wqe info\n");
+ return -EFAULT;
+ }
+
+ if (!buf_in || !buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Buf_in or buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (!out_size || in_size != sizeof(struct wqe_info)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, expect: %lu\n",
+ in_size, sizeof(struct wqe_info));
+ return -EINVAL;
+ }
+
+ return hinic3_dbg_get_wqe_info(nic_dev->hwdev, (u16)info->q_id,
+ (u16)info->wqe_id, wqebb_cnt,
+ buf_out, (u16 *)out_size, HINIC3_RQ);
+}
+
+static int get_rx_cqe_info(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ const struct wqe_info *info = buf_in;
+ u16 q_id = 0;
+ u16 idx = 0;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't get rx cqe info\n");
+ return -EFAULT;
+ }
+
+ if (!buf_in || !buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Buf_in or buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (in_size != sizeof(struct wqe_info)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, expect: %lu\n",
+ in_size, sizeof(struct wqe_info));
+ return -EINVAL;
+ }
+
+ if (!out_size || *out_size != sizeof(struct hinic3_rq_cqe)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(struct hinic3_rq_cqe));
+ return -EINVAL;
+ }
+ q_id = (u16)info->q_id;
+ idx = (u16)info->wqe_id;
+
+ if (q_id >= nic_dev->q_params.num_qps || idx >= nic_dev->rxqs[q_id].q_depth) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Invalid q_id[%u] >= %u, or wqe idx[%u] >= %u.\n",
+ q_id, nic_dev->q_params.num_qps, idx, nic_dev->rxqs[q_id].q_depth);
+ return -EFAULT;
+ }
+
+ memcpy(buf_out, nic_dev->rxqs[q_id].rx_info[idx].cqe,
+ sizeof(struct hinic3_rq_cqe));
+
+ return 0;
+}
+
+static void clean_nicdev_stats(struct hinic3_nic_dev *nic_dev)
+{
+ u64_stats_update_begin(&nic_dev->stats.syncp);
+ nic_dev->stats.netdev_tx_timeout = 0;
+ nic_dev->stats.tx_carrier_off_drop = 0;
+ nic_dev->stats.tx_invalid_qid = 0;
+ nic_dev->stats.rsvd1 = 0;
+ nic_dev->stats.rsvd2 = 0;
+ u64_stats_update_end(&nic_dev->stats.syncp);
+}
+
+static int clear_func_static(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ int i;
+
+ *out_size = 0;
+#ifndef HAVE_NETDEV_STATS_IN_NETDEV
+ memset(&nic_dev->net_stats, 0, sizeof(nic_dev->net_stats));
+#endif
+ clean_nicdev_stats(nic_dev);
+ for (i = 0; i < nic_dev->max_qps; i++) {
+ hinic3_rxq_clean_stats(&nic_dev->rxqs[i].rxq_stats);
+ hinic3_txq_clean_stats(&nic_dev->txqs[i].txq_stats);
+ }
+
+ return 0;
+}
+
+static int get_loopback_mode(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct hinic3_nic_loop_mode *mode = buf_out;
+
+ if (!out_size || !mode)
+ return -EINVAL;
+
+ if (*out_size != sizeof(*mode)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user: %u, expect: %lu\n",
+ *out_size, sizeof(*mode));
+ return -EINVAL;
+ }
+
+ return hinic3_get_loopback_mode(nic_dev->hwdev, (u8 *)&mode->loop_mode,
+ (u8 *)&mode->loop_ctrl);
+}
+
+static int set_loopback_mode(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ const struct hinic3_nic_loop_mode *mode = buf_in;
+ int err;
+
+ if (!test_bit(HINIC3_INTF_UP, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't set loopback mode\n");
+ return -EFAULT;
+ }
+
+ if (!mode || !out_size || in_size != sizeof(*mode))
+ return -EINVAL;
+
+ if (*out_size != sizeof(*mode)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user: %u, expect: %lu\n",
+ *out_size, sizeof(*mode));
+ return -EINVAL;
+ }
+
+ err = hinic3_set_loopback_mode(nic_dev->hwdev, (u8)mode->loop_mode,
+ (u8)mode->loop_ctrl);
+ if (err == 0)
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Set loopback mode %u en %u succeed\n",
+ mode->loop_mode, mode->loop_ctrl);
+
+ return err;
+}
+
+enum hinic3_nic_link_mode {
+ HINIC3_LINK_MODE_AUTO = 0,
+ HINIC3_LINK_MODE_UP,
+ HINIC3_LINK_MODE_DOWN,
+ HINIC3_LINK_MODE_MAX,
+};
+
+static int set_link_mode_param_valid(struct hinic3_nic_dev *nic_dev,
+ const void *buf_in, u32 in_size,
+ const u32 *out_size)
+{
+ if (!test_bit(HINIC3_INTF_UP, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Netdev is down, can't set link mode\n");
+ return -EFAULT;
+ }
+
+ if (!buf_in || !out_size ||
+ in_size != sizeof(enum hinic3_nic_link_mode))
+ return -EINVAL;
+
+ if (*out_size != sizeof(enum hinic3_nic_link_mode)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user: %u, expect: %lu\n",
+ *out_size, sizeof(enum hinic3_nic_link_mode));
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int set_link_mode(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ const enum hinic3_nic_link_mode *link = buf_in;
+ u8 link_status;
+
+ if (set_link_mode_param_valid(nic_dev, buf_in, in_size, out_size))
+ return -EFAULT;
+
+ switch (*link) {
+ case HINIC3_LINK_MODE_AUTO:
+ if (hinic3_get_link_state(nic_dev->hwdev, &link_status))
+ link_status = false;
+ hinic3_link_status_change(nic_dev, (bool)link_status);
+ nicif_info(nic_dev, drv, nic_dev->netdev,
+ "Set link mode: auto succeed, now is link %s\n",
+ (link_status ? "up" : "down"));
+ break;
+ case HINIC3_LINK_MODE_UP:
+ hinic3_link_status_change(nic_dev, true);
+ nicif_info(nic_dev, drv, nic_dev->netdev,
+ "Set link mode: up succeed\n");
+ break;
+ case HINIC3_LINK_MODE_DOWN:
+ hinic3_link_status_change(nic_dev, false);
+ nicif_info(nic_dev, drv, nic_dev->netdev,
+ "Set link mode: down succeed\n");
+ break;
+ default:
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Invalid link mode %d to set\n", *link);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int set_pf_bw_limit(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ u32 pf_bw_limit;
+ int err;
+
+ if (HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "To set VF bandwidth rate, please use ip link cmd\n");
+ return -EINVAL;
+ }
+
+ if (!buf_in || !buf_out || in_size != sizeof(u32) || !out_size || *out_size != sizeof(u8))
+ return -EINVAL;
+
+ pf_bw_limit = *((u32 *)buf_in);
+
+ err = hinic3_set_pf_bw_limit(nic_dev->hwdev, pf_bw_limit);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to set pf bandwidth limit to %d%%\n",
+ pf_bw_limit);
+ if (err < 0)
+ return err;
+ }
+
+ *((u8 *)buf_out) = (u8)err;
+
+ return 0;
+}
+
+static int get_pf_bw_limit(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "To get VF bandwidth rate, please use ip link cmd\n");
+ return -EINVAL;
+ }
+
+ if (!buf_out || !out_size)
+ return -EINVAL;
+
+ if (*out_size != sizeof(u32)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user: %d, expect: %lu\n",
+ *out_size, sizeof(u32));
+ return -EFAULT;
+ }
+
+ nic_io = hinic3_get_service_adapter(nic_dev->hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ *((u32 *)buf_out) = nic_io->nic_cfg.pf_bw_limit;
+
+ return 0;
+}
+
+static int get_sset_count(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ u32 count;
+
+ if (!buf_in || in_size != sizeof(u32) || !out_size ||
+ *out_size != sizeof(u32) || !buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Invalid parameters, in_size: %u\n",
+ in_size);
+ return -EINVAL;
+ }
+
+ switch (*((u32 *)buf_in)) {
+ case HINIC3_SHOW_SSET_IO_STATS:
+ count = hinic3_get_io_stats_size(nic_dev);
+ break;
+ default:
+ count = 0;
+ break;
+ }
+
+ *((u32 *)buf_out) = count;
+
+ return 0;
+}
+
+static int get_sset_stats(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct hinic3_show_item *items = buf_out;
+ u32 sset, count, size;
+ int err;
+
+ if (!buf_in || in_size != sizeof(u32) || !out_size || !buf_out) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Invalid parameters, in_size: %u\n",
+ in_size);
+ return -EINVAL;
+ }
+
+ size = sizeof(u32);
+ err = get_sset_count(nic_dev, buf_in, in_size, &count, &size);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Get sset count failed, ret=%d\n",
+ err);
+ return -EINVAL;
+ }
+ if (count * sizeof(*items) != *out_size) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, count * sizeof(*items));
+ return -EINVAL;
+ }
+
+ sset = *((u32 *)buf_in);
+
+ switch (sset) {
+ case HINIC3_SHOW_SSET_IO_STATS:
+ hinic3_get_io_stats(nic_dev, items);
+ break;
+
+ default:
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Unknown %u to get stats\n",
+ sset);
+ err = -EINVAL;
+ break;
+ }
+
+ return err;
+}
+
+static int update_pcp_dscp_cfg(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dcb_config *wanted_dcb_cfg,
+ const struct hinic3_mt_qos_dev_cfg *qos_in)
+{
+ int i;
+ u8 cos_num = 0, valid_cos_bitmap = 0;
+
+ if (qos_in->cfg_bitmap & CMD_QOS_DEV_PCP2COS) {
+ for (i = 0; i < NIC_DCB_UP_MAX; i++) {
+ if (!(nic_dev->func_dft_cos_bitmap & BIT(qos_in->pcp2cos[i]))) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Invalid cos=%u, func cos valid map is %u",
+ qos_in->pcp2cos[i], nic_dev->func_dft_cos_bitmap);
+ return -EINVAL;
+ }
+
+ if ((BIT(qos_in->pcp2cos[i]) & valid_cos_bitmap) == 0) {
+ valid_cos_bitmap |= (u8)BIT(qos_in->pcp2cos[i]);
+ cos_num++;
+ }
+ }
+
+ memcpy(wanted_dcb_cfg->pcp2cos, qos_in->pcp2cos, sizeof(qos_in->pcp2cos));
+ wanted_dcb_cfg->pcp_user_cos_num = cos_num;
+ wanted_dcb_cfg->pcp_valid_cos_map = valid_cos_bitmap;
+ }
+
+ if (qos_in->cfg_bitmap & CMD_QOS_DEV_DSCP2COS) {
+ cos_num = 0;
+ valid_cos_bitmap = 0;
+ for (i = 0; i < NIC_DCB_IP_PRI_MAX; i++) {
+ u8 cos = qos_in->dscp2cos[i] == DBG_DFLT_DSCP_VAL ?
+ nic_dev->wanted_dcb_cfg.dscp2cos[i] : qos_in->dscp2cos[i];
+
+ if (cos >= NIC_DCB_UP_MAX || !(nic_dev->func_dft_cos_bitmap & BIT(cos))) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Invalid cos=%u, func cos valid map is %u",
+ cos, nic_dev->func_dft_cos_bitmap);
+ return -EINVAL;
+ }
+
+ if ((BIT(cos) & valid_cos_bitmap) == 0) {
+ valid_cos_bitmap |= (u8)BIT(cos);
+ cos_num++;
+ }
+ }
+
+ for (i = 0; i < NIC_DCB_IP_PRI_MAX; i++)
+ wanted_dcb_cfg->dscp2cos[i] = qos_in->dscp2cos[i] == DBG_DFLT_DSCP_VAL ?
+ nic_dev->hw_dcb_cfg.dscp2cos[i] : qos_in->dscp2cos[i];
+ wanted_dcb_cfg->dscp_user_cos_num = cos_num;
+ wanted_dcb_cfg->dscp_valid_cos_map = valid_cos_bitmap;
+ }
+
+ return 0;
+}
+
+static int update_wanted_qos_cfg(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dcb_config *wanted_dcb_cfg,
+ const struct hinic3_mt_qos_dev_cfg *qos_in)
+{
+ int ret;
+ u8 cos_num, valid_cos_bitmap;
+
+ if (qos_in->cfg_bitmap & CMD_QOS_DEV_TRUST) {
+ if (qos_in->trust > DCB_DSCP) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Invalid trust=%u\n", qos_in->trust);
+ return -EINVAL;
+ }
+
+ wanted_dcb_cfg->trust = qos_in->trust;
+ }
+
+ if (qos_in->cfg_bitmap & CMD_QOS_DEV_DFT_COS) {
+ if (!(BIT(qos_in->dft_cos) & nic_dev->func_dft_cos_bitmap)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Invalid dft_cos=%u\n", qos_in->dft_cos);
+ return -EINVAL;
+ }
+
+ wanted_dcb_cfg->default_cos = qos_in->dft_cos;
+ }
+
+ ret = update_pcp_dscp_cfg(nic_dev, wanted_dcb_cfg, qos_in);
+ if (ret)
+ return ret;
+
+ if (wanted_dcb_cfg->trust == DCB_PCP) {
+ cos_num = wanted_dcb_cfg->pcp_user_cos_num;
+ valid_cos_bitmap = wanted_dcb_cfg->pcp_valid_cos_map;
+ } else {
+ cos_num = wanted_dcb_cfg->dscp_user_cos_num;
+ valid_cos_bitmap = wanted_dcb_cfg->dscp_valid_cos_map;
+ }
+
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags)) {
+ if (cos_num > nic_dev->q_params.num_qps) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "DCB is on, cos num should not more than channel num:%u\n",
+ nic_dev->q_params.num_qps);
+ return -EOPNOTSUPP;
+ }
+ }
+
+ if (!(BIT(wanted_dcb_cfg->default_cos) & valid_cos_bitmap)) {
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Current default_cos=%u, change to %u\n",
+ wanted_dcb_cfg->default_cos, (u8)fls(valid_cos_bitmap) - 1);
+ wanted_dcb_cfg->default_cos = (u8)fls(valid_cos_bitmap) - 1;
+ }
+
+ return 0;
+}
+
+static int dcb_mt_qos_map(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ const struct hinic3_mt_qos_dev_cfg *qos_in = buf_in;
+ struct hinic3_mt_qos_dev_cfg *qos_out = buf_out;
+ u8 i;
+ int err;
+
+ if (!buf_out || !out_size || !buf_in)
+ return -EINVAL;
+
+ if (*out_size != sizeof(*qos_out) || in_size != sizeof(*qos_in)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, out_size: %u, expect: %lu\n",
+ in_size, *out_size, sizeof(*qos_in));
+ return -EINVAL;
+ }
+
+ memcpy(qos_out, qos_in, sizeof(*qos_in));
+ qos_out->head.status = 0;
+ if (qos_in->op_code & MT_DCB_OPCODE_WR) {
+ memcpy(&nic_dev->wanted_dcb_cfg, &nic_dev->hw_dcb_cfg,
+ sizeof(struct hinic3_dcb_config));
+ err = update_wanted_qos_cfg(nic_dev, &nic_dev->wanted_dcb_cfg, qos_in);
+ if (err) {
+ qos_out->head.status = MT_EINVAL;
+ return 0;
+ }
+
+ err = hinic3_dcbcfg_set_up_bitmap(nic_dev);
+ if (err)
+ qos_out->head.status = MT_EIO;
+ } else {
+ qos_out->dft_cos = nic_dev->hw_dcb_cfg.default_cos;
+ qos_out->trust = nic_dev->hw_dcb_cfg.trust;
+ for (i = 0; i < NIC_DCB_UP_MAX; i++)
+ qos_out->pcp2cos[i] = nic_dev->hw_dcb_cfg.pcp2cos[i];
+ for (i = 0; i < NIC_DCB_IP_PRI_MAX; i++)
+ qos_out->dscp2cos[i] = nic_dev->hw_dcb_cfg.dscp2cos[i];
+ }
+
+ return 0;
+}
+
+static int dcb_mt_dcb_state(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ const struct hinic3_mt_dcb_state *dcb_in = buf_in;
+ struct hinic3_mt_dcb_state *dcb_out = buf_out;
+ int err;
+ u8 user_cos_num;
+ u8 netif_run = 0;
+
+ if (!buf_in || !buf_out || !out_size)
+ return -EINVAL;
+
+ if (*out_size != sizeof(*dcb_out) || in_size != sizeof(*dcb_in)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, out_size: %u, expect: %lu\n",
+ in_size, *out_size, sizeof(*dcb_in));
+ return -EINVAL;
+ }
+
+ user_cos_num = hinic3_get_dev_user_cos_num(nic_dev);
+ memcpy(dcb_out, dcb_in, sizeof(*dcb_in));
+ dcb_out->head.status = 0;
+ if (dcb_in->op_code & MT_DCB_OPCODE_WR) {
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags) == dcb_in->state)
+ return 0;
+
+ if (dcb_in->state) {
+ if (user_cos_num > nic_dev->q_params.num_qps) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "cos num %u should not more than channel num %u\n",
+ user_cos_num,
+ nic_dev->q_params.num_qps);
+
+ return -EOPNOTSUPP;
+ }
+ }
+
+ rtnl_lock();
+ if (netif_running(nic_dev->netdev)) {
+ netif_run = 1;
+ hinic3_vport_down(nic_dev);
+ }
+
+ err = hinic3_setup_cos(nic_dev->netdev, dcb_in->state ? user_cos_num : 0,
+ netif_run);
+ if (err)
+ goto setup_cos_fail;
+
+ if (netif_run) {
+ err = hinic3_vport_up(nic_dev);
+ if (err)
+ goto vport_up_fail;
+ }
+ rtnl_unlock();
+ } else {
+ dcb_out->state = !!test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags);
+ }
+
+ return 0;
+
+vport_up_fail:
+ hinic3_setup_cos(nic_dev->netdev, dcb_in->state ? 0 : user_cos_num, netif_run);
+
+setup_cos_fail:
+ if (netif_run)
+ hinic3_vport_up(nic_dev);
+ rtnl_unlock();
+
+ return err;
+}
+
+static int dcb_mt_hw_qos_get(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ const struct hinic3_mt_qos_cos_cfg *cos_cfg_in = buf_in;
+ struct hinic3_mt_qos_cos_cfg *cos_cfg_out = buf_out;
+
+ if (!buf_in || !buf_out || !out_size)
+ return -EINVAL;
+
+ if (*out_size != sizeof(*cos_cfg_out) || in_size != sizeof(*cos_cfg_in)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, out_size: %u, expect: %lu\n",
+ in_size, *out_size, sizeof(*cos_cfg_in));
+ return -EINVAL;
+ }
+
+ memcpy(cos_cfg_out, cos_cfg_in, sizeof(*cos_cfg_in));
+ cos_cfg_out->head.status = 0;
+
+ cos_cfg_out->port_id = hinic3_physical_port_id(nic_dev->hwdev);
+ cos_cfg_out->func_cos_bitmap = (u8)nic_dev->func_dft_cos_bitmap;
+ cos_cfg_out->port_cos_bitmap = (u8)nic_dev->port_dft_cos_bitmap;
+ cos_cfg_out->func_max_cos_num = nic_dev->cos_config_num_max;
+
+ return 0;
+}
+
+static int get_inter_num(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ u16 intr_num;
+
+ intr_num = hinic3_intr_num(nic_dev->hwdev);
+
+ if (!buf_out || !out_size || *out_size != sizeof(u16)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(u16));
+ return -EFAULT;
+ }
+ *(u16 *)buf_out = intr_num;
+
+ return 0;
+}
+
+static int get_netdev_name(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ if (!buf_out || !out_size || *out_size != IFNAMSIZ) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect out buf size from user :%u, expect: %u\n",
+ *out_size, IFNAMSIZ);
+ return -EFAULT;
+ }
+
+ strlcpy(buf_out, nic_dev->netdev->name, IFNAMSIZ);
+
+ return 0;
+}
+
+static int get_netdev_tx_timeout(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct net_device *net_dev = nic_dev->netdev;
+ int *tx_timeout = buf_out;
+
+ if (!buf_out || !out_size)
+ return -EINVAL;
+
+ if (*out_size != sizeof(int)) {
+ nicif_err(nic_dev, drv, net_dev, "Unexpect buf size from user, out_size: %u, expect: %lu\n",
+ *out_size, sizeof(int));
+ return -EINVAL;
+ }
+
+ *tx_timeout = net_dev->watchdog_timeo;
+
+ return 0;
+}
+
+static int set_netdev_tx_timeout(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct net_device *net_dev = nic_dev->netdev;
+ const int *tx_timeout = buf_in;
+
+ if (!buf_in)
+ return -EINVAL;
+
+ if (in_size != sizeof(int)) {
+ nicif_err(nic_dev, drv, net_dev, "Unexpect buf size from user, in_size: %u, expect: %lu\n",
+ in_size, sizeof(int));
+ return -EINVAL;
+ }
+
+ net_dev->watchdog_timeo = *tx_timeout * HZ;
+ nicif_info(nic_dev, drv, net_dev, "Set tx timeout check period to %ds\n", *tx_timeout);
+
+ return 0;
+}
+
+static int get_xsfp_present(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct mag_cmd_get_xsfp_present *sfp_abs = buf_out;
+
+ if (!buf_in || !buf_out || !out_size)
+ return -EINVAL;
+
+ if (*out_size != sizeof(*sfp_abs) || in_size != sizeof(*sfp_abs)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, out_size: %u, expect: %lu\n",
+ in_size, *out_size, sizeof(*sfp_abs));
+ return -EINVAL;
+ }
+
+ sfp_abs->head.status = 0;
+ sfp_abs->abs_status = hinic3_if_sfp_absent(nic_dev->hwdev);
+
+ return 0;
+}
+
+static int get_xsfp_info(struct hinic3_nic_dev *nic_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct mag_cmd_get_xsfp_info *sfp_info = buf_out;
+ int err;
+
+ if (!buf_in || !buf_out || !out_size)
+ return -EINVAL;
+
+ if (*out_size != sizeof(*sfp_info) || in_size != sizeof(*sfp_info)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unexpect buf size from user, in_size: %u, out_size: %u, expect: %lu\n",
+ in_size, *out_size, sizeof(*sfp_info));
+ return -EINVAL;
+ }
+
+ err = hinic3_get_sfp_info(nic_dev->hwdev, sfp_info);
+ if (err) {
+ sfp_info->head.status = MT_EIO;
+ return 0;
+ }
+
+ return 0;
+}
+
+static const struct nic_drv_module_handle nic_driv_module_cmd_handle[] = {
+ {TX_INFO, get_tx_info},
+ {Q_NUM, get_q_num},
+ {TX_WQE_INFO, get_tx_wqe_info},
+ {RX_INFO, get_rx_info},
+ {RX_WQE_INFO, get_rx_wqe_info},
+ {RX_CQE_INFO, get_rx_cqe_info},
+ {GET_INTER_NUM, get_inter_num},
+ {CLEAR_FUNC_STASTIC, clear_func_static},
+ {GET_LOOPBACK_MODE, get_loopback_mode},
+ {SET_LOOPBACK_MODE, set_loopback_mode},
+ {SET_LINK_MODE, set_link_mode},
+ {SET_PF_BW_LIMIT, set_pf_bw_limit},
+ {GET_PF_BW_LIMIT, get_pf_bw_limit},
+ {GET_SSET_COUNT, get_sset_count},
+ {GET_SSET_ITEMS, get_sset_stats},
+ {DCB_STATE, dcb_mt_dcb_state},
+ {QOS_DEV, dcb_mt_qos_map},
+ {GET_QOS_COS, dcb_mt_hw_qos_get},
+ {GET_ULD_DEV_NAME, get_netdev_name},
+ {GET_TX_TIMEOUT, get_netdev_tx_timeout},
+ {SET_TX_TIMEOUT, set_netdev_tx_timeout},
+ {GET_XSFP_PRESENT, get_xsfp_present},
+ {GET_XSFP_INFO, get_xsfp_info},
+};
+
+static int send_to_nic_driver(struct hinic3_nic_dev *nic_dev,
+ u32 cmd, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ int index, num_cmds = sizeof(nic_driv_module_cmd_handle) /
+ sizeof(nic_driv_module_cmd_handle[0]);
+ enum driver_cmd_type cmd_type = (enum driver_cmd_type)cmd;
+ int err = 0;
+
+ mutex_lock(&nic_dev->nic_mutex);
+ for (index = 0; index < num_cmds; index++) {
+ if (cmd_type ==
+ nic_driv_module_cmd_handle[index].driv_cmd_name) {
+ err = nic_driv_module_cmd_handle[index].driv_func
+ (nic_dev, buf_in,
+ in_size, buf_out, out_size);
+ break;
+ }
+ }
+ mutex_unlock(&nic_dev->nic_mutex);
+
+ if (index == num_cmds) {
+ pr_err("Can't find callback for %d\n", cmd_type);
+ return -EINVAL;
+ }
+
+ return err;
+}
+
+int nic_ioctl(void *uld_dev, u32 cmd, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ if (cmd == GET_DRV_VERSION)
+ return get_nic_drv_version(buf_out, out_size);
+ else if (!uld_dev)
+ return -EINVAL;
+
+ return send_to_nic_driver(uld_dev, cmd, buf_in,
+ in_size, buf_out, out_size);
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_dcb.c b/drivers/net/ethernet/huawei/hinic3/hinic3_dcb.c
new file mode 100644
index 000000000000..a1fb4afb323e
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_dcb.c
@@ -0,0 +1,405 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+
+#include "hinic3_crm.h"
+#include "hinic3_lld.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_dcb.h"
+
+#define MAX_BW_PERCENT 100
+
+u8 hinic3_get_dev_user_cos_num(struct hinic3_nic_dev *nic_dev)
+{
+ if (nic_dev->hw_dcb_cfg.trust == 0)
+ return nic_dev->hw_dcb_cfg.pcp_user_cos_num;
+ if (nic_dev->hw_dcb_cfg.trust == 1)
+ return nic_dev->hw_dcb_cfg.dscp_user_cos_num;
+ return 0;
+}
+
+u8 hinic3_get_dev_valid_cos_map(struct hinic3_nic_dev *nic_dev)
+{
+ if (nic_dev->hw_dcb_cfg.trust == 0)
+ return nic_dev->hw_dcb_cfg.pcp_valid_cos_map;
+ if (nic_dev->hw_dcb_cfg.trust == 1)
+ return nic_dev->hw_dcb_cfg.dscp_valid_cos_map;
+ return 0;
+}
+
+void hinic3_update_qp_cos_cfg(struct hinic3_nic_dev *nic_dev, u8 num_cos)
+{
+ struct hinic3_dcb_config *dcb_cfg = &nic_dev->hw_dcb_cfg;
+ u8 i, remainder, num_sq_per_cos, cur_cos_num = 0;
+ u8 valid_cos_map = hinic3_get_dev_valid_cos_map(nic_dev);
+
+ if (num_cos == 0)
+ return;
+
+ num_sq_per_cos = (u8)(nic_dev->q_params.num_qps / num_cos);
+ if (num_sq_per_cos == 0)
+ return;
+
+ remainder = nic_dev->q_params.num_qps % num_sq_per_cos;
+
+ memset(dcb_cfg->cos_qp_offset, 0, sizeof(dcb_cfg->cos_qp_offset));
+ memset(dcb_cfg->cos_qp_num, 0, sizeof(dcb_cfg->cos_qp_num));
+
+ for (i = 0; i < PCP_MAX_UP; i++) {
+ if (BIT(i) & valid_cos_map) {
+ u8 cos_qp_num = num_sq_per_cos;
+ u8 cos_qp_offset = (u8)(cur_cos_num * num_sq_per_cos);
+
+ if (cur_cos_num < remainder) {
+ cos_qp_num++;
+ cos_qp_offset += cur_cos_num;
+ } else {
+ cos_qp_offset += remainder;
+ }
+
+ cur_cos_num++;
+ valid_cos_map -= (u8)BIT(i);
+
+ dcb_cfg->cos_qp_offset[i] = cos_qp_offset;
+ dcb_cfg->cos_qp_num[i] = cos_qp_num;
+ hinic3_info(nic_dev, drv, "cos %u, cos_qp_offset=%u cos_qp_num=%u\n",
+ i, cos_qp_offset, cos_qp_num);
+ }
+ }
+
+ memcpy(nic_dev->wanted_dcb_cfg.cos_qp_offset, dcb_cfg->cos_qp_offset,
+ sizeof(dcb_cfg->cos_qp_offset));
+ memcpy(nic_dev->wanted_dcb_cfg.cos_qp_num, dcb_cfg->cos_qp_num,
+ sizeof(dcb_cfg->cos_qp_num));
+}
+
+void hinic3_update_tx_db_cos(struct hinic3_nic_dev *nic_dev, u8 dcb_en)
+{
+ u8 i;
+ u16 start_qid, q_num;
+
+ hinic3_set_txq_cos(nic_dev, 0, nic_dev->q_params.num_qps,
+ nic_dev->hw_dcb_cfg.default_cos);
+ if (!dcb_en)
+ return;
+
+ for (i = 0; i < NIC_DCB_COS_MAX; i++) {
+ q_num = (u16)nic_dev->hw_dcb_cfg.cos_qp_num[i];
+ if (q_num) {
+ start_qid = (u16)nic_dev->hw_dcb_cfg.cos_qp_offset[i];
+
+ hinic3_set_txq_cos(nic_dev, start_qid, q_num, i);
+ hinic3_info(nic_dev, drv, "update tx db cos, start_qid %u, q_num=%u cos=%u\n",
+ start_qid, q_num, i);
+ }
+ }
+}
+
+static int hinic3_set_tx_cos_state(struct hinic3_nic_dev *nic_dev, u8 dcb_en)
+{
+ struct hinic3_dcb_config *dcb_cfg = &nic_dev->hw_dcb_cfg;
+ struct hinic3_dcb_state dcb_state = {0};
+ u8 i;
+ int err;
+
+ if (HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ /* VF does not support DCB, use the default cos */
+ dcb_cfg->default_cos = (u8)fls(nic_dev->func_dft_cos_bitmap) - 1;
+
+ return 0;
+ }
+
+ dcb_state.dcb_on = dcb_en;
+ dcb_state.default_cos = dcb_cfg->default_cos;
+ dcb_state.trust = dcb_cfg->trust;
+
+ if (dcb_en) {
+ for (i = 0; i < NIC_DCB_COS_MAX; i++)
+ dcb_state.pcp2cos[i] = dcb_cfg->pcp2cos[i];
+ for (i = 0; i < NIC_DCB_IP_PRI_MAX; i++)
+ dcb_state.dscp2cos[i] = dcb_cfg->dscp2cos[i];
+ } else {
+ memset(dcb_state.pcp2cos, dcb_cfg->default_cos, sizeof(dcb_state.pcp2cos));
+ memset(dcb_state.dscp2cos, dcb_cfg->default_cos, sizeof(dcb_state.dscp2cos));
+ }
+
+ err = hinic3_set_dcb_state(nic_dev->hwdev, &dcb_state);
+ if (err)
+ hinic3_err(nic_dev, drv, "Failed to set dcb state\n");
+
+ return err;
+}
+
+static int hinic3_configure_dcb_hw(struct hinic3_nic_dev *nic_dev, u8 dcb_en)
+{
+ int err;
+ u8 user_cos_num = hinic3_get_dev_user_cos_num(nic_dev);
+
+ err = hinic3_sync_dcb_state(nic_dev->hwdev, 1, dcb_en);
+ if (err) {
+ hinic3_err(nic_dev, drv, "Set dcb state failed\n");
+ return err;
+ }
+
+ hinic3_update_qp_cos_cfg(nic_dev, user_cos_num);
+ hinic3_update_tx_db_cos(nic_dev, dcb_en);
+
+ err = hinic3_set_tx_cos_state(nic_dev, dcb_en);
+ if (err) {
+ hinic3_err(nic_dev, drv, "Set tx cos state failed\n");
+ goto set_tx_cos_fail;
+ }
+
+ err = hinic3_rx_configure(nic_dev->netdev, dcb_en);
+ if (err) {
+ hinic3_err(nic_dev, drv, "rx configure failed\n");
+ goto rx_configure_fail;
+ }
+
+ if (dcb_en)
+ set_bit(HINIC3_DCB_ENABLE, &nic_dev->flags);
+ else
+ clear_bit(HINIC3_DCB_ENABLE, &nic_dev->flags);
+
+ return 0;
+rx_configure_fail:
+ hinic3_set_tx_cos_state(nic_dev, dcb_en ? 0 : 1);
+
+set_tx_cos_fail:
+ hinic3_update_tx_db_cos(nic_dev, dcb_en ? 0 : 1);
+ hinic3_sync_dcb_state(nic_dev->hwdev, 1, dcb_en ? 0 : 1);
+
+ return err;
+}
+
+int hinic3_setup_cos(struct net_device *netdev, u8 cos, u8 netif_run)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err;
+
+ if (cos && test_bit(HINIC3_SAME_RXTX, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, netdev, "Failed to enable DCB while Symmetric RSS is enabled\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (cos > nic_dev->cos_config_num_max) {
+ nicif_err(nic_dev, drv, netdev, "Invalid num_tc: %u, max cos: %u\n",
+ cos, nic_dev->cos_config_num_max);
+ return -EINVAL;
+ }
+
+ err = hinic3_configure_dcb_hw(nic_dev, cos ? 1 : 0);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static u8 get_cos_num(u8 hw_valid_cos_bitmap)
+{
+ u8 support_cos = 0;
+ u8 i;
+
+ for (i = 0; i < NIC_DCB_COS_MAX; i++)
+ if (hw_valid_cos_bitmap & BIT(i))
+ support_cos++;
+
+ return support_cos;
+}
+
+static void hinic3_sync_dcb_cfg(struct hinic3_nic_dev *nic_dev,
+ const struct hinic3_dcb_config *dcb_cfg)
+{
+ struct hinic3_dcb_config *hw_cfg = &nic_dev->hw_dcb_cfg;
+
+ memcpy(hw_cfg, dcb_cfg, sizeof(struct hinic3_dcb_config));
+}
+
+static int init_default_dcb_cfg(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dcb_config *dcb_cfg)
+{
+ u8 i, hw_dft_cos_map, port_cos_bitmap, dscp_ind;
+ int err;
+
+ err = hinic3_cos_valid_bitmap(nic_dev->hwdev, &hw_dft_cos_map, &port_cos_bitmap);
+ if (err) {
+ hinic3_err(nic_dev, drv, "None cos supported\n");
+ return -EFAULT;
+ }
+ nic_dev->func_dft_cos_bitmap = hw_dft_cos_map;
+ nic_dev->port_dft_cos_bitmap = port_cos_bitmap;
+
+ nic_dev->cos_config_num_max = get_cos_num(hw_dft_cos_map);
+
+ dcb_cfg->trust = DCB_PCP;
+ dcb_cfg->pcp_user_cos_num = nic_dev->cos_config_num_max;
+ dcb_cfg->dscp_user_cos_num = nic_dev->cos_config_num_max;
+ dcb_cfg->default_cos = (u8)fls(nic_dev->func_dft_cos_bitmap) - 1;
+ dcb_cfg->pcp_valid_cos_map = hw_dft_cos_map;
+ dcb_cfg->dscp_valid_cos_map = hw_dft_cos_map;
+
+ for (i = 0; i < NIC_DCB_COS_MAX; i++) {
+ dcb_cfg->pcp2cos[i] = hw_dft_cos_map & BIT(i) ? i : dcb_cfg->default_cos;
+ for (dscp_ind = 0; dscp_ind < NIC_DCB_COS_MAX; dscp_ind++)
+ dcb_cfg->dscp2cos[i * NIC_DCB_DSCP_NUM + dscp_ind] = dcb_cfg->pcp2cos[i];
+ }
+
+ return 0;
+}
+
+void hinic3_dcb_reset_hw_config(struct hinic3_nic_dev *nic_dev)
+{
+ struct hinic3_dcb_config dft_cfg = {0};
+
+ init_default_dcb_cfg(nic_dev, &dft_cfg);
+ hinic3_sync_dcb_cfg(nic_dev, &dft_cfg);
+
+ hinic3_info(nic_dev, drv, "Reset DCB configuration done\n");
+}
+
+int hinic3_configure_dcb(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err;
+
+ err = hinic3_sync_dcb_state(nic_dev->hwdev, 1,
+ test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags) ? 1 : 0);
+ if (err) {
+ hinic3_err(nic_dev, drv, "Set dcb state failed\n");
+ return err;
+ }
+
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags))
+ hinic3_sync_dcb_cfg(nic_dev, &nic_dev->wanted_dcb_cfg);
+ else
+ hinic3_dcb_reset_hw_config(nic_dev);
+
+ return 0;
+}
+
+int hinic3_dcb_init(struct hinic3_nic_dev *nic_dev)
+{
+ struct hinic3_dcb_config *dcb_cfg = &nic_dev->hw_dcb_cfg;
+ int err;
+ u8 dcb_en = test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags) ? 1 : 0;
+
+ if (HINIC3_FUNC_IS_VF(nic_dev->hwdev))
+ return hinic3_set_tx_cos_state(nic_dev, dcb_en);
+
+ err = init_default_dcb_cfg(nic_dev, dcb_cfg);
+ if (err) {
+ hinic3_err(nic_dev, drv, "Initialize dcb configuration failed\n");
+ return err;
+ }
+
+ memcpy(&nic_dev->wanted_dcb_cfg, &nic_dev->hw_dcb_cfg, sizeof(struct hinic3_dcb_config));
+
+ hinic3_info(nic_dev, drv, "Support num cos %u, default cos %u\n",
+ nic_dev->cos_config_num_max, dcb_cfg->default_cos);
+
+ err = hinic3_set_tx_cos_state(nic_dev, dcb_en);
+ if (err) {
+ hinic3_err(nic_dev, drv, "Set tx cos state failed\n");
+ return err;
+ }
+
+ sema_init(&nic_dev->dcb_sem, 1);
+
+ return 0;
+}
+
+static int change_qos_cfg(struct hinic3_nic_dev *nic_dev, const struct hinic3_dcb_config *dcb_cfg)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ int err = 0;
+ u8 user_cos_num = hinic3_get_dev_user_cos_num(nic_dev);
+
+ if (test_and_set_bit(HINIC3_DCB_UP_COS_SETTING, &nic_dev->dcb_flags)) {
+ nicif_warn(nic_dev, drv, netdev,
+ "Cos_up map setting in inprocess, please try again later\n");
+ return -EFAULT;
+ }
+
+ hinic3_sync_dcb_cfg(nic_dev, dcb_cfg);
+
+ hinic3_update_qp_cos_cfg(nic_dev, user_cos_num);
+
+ clear_bit(HINIC3_DCB_UP_COS_SETTING, &nic_dev->dcb_flags);
+
+ return err;
+}
+
+int hinic3_dcbcfg_set_up_bitmap(struct hinic3_nic_dev *nic_dev)
+{
+ int err, rollback_err;
+ u8 netif_run = 0;
+ struct hinic3_dcb_config old_dcb_cfg;
+ u8 user_cos_num = hinic3_get_dev_user_cos_num(nic_dev);
+
+ memcpy(&old_dcb_cfg, &nic_dev->hw_dcb_cfg, sizeof(struct hinic3_dcb_config));
+
+ if (!memcmp(&nic_dev->wanted_dcb_cfg, &old_dcb_cfg, sizeof(struct hinic3_dcb_config))) {
+ nicif_info(nic_dev, drv, nic_dev->netdev,
+ "Same valid up bitmap, don't need to change anything\n");
+ return 0;
+ }
+
+ rtnl_lock();
+ if (netif_running(nic_dev->netdev)) {
+ netif_run = 1;
+ hinic3_vport_down(nic_dev);
+ }
+
+ err = change_qos_cfg(nic_dev, &nic_dev->wanted_dcb_cfg);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Set cos_up map to hw failed\n");
+ goto change_qos_cfg_fail;
+ }
+
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags)) {
+ err = hinic3_setup_cos(nic_dev->netdev, user_cos_num, netif_run);
+ if (err)
+ goto set_err;
+ }
+
+ if (netif_run) {
+ err = hinic3_vport_up(nic_dev);
+ if (err)
+ goto vport_up_fail;
+ }
+
+ rtnl_unlock();
+
+ return 0;
+
+vport_up_fail:
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags))
+ hinic3_setup_cos(nic_dev->netdev, user_cos_num ? 0 : user_cos_num, netif_run);
+
+set_err:
+ rollback_err = change_qos_cfg(nic_dev, &old_dcb_cfg);
+ if (rollback_err)
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to rollback qos configure\n");
+
+change_qos_cfg_fail:
+ if (netif_run)
+ hinic3_vport_up(nic_dev);
+
+ rtnl_unlock();
+
+ return err;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_dcb.h b/drivers/net/ethernet/huawei/hinic3/hinic3_dcb.h
new file mode 100644
index 000000000000..7987f563cfff
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_dcb.h
@@ -0,0 +1,78 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_DCB_H
+#define HINIC3_DCB_H
+
+#include "ossl_knl.h"
+
+enum HINIC3_DCB_FLAGS {
+ HINIC3_DCB_UP_COS_SETTING,
+ HINIC3_DCB_TRAFFIC_STOPPED,
+};
+
+struct hinic3_cos_cfg {
+ u8 up;
+ u8 bw_pct;
+ u8 tc_id;
+ u8 prio_sp; /* 0 - DWRR, 1 - SP */
+};
+
+struct hinic3_tc_cfg {
+ u8 bw_pct;
+ u8 prio_sp; /* 0 - DWRR, 1 - SP */
+ u16 rsvd;
+};
+
+enum HINIC3_DCB_TRUST {
+ DCB_PCP,
+ DCB_DSCP,
+};
+
+#define PCP_MAX_UP 8
+#define DSCP_MAC_UP 64
+#define DBG_DFLT_DSCP_VAL 0xFF
+
+struct hinic3_dcb_config {
+ u8 trust; /* pcp, dscp */
+ u8 default_cos;
+ u8 pcp_user_cos_num;
+ u8 pcp_valid_cos_map;
+ u8 dscp_user_cos_num;
+ u8 dscp_valid_cos_map;
+ u8 pcp2cos[PCP_MAX_UP];
+ u8 dscp2cos[DSCP_MAC_UP];
+
+ u8 cos_qp_offset[NIC_DCB_COS_MAX];
+ u8 cos_qp_num[NIC_DCB_COS_MAX];
+};
+
+u8 hinic3_get_dev_user_cos_num(struct hinic3_nic_dev *nic_dev);
+u8 hinic3_get_dev_valid_cos_map(struct hinic3_nic_dev *nic_dev);
+int hinic3_dcb_init(struct hinic3_nic_dev *nic_dev);
+void hinic3_dcb_reset_hw_config(struct hinic3_nic_dev *nic_dev);
+int hinic3_configure_dcb(struct net_device *netdev);
+int hinic3_setup_cos(struct net_device *netdev, u8 cos, u8 netif_run);
+void hinic3_dcbcfg_set_pfc_state(struct hinic3_nic_dev *nic_dev, u8 pfc_state);
+u8 hinic3_dcbcfg_get_pfc_state(struct hinic3_nic_dev *nic_dev);
+void hinic3_dcbcfg_set_pfc_pri_en(struct hinic3_nic_dev *nic_dev,
+ u8 pfc_en_bitmap);
+u8 hinic3_dcbcfg_get_pfc_pri_en(struct hinic3_nic_dev *nic_dev);
+int hinic3_dcbcfg_set_ets_up_tc_map(struct hinic3_nic_dev *nic_dev,
+ const u8 *up_tc_map);
+void hinic3_dcbcfg_get_ets_up_tc_map(struct hinic3_nic_dev *nic_dev,
+ u8 *up_tc_map);
+int hinic3_dcbcfg_set_ets_tc_bw(struct hinic3_nic_dev *nic_dev,
+ const u8 *tc_bw);
+void hinic3_dcbcfg_get_ets_tc_bw(struct hinic3_nic_dev *nic_dev, u8 *tc_bw);
+void hinic3_dcbcfg_set_ets_tc_prio_type(struct hinic3_nic_dev *nic_dev,
+ u8 tc_prio_bitmap);
+void hinic3_dcbcfg_get_ets_tc_prio_type(struct hinic3_nic_dev *nic_dev,
+ u8 *tc_prio_bitmap);
+int hinic3_dcbcfg_set_up_bitmap(struct hinic3_nic_dev *nic_dev);
+void hinic3_update_tx_db_cos(struct hinic3_nic_dev *nic_dev, u8 dcb_en);
+
+void hinic3_update_qp_cos_cfg(struct hinic3_nic_dev *nic_dev, u8 num_cos);
+void hinic3_vport_down(struct hinic3_nic_dev *nic_dev);
+int hinic3_vport_up(struct hinic3_nic_dev *nic_dev);
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
new file mode 100644
index 000000000000..2b3561e5bca1
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
@@ -0,0 +1,1331 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/ethtool.h>
+
+#include "ossl_knl.h"
+#include "hinic3_hw.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_tx.h"
+#include "hinic3_rx.h"
+#include "hinic3_rss.h"
+
+#define COALESCE_ALL_QUEUE 0xFFFF
+#define COALESCE_PENDING_LIMIT_UNIT 8
+#define COALESCE_TIMER_CFG_UNIT 5
+#define COALESCE_MAX_PENDING_LIMIT (255 * COALESCE_PENDING_LIMIT_UNIT)
+#define COALESCE_MAX_TIMER_CFG (255 * COALESCE_TIMER_CFG_UNIT)
+#define HINIC3_WAIT_PKTS_TO_RX_BUFFER 200
+#define HINIC3_WAIT_CLEAR_LP_TEST 100
+
+#ifndef SET_ETHTOOL_OPS
+#define SET_ETHTOOL_OPS(netdev, ops) \
+ ((netdev)->ethtool_ops = (ops))
+#endif
+
+static void hinic3_get_drvinfo(struct net_device *netdev,
+ struct ethtool_drvinfo *info)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct pci_dev *pdev = nic_dev->pdev;
+ u8 mgmt_ver[HINIC3_MGMT_VERSION_MAX_LEN] = {0};
+ int err;
+
+ strlcpy(info->driver, HINIC3_NIC_DRV_NAME, sizeof(info->driver));
+ strlcpy(info->version, HINIC3_NIC_DRV_VERSION, sizeof(info->version));
+ strlcpy(info->bus_info, pci_name(pdev), sizeof(info->bus_info));
+
+ err = hinic3_get_mgmt_version(nic_dev->hwdev, mgmt_ver,
+ HINIC3_MGMT_VERSION_MAX_LEN,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to get fw version\n");
+ return;
+ }
+
+ err = snprintf(info->fw_version, sizeof(info->fw_version), "%s", mgmt_ver);
+ if (err < 0)
+ nicif_err(nic_dev, drv, netdev, "Failed to snprintf fw version\n");
+}
+
+static u32 hinic3_get_msglevel(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ return nic_dev->msg_enable;
+}
+
+static void hinic3_set_msglevel(struct net_device *netdev, u32 data)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ nic_dev->msg_enable = data;
+
+ nicif_info(nic_dev, drv, netdev, "Set message level: 0x%x\n", data);
+}
+
+static int hinic3_nway_reset(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct nic_port_info port_info = {0};
+ int err;
+
+ while (test_and_set_bit(HINIC3_AUTONEG_RESET, &nic_dev->flags))
+ msleep(100); /* sleep 100 ms, waiting for another autoneg restart progress done */
+
+ err = hinic3_get_port_info(nic_dev->hwdev, &port_info, HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Get port info failed\n");
+ err = -EFAULT;
+ goto reset_err;
+ }
+
+ if (port_info.autoneg_state != PORT_CFG_AN_ON) {
+ nicif_err(nic_dev, drv, netdev, "Autonegotiation is not on, don't support to restart it\n");
+ err = -EOPNOTSUPP;
+ goto reset_err;
+ }
+
+ err = hinic3_set_autoneg(nic_dev->hwdev, false);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Set autonegotiation off failed\n");
+ err = -EFAULT;
+ goto reset_err;
+ }
+
+ msleep(200); /* sleep 200 ms, waiting for status polling finished */
+
+ err = hinic3_set_autoneg(nic_dev->hwdev, true);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Set autonegotiation on failed\n");
+ err = -EFAULT;
+ goto reset_err;
+ }
+
+ msleep(200); /* sleep 200 ms, waiting for status polling finished */
+ nicif_info(nic_dev, drv, netdev, "Restart autonegotiation successfully\n");
+
+reset_err:
+ clear_bit(HINIC3_AUTONEG_RESET, &nic_dev->flags);
+ return err;
+}
+
+static void hinic3_get_ringparam(struct net_device *netdev,
+ struct ethtool_ringparam *ring,
+ struct kernel_ethtool_ringparam *kernel_ring,
+ struct netlink_ext_ack *extack)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ ring->rx_max_pending = HINIC3_MAX_RX_QUEUE_DEPTH;
+ ring->tx_max_pending = HINIC3_MAX_TX_QUEUE_DEPTH;
+ ring->rx_pending = nic_dev->rxqs[0].q_depth;
+ ring->tx_pending = nic_dev->txqs[0].q_depth;
+}
+
+static void hinic3_update_qp_depth(struct hinic3_nic_dev *nic_dev,
+ u32 sq_depth, u32 rq_depth)
+{
+ u16 i;
+
+ nic_dev->q_params.sq_depth = sq_depth;
+ nic_dev->q_params.rq_depth = rq_depth;
+ for (i = 0; i < nic_dev->max_qps; i++) {
+ nic_dev->txqs[i].q_depth = sq_depth;
+ nic_dev->txqs[i].q_mask = sq_depth - 1;
+ nic_dev->rxqs[i].q_depth = rq_depth;
+ nic_dev->rxqs[i].q_mask = rq_depth - 1;
+ }
+}
+
+static int check_ringparam_valid(struct net_device *netdev,
+ const struct ethtool_ringparam *ring)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ if (ring->rx_jumbo_pending || ring->rx_mini_pending) {
+ nicif_err(nic_dev, drv, netdev,
+ "Unsupported rx_jumbo_pending/rx_mini_pending\n");
+ return -EINVAL;
+ }
+
+ if (ring->tx_pending > HINIC3_MAX_TX_QUEUE_DEPTH ||
+ ring->tx_pending < HINIC3_MIN_QUEUE_DEPTH ||
+ ring->rx_pending > HINIC3_MAX_RX_QUEUE_DEPTH ||
+ ring->rx_pending < HINIC3_MIN_QUEUE_DEPTH) {
+ nicif_err(nic_dev, drv, netdev,
+ "Queue depth out of rang tx[%d-%d] rx[%d-%d]\n",
+ HINIC3_MIN_QUEUE_DEPTH, HINIC3_MAX_TX_QUEUE_DEPTH,
+ HINIC3_MIN_QUEUE_DEPTH, HINIC3_MAX_RX_QUEUE_DEPTH);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_set_ringparam(struct net_device *netdev,
+ struct ethtool_ringparam *ring,
+ struct kernel_ethtool_ringparam *kernel_ring,
+ struct netlink_ext_ack *extack)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_dyna_txrxq_params q_params = {0};
+ u32 new_sq_depth, new_rq_depth;
+ int err;
+
+ err = check_ringparam_valid(netdev, ring);
+ if (err)
+ return err;
+
+ new_sq_depth = (u32)(1U << (u16)ilog2(ring->tx_pending));
+ new_rq_depth = (u32)(1U << (u16)ilog2(ring->rx_pending));
+ if (new_sq_depth == nic_dev->q_params.sq_depth &&
+ new_rq_depth == nic_dev->q_params.rq_depth)
+ return 0; /* nothing to do */
+
+ nicif_info(nic_dev, drv, netdev,
+ "Change Tx/Rx ring depth from %u/%u to %u/%u\n",
+ nic_dev->q_params.sq_depth, nic_dev->q_params.rq_depth,
+ new_sq_depth, new_rq_depth);
+
+ if (!netif_running(netdev)) {
+ hinic3_update_qp_depth(nic_dev, new_sq_depth, new_rq_depth);
+ } else {
+ q_params = nic_dev->q_params;
+ q_params.sq_depth = new_sq_depth;
+ q_params.rq_depth = new_rq_depth;
+ q_params.txqs_res = NULL;
+ q_params.rxqs_res = NULL;
+ q_params.irq_cfg = NULL;
+
+ nicif_info(nic_dev, drv, netdev, "Restarting channel\n");
+ err = hinic3_change_channel_settings(nic_dev, &q_params,
+ NULL, NULL);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to change channel settings\n");
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static int get_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *coal, u16 queue)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_intr_coal_info *interrupt_info = NULL;
+
+ if (queue == COALESCE_ALL_QUEUE) {
+ /* get tx/rx irq0 as default parameters */
+ interrupt_info = &nic_dev->intr_coalesce[0];
+ } else {
+ if (queue >= nic_dev->q_params.num_qps) {
+ nicif_err(nic_dev, drv, netdev,
+ "Invalid queue_id: %u\n", queue);
+ return -EINVAL;
+ }
+ interrupt_info = &nic_dev->intr_coalesce[queue];
+ }
+
+ /* coalescs_timer is in unit of 5us */
+ coal->rx_coalesce_usecs = interrupt_info->coalesce_timer_cfg *
+ COALESCE_TIMER_CFG_UNIT;
+ /* coalescs_frams is in unit of 8 */
+ coal->rx_max_coalesced_frames = interrupt_info->pending_limt *
+ COALESCE_PENDING_LIMIT_UNIT;
+
+ /* tx/rx use the same interrupt */
+ coal->tx_coalesce_usecs = coal->rx_coalesce_usecs;
+ coal->tx_max_coalesced_frames = coal->rx_max_coalesced_frames;
+ coal->use_adaptive_rx_coalesce = nic_dev->adaptive_rx_coal;
+
+ coal->pkt_rate_high = (u32)interrupt_info->pkt_rate_high;
+ coal->rx_coalesce_usecs_high = interrupt_info->rx_usecs_high *
+ COALESCE_TIMER_CFG_UNIT;
+ coal->rx_max_coalesced_frames_high =
+ interrupt_info->rx_pending_limt_high *
+ COALESCE_PENDING_LIMIT_UNIT;
+
+ coal->pkt_rate_low = (u32)interrupt_info->pkt_rate_low;
+ coal->rx_coalesce_usecs_low = interrupt_info->rx_usecs_low *
+ COALESCE_TIMER_CFG_UNIT;
+ coal->rx_max_coalesced_frames_low =
+ interrupt_info->rx_pending_limt_low *
+ COALESCE_PENDING_LIMIT_UNIT;
+
+ return 0;
+}
+
+static int set_queue_coalesce(struct hinic3_nic_dev *nic_dev, u16 q_id,
+ struct hinic3_intr_coal_info *coal)
+{
+ struct hinic3_intr_coal_info *intr_coal;
+ struct interrupt_info info = {0};
+ struct net_device *netdev = nic_dev->netdev;
+ int err;
+
+ intr_coal = &nic_dev->intr_coalesce[q_id];
+ if (intr_coal->coalesce_timer_cfg != coal->coalesce_timer_cfg ||
+ intr_coal->pending_limt != coal->pending_limt)
+ intr_coal->user_set_intr_coal_flag = 1;
+
+ intr_coal->coalesce_timer_cfg = coal->coalesce_timer_cfg;
+ intr_coal->pending_limt = coal->pending_limt;
+ intr_coal->pkt_rate_low = coal->pkt_rate_low;
+ intr_coal->rx_usecs_low = coal->rx_usecs_low;
+ intr_coal->rx_pending_limt_low = coal->rx_pending_limt_low;
+ intr_coal->pkt_rate_high = coal->pkt_rate_high;
+ intr_coal->rx_usecs_high = coal->rx_usecs_high;
+ intr_coal->rx_pending_limt_high = coal->rx_pending_limt_high;
+
+ /* netdev not running or qp not in using,
+ * don't need to set coalesce to hw
+ */
+ if (!test_bit(HINIC3_INTF_UP, &nic_dev->flags) ||
+ q_id >= nic_dev->q_params.num_qps || nic_dev->adaptive_rx_coal)
+ return 0;
+
+ info.msix_index = nic_dev->q_params.irq_cfg[q_id].msix_entry_idx;
+ info.lli_set = 0;
+ info.interrupt_coalesc_set = 1;
+ info.coalesc_timer_cfg = intr_coal->coalesce_timer_cfg;
+ info.pending_limt = intr_coal->pending_limt;
+ info.resend_timer_cfg = intr_coal->resend_timer_cfg;
+ nic_dev->rxqs[q_id].last_coalesc_timer_cfg =
+ intr_coal->coalesce_timer_cfg;
+ nic_dev->rxqs[q_id].last_pending_limt = intr_coal->pending_limt;
+ err = hinic3_set_interrupt_cfg(nic_dev->hwdev, info,
+ HINIC3_CHANNEL_NIC);
+ if (err)
+ nicif_warn(nic_dev, drv, netdev,
+ "Failed to set queue%u coalesce", q_id);
+
+ return err;
+}
+
+static int is_coalesce_exceed_limit(struct net_device *netdev,
+ const struct ethtool_coalesce *coal)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ if (coal->rx_coalesce_usecs > COALESCE_MAX_TIMER_CFG) {
+ nicif_err(nic_dev, drv, netdev,
+ "rx_coalesce_usecs out of range[%d-%d]\n", 0,
+ COALESCE_MAX_TIMER_CFG);
+ return -EOPNOTSUPP;
+ }
+
+ if (coal->rx_max_coalesced_frames > COALESCE_MAX_PENDING_LIMIT) {
+ nicif_err(nic_dev, drv, netdev,
+ "rx_max_coalesced_frames out of range[%d-%d]\n", 0,
+ COALESCE_MAX_PENDING_LIMIT);
+ return -EOPNOTSUPP;
+ }
+
+ if (coal->rx_coalesce_usecs_low > COALESCE_MAX_TIMER_CFG) {
+ nicif_err(nic_dev, drv, netdev,
+ "rx_coalesce_usecs_low out of range[%d-%d]\n", 0,
+ COALESCE_MAX_TIMER_CFG);
+ return -EOPNOTSUPP;
+ }
+
+ if (coal->rx_max_coalesced_frames_low > COALESCE_MAX_PENDING_LIMIT) {
+ nicif_err(nic_dev, drv, netdev,
+ "rx_max_coalesced_frames_low out of range[%d-%d]\n",
+ 0, COALESCE_MAX_PENDING_LIMIT);
+ return -EOPNOTSUPP;
+ }
+
+ if (coal->rx_coalesce_usecs_high > COALESCE_MAX_TIMER_CFG) {
+ nicif_err(nic_dev, drv, netdev,
+ "rx_coalesce_usecs_high out of range[%d-%d]\n", 0,
+ COALESCE_MAX_TIMER_CFG);
+ return -EOPNOTSUPP;
+ }
+
+ if (coal->rx_max_coalesced_frames_high > COALESCE_MAX_PENDING_LIMIT) {
+ nicif_err(nic_dev, drv, netdev,
+ "rx_max_coalesced_frames_high out of range[%d-%d]\n",
+ 0, COALESCE_MAX_PENDING_LIMIT);
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
+static int is_coalesce_legal(struct net_device *netdev,
+ const struct ethtool_coalesce *coal)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct ethtool_coalesce tmp_coal = {0};
+ int err;
+
+ if (coal->rx_coalesce_usecs != coal->tx_coalesce_usecs) {
+ nicif_err(nic_dev, drv, netdev,
+ "tx-usecs must be equal to rx-usecs\n");
+ return -EINVAL;
+ }
+
+ if (coal->rx_max_coalesced_frames != coal->tx_max_coalesced_frames) {
+ nicif_err(nic_dev, drv, netdev,
+ "tx-frames must be equal to rx-frames\n");
+ return -EINVAL;
+ }
+
+ tmp_coal.cmd = coal->cmd;
+ tmp_coal.rx_coalesce_usecs = coal->rx_coalesce_usecs;
+ tmp_coal.rx_max_coalesced_frames = coal->rx_max_coalesced_frames;
+ tmp_coal.tx_coalesce_usecs = coal->tx_coalesce_usecs;
+ tmp_coal.tx_max_coalesced_frames = coal->tx_max_coalesced_frames;
+ tmp_coal.use_adaptive_rx_coalesce = coal->use_adaptive_rx_coalesce;
+
+ tmp_coal.pkt_rate_low = coal->pkt_rate_low;
+ tmp_coal.rx_coalesce_usecs_low = coal->rx_coalesce_usecs_low;
+ tmp_coal.rx_max_coalesced_frames_low =
+ coal->rx_max_coalesced_frames_low;
+
+ tmp_coal.pkt_rate_high = coal->pkt_rate_high;
+ tmp_coal.rx_coalesce_usecs_high = coal->rx_coalesce_usecs_high;
+ tmp_coal.rx_max_coalesced_frames_high =
+ coal->rx_max_coalesced_frames_high;
+
+ if (memcmp(coal, &tmp_coal, sizeof(struct ethtool_coalesce))) {
+ nicif_err(nic_dev, drv, netdev,
+ "Only support to change rx/tx-usecs and rx/tx-frames\n");
+ return -EOPNOTSUPP;
+ }
+
+ err = is_coalesce_exceed_limit(netdev, coal);
+ if (err)
+ return err;
+
+ if (coal->rx_coalesce_usecs_low / COALESCE_TIMER_CFG_UNIT >=
+ coal->rx_coalesce_usecs_high / COALESCE_TIMER_CFG_UNIT) {
+ nicif_err(nic_dev, drv, netdev,
+ "coalesce_usecs_high(%u) must more than coalesce_usecs_low(%u), after dividing %d usecs unit\n",
+ coal->rx_coalesce_usecs_high,
+ coal->rx_coalesce_usecs_low,
+ COALESCE_TIMER_CFG_UNIT);
+ return -EOPNOTSUPP;
+ }
+
+ if (coal->rx_max_coalesced_frames_low / COALESCE_PENDING_LIMIT_UNIT >=
+ coal->rx_max_coalesced_frames_high / COALESCE_PENDING_LIMIT_UNIT) {
+ nicif_err(nic_dev, drv, netdev,
+ "coalesced_frames_high(%u) must more than coalesced_frames_low(%u),after dividing %d frames unit\n",
+ coal->rx_max_coalesced_frames_high,
+ coal->rx_max_coalesced_frames_low,
+ COALESCE_PENDING_LIMIT_UNIT);
+ return -EOPNOTSUPP;
+ }
+
+ if (coal->pkt_rate_low >= coal->pkt_rate_high) {
+ nicif_err(nic_dev, drv, netdev,
+ "pkt_rate_high(%u) must more than pkt_rate_low(%u)\n",
+ coal->pkt_rate_high,
+ coal->pkt_rate_low);
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
+#define CHECK_COALESCE_ALIGN(coal, item, unit) \
+do { \
+ if ((coal)->item % (unit)) \
+ nicif_warn(nic_dev, drv, netdev, \
+ "%s in %d units, change to %u\n", \
+ #item, (unit), ((coal)->item - \
+ (coal)->item % (unit))); \
+} while (0)
+
+#define CHECK_COALESCE_CHANGED(coal, item, unit, ori_val, obj_str) \
+do { \
+ if (((coal)->item / (unit)) != (ori_val)) \
+ nicif_info(nic_dev, drv, netdev, \
+ "Change %s from %d to %u %s\n", \
+ #item, (ori_val) * (unit), \
+ ((coal)->item - (coal)->item % (unit)), \
+ (obj_str)); \
+} while (0)
+
+#define CHECK_PKT_RATE_CHANGED(coal, item, ori_val, obj_str) \
+do { \
+ if ((coal)->item != (ori_val)) \
+ nicif_info(nic_dev, drv, netdev, \
+ "Change %s from %llu to %u %s\n", \
+ #item, (ori_val), (coal)->item, (obj_str)); \
+} while (0)
+
+static int set_hw_coal_param(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_intr_coal_info *intr_coal, u16 queue)
+{
+ u16 i;
+
+ if (queue == COALESCE_ALL_QUEUE) {
+ for (i = 0; i < nic_dev->max_qps; i++)
+ set_queue_coalesce(nic_dev, i, intr_coal);
+ } else {
+ if (queue >= nic_dev->q_params.num_qps) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Invalid queue_id: %u\n", queue);
+ return -EINVAL;
+ }
+ set_queue_coalesce(nic_dev, queue, intr_coal);
+ }
+
+ return 0;
+}
+
+static int set_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *coal, u16 queue)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_intr_coal_info intr_coal = {0};
+ struct hinic3_intr_coal_info *ori_intr_coal = NULL;
+ u32 last_adaptive_rx;
+ char obj_str[32] = {0};
+ int err = 0;
+
+ err = is_coalesce_legal(netdev, coal);
+ if (err)
+ return err;
+
+ CHECK_COALESCE_ALIGN(coal, rx_coalesce_usecs, COALESCE_TIMER_CFG_UNIT);
+ CHECK_COALESCE_ALIGN(coal, rx_max_coalesced_frames,
+ COALESCE_PENDING_LIMIT_UNIT);
+ CHECK_COALESCE_ALIGN(coal, rx_coalesce_usecs_high,
+ COALESCE_TIMER_CFG_UNIT);
+ CHECK_COALESCE_ALIGN(coal, rx_max_coalesced_frames_high,
+ COALESCE_PENDING_LIMIT_UNIT);
+ CHECK_COALESCE_ALIGN(coal, rx_coalesce_usecs_low,
+ COALESCE_TIMER_CFG_UNIT);
+ CHECK_COALESCE_ALIGN(coal, rx_max_coalesced_frames_low,
+ COALESCE_PENDING_LIMIT_UNIT);
+
+ if (queue == COALESCE_ALL_QUEUE) {
+ ori_intr_coal = &nic_dev->intr_coalesce[0];
+ snprintf(obj_str, sizeof(obj_str), "for netdev");
+ } else {
+ ori_intr_coal = &nic_dev->intr_coalesce[queue];
+ snprintf(obj_str, sizeof(obj_str), "for queue %u", queue);
+ }
+ CHECK_COALESCE_CHANGED(coal, rx_coalesce_usecs, COALESCE_TIMER_CFG_UNIT,
+ ori_intr_coal->coalesce_timer_cfg, obj_str);
+ CHECK_COALESCE_CHANGED(coal, rx_max_coalesced_frames,
+ COALESCE_PENDING_LIMIT_UNIT,
+ ori_intr_coal->pending_limt, obj_str);
+ CHECK_PKT_RATE_CHANGED(coal, pkt_rate_high,
+ ori_intr_coal->pkt_rate_high, obj_str);
+ CHECK_COALESCE_CHANGED(coal, rx_coalesce_usecs_high,
+ COALESCE_TIMER_CFG_UNIT,
+ ori_intr_coal->rx_usecs_high, obj_str);
+ CHECK_COALESCE_CHANGED(coal, rx_max_coalesced_frames_high,
+ COALESCE_PENDING_LIMIT_UNIT,
+ ori_intr_coal->rx_pending_limt_high, obj_str);
+ CHECK_PKT_RATE_CHANGED(coal, pkt_rate_low,
+ ori_intr_coal->pkt_rate_low, obj_str);
+ CHECK_COALESCE_CHANGED(coal, rx_coalesce_usecs_low,
+ COALESCE_TIMER_CFG_UNIT,
+ ori_intr_coal->rx_usecs_low, obj_str);
+ CHECK_COALESCE_CHANGED(coal, rx_max_coalesced_frames_low,
+ COALESCE_PENDING_LIMIT_UNIT,
+ ori_intr_coal->rx_pending_limt_low, obj_str);
+
+ intr_coal.coalesce_timer_cfg =
+ (u8)(coal->rx_coalesce_usecs / COALESCE_TIMER_CFG_UNIT);
+ intr_coal.pending_limt = (u8)(coal->rx_max_coalesced_frames /
+ COALESCE_PENDING_LIMIT_UNIT);
+
+ last_adaptive_rx = nic_dev->adaptive_rx_coal;
+ nic_dev->adaptive_rx_coal = coal->use_adaptive_rx_coalesce;
+
+ intr_coal.pkt_rate_high = coal->pkt_rate_high;
+ intr_coal.rx_usecs_high =
+ (u8)(coal->rx_coalesce_usecs_high / COALESCE_TIMER_CFG_UNIT);
+ intr_coal.rx_pending_limt_high =
+ (u8)(coal->rx_max_coalesced_frames_high /
+ COALESCE_PENDING_LIMIT_UNIT);
+
+ intr_coal.pkt_rate_low = coal->pkt_rate_low;
+ intr_coal.rx_usecs_low =
+ (u8)(coal->rx_coalesce_usecs_low / COALESCE_TIMER_CFG_UNIT);
+ intr_coal.rx_pending_limt_low =
+ (u8)(coal->rx_max_coalesced_frames_low /
+ COALESCE_PENDING_LIMIT_UNIT);
+
+ /* coalesce timer or pending set to zero will disable coalesce */
+ if (!nic_dev->adaptive_rx_coal &&
+ (!intr_coal.coalesce_timer_cfg || !intr_coal.pending_limt))
+ nicif_warn(nic_dev, drv, netdev, "Coalesce will be disabled\n");
+
+ /* ensure coalesce paramester will not be changed in auto
+ * moderation work
+ */
+ if (HINIC3_CHANNEL_RES_VALID(nic_dev)) {
+ if (!nic_dev->adaptive_rx_coal)
+ cancel_delayed_work_sync(&nic_dev->moderation_task);
+ else if (!last_adaptive_rx)
+ queue_delayed_work(nic_dev->workq,
+ &nic_dev->moderation_task,
+ HINIC3_MODERATONE_DELAY);
+ }
+
+ return set_hw_coal_param(nic_dev, &intr_coal, queue);
+}
+
+static int hinic3_get_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *coal,
+ struct kernel_ethtool_coalesce *kernel_coal,
+ struct netlink_ext_ack *extack)
+{
+ return get_coalesce(netdev, coal, COALESCE_ALL_QUEUE);
+}
+
+static int hinic3_set_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *coal,
+ struct kernel_ethtool_coalesce *kernel_coal,
+ struct netlink_ext_ack *extack)
+{
+ return set_coalesce(netdev, coal, COALESCE_ALL_QUEUE);
+}
+
+#if defined(ETHTOOL_PERQUEUE) && defined(ETHTOOL_GCOALESCE)
+static int hinic3_get_per_queue_coalesce(struct net_device *netdev, u32 queue,
+ struct ethtool_coalesce *coal)
+{
+ return get_coalesce(netdev, coal, (u16)queue);
+}
+
+static int hinic3_set_per_queue_coalesce(struct net_device *netdev, u32 queue,
+ struct ethtool_coalesce *coal)
+{
+ return set_coalesce(netdev, coal, (u16)queue);
+}
+#endif
+
+#ifdef HAVE_ETHTOOL_SET_PHYS_ID
+static int hinic3_set_phys_id(struct net_device *netdev,
+ enum ethtool_phys_id_state state)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err;
+
+ switch (state) {
+ case ETHTOOL_ID_ACTIVE:
+ err = hinic3_set_led_status(nic_dev->hwdev,
+ MAG_CMD_LED_TYPE_ALARM,
+ MAG_CMD_LED_MODE_FORCE_BLINK_2HZ);
+ if (err)
+ nicif_err(nic_dev, drv, netdev,
+ "Set LED blinking in 2HZ failed\n");
+ else
+ nicif_info(nic_dev, drv, netdev,
+ "Set LED blinking in 2HZ success\n");
+ break;
+
+ case ETHTOOL_ID_INACTIVE:
+ err = hinic3_set_led_status(nic_dev->hwdev,
+ MAG_CMD_LED_TYPE_ALARM,
+ MAG_CMD_LED_MODE_DEFAULT);
+ if (err)
+ nicif_err(nic_dev, drv, netdev,
+ "Reset LED to original status failed\n");
+ else
+ nicif_info(nic_dev, drv, netdev,
+ "Reset LED to original status success\n");
+ break;
+
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ return err;
+}
+#else
+static int hinic3_phys_id(struct net_device *netdev, u32 data)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ nicif_err(nic_dev, drv, netdev, "Not support to set phys id\n");
+
+ return -EOPNOTSUPP;
+}
+#endif
+
+static void hinic3_get_pauseparam(struct net_device *netdev,
+ struct ethtool_pauseparam *pause)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct nic_pause_config nic_pause = {0};
+ int err;
+
+ err = hinic3_get_pause_info(nic_dev->hwdev, &nic_pause);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to get pauseparam from hw\n");
+ } else {
+ pause->autoneg = nic_pause.auto_neg == PORT_CFG_AN_ON ?
+ AUTONEG_ENABLE : AUTONEG_DISABLE;
+ pause->rx_pause = nic_pause.rx_pause;
+ pause->tx_pause = nic_pause.tx_pause;
+ }
+}
+
+static int hinic3_set_pauseparam(struct net_device *netdev,
+ struct ethtool_pauseparam *pause)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct nic_pause_config nic_pause = {0};
+ struct nic_port_info port_info = {0};
+ u32 auto_neg;
+ int err;
+
+ err = hinic3_get_port_info(nic_dev->hwdev, &port_info,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to get auto-negotiation state\n");
+ return -EFAULT;
+ }
+
+ auto_neg = port_info.autoneg_state == PORT_CFG_AN_ON ? AUTONEG_ENABLE : AUTONEG_DISABLE;
+ if (pause->autoneg != auto_neg) {
+ nicif_err(nic_dev, drv, netdev,
+ "To change autoneg please use: ethtool -s <dev> autoneg <on|off>\n");
+ return -EOPNOTSUPP;
+ }
+
+ nic_pause.auto_neg = pause->autoneg == AUTONEG_ENABLE ? PORT_CFG_AN_ON : PORT_CFG_AN_OFF;
+ nic_pause.rx_pause = (u8)pause->rx_pause;
+ nic_pause.tx_pause = (u8)pause->tx_pause;
+
+ err = hinic3_set_pause_info(nic_dev->hwdev, nic_pause);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to set pauseparam\n");
+ return err;
+ }
+
+ nicif_info(nic_dev, drv, netdev, "Set pause options, tx: %s, rx: %s\n",
+ pause->tx_pause ? "on" : "off",
+ pause->rx_pause ? "on" : "off");
+
+ return 0;
+}
+
+#ifdef ETHTOOL_GMODULEEEPROM
+static int hinic3_get_module_info(struct net_device *netdev,
+ struct ethtool_modinfo *modinfo)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u8 sfp_type = 0;
+ u8 sfp_type_ext = 0;
+ int err;
+
+ err = hinic3_get_sfp_type(nic_dev->hwdev, &sfp_type, &sfp_type_ext);
+ if (err)
+ return err;
+
+ switch (sfp_type) {
+ case MODULE_TYPE_SFP:
+ modinfo->type = ETH_MODULE_SFF_8472;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
+ break;
+ case MODULE_TYPE_QSFP:
+ modinfo->type = ETH_MODULE_SFF_8436;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8436_MAX_LEN;
+ break;
+ case MODULE_TYPE_QSFP_PLUS:
+ if (sfp_type_ext >= 0x3) {
+ modinfo->type = ETH_MODULE_SFF_8636;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8636_MAX_LEN;
+ } else {
+ modinfo->type = ETH_MODULE_SFF_8436;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8436_MAX_LEN;
+ }
+ break;
+ case MODULE_TYPE_QSFP28:
+ modinfo->type = ETH_MODULE_SFF_8636;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8636_MAX_LEN;
+ break;
+ default:
+ nicif_warn(nic_dev, drv, netdev,
+ "Optical module unknown: 0x%x\n", sfp_type);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_get_module_eeprom(struct net_device *netdev,
+ struct ethtool_eeprom *ee, u8 *data)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u8 sfp_data[STD_SFP_INFO_MAX_SIZE];
+ int err;
+
+ if (!ee->len || ((ee->len + ee->offset) > STD_SFP_INFO_MAX_SIZE))
+ return -EINVAL;
+
+ memset(data, 0, ee->len);
+
+ err = hinic3_get_sfp_eeprom(nic_dev->hwdev, (u8 *)sfp_data, ee->len);
+ if (err)
+ return err;
+
+ memcpy(data, sfp_data + ee->offset, ee->len);
+
+ return 0;
+}
+#endif /* ETHTOOL_GMODULEEEPROM */
+
+#define HINIC3_PRIV_FLAGS_SYMM_RSS BIT(0)
+#define HINIC3_PRIV_FLAGS_LINK_UP BIT(1)
+#define HINIC3_PRIV_FLAGS_RXQ_RECOVERY BIT(2)
+
+static u32 hinic3_get_priv_flags(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u32 priv_flags = 0;
+
+ if (test_bit(HINIC3_SAME_RXTX, &nic_dev->flags))
+ priv_flags |= HINIC3_PRIV_FLAGS_SYMM_RSS;
+
+ if (test_bit(HINIC3_FORCE_LINK_UP, &nic_dev->flags))
+ priv_flags |= HINIC3_PRIV_FLAGS_LINK_UP;
+
+ if (test_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags))
+ priv_flags |= HINIC3_PRIV_FLAGS_RXQ_RECOVERY;
+
+ return priv_flags;
+}
+
+int hinic3_set_rxq_recovery_flag(struct net_device *netdev, u32 priv_flags)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ if (priv_flags & HINIC3_PRIV_FLAGS_RXQ_RECOVERY) {
+ if (!HINIC3_SUPPORT_RXQ_RECOVERY(nic_dev->hwdev)) {
+ nicif_info(nic_dev, drv, netdev, "Unsupport open rxq recovery\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (test_and_set_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags))
+ return 0;
+ queue_delayed_work(nic_dev->workq, &nic_dev->rxq_check_work, HZ);
+ nicif_info(nic_dev, drv, netdev, "open rxq recovery\n");
+ } else {
+ if (!test_and_clear_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags))
+ return 0;
+ cancel_delayed_work_sync(&nic_dev->rxq_check_work);
+ nicif_info(nic_dev, drv, netdev, "close rxq recovery\n");
+ }
+
+ return 0;
+}
+
+static int hinic3_set_symm_rss_flag(struct net_device *netdev, u32 priv_flags)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ if (priv_flags & HINIC3_PRIV_FLAGS_SYMM_RSS) {
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, netdev, "Failed to open Symmetric RSS while DCB is enabled\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, netdev, "Failed to open Symmetric RSS while RSS is disabled\n");
+ return -EOPNOTSUPP;
+ }
+
+ set_bit(HINIC3_SAME_RXTX, &nic_dev->flags);
+ } else {
+ clear_bit(HINIC3_SAME_RXTX, &nic_dev->flags);
+ }
+
+ return 0;
+}
+
+static int hinic3_set_force_link_flag(struct net_device *netdev, u32 priv_flags)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u8 link_status = 0;
+ int err;
+
+ if (priv_flags & HINIC3_PRIV_FLAGS_LINK_UP) {
+ if (test_and_set_bit(HINIC3_FORCE_LINK_UP, &nic_dev->flags))
+ return 0;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev))
+ return 0;
+
+ if (netif_carrier_ok(netdev))
+ return 0;
+
+ nic_dev->link_status = true;
+ netif_carrier_on(netdev);
+ nicif_info(nic_dev, link, netdev, "Set link up\n");
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev))
+ hinic3_notify_all_vfs_link_changed(nic_dev->hwdev, nic_dev->link_status);
+ } else {
+ if (!test_and_clear_bit(HINIC3_FORCE_LINK_UP, &nic_dev->flags))
+ return 0;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev))
+ return 0;
+
+ err = hinic3_get_link_state(nic_dev->hwdev, &link_status);
+ if (err) {
+ nicif_err(nic_dev, link, netdev, "Get link state err: %d\n", err);
+ return err;
+ }
+
+ nic_dev->link_status = link_status;
+
+ if (link_status) {
+ if (netif_carrier_ok(netdev))
+ return 0;
+
+ netif_carrier_on(netdev);
+ nicif_info(nic_dev, link, netdev, "Link state is up\n");
+ } else {
+ if (!netif_carrier_ok(netdev))
+ return 0;
+
+ netif_carrier_off(netdev);
+ nicif_info(nic_dev, link, netdev, "Link state is down\n");
+ }
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev))
+ hinic3_notify_all_vfs_link_changed(nic_dev->hwdev, nic_dev->link_status);
+ }
+
+ return 0;
+}
+
+static int hinic3_set_priv_flags(struct net_device *netdev, u32 priv_flags)
+{
+ int err;
+
+ err = hinic3_set_symm_rss_flag(netdev, priv_flags);
+ if (err)
+ return err;
+
+ err = hinic3_set_rxq_recovery_flag(netdev, priv_flags);
+ if (err)
+ return err;
+
+ return hinic3_set_force_link_flag(netdev, priv_flags);
+}
+
+#define PORT_DOWN_ERR_IDX 0
+#define LP_DEFAULT_TIME 5 /* seconds */
+#define LP_PKT_LEN 60
+
+#define TEST_TIME_MULTIPLE 5
+static int hinic3_run_lp_test(struct hinic3_nic_dev *nic_dev, u32 test_time)
+{
+ u8 *lb_test_rx_buf = nic_dev->lb_test_rx_buf;
+ struct net_device *netdev = nic_dev->netdev;
+ u32 cnt = test_time * TEST_TIME_MULTIPLE;
+ struct sk_buff *skb_tmp = NULL;
+ struct ethhdr *eth_hdr = NULL;
+ struct sk_buff *skb = NULL;
+ u8 *test_data = NULL;
+ u32 i;
+ u8 j;
+
+ skb_tmp = alloc_skb(LP_PKT_LEN, GFP_ATOMIC);
+ if (!skb_tmp) {
+ nicif_err(nic_dev, drv, netdev,
+ "Alloc xmit skb template failed for loopback test\n");
+ return -ENOMEM;
+ }
+
+ eth_hdr = __skb_put(skb_tmp, ETH_HLEN);
+ eth_hdr->h_proto = htons(ETH_P_ARP);
+ ether_addr_copy(eth_hdr->h_dest, nic_dev->netdev->dev_addr);
+ eth_zero_addr(eth_hdr->h_source);
+ skb_reset_mac_header(skb_tmp);
+
+ test_data = __skb_put(skb_tmp, LP_PKT_LEN - ETH_HLEN);
+ for (i = ETH_HLEN; i < LP_PKT_LEN; i++)
+ test_data[i] = i & 0xFF;
+
+ skb_tmp->queue_mapping = 0;
+ skb_tmp->dev = netdev;
+ skb_tmp->protocol = htons(ETH_P_ARP);
+
+ for (i = 0; i < cnt; i++) {
+ nic_dev->lb_test_rx_idx = 0;
+ memset(lb_test_rx_buf, 0, LP_PKT_CNT * LP_PKT_LEN);
+
+ for (j = 0; j < LP_PKT_CNT; j++) {
+ skb = pskb_copy(skb_tmp, GFP_ATOMIC);
+ if (!skb) {
+ dev_kfree_skb_any(skb_tmp);
+ nicif_err(nic_dev, drv, netdev,
+ "Copy skb failed for loopback test\n");
+ return -ENOMEM;
+ }
+
+ /* mark index for every pkt */
+ skb->data[LP_PKT_LEN - 1] = j;
+
+ if (hinic3_lb_xmit_frame(skb, netdev)) {
+ dev_kfree_skb_any(skb);
+ dev_kfree_skb_any(skb_tmp);
+ nicif_err(nic_dev, drv, netdev,
+ "Xmit pkt failed for loopback test\n");
+ return -EBUSY;
+ }
+ }
+
+ /* wait till all pkts received to RX buffer */
+ msleep(HINIC3_WAIT_PKTS_TO_RX_BUFFER);
+
+ for (j = 0; j < LP_PKT_CNT; j++) {
+ if (memcmp((lb_test_rx_buf + (j * LP_PKT_LEN)),
+ skb_tmp->data, (LP_PKT_LEN - 1)) ||
+ (*(lb_test_rx_buf + ((j * LP_PKT_LEN) +
+ (LP_PKT_LEN - 1))) != j)) {
+ dev_kfree_skb_any(skb_tmp);
+ nicif_err(nic_dev, drv, netdev,
+ "Compare pkt failed in loopback test(index=0x%02x, data[%d]=0x%02x)\n",
+ (j + (i * LP_PKT_CNT)),
+ (LP_PKT_LEN - 1),
+ *(lb_test_rx_buf +
+ (((j * LP_PKT_LEN) +
+ (LP_PKT_LEN - 1)))));
+ return -EIO;
+ }
+ }
+ }
+
+ dev_kfree_skb_any(skb_tmp);
+ nicif_info(nic_dev, drv, netdev, "Loopback test succeed.\n");
+ return 0;
+}
+
+enum diag_test_index {
+ INTERNAL_LP_TEST = 0,
+ EXTERNAL_LP_TEST = 1,
+ DIAG_TEST_MAX = 2,
+};
+
+#define HINIC3_INTERNAL_LP_MODE 5
+static int do_lp_test(struct hinic3_nic_dev *nic_dev, u32 *flags, u32 test_time,
+ enum diag_test_index *test_index)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u8 *lb_test_rx_buf = NULL;
+ int err = 0;
+
+ if (!(*flags & ETH_TEST_FL_EXTERNAL_LB)) {
+ *test_index = INTERNAL_LP_TEST;
+ if (hinic3_set_loopback_mode(nic_dev->hwdev,
+ HINIC3_INTERNAL_LP_MODE, true)) {
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to set port loopback mode before loopback test\n");
+ return -EFAULT;
+ }
+
+ /* suspend 5000 ms, waiting for port to stop receiving frames */
+ msleep(5000);
+ } else {
+ *test_index = EXTERNAL_LP_TEST;
+ }
+
+ lb_test_rx_buf = vmalloc(LP_PKT_CNT * LP_PKT_LEN);
+ if (!lb_test_rx_buf) {
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to alloc RX buffer for loopback test\n");
+ err = -ENOMEM;
+ } else {
+ nic_dev->lb_test_rx_buf = lb_test_rx_buf;
+ nic_dev->lb_pkt_len = LP_PKT_LEN;
+ set_bit(HINIC3_LP_TEST, &nic_dev->flags);
+
+ if (hinic3_run_lp_test(nic_dev, test_time))
+ err = -EFAULT;
+
+ clear_bit(HINIC3_LP_TEST, &nic_dev->flags);
+ msleep(HINIC3_WAIT_CLEAR_LP_TEST);
+ vfree(lb_test_rx_buf);
+ nic_dev->lb_test_rx_buf = NULL;
+ }
+
+ if (!(*flags & ETH_TEST_FL_EXTERNAL_LB)) {
+ if (hinic3_set_loopback_mode(nic_dev->hwdev,
+ HINIC3_INTERNAL_LP_MODE, false)) {
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to cancel port loopback mode after loopback test\n");
+ err = -EFAULT;
+ }
+ } else {
+ *flags |= ETH_TEST_FL_EXTERNAL_LB_DONE;
+ }
+
+ return err;
+}
+
+static void hinic3_lp_test(struct net_device *netdev, struct ethtool_test *eth_test,
+ u64 *data, u32 test_time)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ enum diag_test_index test_index = 0;
+ u8 link_status = 0;
+ int err;
+ u32 test_time_real = test_time;
+
+ /* don't support loopback test when netdev is closed. */
+ if (!test_bit(HINIC3_INTF_UP, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, netdev,
+ "Do not support loopback test when netdev is closed\n");
+ eth_test->flags |= ETH_TEST_FL_FAILED;
+ data[PORT_DOWN_ERR_IDX] = 1;
+ return;
+ }
+ if (test_time_real == 0)
+ test_time_real = LP_DEFAULT_TIME;
+
+ netif_carrier_off(netdev);
+ netif_tx_disable(netdev);
+
+ err = do_lp_test(nic_dev, ð_test->flags, test_time_real, &test_index);
+ if (err) {
+ eth_test->flags |= ETH_TEST_FL_FAILED;
+ data[test_index] = 1;
+ }
+
+ netif_tx_wake_all_queues(netdev);
+
+ err = hinic3_get_link_state(nic_dev->hwdev, &link_status);
+ if (!err && link_status)
+ netif_carrier_on(netdev);
+}
+
+static void hinic3_diag_test(struct net_device *netdev,
+ struct ethtool_test *eth_test, u64 *data)
+{
+ memset(data, 0, DIAG_TEST_MAX * sizeof(u64));
+
+ hinic3_lp_test(netdev, eth_test, data, 0);
+}
+
+static const struct ethtool_ops hinic3_ethtool_ops = {
+#ifdef SUPPORTED_COALESCE_PARAMS
+ .supported_coalesce_params = ETHTOOL_COALESCE_USECS |
+ ETHTOOL_COALESCE_PKT_RATE_RX_USECS,
+#endif
+#ifdef ETHTOOL_GLINKSETTINGS
+#ifndef XENSERVER_HAVE_NEW_ETHTOOL_OPS
+ .get_link_ksettings = hinic3_get_link_ksettings,
+ .set_link_ksettings = hinic3_set_link_ksettings,
+#endif
+#endif
+#ifndef HAVE_NEW_ETHTOOL_LINK_SETTINGS_ONLY
+ .get_settings = hinic3_get_settings,
+ .set_settings = hinic3_set_settings,
+#endif
+
+ .get_drvinfo = hinic3_get_drvinfo,
+ .get_msglevel = hinic3_get_msglevel,
+ .set_msglevel = hinic3_set_msglevel,
+ .nway_reset = hinic3_nway_reset,
+#ifdef CONFIG_MODULE_PROF
+ .get_link = hinic3_get_link,
+#else
+ .get_link = ethtool_op_get_link,
+#endif
+ .get_ringparam = hinic3_get_ringparam,
+ .set_ringparam = hinic3_set_ringparam,
+ .get_pauseparam = hinic3_get_pauseparam,
+ .set_pauseparam = hinic3_set_pauseparam,
+ .get_sset_count = hinic3_get_sset_count,
+ .get_ethtool_stats = hinic3_get_ethtool_stats,
+ .get_strings = hinic3_get_strings,
+
+ .self_test = hinic3_diag_test,
+
+#ifndef HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT
+#ifdef HAVE_ETHTOOL_SET_PHYS_ID
+ .set_phys_id = hinic3_set_phys_id,
+#else
+ .phys_id = hinic3_phys_id,
+#endif
+#endif
+
+ .get_coalesce = hinic3_get_coalesce,
+ .set_coalesce = hinic3_set_coalesce,
+#if defined(ETHTOOL_PERQUEUE) && defined(ETHTOOL_GCOALESCE)
+ .get_per_queue_coalesce = hinic3_get_per_queue_coalesce,
+ .set_per_queue_coalesce = hinic3_set_per_queue_coalesce,
+#endif
+
+ .get_rxnfc = hinic3_get_rxnfc,
+ .set_rxnfc = hinic3_set_rxnfc,
+ .get_priv_flags = hinic3_get_priv_flags,
+ .set_priv_flags = hinic3_set_priv_flags,
+
+#ifndef HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT
+ .get_channels = hinic3_get_channels,
+ .set_channels = hinic3_set_channels,
+
+#ifdef ETHTOOL_GMODULEEEPROM
+ .get_module_info = hinic3_get_module_info,
+ .get_module_eeprom = hinic3_get_module_eeprom,
+#endif
+
+#ifndef NOT_HAVE_GET_RXFH_INDIR_SIZE
+ .get_rxfh_indir_size = hinic3_get_rxfh_indir_size,
+#endif
+
+#if defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)
+ .get_rxfh_key_size = hinic3_get_rxfh_key_size,
+ .get_rxfh = hinic3_get_rxfh,
+ .set_rxfh = hinic3_set_rxfh,
+#else
+ .get_rxfh_indir = hinic3_get_rxfh_indir,
+ .set_rxfh_indir = hinic3_set_rxfh_indir,
+#endif
+
+#endif /* HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT */
+};
+
+#ifdef HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT
+static const struct ethtool_ops_ext hinic3_ethtool_ops_ext = {
+ .size = sizeof(struct ethtool_ops_ext),
+ .set_phys_id = hinic3_set_phys_id,
+ .get_channels = hinic3_get_channels,
+ .set_channels = hinic3_set_channels,
+#ifdef ETHTOOL_GMODULEEEPROM
+ .get_module_info = hinic3_get_module_info,
+ .get_module_eeprom = hinic3_get_module_eeprom,
+#endif
+
+#ifndef NOT_HAVE_GET_RXFH_INDIR_SIZE
+ .get_rxfh_indir_size = hinic3_get_rxfh_indir_size,
+#endif
+
+#if defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)
+ .get_rxfh_key_size = hinic3_get_rxfh_key_size,
+ .get_rxfh = hinic3_get_rxfh,
+ .set_rxfh = hinic3_set_rxfh,
+#else
+ .get_rxfh_indir = hinic3_get_rxfh_indir,
+ .set_rxfh_indir = hinic3_set_rxfh_indir,
+#endif
+
+};
+#endif /* HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT */
+
+static const struct ethtool_ops hinic3vf_ethtool_ops = {
+#ifdef SUPPORTED_COALESCE_PARAMS
+ .supported_coalesce_params = ETHTOOL_COALESCE_USECS |
+ ETHTOOL_COALESCE_PKT_RATE_RX_USECS,
+#endif
+#ifdef ETHTOOL_GLINKSETTINGS
+#ifndef XENSERVER_HAVE_NEW_ETHTOOL_OPS
+ .get_link_ksettings = hinic3_get_link_ksettings,
+#endif
+#else
+ .get_settings = hinic3_get_settings,
+#endif
+ .get_drvinfo = hinic3_get_drvinfo,
+ .get_msglevel = hinic3_get_msglevel,
+ .set_msglevel = hinic3_set_msglevel,
+ .get_link = ethtool_op_get_link,
+ .get_ringparam = hinic3_get_ringparam,
+
+ .set_ringparam = hinic3_set_ringparam,
+ .get_sset_count = hinic3_get_sset_count,
+ .get_ethtool_stats = hinic3_get_ethtool_stats,
+ .get_strings = hinic3_get_strings,
+
+ .get_coalesce = hinic3_get_coalesce,
+ .set_coalesce = hinic3_set_coalesce,
+#if defined(ETHTOOL_PERQUEUE) && defined(ETHTOOL_GCOALESCE)
+ .get_per_queue_coalesce = hinic3_get_per_queue_coalesce,
+ .set_per_queue_coalesce = hinic3_set_per_queue_coalesce,
+#endif
+
+ .get_rxnfc = hinic3_get_rxnfc,
+ .set_rxnfc = hinic3_set_rxnfc,
+ .get_priv_flags = hinic3_get_priv_flags,
+ .set_priv_flags = hinic3_set_priv_flags,
+
+#ifndef HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT
+ .get_channels = hinic3_get_channels,
+ .set_channels = hinic3_set_channels,
+
+#ifndef NOT_HAVE_GET_RXFH_INDIR_SIZE
+ .get_rxfh_indir_size = hinic3_get_rxfh_indir_size,
+#endif
+
+#if defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)
+ .get_rxfh_key_size = hinic3_get_rxfh_key_size,
+ .get_rxfh = hinic3_get_rxfh,
+ .set_rxfh = hinic3_set_rxfh,
+#else
+ .get_rxfh_indir = hinic3_get_rxfh_indir,
+ .set_rxfh_indir = hinic3_set_rxfh_indir,
+#endif
+
+#endif /* HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT */
+};
+
+#ifdef HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT
+static const struct ethtool_ops_ext hinic3vf_ethtool_ops_ext = {
+ .size = sizeof(struct ethtool_ops_ext),
+ .get_channels = hinic3_get_channels,
+ .set_channels = hinic3_set_channels,
+
+#ifndef NOT_HAVE_GET_RXFH_INDIR_SIZE
+ .get_rxfh_indir_size = hinic3_get_rxfh_indir_size,
+#endif
+
+#if defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)
+ .get_rxfh_key_size = hinic3_get_rxfh_key_size,
+ .get_rxfh = hinic3_get_rxfh,
+ .set_rxfh = hinic3_set_rxfh,
+#else
+ .get_rxfh_indir = hinic3_get_rxfh_indir,
+ .set_rxfh_indir = hinic3_set_rxfh_indir,
+#endif
+
+};
+#endif /* HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT */
+
+void hinic3_set_ethtool_ops(struct net_device *netdev)
+{
+ SET_ETHTOOL_OPS(netdev, &hinic3_ethtool_ops);
+#ifdef HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT
+ set_ethtool_ops_ext(netdev, &hinic3_ethtool_ops_ext);
+#endif /* HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT */
+}
+
+void hinic3vf_set_ethtool_ops(struct net_device *netdev)
+{
+ SET_ETHTOOL_OPS(netdev, &hinic3vf_ethtool_ops);
+#ifdef HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT
+ set_ethtool_ops_ext(netdev, &hinic3vf_ethtool_ops_ext);
+#endif /* HAVE_RHEL6_ETHTOOL_OPS_EXT_STRUCT */
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool_stats.c b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool_stats.c
new file mode 100644
index 000000000000..de59b7668254
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool_stats.c
@@ -0,0 +1,1233 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/ethtool.h>
+
+#include "ossl_knl.h"
+#include "hinic3_hw.h"
+#include "hinic3_crm.h"
+#include "hinic3_mt.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_tx.h"
+#include "hinic3_rx.h"
+
+#define FPGA_PORT_COUNTER 0
+#define EVB_PORT_COUNTER 1
+u16 mag_support_mode = EVB_PORT_COUNTER;
+module_param(mag_support_mode, ushort, 0444);
+MODULE_PARM_DESC(mag_support_mode, "Set mag port counter support mode, 0:FPGA 1:EVB, default is 1");
+
+struct hinic3_stats {
+ char name[ETH_GSTRING_LEN];
+ u32 size;
+ int offset;
+};
+
+#define HINIC3_NETDEV_STAT(_stat_item) { \
+ .name = #_stat_item, \
+ .size = FIELD_SIZEOF(struct rtnl_link_stats64, _stat_item), \
+ .offset = offsetof(struct rtnl_link_stats64, _stat_item) \
+}
+
+static struct hinic3_stats hinic3_netdev_stats[] = {
+ HINIC3_NETDEV_STAT(rx_packets),
+ HINIC3_NETDEV_STAT(tx_packets),
+ HINIC3_NETDEV_STAT(rx_bytes),
+ HINIC3_NETDEV_STAT(tx_bytes),
+ HINIC3_NETDEV_STAT(rx_errors),
+ HINIC3_NETDEV_STAT(tx_errors),
+ HINIC3_NETDEV_STAT(rx_dropped),
+ HINIC3_NETDEV_STAT(tx_dropped),
+ HINIC3_NETDEV_STAT(multicast),
+ HINIC3_NETDEV_STAT(collisions),
+ HINIC3_NETDEV_STAT(rx_length_errors),
+ HINIC3_NETDEV_STAT(rx_over_errors),
+ HINIC3_NETDEV_STAT(rx_crc_errors),
+ HINIC3_NETDEV_STAT(rx_frame_errors),
+ HINIC3_NETDEV_STAT(rx_fifo_errors),
+ HINIC3_NETDEV_STAT(rx_missed_errors),
+ HINIC3_NETDEV_STAT(tx_aborted_errors),
+ HINIC3_NETDEV_STAT(tx_carrier_errors),
+ HINIC3_NETDEV_STAT(tx_fifo_errors),
+ HINIC3_NETDEV_STAT(tx_heartbeat_errors),
+};
+
+#define HINIC3_NIC_STAT(_stat_item) { \
+ .name = #_stat_item, \
+ .size = FIELD_SIZEOF(struct hinic3_nic_stats, _stat_item), \
+ .offset = offsetof(struct hinic3_nic_stats, _stat_item) \
+}
+
+static struct hinic3_stats hinic3_nic_dev_stats[] = {
+ HINIC3_NIC_STAT(netdev_tx_timeout),
+};
+
+static struct hinic3_stats hinic3_nic_dev_stats_extern[] = {
+ HINIC3_NIC_STAT(tx_carrier_off_drop),
+ HINIC3_NIC_STAT(tx_invalid_qid),
+ HINIC3_NIC_STAT(rsvd1),
+ HINIC3_NIC_STAT(rsvd2),
+};
+
+#define HINIC3_RXQ_STAT(_stat_item) { \
+ .name = "rxq%d_"#_stat_item, \
+ .size = FIELD_SIZEOF(struct hinic3_rxq_stats, _stat_item), \
+ .offset = offsetof(struct hinic3_rxq_stats, _stat_item) \
+}
+
+#define HINIC3_TXQ_STAT(_stat_item) { \
+ .name = "txq%d_"#_stat_item, \
+ .size = FIELD_SIZEOF(struct hinic3_txq_stats, _stat_item), \
+ .offset = offsetof(struct hinic3_txq_stats, _stat_item) \
+}
+
+/*lint -save -e786*/
+static struct hinic3_stats hinic3_rx_queue_stats[] = {
+ HINIC3_RXQ_STAT(packets),
+ HINIC3_RXQ_STAT(bytes),
+ HINIC3_RXQ_STAT(errors),
+ HINIC3_RXQ_STAT(csum_errors),
+ HINIC3_RXQ_STAT(other_errors),
+ HINIC3_RXQ_STAT(dropped),
+#ifdef HAVE_XDP_SUPPORT
+ HINIC3_RXQ_STAT(xdp_dropped),
+#endif
+ HINIC3_RXQ_STAT(rx_buf_empty),
+};
+
+static struct hinic3_stats hinic3_rx_queue_stats_extern[] = {
+ HINIC3_RXQ_STAT(alloc_skb_err),
+ HINIC3_RXQ_STAT(alloc_rx_buf_err),
+ HINIC3_RXQ_STAT(xdp_large_pkt),
+ HINIC3_RXQ_STAT(restore_drop_sge),
+ HINIC3_RXQ_STAT(rsvd2),
+};
+
+static struct hinic3_stats hinic3_tx_queue_stats[] = {
+ HINIC3_TXQ_STAT(packets),
+ HINIC3_TXQ_STAT(bytes),
+ HINIC3_TXQ_STAT(busy),
+ HINIC3_TXQ_STAT(wake),
+ HINIC3_TXQ_STAT(dropped),
+};
+
+static struct hinic3_stats hinic3_tx_queue_stats_extern[] = {
+ HINIC3_TXQ_STAT(skb_pad_err),
+ HINIC3_TXQ_STAT(frag_len_overflow),
+ HINIC3_TXQ_STAT(offload_cow_skb_err),
+ HINIC3_TXQ_STAT(map_frag_err),
+ HINIC3_TXQ_STAT(unknown_tunnel_pkt),
+ HINIC3_TXQ_STAT(frag_size_err),
+ HINIC3_TXQ_STAT(rsvd1),
+ HINIC3_TXQ_STAT(rsvd2),
+};
+
+/*lint -restore*/
+
+#define HINIC3_FUNC_STAT(_stat_item) { \
+ .name = #_stat_item, \
+ .size = FIELD_SIZEOF(struct hinic3_vport_stats, _stat_item), \
+ .offset = offsetof(struct hinic3_vport_stats, _stat_item) \
+}
+
+static struct hinic3_stats hinic3_function_stats[] = {
+ HINIC3_FUNC_STAT(tx_unicast_pkts_vport),
+ HINIC3_FUNC_STAT(tx_unicast_bytes_vport),
+ HINIC3_FUNC_STAT(tx_multicast_pkts_vport),
+ HINIC3_FUNC_STAT(tx_multicast_bytes_vport),
+ HINIC3_FUNC_STAT(tx_broadcast_pkts_vport),
+ HINIC3_FUNC_STAT(tx_broadcast_bytes_vport),
+
+ HINIC3_FUNC_STAT(rx_unicast_pkts_vport),
+ HINIC3_FUNC_STAT(rx_unicast_bytes_vport),
+ HINIC3_FUNC_STAT(rx_multicast_pkts_vport),
+ HINIC3_FUNC_STAT(rx_multicast_bytes_vport),
+ HINIC3_FUNC_STAT(rx_broadcast_pkts_vport),
+ HINIC3_FUNC_STAT(rx_broadcast_bytes_vport),
+
+ HINIC3_FUNC_STAT(tx_discard_vport),
+ HINIC3_FUNC_STAT(rx_discard_vport),
+ HINIC3_FUNC_STAT(tx_err_vport),
+ HINIC3_FUNC_STAT(rx_err_vport),
+};
+
+#define HINIC3_PORT_STAT(_stat_item) { \
+ .name = #_stat_item, \
+ .size = FIELD_SIZEOF(struct mag_cmd_port_stats, _stat_item), \
+ .offset = offsetof(struct mag_cmd_port_stats, _stat_item) \
+}
+
+static struct hinic3_stats hinic3_port_stats[] = {
+ HINIC3_PORT_STAT(mac_tx_fragment_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_undersize_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_undermin_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_64_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_65_127_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_128_255_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_256_511_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_512_1023_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_1024_1518_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_1519_2047_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_2048_4095_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_4096_8191_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_8192_9216_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_9217_12287_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_12288_16383_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_1519_max_bad_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_1519_max_good_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_oversize_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_jabber_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_bad_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_bad_oct_num),
+ HINIC3_PORT_STAT(mac_tx_good_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_good_oct_num),
+ HINIC3_PORT_STAT(mac_tx_total_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_total_oct_num),
+ HINIC3_PORT_STAT(mac_tx_uni_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_multi_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_broad_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pause_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri0_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri1_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri2_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri3_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri4_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri5_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri6_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_pfc_pri7_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_control_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_err_all_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_from_app_good_pkt_num),
+ HINIC3_PORT_STAT(mac_tx_from_app_bad_pkt_num),
+
+ HINIC3_PORT_STAT(mac_rx_fragment_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_undersize_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_undermin_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_64_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_65_127_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_128_255_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_256_511_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_512_1023_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_1024_1518_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_1519_2047_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_2048_4095_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_4096_8191_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_8192_9216_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_9217_12287_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_12288_16383_oct_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_1519_max_bad_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_1519_max_good_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_oversize_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_jabber_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_bad_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_bad_oct_num),
+ HINIC3_PORT_STAT(mac_rx_good_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_good_oct_num),
+ HINIC3_PORT_STAT(mac_rx_total_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_total_oct_num),
+ HINIC3_PORT_STAT(mac_rx_uni_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_multi_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_broad_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pause_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri0_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri1_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri2_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri3_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri4_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri5_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri6_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_pfc_pri7_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_control_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_sym_err_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_fcs_err_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_send_app_good_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_send_app_bad_pkt_num),
+ HINIC3_PORT_STAT(mac_rx_unfilter_pkt_num),
+};
+
+#define HINIC3_FGPA_PORT_STAT(_stat_item) { \
+ .name = #_stat_item, \
+ .size = FIELD_SIZEOF(struct hinic3_phy_fpga_port_stats, _stat_item), \
+ .offset = offsetof(struct hinic3_phy_fpga_port_stats, _stat_item) \
+}
+
+static struct hinic3_stats g_hinic3_fpga_port_stats[] = {
+ HINIC3_FGPA_PORT_STAT(mac_rx_total_octs_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_total_octs_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_under_frame_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_frag_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_64_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_127_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_255_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_511_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_1023_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_max_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_over_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_64_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_127_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_255_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_511_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_1023_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_max_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_over_oct_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_good_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_crc_error_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_broadcast_ok_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_multicast_ok_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_mac_frame_ok_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_length_err_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_vlan_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_pause_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_rx_unknown_mac_frame_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_good_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_broadcast_ok_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_multicast_ok_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_underrun_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_mac_frame_ok_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_vlan_pkts_port),
+ HINIC3_FGPA_PORT_STAT(mac_tx_pause_pkts_port),
+};
+
+static char g_hinic_priv_flags_strings[][ETH_GSTRING_LEN] = {
+ "Symmetric-RSS",
+ "Force-Link-up",
+ "Rxq_Recovery",
+};
+
+u32 hinic3_get_io_stats_size(const struct hinic3_nic_dev *nic_dev)
+{
+ u32 count;
+
+ count = ARRAY_LEN(hinic3_nic_dev_stats) +
+ ARRAY_LEN(hinic3_nic_dev_stats_extern) +
+ (ARRAY_LEN(hinic3_tx_queue_stats) +
+ ARRAY_LEN(hinic3_tx_queue_stats_extern) +
+ ARRAY_LEN(hinic3_rx_queue_stats) +
+ ARRAY_LEN(hinic3_rx_queue_stats_extern)) * nic_dev->max_qps;
+
+ return count;
+}
+
+#define GET_VALUE_OF_PTR(size, ptr) ( \
+ (size) == sizeof(u64) ? *(u64 *)(ptr) : \
+ (size) == sizeof(u32) ? *(u32 *)(ptr) : \
+ (size) == sizeof(u16) ? *(u16 *)(ptr) : *(u8 *)(ptr) \
+)
+
+#define DEV_STATS_PACK(items, item_idx, array, stats_ptr) do { \
+ int j; \
+ for (j = 0; j < ARRAY_LEN(array); j++) { \
+ memcpy((items)[item_idx].name, (array)[j].name, \
+ HINIC3_SHOW_ITEM_LEN); \
+ (items)[item_idx].hexadecimal = 0; \
+ (items)[item_idx].value = \
+ GET_VALUE_OF_PTR((array)[j].size, \
+ (char *)(stats_ptr) + (array)[j].offset); \
+ (item_idx)++; \
+ } \
+} while (0)
+
+#define QUEUE_STATS_PACK(items, item_idx, array, stats_ptr, qid) do { \
+ int j; \
+ for (j = 0; j < ARRAY_LEN(array); j++) { \
+ memcpy((items)[item_idx].name, (array)[j].name, \
+ HINIC3_SHOW_ITEM_LEN); \
+ snprintf((items)[item_idx].name, HINIC3_SHOW_ITEM_LEN, \
+ (array)[j].name, (qid)); \
+ (items)[item_idx].hexadecimal = 0; \
+ (items)[item_idx].value = \
+ GET_VALUE_OF_PTR((array)[j].size, \
+ (char *)(stats_ptr) + (array)[j].offset); \
+ (item_idx)++; \
+ } \
+} while (0)
+
+void hinic3_get_io_stats(const struct hinic3_nic_dev *nic_dev, void *stats)
+{
+ struct hinic3_show_item *items = stats;
+ int item_idx = 0;
+ u16 qid;
+
+ DEV_STATS_PACK(items, item_idx, hinic3_nic_dev_stats, &nic_dev->stats);
+ DEV_STATS_PACK(items, item_idx, hinic3_nic_dev_stats_extern,
+ &nic_dev->stats);
+
+ for (qid = 0; qid < nic_dev->max_qps; qid++) {
+ QUEUE_STATS_PACK(items, item_idx, hinic3_tx_queue_stats,
+ &nic_dev->txqs[qid].txq_stats, qid);
+ QUEUE_STATS_PACK(items, item_idx, hinic3_tx_queue_stats_extern,
+ &nic_dev->txqs[qid].txq_stats, qid);
+ }
+
+ for (qid = 0; qid < nic_dev->max_qps; qid++) {
+ QUEUE_STATS_PACK(items, item_idx, hinic3_rx_queue_stats,
+ &nic_dev->rxqs[qid].rxq_stats, qid);
+ QUEUE_STATS_PACK(items, item_idx, hinic3_rx_queue_stats_extern,
+ &nic_dev->rxqs[qid].rxq_stats, qid);
+ }
+}
+
+static char g_hinic3_test_strings[][ETH_GSTRING_LEN] = {
+ "Internal lb test (on/offline)",
+ "External lb test (external_lb)",
+};
+
+int hinic3_get_sset_count(struct net_device *netdev, int sset)
+{
+ int count = 0, q_num = 0;
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ switch (sset) {
+ case ETH_SS_TEST:
+ return ARRAY_LEN(g_hinic3_test_strings);
+ case ETH_SS_STATS:
+ q_num = nic_dev->q_params.num_qps;
+ count = ARRAY_LEN(hinic3_netdev_stats) +
+ ARRAY_LEN(hinic3_nic_dev_stats) +
+ ARRAY_LEN(hinic3_function_stats) +
+ (ARRAY_LEN(hinic3_tx_queue_stats) +
+ ARRAY_LEN(hinic3_rx_queue_stats)) * q_num;
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ if (mag_support_mode == FPGA_PORT_COUNTER)
+ count += ARRAY_LEN(g_hinic3_fpga_port_stats);
+ else
+ count += ARRAY_LEN(hinic3_port_stats);
+ }
+
+ return count;
+ case ETH_SS_PRIV_FLAGS:
+ return ARRAY_LEN(g_hinic_priv_flags_strings);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static void get_drv_queue_stats(struct hinic3_nic_dev *nic_dev, u64 *data)
+{
+ struct hinic3_txq_stats txq_stats;
+ struct hinic3_rxq_stats rxq_stats;
+ u16 i = 0, j = 0, qid = 0;
+ char *p = NULL;
+
+ for (qid = 0; qid < nic_dev->q_params.num_qps; qid++) {
+ if (!nic_dev->txqs)
+ break;
+
+ hinic3_txq_get_stats(&nic_dev->txqs[qid], &txq_stats);
+ for (j = 0; j < ARRAY_LEN(hinic3_tx_queue_stats); j++, i++) {
+ p = (char *)(&txq_stats) +
+ hinic3_tx_queue_stats[j].offset;
+ data[i] = (hinic3_tx_queue_stats[j].size ==
+ sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
+ }
+ }
+
+ for (qid = 0; qid < nic_dev->q_params.num_qps; qid++) {
+ if (!nic_dev->rxqs)
+ break;
+
+ hinic3_rxq_get_stats(&nic_dev->rxqs[qid], &rxq_stats);
+ for (j = 0; j < ARRAY_LEN(hinic3_rx_queue_stats); j++, i++) {
+ p = (char *)(&rxq_stats) +
+ hinic3_rx_queue_stats[j].offset;
+ data[i] = (hinic3_rx_queue_stats[j].size ==
+ sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
+ }
+ }
+}
+
+static u16 get_fpga_port_stats(struct hinic3_nic_dev *nic_dev, u64 *data)
+{
+ struct hinic3_phy_fpga_port_stats *port_stats = NULL;
+ char *p = NULL;
+ u16 i = 0, j = 0;
+ int err;
+
+ port_stats = kzalloc(sizeof(*port_stats), GFP_KERNEL);
+ if (!port_stats) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to malloc port stats\n");
+ memset(&data[i], 0,
+ ARRAY_LEN(g_hinic3_fpga_port_stats) * sizeof(*data));
+ i += ARRAY_LEN(g_hinic3_fpga_port_stats);
+ return i;
+ }
+
+ err = hinic3_get_fpga_phy_port_stats(nic_dev->hwdev, port_stats);
+ if (err)
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to get port stats from fw\n");
+
+ for (j = 0; j < ARRAY_LEN(g_hinic3_fpga_port_stats); j++, i++) {
+ p = (char *)(port_stats) + g_hinic3_fpga_port_stats[j].offset;
+ data[i] = (g_hinic3_fpga_port_stats[j].size ==
+ sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
+ }
+
+ kfree(port_stats);
+
+ return i;
+}
+
+static u16 get_ethtool_port_stats(struct hinic3_nic_dev *nic_dev, u64 *data)
+{
+ struct mag_cmd_port_stats *port_stats = NULL;
+ char *p = NULL;
+ u16 i = 0, j = 0;
+ int err;
+
+ if (mag_support_mode == FPGA_PORT_COUNTER)
+ return get_fpga_port_stats(nic_dev, data);
+
+ port_stats = kzalloc(sizeof(*port_stats), GFP_KERNEL);
+ if (!port_stats) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to malloc port stats\n");
+ memset(&data[i], 0,
+ ARRAY_LEN(hinic3_port_stats) * sizeof(*data));
+ i += ARRAY_LEN(hinic3_port_stats);
+ return i;
+ }
+
+ err = hinic3_get_phy_port_stats(nic_dev->hwdev, port_stats);
+ if (err)
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to get port stats from fw\n");
+
+ for (j = 0; j < ARRAY_LEN(hinic3_port_stats); j++, i++) {
+ p = (char *)(port_stats) + hinic3_port_stats[j].offset;
+ data[i] = (hinic3_port_stats[j].size ==
+ sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
+ }
+
+ kfree(port_stats);
+
+ return i;
+}
+
+void hinic3_get_ethtool_stats(struct net_device *netdev,
+ struct ethtool_stats *stats, u64 *data)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+#ifdef HAVE_NDO_GET_STATS64
+ struct rtnl_link_stats64 temp;
+ const struct rtnl_link_stats64 *net_stats = NULL;
+#else
+ const struct net_device_stats *net_stats = NULL;
+#endif
+ struct hinic3_nic_stats *nic_stats = NULL;
+
+ struct hinic3_vport_stats vport_stats = {0};
+ u16 i = 0, j = 0;
+ char *p = NULL;
+ int err;
+
+#ifdef HAVE_NDO_GET_STATS64
+ net_stats = dev_get_stats(netdev, &temp);
+#else
+ net_stats = dev_get_stats(netdev);
+#endif
+ for (j = 0; j < ARRAY_LEN(hinic3_netdev_stats); j++, i++) {
+ p = (char *)(net_stats) + hinic3_netdev_stats[j].offset;
+ data[i] = GET_VALUE_OF_PTR(hinic3_netdev_stats[j].size, p);
+ }
+
+ nic_stats = &nic_dev->stats;
+ for (j = 0; j < ARRAY_LEN(hinic3_nic_dev_stats); j++, i++) {
+ p = (char *)(nic_stats) + hinic3_nic_dev_stats[j].offset;
+ data[i] = GET_VALUE_OF_PTR(hinic3_nic_dev_stats[j].size, p);
+ }
+
+ err = hinic3_get_vport_stats(nic_dev->hwdev, hinic3_global_func_id(nic_dev->hwdev),
+ &vport_stats);
+ if (err)
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to get function stats from fw\n");
+
+ for (j = 0; j < ARRAY_LEN(hinic3_function_stats); j++, i++) {
+ p = (char *)(&vport_stats) + hinic3_function_stats[j].offset;
+ data[i] = GET_VALUE_OF_PTR(hinic3_function_stats[j].size, p);
+ }
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev))
+ i += get_ethtool_port_stats(nic_dev, data + i);
+
+ get_drv_queue_stats(nic_dev, data + i);
+}
+
+static u16 get_drv_dev_strings(struct hinic3_nic_dev *nic_dev, char *p)
+{
+ u16 i, cnt = 0;
+
+ for (i = 0; i < ARRAY_LEN(hinic3_netdev_stats); i++) {
+ memcpy(p, hinic3_netdev_stats[i].name,
+ ETH_GSTRING_LEN);
+ p += ETH_GSTRING_LEN;
+ cnt++;
+ }
+
+ for (i = 0; i < ARRAY_LEN(hinic3_nic_dev_stats); i++) {
+ memcpy(p, hinic3_nic_dev_stats[i].name, ETH_GSTRING_LEN);
+ p += ETH_GSTRING_LEN;
+ cnt++;
+ }
+
+ return cnt;
+}
+
+static u16 get_hw_stats_strings(struct hinic3_nic_dev *nic_dev, char *p)
+{
+ u16 i, cnt = 0;
+
+ for (i = 0; i < ARRAY_LEN(hinic3_function_stats); i++) {
+ memcpy(p, hinic3_function_stats[i].name,
+ ETH_GSTRING_LEN);
+ p += ETH_GSTRING_LEN;
+ cnt++;
+ }
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ if (mag_support_mode == FPGA_PORT_COUNTER) {
+ for (i = 0; i < ARRAY_LEN(g_hinic3_fpga_port_stats); i++) {
+ memcpy(p, g_hinic3_fpga_port_stats[i].name, ETH_GSTRING_LEN);
+ p += ETH_GSTRING_LEN;
+ cnt++;
+ }
+ } else {
+ for (i = 0; i < ARRAY_LEN(hinic3_port_stats); i++) {
+ memcpy(p, hinic3_port_stats[i].name, ETH_GSTRING_LEN);
+ p += ETH_GSTRING_LEN;
+ cnt++;
+ }
+ }
+ }
+
+ return cnt;
+}
+
+static u16 get_qp_stats_strings(const struct hinic3_nic_dev *nic_dev, char *p)
+{
+ u16 i = 0, j = 0, cnt = 0;
+ int err;
+
+ for (i = 0; i < nic_dev->q_params.num_qps; i++) {
+ for (j = 0; j < ARRAY_LEN(hinic3_tx_queue_stats); j++) {
+ err = sprintf(p, hinic3_tx_queue_stats[j].name, i);
+ if (err < 0)
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to sprintf tx queue stats name, idx_qps: %u, idx_stats: %u\n",
+ i, j);
+ p += ETH_GSTRING_LEN;
+ cnt++;
+ }
+ }
+
+ for (i = 0; i < nic_dev->q_params.num_qps; i++) {
+ for (j = 0; j < ARRAY_LEN(hinic3_rx_queue_stats); j++) {
+ err = sprintf(p, hinic3_rx_queue_stats[j].name, i);
+ if (err < 0)
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to sprintf rx queue stats name, idx_qps: %u, idx_stats: %u\n",
+ i, j);
+ p += ETH_GSTRING_LEN;
+ cnt++;
+ }
+ }
+
+ return cnt;
+}
+
+void hinic3_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ char *p = (char *)data;
+ u16 offset = 0;
+
+ switch (stringset) {
+ case ETH_SS_TEST:
+ memcpy(data, *g_hinic3_test_strings, sizeof(g_hinic3_test_strings));
+ return;
+ case ETH_SS_STATS:
+ offset = get_drv_dev_strings(nic_dev, p);
+ offset += get_hw_stats_strings(nic_dev,
+ p + offset * ETH_GSTRING_LEN);
+ get_qp_stats_strings(nic_dev, p + offset * ETH_GSTRING_LEN);
+
+ return;
+ case ETH_SS_PRIV_FLAGS:
+ memcpy(data, g_hinic_priv_flags_strings,
+ sizeof(g_hinic_priv_flags_strings));
+ return;
+ default:
+ nicif_err(nic_dev, drv, netdev,
+ "Invalid string set %u.", stringset);
+ return;
+ }
+}
+
+static const u32 hinic3_mag_link_mode_ge[] = {
+ ETHTOOL_LINK_MODE_1000baseT_Full_BIT,
+ ETHTOOL_LINK_MODE_1000baseKX_Full_BIT,
+ ETHTOOL_LINK_MODE_1000baseX_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_10ge_base_r[] = {
+ ETHTOOL_LINK_MODE_10000baseKR_Full_BIT,
+ ETHTOOL_LINK_MODE_10000baseR_FEC_BIT,
+ ETHTOOL_LINK_MODE_10000baseCR_Full_BIT,
+ ETHTOOL_LINK_MODE_10000baseSR_Full_BIT,
+ ETHTOOL_LINK_MODE_10000baseLR_Full_BIT,
+ ETHTOOL_LINK_MODE_10000baseLRM_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_25ge_base_r[] = {
+ ETHTOOL_LINK_MODE_25000baseCR_Full_BIT,
+ ETHTOOL_LINK_MODE_25000baseKR_Full_BIT,
+ ETHTOOL_LINK_MODE_25000baseSR_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_40ge_base_r4[] = {
+ ETHTOOL_LINK_MODE_40000baseKR4_Full_BIT,
+ ETHTOOL_LINK_MODE_40000baseCR4_Full_BIT,
+ ETHTOOL_LINK_MODE_40000baseSR4_Full_BIT,
+ ETHTOOL_LINK_MODE_40000baseLR4_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_50ge_base_r[] = {
+ ETHTOOL_LINK_MODE_50000baseKR_Full_BIT,
+ ETHTOOL_LINK_MODE_50000baseSR_Full_BIT,
+ ETHTOOL_LINK_MODE_50000baseCR_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_50ge_base_r2[] = {
+ ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT,
+ ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT,
+ ETHTOOL_LINK_MODE_50000baseSR2_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_100ge_base_r[] = {
+ ETHTOOL_LINK_MODE_100000baseKR_Full_BIT,
+ ETHTOOL_LINK_MODE_100000baseSR_Full_BIT,
+ ETHTOOL_LINK_MODE_100000baseCR_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_100ge_base_r2[] = {
+ ETHTOOL_LINK_MODE_100000baseKR2_Full_BIT,
+ ETHTOOL_LINK_MODE_100000baseSR2_Full_BIT,
+ ETHTOOL_LINK_MODE_100000baseCR2_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_100ge_base_r4[] = {
+ ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT,
+ ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT,
+ ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT,
+ ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_200ge_base_r2[] = {
+ ETHTOOL_LINK_MODE_200000baseKR2_Full_BIT,
+ ETHTOOL_LINK_MODE_200000baseSR2_Full_BIT,
+ ETHTOOL_LINK_MODE_200000baseCR2_Full_BIT,
+};
+
+static const u32 hinic3_mag_link_mode_200ge_base_r4[] = {
+ ETHTOOL_LINK_MODE_200000baseKR4_Full_BIT,
+ ETHTOOL_LINK_MODE_200000baseSR4_Full_BIT,
+ ETHTOOL_LINK_MODE_200000baseCR4_Full_BIT,
+};
+
+struct hw2ethtool_link_mode {
+ const u32 *link_mode_bit_arr;
+ u32 arr_size;
+ u32 speed;
+};
+
+/*lint -save -e26 */
+static const struct hw2ethtool_link_mode
+ hw2ethtool_link_mode_table[LINK_MODE_MAX_NUMBERS] = {
+ [LINK_MODE_GE] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_ge,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_ge),
+ .speed = SPEED_1000,
+ },
+ [LINK_MODE_10GE_BASE_R] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_10ge_base_r,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_10ge_base_r),
+ .speed = SPEED_10000,
+ },
+ [LINK_MODE_25GE_BASE_R] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_25ge_base_r,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_25ge_base_r),
+ .speed = SPEED_25000,
+ },
+ [LINK_MODE_40GE_BASE_R4] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_40ge_base_r4,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_40ge_base_r4),
+ .speed = SPEED_40000,
+ },
+ [LINK_MODE_50GE_BASE_R] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_50ge_base_r,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_50ge_base_r),
+ .speed = SPEED_50000,
+ },
+ [LINK_MODE_50GE_BASE_R2] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_50ge_base_r2,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_50ge_base_r2),
+ .speed = SPEED_50000,
+ },
+ [LINK_MODE_100GE_BASE_R] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_100ge_base_r,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_100ge_base_r),
+ .speed = SPEED_100000,
+ },
+ [LINK_MODE_100GE_BASE_R2] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_100ge_base_r2,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_100ge_base_r2),
+ .speed = SPEED_100000,
+ },
+ [LINK_MODE_100GE_BASE_R4] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_100ge_base_r4,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_100ge_base_r4),
+ .speed = SPEED_100000,
+ },
+ [LINK_MODE_200GE_BASE_R2] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_200ge_base_r2,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_200ge_base_r2),
+ .speed = SPEED_200000,
+ },
+ [LINK_MODE_200GE_BASE_R4] = {
+ .link_mode_bit_arr = hinic3_mag_link_mode_200ge_base_r4,
+ .arr_size = ARRAY_LEN(hinic3_mag_link_mode_200ge_base_r4),
+ .speed = SPEED_200000,
+ },
+};
+
+/*lint -restore */
+
+#define GET_SUPPORTED_MODE 0
+#define GET_ADVERTISED_MODE 1
+
+struct cmd_link_settings {
+ __ETHTOOL_DECLARE_LINK_MODE_MASK(supported);
+ __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);
+
+ u32 speed;
+ u8 duplex;
+ u8 port;
+ u8 autoneg;
+};
+
+#define ETHTOOL_ADD_SUPPORTED_LINK_MODE(ecmd, mode) \
+ set_bit(ETHTOOL_LINK_MODE_##mode##_BIT, (ecmd)->supported)
+#define ETHTOOL_ADD_ADVERTISED_LINK_MODE(ecmd, mode) \
+ set_bit(ETHTOOL_LINK_MODE_##mode##_BIT, (ecmd)->advertising)
+
+#define ETHTOOL_ADD_SUPPORTED_SPEED_LINK_MODE(ecmd, mode) \
+do { \
+ u32 i; \
+ for (i = 0; i < hw2ethtool_link_mode_table[mode].arr_size; i++) { \
+ if (hw2ethtool_link_mode_table[mode].link_mode_bit_arr[i] >= \
+ __ETHTOOL_LINK_MODE_MASK_NBITS) \
+ continue; \
+ set_bit(hw2ethtool_link_mode_table[mode].link_mode_bit_arr[i], \
+ (ecmd)->supported); \
+ } \
+} while (0)
+
+#define ETHTOOL_ADD_ADVERTISED_SPEED_LINK_MODE(ecmd, mode) \
+do { \
+ u32 i; \
+ for (i = 0; i < hw2ethtool_link_mode_table[mode].arr_size; i++) { \
+ if (hw2ethtool_link_mode_table[mode].link_mode_bit_arr[i] >= \
+ __ETHTOOL_LINK_MODE_MASK_NBITS) \
+ continue; \
+ set_bit(hw2ethtool_link_mode_table[mode].link_mode_bit_arr[i], \
+ (ecmd)->advertising); \
+ } \
+} while (0)
+
+/* Related to enum mag_cmd_port_speed */
+static u32 hw_to_ethtool_speed[] = {
+ (u32)SPEED_UNKNOWN, SPEED_10, SPEED_100, SPEED_1000, SPEED_10000,
+ SPEED_25000, SPEED_40000, SPEED_50000, SPEED_100000, SPEED_200000
+};
+
+static int hinic3_ethtool_to_hw_speed_level(u32 speed)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_LEN(hw_to_ethtool_speed); i++) {
+ if (hw_to_ethtool_speed[i] == speed)
+ break;
+ }
+
+ return i;
+}
+
+static void hinic3_add_ethtool_link_mode(struct cmd_link_settings *link_settings,
+ u32 hw_link_mode, u32 name)
+{
+ u32 link_mode;
+
+ for (link_mode = 0; link_mode < LINK_MODE_MAX_NUMBERS; link_mode++) {
+ if (hw_link_mode & BIT(link_mode)) {
+ if (name == GET_SUPPORTED_MODE)
+ ETHTOOL_ADD_SUPPORTED_SPEED_LINK_MODE
+ (link_settings, link_mode);
+ else
+ ETHTOOL_ADD_ADVERTISED_SPEED_LINK_MODE
+ (link_settings, link_mode);
+ }
+ }
+}
+
+static int hinic3_link_speed_set(struct hinic3_nic_dev *nic_dev,
+ struct cmd_link_settings *link_settings,
+ struct nic_port_info *port_info)
+{
+ u8 link_state = 0;
+ int err;
+
+ if (port_info->supported_mode != LINK_MODE_UNKNOWN)
+ hinic3_add_ethtool_link_mode(link_settings,
+ port_info->supported_mode,
+ GET_SUPPORTED_MODE);
+ if (port_info->advertised_mode != LINK_MODE_UNKNOWN)
+ hinic3_add_ethtool_link_mode(link_settings,
+ port_info->advertised_mode,
+ GET_ADVERTISED_MODE);
+
+ err = hinic3_get_link_state(nic_dev->hwdev, &link_state);
+ if (!err && link_state) {
+ link_settings->speed =
+ port_info->speed < ARRAY_LEN(hw_to_ethtool_speed) ?
+ hw_to_ethtool_speed[port_info->speed] :
+ (u32)SPEED_UNKNOWN;
+
+ link_settings->duplex = port_info->duplex;
+ } else {
+ link_settings->speed = (u32)SPEED_UNKNOWN;
+ link_settings->duplex = DUPLEX_UNKNOWN;
+ }
+
+ return 0;
+}
+
+static void hinic3_link_port_type(struct cmd_link_settings *link_settings,
+ u8 port_type)
+{
+ switch (port_type) {
+ case MAG_CMD_WIRE_TYPE_ELECTRIC:
+ ETHTOOL_ADD_SUPPORTED_LINK_MODE(link_settings, TP);
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings, TP);
+ link_settings->port = PORT_TP;
+ break;
+
+ case MAG_CMD_WIRE_TYPE_AOC:
+ case MAG_CMD_WIRE_TYPE_MM:
+ case MAG_CMD_WIRE_TYPE_SM:
+ ETHTOOL_ADD_SUPPORTED_LINK_MODE(link_settings, FIBRE);
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings, FIBRE);
+ link_settings->port = PORT_FIBRE;
+ break;
+
+ case MAG_CMD_WIRE_TYPE_COPPER:
+ ETHTOOL_ADD_SUPPORTED_LINK_MODE(link_settings, FIBRE);
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings, FIBRE);
+ link_settings->port = PORT_DA;
+ break;
+
+ case MAG_CMD_WIRE_TYPE_BACKPLANE:
+ ETHTOOL_ADD_SUPPORTED_LINK_MODE(link_settings, Backplane);
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings, Backplane);
+ link_settings->port = PORT_NONE;
+ break;
+
+ default:
+ link_settings->port = PORT_OTHER;
+ break;
+ }
+}
+
+static int get_link_pause_settings(struct hinic3_nic_dev *nic_dev,
+ struct cmd_link_settings *link_settings)
+{
+ struct nic_pause_config nic_pause = {0};
+ int err;
+
+ err = hinic3_get_pause_info(nic_dev->hwdev, &nic_pause);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to get pauseparam from hw\n");
+ return err;
+ }
+
+ ETHTOOL_ADD_SUPPORTED_LINK_MODE(link_settings, Pause);
+ if (nic_pause.rx_pause && nic_pause.tx_pause) {
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings, Pause);
+ } else if (nic_pause.tx_pause) {
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings,
+ Asym_Pause);
+ } else if (nic_pause.rx_pause) {
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings, Pause);
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings,
+ Asym_Pause);
+ }
+
+ return 0;
+}
+
+static int get_link_settings(struct net_device *netdev,
+ struct cmd_link_settings *link_settings)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct nic_port_info port_info = {0};
+ int err;
+
+ err = hinic3_get_port_info(nic_dev->hwdev, &port_info,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to get port info\n");
+ return err;
+ }
+
+ err = hinic3_link_speed_set(nic_dev, link_settings, &port_info);
+ if (err)
+ return err;
+
+ hinic3_link_port_type(link_settings, port_info.port_type);
+
+ link_settings->autoneg = port_info.autoneg_state == PORT_CFG_AN_ON ?
+ AUTONEG_ENABLE : AUTONEG_DISABLE;
+ if (port_info.autoneg_cap)
+ ETHTOOL_ADD_SUPPORTED_LINK_MODE(link_settings, Autoneg);
+ if (port_info.autoneg_state == PORT_CFG_AN_ON)
+ ETHTOOL_ADD_ADVERTISED_LINK_MODE(link_settings, Autoneg);
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev))
+ err = get_link_pause_settings(nic_dev, link_settings);
+
+ return err;
+}
+
+#ifdef ETHTOOL_GLINKSETTINGS
+#ifndef XENSERVER_HAVE_NEW_ETHTOOL_OPS
+int hinic3_get_link_ksettings(struct net_device *netdev,
+ struct ethtool_link_ksettings *link_settings)
+{
+ struct cmd_link_settings settings = { { 0 } };
+ struct ethtool_link_settings *base = &link_settings->base;
+ int err;
+
+ ethtool_link_ksettings_zero_link_mode(link_settings, supported);
+ ethtool_link_ksettings_zero_link_mode(link_settings, advertising);
+
+ err = get_link_settings(netdev, &settings);
+ if (err)
+ return err;
+
+ bitmap_copy(link_settings->link_modes.supported, settings.supported,
+ __ETHTOOL_LINK_MODE_MASK_NBITS);
+ bitmap_copy(link_settings->link_modes.advertising, settings.advertising,
+ __ETHTOOL_LINK_MODE_MASK_NBITS);
+
+ base->autoneg = settings.autoneg;
+ base->speed = settings.speed;
+ base->duplex = settings.duplex;
+ base->port = settings.port;
+
+ return 0;
+}
+#endif
+#endif
+
+static bool hinic3_is_support_speed(u32 supported_link, u32 speed)
+{
+ u32 link_mode;
+
+ for (link_mode = 0; link_mode < LINK_MODE_MAX_NUMBERS; link_mode++) {
+ if (!(supported_link & BIT(link_mode)))
+ continue;
+
+ if (hw2ethtool_link_mode_table[link_mode].speed == speed)
+ return true;
+ }
+
+ return false;
+}
+
+static int hinic3_is_speed_legal(struct hinic3_nic_dev *nic_dev,
+ struct nic_port_info *port_info, u32 speed)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ int speed_level = 0;
+
+ if (port_info->supported_mode == LINK_MODE_UNKNOWN ||
+ port_info->advertised_mode == LINK_MODE_UNKNOWN) {
+ nicif_err(nic_dev, drv, netdev, "Unknown supported link modes\n");
+ return -EAGAIN;
+ }
+
+ speed_level = hinic3_ethtool_to_hw_speed_level(speed);
+ if (speed_level >= PORT_SPEED_UNKNOWN ||
+ !hinic3_is_support_speed(port_info->supported_mode, speed)) {
+ nicif_err(nic_dev, drv, netdev,
+ "Not supported speed: %u\n", speed);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int get_link_settings_type(struct hinic3_nic_dev *nic_dev,
+ u8 autoneg, u32 speed, u32 *set_settings)
+{
+ struct nic_port_info port_info = {0};
+ int err;
+
+ err = hinic3_get_port_info(nic_dev->hwdev, &port_info,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to get current settings\n");
+ return -EAGAIN;
+ }
+
+ /* Alwayse set autonegation */
+ if (port_info.autoneg_cap)
+ *set_settings |= HILINK_LINK_SET_AUTONEG;
+
+ if (autoneg == AUTONEG_ENABLE) {
+ if (!port_info.autoneg_cap) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Not support autoneg\n");
+ return -EOPNOTSUPP;
+ }
+ } else if (speed != (u32)SPEED_UNKNOWN) {
+ /* Set speed only when autoneg is disable */
+ err = hinic3_is_speed_legal(nic_dev, &port_info, speed);
+ if (err)
+ return err;
+
+ *set_settings |= HILINK_LINK_SET_SPEED;
+ } else {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Need to set speed when autoneg is off\n");
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
+static int hinic3_set_settings_to_hw(struct hinic3_nic_dev *nic_dev,
+ u32 set_settings, u8 autoneg, u32 speed)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ struct hinic3_link_ksettings settings = {0};
+ int speed_level = 0;
+ char set_link_str[128] = {0};
+ int err = 0;
+
+ err = snprintf(set_link_str, sizeof(set_link_str) - 1, "%s",
+ (bool)(set_settings & HILINK_LINK_SET_AUTONEG) ?
+ ((bool)autoneg ? "autong enable " : "autong disable ") : "");
+ if (err < 0)
+ return -EINVAL;
+
+ if (set_settings & HILINK_LINK_SET_SPEED) {
+ speed_level = hinic3_ethtool_to_hw_speed_level(speed);
+ err = snprintf(set_link_str, sizeof(set_link_str) - 1,
+ "%sspeed %u ", set_link_str, speed);
+ if (err < 0)
+ return -EINVAL;
+ }
+
+ settings.valid_bitmap = set_settings;
+ settings.autoneg = (bool)autoneg ? PORT_CFG_AN_ON : PORT_CFG_AN_OFF;
+ settings.speed = (u8)speed_level;
+
+ err = hinic3_set_link_settings(nic_dev->hwdev, &settings);
+ if (err)
+ nicif_err(nic_dev, drv, netdev, "Set %sfailed\n",
+ set_link_str);
+ else
+ nicif_info(nic_dev, drv, netdev, "Set %ssuccess\n",
+ set_link_str);
+
+ return err;
+}
+
+static int set_link_settings(struct net_device *netdev, u8 autoneg, u32 speed)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u32 set_settings = 0;
+ int err = 0;
+
+ err = get_link_settings_type(nic_dev, autoneg, speed, &set_settings);
+ if (err)
+ return err;
+
+ if (set_settings)
+ err = hinic3_set_settings_to_hw(nic_dev, set_settings,
+ autoneg, speed);
+ else
+ nicif_info(nic_dev, drv, netdev, "Nothing changed, exiting without setting anything\n");
+
+ return err;
+}
+
+#ifdef ETHTOOL_GLINKSETTINGS
+#ifndef XENSERVER_HAVE_NEW_ETHTOOL_OPS
+int hinic3_set_link_ksettings(struct net_device *netdev,
+ const struct ethtool_link_ksettings *link_settings)
+{
+ /* Only support to set autoneg and speed */
+ return set_link_settings(netdev, link_settings->base.autoneg,
+ link_settings->base.speed);
+}
+#endif
+#endif
+
+#ifndef HAVE_NEW_ETHTOOL_LINK_SETTINGS_ONLY
+int hinic3_get_settings(struct net_device *netdev, struct ethtool_cmd *ep)
+{
+ struct cmd_link_settings settings = { { 0 } };
+ int err;
+
+ err = get_link_settings(netdev, &settings);
+ if (err)
+ return err;
+
+ ep->supported = settings.supported[0] & ((u32)~0);
+ ep->advertising = settings.advertising[0] & ((u32)~0);
+
+ ep->autoneg = settings.autoneg;
+ ethtool_cmd_speed_set(ep, settings.speed);
+ ep->duplex = settings.duplex;
+ ep->port = settings.port;
+ ep->transceiver = XCVR_INTERNAL;
+
+ return 0;
+}
+
+int hinic3_set_settings(struct net_device *netdev,
+ struct ethtool_cmd *link_settings)
+{
+ /* Only support to set autoneg and speed */
+ return set_link_settings(netdev, link_settings->autoneg,
+ ethtool_cmd_speed(link_settings));
+}
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_filter.c b/drivers/net/ethernet/huawei/hinic3/hinic3_filter.c
new file mode 100644
index 000000000000..70346d6393de
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_filter.c
@@ -0,0 +1,483 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/debugfs.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+
+#include "ossl_knl.h"
+#include "hinic3_hw.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_srv_nic.h"
+
+static unsigned char set_filter_state = 1;
+module_param(set_filter_state, byte, 0444);
+MODULE_PARM_DESC(set_filter_state, "Set mac filter config state: 0 - disable, 1 - enable (default=1)");
+
+static int hinic3_uc_sync(struct net_device *netdev, u8 *addr)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ return hinic3_set_mac(nic_dev->hwdev, addr, 0,
+ hinic3_global_func_id(nic_dev->hwdev),
+ HINIC3_CHANNEL_NIC);
+}
+
+static int hinic3_uc_unsync(struct net_device *netdev, u8 *addr)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ /* The addr is in use */
+ if (ether_addr_equal(addr, netdev->dev_addr))
+ return 0;
+
+ return hinic3_del_mac(nic_dev->hwdev, addr, 0,
+ hinic3_global_func_id(nic_dev->hwdev),
+ HINIC3_CHANNEL_NIC);
+}
+
+void hinic3_clean_mac_list_filter(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+
+ list_for_each_entry_safe(f, ftmp, &nic_dev->uc_filter_list, list) {
+ if (f->state == HINIC3_MAC_HW_SYNCED)
+ hinic3_uc_unsync(netdev, f->addr);
+ list_del(&f->list);
+ kfree(f);
+ }
+
+ list_for_each_entry_safe(f, ftmp, &nic_dev->mc_filter_list, list) {
+ if (f->state == HINIC3_MAC_HW_SYNCED)
+ hinic3_uc_unsync(netdev, f->addr);
+ list_del(&f->list);
+ kfree(f);
+ }
+}
+
+static struct hinic3_mac_filter *hinic3_find_mac(const struct list_head *filter_list,
+ u8 *addr)
+{
+ struct hinic3_mac_filter *f = NULL;
+
+ list_for_each_entry(f, filter_list, list) {
+ if (ether_addr_equal(addr, f->addr))
+ return f;
+ }
+ return NULL;
+}
+
+static struct hinic3_mac_filter *hinic3_add_filter(struct hinic3_nic_dev *nic_dev,
+ struct list_head *mac_filter_list,
+ u8 *addr)
+{
+ struct hinic3_mac_filter *f;
+
+ f = kzalloc(sizeof(*f), GFP_ATOMIC);
+ if (!f)
+ goto out;
+
+ ether_addr_copy(f->addr, addr);
+
+ INIT_LIST_HEAD(&f->list);
+ list_add_tail(&f->list, mac_filter_list);
+
+ f->state = HINIC3_MAC_WAIT_HW_SYNC;
+ set_bit(HINIC3_MAC_FILTER_CHANGED, &nic_dev->flags);
+
+out:
+ return f;
+}
+
+static void hinic3_del_filter(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_mac_filter *f)
+{
+ set_bit(HINIC3_MAC_FILTER_CHANGED, &nic_dev->flags);
+
+ if (f->state == HINIC3_MAC_WAIT_HW_SYNC) {
+ /* have not added to hw, delete it directly */
+ list_del(&f->list);
+ kfree(f);
+ return;
+ }
+
+ f->state = HINIC3_MAC_WAIT_HW_UNSYNC;
+}
+
+static struct hinic3_mac_filter *hinic3_mac_filter_entry_clone(const struct hinic3_mac_filter *src)
+{
+ struct hinic3_mac_filter *f;
+
+ f = kzalloc(sizeof(*f), GFP_ATOMIC);
+ if (!f)
+ return NULL;
+
+ *f = *src;
+ INIT_LIST_HEAD(&f->list);
+
+ return f;
+}
+
+static void hinic3_undo_del_filter_entries(struct list_head *filter_list,
+ const struct list_head *from)
+{
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+
+ list_for_each_entry_safe(f, ftmp, from, list) {
+ if (hinic3_find_mac(filter_list, f->addr))
+ continue;
+
+ if (f->state == HINIC3_MAC_HW_SYNCED)
+ f->state = HINIC3_MAC_WAIT_HW_UNSYNC;
+
+ list_move_tail(&f->list, filter_list);
+ }
+}
+
+static void hinic3_undo_add_filter_entries(struct list_head *filter_list,
+ const struct list_head *from)
+{
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *tmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+
+ list_for_each_entry_safe(f, ftmp, from, list) {
+ tmp = hinic3_find_mac(filter_list, f->addr);
+ if (tmp && tmp->state == HINIC3_MAC_HW_SYNCED)
+ tmp->state = HINIC3_MAC_WAIT_HW_SYNC;
+ }
+}
+
+static void hinic3_cleanup_filter_list(const struct list_head *head)
+{
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+
+ list_for_each_entry_safe(f, ftmp, head, list) {
+ list_del(&f->list);
+ kfree(f);
+ }
+}
+
+static int hinic3_mac_filter_sync_hw(struct hinic3_nic_dev *nic_dev,
+ struct list_head *del_list,
+ struct list_head *add_list)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+ int err = 0, add_count = 0;
+
+ if (!list_empty(del_list)) {
+ list_for_each_entry_safe(f, ftmp, del_list, list) {
+ err = hinic3_uc_unsync(netdev, f->addr);
+ if (err) { /* ignore errors when delete mac */
+ nic_err(&nic_dev->pdev->dev, "Failed to delete mac\n");
+ }
+
+ list_del(&f->list);
+ kfree(f);
+ }
+ }
+
+ if (!list_empty(add_list)) {
+ list_for_each_entry_safe(f, ftmp, add_list, list) {
+ err = hinic3_uc_sync(netdev, f->addr);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to add mac\n");
+ return err;
+ }
+
+ add_count++;
+ list_del(&f->list);
+ kfree(f);
+ }
+ }
+
+ return add_count;
+}
+
+static int hinic3_mac_filter_sync(struct hinic3_nic_dev *nic_dev,
+ struct list_head *mac_filter_list, bool uc)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ struct list_head tmp_del_list, tmp_add_list;
+ struct hinic3_mac_filter *fclone = NULL;
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+ int err = 0, add_count = 0;
+
+ INIT_LIST_HEAD(&tmp_del_list);
+ INIT_LIST_HEAD(&tmp_add_list);
+
+ list_for_each_entry_safe(f, ftmp, mac_filter_list, list) {
+ if (f->state != HINIC3_MAC_WAIT_HW_UNSYNC)
+ continue;
+
+ f->state = HINIC3_MAC_HW_UNSYNCED;
+ list_move_tail(&f->list, &tmp_del_list);
+ }
+
+ list_for_each_entry_safe(f, ftmp, mac_filter_list, list) {
+ if (f->state != HINIC3_MAC_WAIT_HW_SYNC)
+ continue;
+
+ fclone = hinic3_mac_filter_entry_clone(f);
+ if (!fclone) {
+ err = -ENOMEM;
+ break;
+ }
+
+ f->state = HINIC3_MAC_HW_SYNCED;
+ list_add_tail(&fclone->list, &tmp_add_list);
+ }
+
+ if (err) {
+ hinic3_undo_del_filter_entries(mac_filter_list, &tmp_del_list);
+ hinic3_undo_add_filter_entries(mac_filter_list, &tmp_add_list);
+ nicif_err(nic_dev, drv, netdev, "Failed to clone mac_filter_entry\n");
+
+ hinic3_cleanup_filter_list(&tmp_del_list);
+ hinic3_cleanup_filter_list(&tmp_add_list);
+ return -ENOMEM;
+ }
+
+ add_count = hinic3_mac_filter_sync_hw(nic_dev, &tmp_del_list,
+ &tmp_add_list);
+ if (list_empty(&tmp_add_list))
+ return add_count;
+
+ /* there are errors when add mac to hw, delete all mac in hw */
+ hinic3_undo_add_filter_entries(mac_filter_list, &tmp_add_list);
+ /* VF don't support to enter promisc mode,
+ * so we can't delete any other uc mac
+ */
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev) || !uc) {
+ list_for_each_entry_safe(f, ftmp, mac_filter_list, list) {
+ if (f->state != HINIC3_MAC_HW_SYNCED)
+ continue;
+
+ fclone = hinic3_mac_filter_entry_clone(f);
+ if (!fclone)
+ break;
+
+ f->state = HINIC3_MAC_WAIT_HW_SYNC;
+ list_add_tail(&fclone->list, &tmp_del_list);
+ }
+ }
+
+ hinic3_cleanup_filter_list(&tmp_add_list);
+ hinic3_mac_filter_sync_hw(nic_dev, &tmp_del_list, &tmp_add_list);
+
+ /* need to enter promisc/allmulti mode */
+ return -ENOMEM;
+}
+
+static void hinic3_mac_filter_sync_all(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ int add_count;
+
+ if (test_bit(HINIC3_MAC_FILTER_CHANGED, &nic_dev->flags)) {
+ clear_bit(HINIC3_MAC_FILTER_CHANGED, &nic_dev->flags);
+ add_count = hinic3_mac_filter_sync(nic_dev,
+ &nic_dev->uc_filter_list,
+ true);
+ if (add_count < 0 && HINIC3_SUPPORT_PROMISC(nic_dev->hwdev)) {
+ set_bit(HINIC3_PROMISC_FORCE_ON,
+ &nic_dev->rx_mod_state);
+ nicif_info(nic_dev, drv, netdev, "Promisc mode forced on\n");
+ } else if (add_count) {
+ clear_bit(HINIC3_PROMISC_FORCE_ON,
+ &nic_dev->rx_mod_state);
+ }
+
+ add_count = hinic3_mac_filter_sync(nic_dev,
+ &nic_dev->mc_filter_list,
+ false);
+ if (add_count < 0 && HINIC3_SUPPORT_ALLMULTI(nic_dev->hwdev)) {
+ set_bit(HINIC3_ALLMULTI_FORCE_ON,
+ &nic_dev->rx_mod_state);
+ nicif_info(nic_dev, drv, netdev, "All multicast mode forced on\n");
+ } else if (add_count) {
+ clear_bit(HINIC3_ALLMULTI_FORCE_ON,
+ &nic_dev->rx_mod_state);
+ }
+ }
+}
+
+#define HINIC3_DEFAULT_RX_MODE (NIC_RX_MODE_UC | NIC_RX_MODE_MC | \
+ NIC_RX_MODE_BC)
+
+static void hinic3_update_mac_filter(struct hinic3_nic_dev *nic_dev,
+ const struct netdev_hw_addr_list *src_list,
+ struct list_head *filter_list)
+{
+ struct hinic3_mac_filter *filter = NULL;
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+ struct netdev_hw_addr *ha = NULL;
+
+ /* add addr if not already in the filter list */
+ netif_addr_lock_bh(nic_dev->netdev);
+ netdev_hw_addr_list_for_each(ha, src_list) {
+ filter = hinic3_find_mac(filter_list, ha->addr);
+ if (!filter)
+ hinic3_add_filter(nic_dev, filter_list, ha->addr);
+ else if (filter->state == HINIC3_MAC_WAIT_HW_UNSYNC)
+ filter->state = HINIC3_MAC_HW_SYNCED;
+ }
+ netif_addr_unlock_bh(nic_dev->netdev);
+
+ /* delete addr if not in netdev list */
+ list_for_each_entry_safe(f, ftmp, filter_list, list) {
+ bool found = false;
+
+ netif_addr_lock_bh(nic_dev->netdev);
+ netdev_hw_addr_list_for_each(ha, src_list)
+ if (ether_addr_equal(ha->addr, f->addr)) {
+ found = true;
+ break;
+ }
+ netif_addr_unlock_bh(nic_dev->netdev);
+
+ if (found)
+ continue;
+
+ hinic3_del_filter(nic_dev, f);
+ }
+}
+
+#ifndef NETDEV_HW_ADDR_T_MULTICAST
+static void hinic3_update_mc_filter(struct hinic3_nic_dev *nic_dev,
+ struct list_head *filter_list)
+{
+ struct hinic3_mac_filter *filter = NULL;
+ struct hinic3_mac_filter *ftmp = NULL;
+ struct hinic3_mac_filter *f = NULL;
+ struct dev_mc_list *ha = NULL;
+
+ /* add addr if not already in the filter list */
+ netif_addr_lock_bh(nic_dev->netdev);
+ netdev_for_each_mc_addr(ha, nic_dev->netdev) {
+ filter = hinic3_find_mac(filter_list, ha->da_addr);
+ if (!filter)
+ hinic3_add_filter(nic_dev, filter_list, ha->da_addr);
+ else if (filter->state == HINIC3_MAC_WAIT_HW_UNSYNC)
+ filter->state = HINIC3_MAC_HW_SYNCED;
+ }
+ netif_addr_unlock_bh(nic_dev->netdev);
+ /* delete addr if not in netdev list */
+ list_for_each_entry_safe(f, ftmp, filter_list, list) {
+ bool found = false;
+
+ netif_addr_lock_bh(nic_dev->netdev);
+ netdev_for_each_mc_addr(ha, nic_dev->netdev)
+ if (ether_addr_equal(ha->da_addr, f->addr)) {
+ found = true;
+ break;
+ }
+ netif_addr_unlock_bh(nic_dev->netdev);
+
+ if (found)
+ continue;
+
+ hinic3_del_filter(nic_dev, f);
+ }
+}
+#endif
+
+static void update_mac_filter(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+
+ if (test_and_clear_bit(HINIC3_UPDATE_MAC_FILTER, &nic_dev->flags)) {
+ hinic3_update_mac_filter(nic_dev, &netdev->uc,
+ &nic_dev->uc_filter_list);
+ /* FPGA mc only 12 entry, default disable mc */
+ if (set_filter_state) {
+#ifdef NETDEV_HW_ADDR_T_MULTICAST
+ hinic3_update_mac_filter(nic_dev, &netdev->mc,
+ &nic_dev->mc_filter_list);
+#else
+ hinic3_update_mc_filter(nic_dev,
+ &nic_dev->mc_filter_list);
+#endif
+ }
+ }
+}
+
+static void sync_rx_mode_to_hw(struct hinic3_nic_dev *nic_dev, int promisc_en,
+ int allmulti_en)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u32 rx_mod = HINIC3_DEFAULT_RX_MODE;
+ int err;
+
+ rx_mod |= (promisc_en ? NIC_RX_MODE_PROMISC : 0);
+ rx_mod |= (allmulti_en ? NIC_RX_MODE_MC_ALL : 0);
+
+ if (promisc_en != test_bit(HINIC3_HW_PROMISC_ON,
+ &nic_dev->rx_mod_state))
+ nicif_info(nic_dev, drv, netdev,
+ "%s promisc mode\n",
+ promisc_en ? "Enter" : "Left");
+ if (allmulti_en !=
+ test_bit(HINIC3_HW_ALLMULTI_ON, &nic_dev->rx_mod_state))
+ nicif_info(nic_dev, drv, netdev,
+ "%s all_multi mode\n",
+ allmulti_en ? "Enter" : "Left");
+
+ err = hinic3_set_rx_mode(nic_dev->hwdev, rx_mod);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to set rx_mode\n");
+ return;
+ }
+
+ promisc_en ? set_bit(HINIC3_HW_PROMISC_ON, &nic_dev->rx_mod_state) :
+ clear_bit(HINIC3_HW_PROMISC_ON, &nic_dev->rx_mod_state);
+
+ allmulti_en ? set_bit(HINIC3_HW_ALLMULTI_ON, &nic_dev->rx_mod_state) :
+ clear_bit(HINIC3_HW_ALLMULTI_ON, &nic_dev->rx_mod_state);
+}
+
+void hinic3_set_rx_mode_work(struct work_struct *work)
+{
+ struct hinic3_nic_dev *nic_dev =
+ container_of(work, struct hinic3_nic_dev, rx_mode_work);
+ struct net_device *netdev = nic_dev->netdev;
+ int promisc_en = 0, allmulti_en = 0;
+
+ update_mac_filter(nic_dev);
+
+ hinic3_mac_filter_sync_all(nic_dev);
+
+ if (HINIC3_SUPPORT_PROMISC(nic_dev->hwdev))
+ promisc_en = !!(netdev->flags & IFF_PROMISC) ||
+ test_bit(HINIC3_PROMISC_FORCE_ON,
+ &nic_dev->rx_mod_state);
+
+ if (HINIC3_SUPPORT_ALLMULTI(nic_dev->hwdev))
+ allmulti_en = !!(netdev->flags & IFF_ALLMULTI) ||
+ test_bit(HINIC3_ALLMULTI_FORCE_ON,
+ &nic_dev->rx_mod_state);
+
+ if (promisc_en !=
+ test_bit(HINIC3_HW_PROMISC_ON, &nic_dev->rx_mod_state) ||
+ allmulti_en !=
+ test_bit(HINIC3_HW_ALLMULTI_ON, &nic_dev->rx_mod_state))
+ sync_rx_mode_to_hw(nic_dev, promisc_en, allmulti_en);
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_hw.h b/drivers/net/ethernet/huawei/hinic3/hinic3_hw.h
new file mode 100644
index 000000000000..34888e3d3535
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_hw.h
@@ -0,0 +1,828 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_HW_H
+#define HINIC3_HW_H
+
+#include "hinic3_comm_cmd.h"
+#include "comm_msg_intf.h"
+#include "comm_cmdq_intf.h"
+
+#include "hinic3_crm.h"
+
+#ifndef BIG_ENDIAN
+#define BIG_ENDIAN 0x4321
+#endif
+
+#ifndef LITTLE_ENDIAN
+#define LITTLE_ENDIAN 0x1234
+#endif
+
+#ifdef BYTE_ORDER
+#undef BYTE_ORDER
+#endif
+/* X86 */
+#define BYTE_ORDER LITTLE_ENDIAN
+
+/* to use 0-level CLA, page size must be: SQ 16B(wqe) * 64k(max_q_depth) */
+#define HINIC3_DEFAULT_WQ_PAGE_SIZE 0x100000
+#define HINIC3_HW_WQ_PAGE_SIZE 0x1000
+#define HINIC3_MAX_WQ_PAGE_SIZE_ORDER 8
+#define SPU_HOST_ID 4
+
+enum hinic3_channel_id {
+ HINIC3_CHANNEL_DEFAULT,
+ HINIC3_CHANNEL_COMM,
+ HINIC3_CHANNEL_NIC,
+ HINIC3_CHANNEL_ROCE,
+ HINIC3_CHANNEL_TOE,
+ HINIC3_CHANNEL_FC,
+ HINIC3_CHANNEL_OVS,
+ HINIC3_CHANNEL_DSW,
+ HINIC3_CHANNEL_MIG,
+ HINIC3_CHANNEL_CRYPT,
+
+ HINIC3_CHANNEL_MAX = 32,
+};
+
+struct hinic3_cmd_buf {
+ void *buf;
+ dma_addr_t dma_addr;
+ u16 size;
+ /* Usage count, USERS DO NOT USE */
+ atomic_t ref_cnt;
+};
+
+enum hinic3_aeq_type {
+ HINIC3_HW_INTER_INT = 0,
+ HINIC3_MBX_FROM_FUNC = 1,
+ HINIC3_MSG_FROM_MGMT_CPU = 2,
+ HINIC3_API_RSP = 3,
+ HINIC3_API_CHAIN_STS = 4,
+ HINIC3_MBX_SEND_RSLT = 5,
+ HINIC3_MAX_AEQ_EVENTS
+};
+
+enum hinic3_aeq_sw_type {
+ HINIC3_STATELESS_EVENT = 0,
+ HINIC3_STATEFUL_EVENT = 1,
+ HINIC3_MAX_AEQ_SW_EVENTS
+};
+
+enum hinic3_hwdev_init_state {
+ HINIC3_HWDEV_NONE_INITED = 0,
+ HINIC3_HWDEV_MGMT_INITED,
+ HINIC3_HWDEV_MBOX_INITED,
+ HINIC3_HWDEV_CMDQ_INITED,
+};
+
+enum hinic3_ceq_event {
+ HINIC3_NON_L2NIC_SCQ,
+ HINIC3_NON_L2NIC_ECQ,
+ HINIC3_NON_L2NIC_NO_CQ_EQ,
+ HINIC3_CMDQ,
+ HINIC3_L2NIC_SQ,
+ HINIC3_L2NIC_RQ,
+ HINIC3_MAX_CEQ_EVENTS,
+};
+
+enum hinic3_mbox_seg_errcode {
+ MBOX_ERRCODE_NO_ERRORS = 0,
+ /* VF send the mailbox data to the wrong destination functions */
+ MBOX_ERRCODE_VF_TO_WRONG_FUNC = 0x100,
+ /* PPF send the mailbox data to the wrong destination functions */
+ MBOX_ERRCODE_PPF_TO_WRONG_FUNC = 0x200,
+ /* PF send the mailbox data to the wrong destination functions */
+ MBOX_ERRCODE_PF_TO_WRONG_FUNC = 0x300,
+ /* The mailbox data size is set to all zero */
+ MBOX_ERRCODE_ZERO_DATA_SIZE = 0x400,
+ /* The sender function attribute has not been learned by hardware */
+ MBOX_ERRCODE_UNKNOWN_SRC_FUNC = 0x500,
+ /* The receiver function attr has not been learned by hardware */
+ MBOX_ERRCODE_UNKNOWN_DES_FUNC = 0x600,
+};
+
+struct hinic3_ceq_info {
+ u32 q_len;
+ u32 page_size;
+ u16 elem_size;
+ u16 num_pages;
+ u32 num_elem_in_pg;
+};
+
+typedef void (*hinic3_aeq_hwe_cb)(void *pri_handle, u8 *data, u8 size);
+typedef u8 (*hinic3_aeq_swe_cb)(void *pri_handle, u8 event, u8 *data);
+typedef void (*hinic3_ceq_event_cb)(void *pri_handle, u32 ceqe_data);
+
+typedef int (*hinic3_vf_mbox_cb)(void *pri_handle,
+ u16 cmd, void *buf_in, u16 in_size, void *buf_out, u16 *out_size);
+
+typedef int (*hinic3_pf_mbox_cb)(void *pri_handle,
+ u16 vf_id, u16 cmd, void *buf_in, u16 in_size, void *buf_out, u16 *out_size);
+
+typedef int (*hinic3_ppf_mbox_cb)(void *pri_handle, u16 pf_idx,
+ u16 vf_id, u16 cmd, void *buf_in, u16 in_size, void *buf_out, u16 *out_size);
+
+typedef int (*hinic3_pf_recv_from_ppf_mbox_cb)(void *pri_handle,
+ u16 cmd, void *buf_in, u16 in_size, void *buf_out, u16 *out_size);
+
+/**
+ * @brief hinic3_aeq_register_hw_cb - register aeq hardware callback
+ * @param hwdev: device pointer to hwdev
+ * @param event: event type
+ * @param hwe_cb: callback function
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_aeq_register_hw_cb(void *hwdev, void *pri_handle,
+ enum hinic3_aeq_type event, hinic3_aeq_hwe_cb hwe_cb);
+
+/**
+ * @brief hinic3_aeq_unregister_hw_cb - unregister aeq hardware callback
+ * @param hwdev: device pointer to hwdev
+ * @param event: event type
+ **/
+void hinic3_aeq_unregister_hw_cb(void *hwdev, enum hinic3_aeq_type event);
+
+/**
+ * @brief hinic3_aeq_register_swe_cb - register aeq soft event callback
+ * @param hwdev: device pointer to hwdev
+ * @pri_handle: the pointer to private invoker device
+ * @param event: event type
+ * @param aeq_swe_cb: callback function
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_aeq_register_swe_cb(void *hwdev, void *pri_handle, enum hinic3_aeq_sw_type event,
+ hinic3_aeq_swe_cb aeq_swe_cb);
+
+/**
+ * @brief hinic3_aeq_unregister_swe_cb - unregister aeq soft event callback
+ * @param hwdev: device pointer to hwdev
+ * @param event: event type
+ **/
+void hinic3_aeq_unregister_swe_cb(void *hwdev, enum hinic3_aeq_sw_type event);
+
+/**
+ * @brief hinic3_ceq_register_cb - register ceq callback
+ * @param hwdev: device pointer to hwdev
+ * @param event: event type
+ * @param callback: callback function
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_ceq_register_cb(void *hwdev, void *pri_handle, enum hinic3_ceq_event event,
+ hinic3_ceq_event_cb callback);
+/**
+ * @brief hinic3_ceq_unregister_cb - unregister ceq callback
+ * @param hwdev: device pointer to hwdev
+ * @param event: event type
+ **/
+void hinic3_ceq_unregister_cb(void *hwdev, enum hinic3_ceq_event event);
+
+/**
+ * @brief hinic3_register_ppf_mbox_cb - ppf register mbox msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param pri_handle: private data will be used by the callback
+ * @param callback: callback function
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_register_ppf_mbox_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_ppf_mbox_cb callback);
+
+/**
+ * @brief hinic3_register_pf_mbox_cb - pf register mbox msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param pri_handle: private data will be used by the callback
+ * @param callback: callback function
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_register_pf_mbox_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_pf_mbox_cb callback);
+/**
+ * @brief hinic3_register_vf_mbox_cb - vf register mbox msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param pri_handle: private data will be used by the callback
+ * @param callback: callback function
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_register_vf_mbox_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_vf_mbox_cb callback);
+
+/**
+ * @brief hinic3_unregister_ppf_mbox_cb - ppf unregister mbox msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ **/
+void hinic3_unregister_ppf_mbox_cb(void *hwdev, u8 mod);
+
+/**
+ * @brief hinic3_unregister_pf_mbox_cb - pf register mbox msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ **/
+void hinic3_unregister_pf_mbox_cb(void *hwdev, u8 mod);
+
+/**
+ * @brief hinic3_unregister_vf_mbox_cb - pf register mbox msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ **/
+void hinic3_unregister_vf_mbox_cb(void *hwdev, u8 mod);
+
+/**
+ * @brief hinic3_unregister_ppf_to_pf_mbox_cb - unregister mbox msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ **/
+void hinic3_unregister_ppf_to_pf_mbox_cb(void *hwdev, u8 mod);
+
+typedef void (*hinic3_mgmt_msg_cb)(void *pri_handle,
+ u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+
+/**
+ * @brief hinic3_register_service_adapter - register mgmt msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param pri_handle: private data will be used by the callback
+ * @param callback: callback function
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_register_mgmt_msg_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_mgmt_msg_cb callback);
+
+/**
+ * @brief hinic3_unregister_mgmt_msg_cb - unregister mgmt msg callback
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ **/
+void hinic3_unregister_mgmt_msg_cb(void *hwdev, u8 mod);
+
+/**
+ * @brief hinic3_register_service_adapter - register service adapter
+ * @param hwdev: device pointer to hwdev
+ * @param service_adapter: service adapter
+ * @param type: service type
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_register_service_adapter(void *hwdev, void *service_adapter,
+ enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_unregister_service_adapter - unregister service adapter
+ * @param hwdev: device pointer to hwdev
+ * @param type: service type
+ **/
+void hinic3_unregister_service_adapter(void *hwdev,
+ enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_get_service_adapter - get service adapter
+ * @param hwdev: device pointer to hwdev
+ * @param type: service type
+ * @retval non-zero: success
+ * @retval null: failure
+ **/
+void *hinic3_get_service_adapter(void *hwdev, enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_alloc_db_phy_addr - alloc doorbell & direct wqe pyhsical addr
+ * @param hwdev: device pointer to hwdev
+ * @param db_base: pointer to alloc doorbell base address
+ * @param dwqe_base: pointer to alloc direct base address
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_alloc_db_phy_addr(void *hwdev, u64 *db_base, u64 *dwqe_base);
+
+/**
+ * @brief hinic3_free_db_phy_addr - free doorbell & direct wqe physical address
+ * @param hwdev: device pointer to hwdev
+ * @param db_base: pointer to free doorbell base address
+ * @param dwqe_base: pointer to free direct base address
+ **/
+void hinic3_free_db_phy_addr(void *hwdev, u64 db_base, u64 dwqe_base);
+
+/**
+ * @brief hinic3_alloc_db_addr - alloc doorbell & direct wqe
+ * @param hwdev: device pointer to hwdev
+ * @param db_base: pointer to alloc doorbell base address
+ * @param dwqe_base: pointer to alloc direct base address
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_alloc_db_addr(void *hwdev, void __iomem **db_base,
+ void __iomem **dwqe_base);
+
+/**
+ * @brief hinic3_free_db_addr - free doorbell & direct wqe
+ * @param hwdev: device pointer to hwdev
+ * @param db_base: pointer to free doorbell base address
+ * @param dwqe_base: pointer to free direct base address
+ **/
+void hinic3_free_db_addr(void *hwdev, const void __iomem *db_base,
+ void __iomem *dwqe_base);
+
+/**
+ * @brief hinic3_alloc_db_phy_addr - alloc physical doorbell & direct wqe
+ * @param hwdev: device pointer to hwdev
+ * @param db_base: pointer to alloc doorbell base address
+ * @param dwqe_base: pointer to alloc direct base address
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_alloc_db_phy_addr(void *hwdev, u64 *db_base, u64 *dwqe_base);
+
+/**
+ * @brief hinic3_free_db_phy_addr - free physical doorbell & direct wqe
+ * @param hwdev: device pointer to hwdev
+ * @param db_base: free doorbell base address
+ * @param dwqe_base: free direct base address
+ **/
+
+void hinic3_free_db_phy_addr(void *hwdev, u64 db_base, u64 dwqe_base);
+
+/**
+ * @brief hinic3_set_root_ctxt - set root context
+ * @param hwdev: device pointer to hwdev
+ * @param rq_depth: rq depth
+ * @param sq_depth: sq depth
+ * @param rx_buf_sz: rx buffer size
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_set_root_ctxt(void *hwdev, u32 rq_depth, u32 sq_depth,
+ int rx_buf_sz, u16 channel);
+
+/**
+ * @brief hinic3_clean_root_ctxt - clean root context
+ * @param hwdev: device pointer to hwdev
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_clean_root_ctxt(void *hwdev, u16 channel);
+
+/**
+ * @brief hinic3_alloc_cmd_buf - alloc cmd buffer
+ * @param hwdev: device pointer to hwdev
+ * @retval non-zero: success
+ * @retval null: failure
+ **/
+struct hinic3_cmd_buf *hinic3_alloc_cmd_buf(void *hwdev);
+
+/**
+ * @brief hinic3_free_cmd_buf - free cmd buffer
+ * @param hwdev: device pointer to hwdev
+ * @param cmd_buf: cmd buffer to free
+ **/
+void hinic3_free_cmd_buf(void *hwdev, struct hinic3_cmd_buf *cmd_buf);
+
+/**
+ * hinic3_sm_ctr_rd16 - small single 16 counter read
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd16(void *hwdev, u8 node, u8 instance, u32 ctr_id, u16 *value);
+
+/**
+ * @brief hinic3_sm_ctr_rd32 - small single 32 counter read
+ * @param hwdev: device pointer to hwdev
+ * @param node: the node id
+ * @param instance: instance id
+ * @param ctr_id: counter id
+ * @param value: read counter value ptr
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_sm_ctr_rd32(void *hwdev, u8 node, u8 instance, u32 ctr_id,
+ u32 *value);
+/**
+ * @brief hinic3_sm_ctr_rd32_clear - small single 32 counter read clear
+ * @param hwdev: device pointer to hwdev
+ * @param node: the node id
+ * @param instance: instance id
+ * @param ctr_id: counter id
+ * @param value: read counter value ptr
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_sm_ctr_rd32_clear(void *hwdev, u8 node, u8 instance,
+ u32 ctr_id, u32 *value);
+
+/**
+ * @brief hinic3_sm_ctr_rd64_pair - big pair 128 counter read
+ * @param hwdev: device pointer to hwdev
+ * @param node: the node id
+ * @param instance: instance id
+ * @param ctr_id: counter id
+ * @param value1: read counter value ptr
+ * @param value2: read counter value ptr
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_sm_ctr_rd64_pair(void *hwdev, u8 node, u8 instance,
+ u32 ctr_id, u64 *value1, u64 *value2);
+
+/**
+ * hinic3_sm_ctr_rd64_pair_clear - big pair 128 counter read
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value1: read counter value ptr
+ * @value2: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd64_pair_clear(void *hwdev, u8 node, u8 instance,
+ u32 ctr_id, u64 *value1, u64 *value2);
+
+/**
+ * @brief hinic3_sm_ctr_rd64 - big counter 64 read
+ * @param hwdev: device pointer to hwdev
+ * @param node: the node id
+ * @param instance: instance id
+ * @param ctr_id: counter id
+ * @param value: read counter value ptr
+ * @retval zero: success
+ * @retval non-zero: failure
+ **/
+int hinic3_sm_ctr_rd64(void *hwdev, u8 node, u8 instance, u32 ctr_id,
+ u64 *value);
+
+/**
+ * hinic3_sm_ctr_rd64_clear - big counter 64 read
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd64_clear(void *hwdev, u8 node, u8 instance,
+ u32 ctr_id, u64 *value);
+
+/**
+ * @brief hinic3_api_csr_rd32 - read 32 byte csr
+ * @param hwdev: device pointer to hwdev
+ * @param dest: hardware node id
+ * @param addr: reg address
+ * @param val: reg value
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_api_csr_rd32(void *hwdev, u8 dest, u32 addr, u32 *val);
+
+/**
+ * @brief hinic3_api_csr_wr32 - write 32 byte csr
+ * @param hwdev: device pointer to hwdev
+ * @param dest: hardware node id
+ * @param addr: reg address
+ * @param val: reg value
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_api_csr_wr32(void *hwdev, u8 dest, u32 addr, u32 val);
+
+/**
+ * @brief hinic3_api_csr_rd64 - read 64 byte csr
+ * @param hwdev: device pointer to hwdev
+ * @param dest: hardware node id
+ * @param addr: reg address
+ * @param val: reg value
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_api_csr_rd64(void *hwdev, u8 dest, u32 addr, u64 *val);
+
+/**
+ * @brief hinic3_dbg_get_hw_stats - get hardware stats
+ * @param hwdev: device pointer to hwdev
+ * @param hw_stats: pointer to memory caller to alloc
+ * @param out_size: out size
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_dbg_get_hw_stats(const void *hwdev, u8 *hw_stats, const u16 *out_size);
+
+/**
+ * @brief hinic3_dbg_clear_hw_stats - clear hardware stats
+ * @param hwdev: device pointer to hwdev
+ * @retval clear hardware size
+ */
+u16 hinic3_dbg_clear_hw_stats(void *hwdev);
+
+/**
+ * @brief hinic3_get_chip_fault_stats - get chip fault stats
+ * @param hwdev: device pointer to hwdev
+ * @param chip_fault_stats: pointer to memory caller to alloc
+ * @param offset: offset
+ */
+void hinic3_get_chip_fault_stats(const void *hwdev, u8 *chip_fault_stats,
+ u32 offset);
+
+/**
+ * @brief hinic3_msg_to_mgmt_sync - msg to management cpu
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param in_size: in buffer size
+ * @param buf_out: message buffer out
+ * @param out_size: out buffer size
+ * @param timeout: timeout
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_msg_to_mgmt_sync(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout, u16 channel);
+
+/**
+ * @brief hinic3_msg_to_mgmt_async - msg to management cpu async
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param in_size: in buffer size
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ *
+ * The function does not sleep inside, allowing use in irq context
+ */
+int hinic3_msg_to_mgmt_async(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, u16 channel);
+
+/**
+ * @brief hinic3_msg_to_mgmt_no_ack - msg to management cpu don't need no ack
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param in_size: in buffer size
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ *
+ * The function will sleep inside, and it is not allowed to be used in
+ * interrupt context
+ */
+int hinic3_msg_to_mgmt_no_ack(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, u16 channel);
+
+int hinic3_msg_to_mgmt_api_chain_async(void *hwdev, u8 mod, u16 cmd,
+ const void *buf_in, u16 in_size);
+
+int hinic3_msg_to_mgmt_api_chain_sync(void *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size, u32 timeout);
+
+/**
+ * @brief hinic3_mbox_to_pf - vf mbox message to pf
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param in_size: in buffer size
+ * @param buf_out: message buffer out
+ * @param out_size: out buffer size
+ * @param timeout: timeout
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_mbox_to_pf(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout, u16 channel);
+
+/**
+ * @brief hinic3_mbox_to_vf - mbox message to vf
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf index
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param in_size: in buffer size
+ * @param buf_out: message buffer out
+ * @param out_size: out buffer size
+ * @param timeout: timeout
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_mbox_to_vf(void *hwdev, u16 vf_id, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size, u32 timeout,
+ u16 channel);
+
+int hinic3_clp_to_mgmt(void *hwdev, u8 mod, u16 cmd, const void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size);
+/**
+ * @brief hinic3_cmdq_async - cmdq asynchronous message
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_cmdq_async(void *hwdev, u8 mod, u8 cmd, struct hinic3_cmd_buf *buf_in, u16 channel);
+
+/**
+ * @brief hinic3_cmdq_detail_resp - cmdq direct message response
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param out_param: message out
+ * @param timeout: timeout
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_cmdq_direct_resp(void *hwdev, u8 mod, u8 cmd,
+ struct hinic3_cmd_buf *buf_in,
+ u64 *out_param, u32 timeout, u16 channel);
+
+/**
+ * @brief hinic3_cmdq_detail_resp - cmdq detail message response
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param buf_in: message buffer in
+ * @param buf_out: message buffer out
+ * @param out_param: inline output data
+ * @param timeout: timeout
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_cmdq_detail_resp(void *hwdev, u8 mod, u8 cmd,
+ struct hinic3_cmd_buf *buf_in,
+ struct hinic3_cmd_buf *buf_out,
+ u64 *out_param, u32 timeout, u16 channel);
+
+/**
+ * @brief hinic3_cmdq_detail_resp - cmdq detail message response
+ * @param hwdev: device pointer to hwdev
+ * @param mod: mod type
+ * @param cmd: cmd
+ * @param cos_id: cos id
+ * @param buf_in: message buffer in
+ * @param buf_out: message buffer out
+ * @param out_param: inline output data
+ * @param timeout: timeout
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_cos_id_detail_resp(void *hwdev, u8 mod, u8 cmd, u8 cos_id,
+ struct hinic3_cmd_buf *buf_in,
+ struct hinic3_cmd_buf *buf_out,
+ u64 *out_param, u32 timeout, u16 channel);
+
+/**
+ * @brief hinic3_ppf_tmr_start - start ppf timer
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_ppf_tmr_start(void *hwdev);
+
+/**
+ * @brief hinic3_ppf_tmr_stop - stop ppf timer
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_ppf_tmr_stop(void *hwdev);
+
+/**
+ * @brief hinic3_func_tmr_bitmap_set - set timer bitmap status
+ * @param hwdev: device pointer to hwdev
+ * @param func_id: global function index
+ * @param enable: 0-disable, 1-enable
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_func_tmr_bitmap_set(void *hwdev, u16 func_id, bool en);
+
+/**
+ * @brief hinic3_get_board_info - get board info
+ * @param hwdev: device pointer to hwdev
+ * @param info: board info
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_board_info(void *hwdev, struct hinic3_board_info *info,
+ u16 channel);
+
+/**
+ * @brief hinic3_set_wq_page_size - set work queue page size
+ * @param hwdev: device pointer to hwdev
+ * @param func_idx: function id
+ * @param page_size: page size
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_wq_page_size(void *hwdev, u16 func_idx, u32 page_size,
+ u16 channel);
+
+/**
+ * @brief hinic3_event_callback - evnet callback to notify service driver
+ * @param hwdev: device pointer to hwdev
+ * @param event: event info to service driver
+ */
+void hinic3_event_callback(void *hwdev, struct hinic3_event_info *event);
+
+/**
+ * @brief hinic3_dbg_lt_rd_16byte - liner table read
+ * @param hwdev: device pointer to hwdev
+ * @param dest: destine id
+ * @param instance: instance id
+ * @param lt_index: liner table index id
+ * @param data: data
+ */
+int hinic3_dbg_lt_rd_16byte(void *hwdev, u8 dest, u8 instance,
+ u32 lt_index, u8 *data);
+
+/**
+ * @brief hinic3_dbg_lt_wr_16byte_mask - liner table write
+ * @param hwdev: device pointer to hwdev
+ * @param dest: destine id
+ * @param instance: instance id
+ * @param lt_index: liner table index id
+ * @param data: data
+ * @param mask: mask
+ */
+int hinic3_dbg_lt_wr_16byte_mask(void *hwdev, u8 dest, u8 instance,
+ u32 lt_index, u8 *data, u16 mask);
+
+/**
+ * @brief hinic3_link_event_stats - link event stats
+ * @param hwdev: device pointer to hwdev
+ * @param link: link status
+ */
+void hinic3_link_event_stats(void *dev, u8 link);
+
+/**
+ * @brief hinic3_get_hw_pf_infos - get pf infos
+ * @param hwdev: device pointer to hwdev
+ * @param infos: pf infos
+ * @param channel: channel id
+ */
+int hinic3_get_hw_pf_infos(void *hwdev, struct hinic3_hw_pf_infos *infos,
+ u16 channel);
+
+/**
+ * @brief hinic3_func_reset - reset func
+ * @param hwdev: device pointer to hwdev
+ * @param func_id: global function index
+ * @param reset_flag: reset flag
+ * @param channel: channel id
+ */
+int hinic3_func_reset(void *dev, u16 func_id, u64 reset_flag, u16 channel);
+
+int hinic3_get_ppf_timer_cfg(void *hwdev);
+
+int hinic3_set_bdf_ctxt(void *hwdev, u8 bus, u8 device, u8 function);
+
+int hinic3_init_func_mbox_msg_channel(void *hwdev, u16 num_func);
+
+int hinic3_ppf_ht_gpa_init(void *dev);
+
+void hinic3_ppf_ht_gpa_deinit(void *dev);
+
+int hinic3_get_sml_table_info(void *hwdev, u32 tbl_id, u8 *node_id, u8 *instance_id);
+
+int hinic3_mbox_ppf_to_host(void *hwdev, u8 mod, u16 cmd, u8 host_id,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size, u32 timeout, u16 channel);
+
+void hinic3_force_complete_all(void *dev);
+int hinic3_get_ceq_page_phy_addr(void *hwdev, u16 q_id,
+ u16 page_idx, u64 *page_phy_addr);
+int hinic3_set_ceq_irq_disable(void *hwdev, u16 q_id);
+int hinic3_get_ceq_info(void *hwdev, u16 q_id, struct hinic3_ceq_info *ceq_info);
+
+void hinic3_set_api_stop(void *hwdev);
+
+int hinic3_activate_firmware(void *hwdev, u8 cfg_index);
+int hinic3_switch_config(void *hwdev, u8 cfg_index);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c b/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
new file mode 100644
index 000000000000..3c835ff95e89
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/debugfs.h>
+
+#include "hinic3_hw.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_tx.h"
+#include "hinic3_rx.h"
+
+int hinic3_poll(struct napi_struct *napi, int budget)
+{
+ int tx_pkts, rx_pkts;
+ struct hinic3_irq *irq_cfg =
+ container_of(napi, struct hinic3_irq, napi);
+ struct hinic3_nic_dev *nic_dev = netdev_priv(irq_cfg->netdev);
+
+ rx_pkts = hinic3_rx_poll(irq_cfg->rxq, budget);
+
+ tx_pkts = hinic3_tx_poll(irq_cfg->txq, budget);
+ if (tx_pkts >= budget || rx_pkts >= budget)
+ return budget;
+
+ napi_complete(napi);
+
+ hinic3_set_msix_state(nic_dev->hwdev, irq_cfg->msix_entry_idx,
+ HINIC3_MSIX_ENABLE);
+
+ return max(tx_pkts, rx_pkts);
+}
+
+static void qp_add_napi(struct hinic3_irq *irq_cfg)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(irq_cfg->netdev);
+
+ netif_napi_add(nic_dev->netdev, &irq_cfg->napi,
+ hinic3_poll, nic_dev->poll_weight);
+ napi_enable(&irq_cfg->napi);
+}
+
+static void qp_del_napi(struct hinic3_irq *irq_cfg)
+{
+ napi_disable(&irq_cfg->napi);
+ netif_napi_del(&irq_cfg->napi);
+}
+
+static irqreturn_t qp_irq(int irq, void *data)
+{
+ struct hinic3_irq *irq_cfg = (struct hinic3_irq *)data;
+ struct hinic3_nic_dev *nic_dev = netdev_priv(irq_cfg->netdev);
+
+ hinic3_misx_intr_clear_resend_bit(nic_dev->hwdev, irq_cfg->msix_entry_idx, 1);
+
+ napi_schedule(&irq_cfg->napi);
+
+ return IRQ_HANDLED;
+}
+
+static int hinic3_request_irq(struct hinic3_irq *irq_cfg, u16 q_id)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(irq_cfg->netdev);
+ struct interrupt_info info = {0};
+ int err;
+
+ qp_add_napi(irq_cfg);
+
+ info.msix_index = irq_cfg->msix_entry_idx;
+ info.lli_set = 0;
+ info.interrupt_coalesc_set = 1;
+ info.pending_limt = nic_dev->intr_coalesce[q_id].pending_limt;
+ info.coalesc_timer_cfg =
+ nic_dev->intr_coalesce[q_id].coalesce_timer_cfg;
+ info.resend_timer_cfg = nic_dev->intr_coalesce[q_id].resend_timer_cfg;
+ nic_dev->rxqs[q_id].last_coalesc_timer_cfg =
+ nic_dev->intr_coalesce[q_id].coalesce_timer_cfg;
+ nic_dev->rxqs[q_id].last_pending_limt =
+ nic_dev->intr_coalesce[q_id].pending_limt;
+ err = hinic3_set_interrupt_cfg(nic_dev->hwdev, info,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, irq_cfg->netdev,
+ "Failed to set RX interrupt coalescing attribute.\n");
+ qp_del_napi(irq_cfg);
+ return err;
+ }
+
+ err = request_irq(irq_cfg->irq_id, &qp_irq, 0, irq_cfg->irq_name, irq_cfg);
+ if (err) {
+ nicif_err(nic_dev, drv, irq_cfg->netdev, "Failed to request Rx irq\n");
+ qp_del_napi(irq_cfg);
+ return err;
+ }
+
+ irq_set_affinity_hint(irq_cfg->irq_id, &irq_cfg->affinity_mask);
+
+ return 0;
+}
+
+static void hinic3_release_irq(struct hinic3_irq *irq_cfg)
+{
+ irq_set_affinity_hint(irq_cfg->irq_id, NULL);
+ synchronize_irq(irq_cfg->irq_id);
+ free_irq(irq_cfg->irq_id, irq_cfg);
+ qp_del_napi(irq_cfg);
+}
+
+int hinic3_qps_irq_init(struct hinic3_nic_dev *nic_dev)
+{
+ struct pci_dev *pdev = nic_dev->pdev;
+ struct irq_info *qp_irq_info = NULL;
+ struct hinic3_irq *irq_cfg = NULL;
+ u16 q_id, i;
+ u32 local_cpu;
+ int err;
+
+ for (q_id = 0; q_id < nic_dev->q_params.num_qps; q_id++) {
+ qp_irq_info = &nic_dev->qps_irq_info[q_id];
+ irq_cfg = &nic_dev->q_params.irq_cfg[q_id];
+
+ irq_cfg->irq_id = qp_irq_info->irq_id;
+ irq_cfg->msix_entry_idx = qp_irq_info->msix_entry_idx;
+ irq_cfg->netdev = nic_dev->netdev;
+ irq_cfg->txq = &nic_dev->txqs[q_id];
+ irq_cfg->rxq = &nic_dev->rxqs[q_id];
+ nic_dev->rxqs[q_id].irq_cfg = irq_cfg;
+
+ local_cpu = cpumask_local_spread(q_id, dev_to_node(&pdev->dev));
+ cpumask_set_cpu(local_cpu, &irq_cfg->affinity_mask);
+
+ err = snprintf(irq_cfg->irq_name, sizeof(irq_cfg->irq_name),
+ "%s_qp%u", nic_dev->netdev->name, q_id);
+ if (err < 0) {
+ err = -EINVAL;
+ goto req_tx_irq_err;
+ }
+
+ err = hinic3_request_irq(irq_cfg, q_id);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to request Rx irq\n");
+ goto req_tx_irq_err;
+ }
+
+ hinic3_set_msix_auto_mask_state(nic_dev->hwdev, irq_cfg->msix_entry_idx,
+ HINIC3_SET_MSIX_AUTO_MASK);
+ hinic3_set_msix_state(nic_dev->hwdev, irq_cfg->msix_entry_idx, HINIC3_MSIX_ENABLE);
+ }
+
+ INIT_DELAYED_WORK(&nic_dev->moderation_task, hinic3_auto_moderation_work);
+
+ return 0;
+
+req_tx_irq_err:
+ for (i = 0; i < q_id; i++) {
+ irq_cfg = &nic_dev->q_params.irq_cfg[i];
+ hinic3_set_msix_state(nic_dev->hwdev, irq_cfg->msix_entry_idx, HINIC3_MSIX_DISABLE);
+ hinic3_set_msix_auto_mask_state(nic_dev->hwdev, irq_cfg->msix_entry_idx,
+ HINIC3_CLR_MSIX_AUTO_MASK);
+ hinic3_release_irq(irq_cfg);
+ }
+
+ return err;
+}
+
+void hinic3_qps_irq_deinit(struct hinic3_nic_dev *nic_dev)
+{
+ struct hinic3_irq *irq_cfg = NULL;
+ u16 q_id;
+
+ for (q_id = 0; q_id < nic_dev->q_params.num_qps; q_id++) {
+ irq_cfg = &nic_dev->q_params.irq_cfg[q_id];
+ hinic3_set_msix_state(nic_dev->hwdev, irq_cfg->msix_entry_idx,
+ HINIC3_MSIX_DISABLE);
+ hinic3_set_msix_auto_mask_state(nic_dev->hwdev,
+ irq_cfg->msix_entry_idx,
+ HINIC3_CLR_MSIX_AUTO_MASK);
+ hinic3_release_irq(irq_cfg);
+ }
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_lld.h b/drivers/net/ethernet/huawei/hinic3/hinic3_lld.h
new file mode 100644
index 000000000000..656b49f8ad6c
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_lld.h
@@ -0,0 +1,204 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_LLD_H
+#define HINIC3_LLD_H
+
+#include "hinic3_crm.h"
+
+struct hinic3_lld_dev {
+ struct pci_dev *pdev;
+ void *hwdev;
+};
+struct hinic3_uld_info {
+ /* When the function does not need to initialize the corresponding uld,
+ * @probe needs to return 0 and uld_dev is set to NULL;
+ * if uld_dev is NULL, @remove will not be called when uninstalling
+ */
+ int (*probe)(struct hinic3_lld_dev *lld_dev, void **uld_dev, char *uld_dev_name);
+ void (*remove)(struct hinic3_lld_dev *lld_dev, void *uld_dev);
+ int (*suspend)(struct hinic3_lld_dev *lld_dev, void *uld_dev, pm_message_t state);
+ int (*resume)(struct hinic3_lld_dev *lld_dev, void *uld_dev);
+ void (*event)(struct hinic3_lld_dev *lld_dev, void *uld_dev,
+ struct hinic3_event_info *event);
+ int (*ioctl)(void *uld_dev, u32 cmd, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+};
+
+/* hinic3_register_uld - register an upper-layer driver
+ * @type: uld service type
+ * @uld_info: uld callback
+ *
+ * Registers an upper-layer driver.
+ * Traverse existing devices and call @probe to initialize the uld device.
+ */
+int hinic3_register_uld(enum hinic3_service_type type, struct hinic3_uld_info *uld_info);
+
+/**
+ * hinic3_unregister_uld - unregister an upper-layer driver
+ * @type: uld service type
+ *
+ * Traverse existing devices and call @remove to uninstall the uld device.
+ * Unregisters an existing upper-layer driver.
+ */
+void hinic3_unregister_uld(enum hinic3_service_type type);
+
+void lld_hold(void);
+void lld_put(void);
+
+/**
+ * @brief hinic3_get_lld_dev_by_chip_name - get lld device by chip name
+ * @param chip_name: chip name
+ *
+ * The value of lld_dev reference increases when lld_dev is obtained. The caller needs
+ * to release the reference by calling lld_dev_put.
+ **/
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_chip_name(const char *chip_name);
+
+/**
+ * @brief lld_dev_hold - get reference to lld_dev
+ * @param dev: lld device
+ *
+ * Hold reference to device to keep it from being freed
+ **/
+void lld_dev_hold(struct hinic3_lld_dev *dev);
+
+/**
+ * @brief lld_dev_put - release reference to lld_dev
+ * @param dev: lld device
+ *
+ * Release reference to device to allow it to be freed
+ **/
+void lld_dev_put(struct hinic3_lld_dev *dev);
+
+/**
+ * @brief hinic3_get_lld_dev_by_dev_name - get lld device by uld device name
+ * @param dev_name: uld device name
+ * @param type: uld service type, When the type is SERVICE_T_MAX, try to match
+ * all ULD names to get uld_dev
+ *
+ * The value of lld_dev reference increases when lld_dev is obtained. The caller needs
+ * to release the reference by calling lld_dev_put.
+ **/
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_dev_name(const char *dev_name,
+ enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_get_lld_dev_by_dev_name_unsafe - get lld device by uld device name
+ * @param dev_name: uld device name
+ * @param type: uld service type, When the type is SERVICE_T_MAX, try to match
+ * all ULD names to get uld_dev
+ *
+ * hinic3_get_lld_dev_by_dev_name_unsafe() is completely analogous to
+ * hinic3_get_lld_dev_by_dev_name(), The only difference is that the reference
+ * of lld_dev is not increased when lld_dev is obtained.
+ *
+ * The caller must ensure that lld_dev will not be freed during the remove process
+ * when using lld_dev.
+ **/
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_dev_name_unsafe(const char *dev_name,
+ enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_get_lld_dev_by_chip_and_port - get lld device by chip name and port id
+ * @param chip_name: chip name
+ * @param port_id: port id
+ **/
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_chip_and_port(const char *chip_name, u8 port_id);
+
+/**
+ * @brief hinic3_get_ppf_lld_dev - get ppf lld device by current function's lld device
+ * @param lld_dev: current function's lld device
+ *
+ * The value of lld_dev reference increases when lld_dev is obtained. The caller needs
+ * to release the reference by calling lld_dev_put.
+ **/
+struct hinic3_lld_dev *hinic3_get_ppf_lld_dev(struct hinic3_lld_dev *lld_dev);
+
+/**
+ * @brief hinic3_get_ppf_lld_dev_unsafe - get ppf lld device by current function's lld device
+ * @param lld_dev: current function's lld device
+ *
+ * hinic3_get_ppf_lld_dev_unsafe() is completely analogous to hinic3_get_ppf_lld_dev(),
+ * The only difference is that the reference of lld_dev is not increased when lld_dev is obtained.
+ *
+ * The caller must ensure that ppf's lld_dev will not be freed during the remove process
+ * when using ppf lld_dev.
+ **/
+struct hinic3_lld_dev *hinic3_get_ppf_lld_dev_unsafe(struct hinic3_lld_dev *lld_dev);
+
+/**
+ * @brief uld_dev_hold - get reference to uld_dev
+ * @param lld_dev: lld device
+ * @param type: uld service type
+ *
+ * Hold reference to uld device to keep it from being freed
+ **/
+void uld_dev_hold(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type);
+
+/**
+ * @brief uld_dev_put - release reference to lld_dev
+ * @param dev: lld device
+ * @param type: uld service type
+ *
+ * Release reference to uld device to allow it to be freed
+ **/
+void uld_dev_put(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_get_uld_dev - get uld device by lld device
+ * @param lld_dev: lld device
+ * @param type: uld service type
+ *
+ * The value of uld_dev reference increases when uld_dev is obtained. The caller needs
+ * to release the reference by calling uld_dev_put.
+ **/
+void *hinic3_get_uld_dev(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_get_uld_dev_unsafe - get uld device by lld device
+ * @param lld_dev: lld device
+ * @param type: uld service type
+ *
+ * hinic3_get_uld_dev_unsafe() is completely analogous to hinic3_get_uld_dev(),
+ * The only difference is that the reference of uld_dev is not increased when uld_dev is obtained.
+ *
+ * The caller must ensure that uld_dev will not be freed during the remove process
+ * when using uld_dev.
+ **/
+void *hinic3_get_uld_dev_unsafe(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type);
+
+/**
+ * @brief hinic3_get_chip_name - get chip name by lld device
+ * @param lld_dev: lld device
+ * @param chip_name: String for storing the chip name
+ * @param max_len: Maximum number of characters to be copied for chip_name
+ **/
+int hinic3_get_chip_name(struct hinic3_lld_dev *lld_dev, char *chip_name, u16 max_len);
+
+struct card_node *hinic3_get_chip_node_by_lld(struct hinic3_lld_dev *lld_dev);
+
+struct hinic3_hwdev *hinic3_get_sdk_hwdev_by_lld(struct hinic3_lld_dev *lld_dev);
+
+bool hinic3_get_vf_service_load(struct pci_dev *pdev, u16 service);
+
+int hinic3_set_vf_service_load(struct pci_dev *pdev, u16 service,
+ bool vf_srv_load);
+
+int hinic3_set_vf_service_state(struct pci_dev *pdev, u16 vf_func_id,
+ u16 service, bool en);
+
+bool hinic3_get_vf_load_state(struct pci_dev *pdev);
+
+int hinic3_set_vf_load_state(struct pci_dev *pdev, bool vf_load_state);
+
+int hinic3_attach_nic(struct hinic3_lld_dev *lld_dev);
+
+void hinic3_detach_nic(const struct hinic3_lld_dev *lld_dev);
+
+int hinic3_attach_service(const struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type);
+void hinic3_detach_service(const struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type);
+const char **hinic3_get_uld_names(void);
+int hinic3_lld_init(void);
+void hinic3_lld_exit(void);
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_mag_cfg.c b/drivers/net/ethernet/huawei/hinic3/hinic3_mag_cfg.c
new file mode 100644
index 000000000000..4049e81ce034
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_mag_cfg.c
@@ -0,0 +1,953 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/ethtool.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "mag_cmd.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic.h"
+#include "hinic3_common.h"
+
+static int mag_msg_to_mgmt_sync(void *hwdev, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+static int mag_msg_to_mgmt_sync_ch(void *hwdev, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u16 channel);
+
+int hinic3_set_port_enable(void *hwdev, bool enable, u16 channel)
+{
+ struct mag_cmd_set_port_enable en_state;
+ u16 out_size = sizeof(en_state);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return 0;
+
+ memset(&en_state, 0, sizeof(en_state));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ en_state.function_id = hinic3_global_func_id(hwdev);
+ en_state.state = enable ? MAG_CMD_TX_ENABLE | MAG_CMD_RX_ENABLE :
+ MAG_CMD_PORT_DISABLE;
+
+ err = mag_msg_to_mgmt_sync_ch(hwdev, MAG_CMD_SET_PORT_ENABLE, &en_state,
+ sizeof(en_state), &en_state, &out_size,
+ channel);
+ if (err || !out_size || en_state.head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set port state, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, en_state.head.status, out_size, channel);
+ return -EIO;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_port_enable);
+
+int hinic3_get_phy_port_stats(void *hwdev, struct mag_cmd_port_stats *stats)
+{
+ struct mag_cmd_get_port_stat *port_stats = NULL;
+ struct mag_cmd_port_stats_info stats_info;
+ u16 out_size = sizeof(*port_stats);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ port_stats = kzalloc(sizeof(*port_stats), GFP_KERNEL);
+ if (!port_stats)
+ return -ENOMEM;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ memset(&stats_info, 0, sizeof(stats_info));
+ stats_info.port_id = hinic3_physical_port_id(hwdev);
+
+ err = mag_msg_to_mgmt_sync(hwdev, MAG_CMD_GET_PORT_STAT,
+ &stats_info, sizeof(stats_info),
+ port_stats, &out_size);
+ if (err || !out_size || port_stats->head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get port statistics, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, port_stats->head.status, out_size);
+ err = -EIO;
+ goto out;
+ }
+
+ memcpy(stats, &port_stats->counter, sizeof(*stats));
+
+out:
+ kfree(port_stats);
+
+ return err;
+}
+EXPORT_SYMBOL(hinic3_get_phy_port_stats);
+
+int hinic3_set_port_funcs_state(void *hwdev, bool enable)
+{
+ return 0;
+}
+
+int hinic3_reset_port_link_cfg(void *hwdev)
+{
+ return 0;
+}
+
+int hinic3_force_port_relink(void *hwdev)
+{
+ return 0;
+}
+
+int hinic3_set_autoneg(void *hwdev, bool enable)
+{
+ struct hinic3_link_ksettings settings = {0};
+ struct hinic3_nic_io *nic_io = NULL;
+ u32 set_settings = 0;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ set_settings |= HILINK_LINK_SET_AUTONEG;
+ settings.valid_bitmap = set_settings;
+ settings.autoneg = enable ? PORT_CFG_AN_ON : PORT_CFG_AN_OFF;
+
+ return hinic3_set_link_settings(hwdev, &settings);
+}
+
+static int hinic3_cfg_loopback_mode(struct hinic3_nic_io *nic_io, u8 opcode,
+ u8 *mode, u8 *enable)
+{
+ struct mag_cmd_cfg_loopback_mode lp;
+ u16 out_size = sizeof(lp);
+ int err;
+
+ memset(&lp, 0, sizeof(lp));
+ lp.port_id = hinic3_physical_port_id(nic_io->hwdev);
+ lp.opcode = opcode;
+ if (opcode == MGMT_MSG_CMD_OP_SET) {
+ lp.lp_mode = *mode;
+ lp.lp_en = *enable;
+ }
+
+ err = mag_msg_to_mgmt_sync(nic_io->hwdev, MAG_CMD_CFG_LOOPBACK_MODE,
+ &lp, sizeof(lp), &lp, &out_size);
+ if (err || !out_size || lp.head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to %s loopback mode, err: %d, status: 0x%x, out size: 0x%x\n",
+ opcode == MGMT_MSG_CMD_OP_SET ? "set" : "get",
+ err, lp.head.status, out_size);
+ return -EIO;
+ }
+
+ if (opcode == MGMT_MSG_CMD_OP_GET) {
+ *mode = lp.lp_mode;
+ *enable = lp.lp_en;
+ }
+
+ return 0;
+}
+
+int hinic3_get_loopback_mode(void *hwdev, u8 *mode, u8 *enable)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !mode || !enable)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ return hinic3_cfg_loopback_mode(nic_io, MGMT_MSG_CMD_OP_GET, mode,
+ enable);
+}
+
+#define LOOP_MODE_MIN 1
+#define LOOP_MODE_MAX 6
+int hinic3_set_loopback_mode(void *hwdev, u8 mode, u8 enable)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ if (mode < LOOP_MODE_MIN || mode > LOOP_MODE_MAX) {
+ nic_err(nic_io->dev_hdl, "Invalid loopback mode %u to set\n",
+ mode);
+ return -EINVAL;
+ }
+
+ return hinic3_cfg_loopback_mode(nic_io, MGMT_MSG_CMD_OP_SET, &mode,
+ &enable);
+}
+
+int hinic3_set_led_status(void *hwdev, enum mag_led_type type,
+ enum mag_led_mode mode)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct mag_cmd_set_led_cfg led_info;
+ u16 out_size = sizeof(led_info);
+ int err;
+
+ if (!hwdev)
+ return -EFAULT;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&led_info, 0, sizeof(led_info));
+
+ led_info.function_id = hinic3_global_func_id(hwdev);
+ led_info.type = type;
+ led_info.mode = mode;
+
+ err = mag_msg_to_mgmt_sync(hwdev, MAG_CMD_SET_LED_CFG, &led_info,
+ sizeof(led_info), &led_info, &out_size);
+ if (err || led_info.head.status || !out_size) {
+ nic_err(nic_io->dev_hdl, "Failed to set led status, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, led_info.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_get_port_info(void *hwdev, struct nic_port_info *port_info,
+ u16 channel)
+{
+ struct mag_cmd_get_port_info port_msg;
+ u16 out_size = sizeof(port_msg);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !port_info)
+ return -EINVAL;
+
+ memset(&port_msg, 0, sizeof(port_msg));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ port_msg.port_id = hinic3_physical_port_id(hwdev);
+
+ err = mag_msg_to_mgmt_sync_ch(hwdev, MAG_CMD_GET_PORT_INFO, &port_msg,
+ sizeof(port_msg), &port_msg, &out_size,
+ channel);
+ if (err || !out_size || port_msg.head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get port info, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, port_msg.head.status, out_size, channel);
+ return -EIO;
+ }
+
+ port_info->autoneg_cap = port_msg.an_support;
+ port_info->autoneg_state = port_msg.an_en;
+ port_info->duplex = port_msg.duplex;
+ port_info->port_type = port_msg.wire_type;
+ port_info->speed = port_msg.speed;
+ port_info->fec = port_msg.fec;
+ port_info->supported_mode = port_msg.supported_mode;
+ port_info->advertised_mode = port_msg.advertised_mode;
+
+ return 0;
+}
+
+int hinic3_get_speed(void *hwdev, enum mag_cmd_port_speed *speed, u16 channel)
+{
+ struct nic_port_info port_info = {0};
+ int err;
+
+ if (!hwdev || !speed)
+ return -EINVAL;
+
+ err = hinic3_get_port_info(hwdev, &port_info, channel);
+ if (err)
+ return err;
+
+ *speed = port_info.speed;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_speed);
+
+int hinic3_set_link_settings(void *hwdev,
+ struct hinic3_link_ksettings *settings)
+{
+ struct mag_cmd_set_port_cfg info;
+ u16 out_size = sizeof(info);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !settings)
+ return -EINVAL;
+
+ memset(&info, 0, sizeof(info));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ info.port_id = hinic3_physical_port_id(hwdev);
+ info.config_bitmap = settings->valid_bitmap;
+ info.autoneg = settings->autoneg;
+ info.speed = settings->speed;
+ info.fec = settings->fec;
+
+ err = mag_msg_to_mgmt_sync(hwdev, MAG_CMD_SET_PORT_CFG, &info,
+ sizeof(info), &info, &out_size);
+ if (err || !out_size || info.head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set link settings, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, info.head.status, out_size);
+ return -EIO;
+ }
+
+ return info.head.status;
+}
+
+int hinic3_get_link_state(void *hwdev, u8 *link_state)
+{
+ struct mag_cmd_get_link_status get_link;
+ u16 out_size = sizeof(get_link);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !link_state)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&get_link, 0, sizeof(get_link));
+ get_link.port_id = hinic3_physical_port_id(hwdev);
+
+ err = mag_msg_to_mgmt_sync(hwdev, MAG_CMD_GET_LINK_STATUS, &get_link,
+ sizeof(get_link), &get_link, &out_size);
+ if (err || !out_size || get_link.head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to get link state, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, get_link.head.status, out_size);
+ return -EIO;
+ }
+
+ *link_state = get_link.status;
+
+ return 0;
+}
+
+void hinic3_notify_vf_link_status(struct hinic3_nic_io *nic_io,
+ u16 vf_id, u8 link_status)
+{
+ struct mag_cmd_get_link_status link;
+ struct vf_data_storage *vf_infos = nic_io->vf_infos;
+ u16 out_size = sizeof(link);
+ int err;
+
+ memset(&link, 0, sizeof(link));
+ if (vf_infos[HW_VF_ID_TO_OS(vf_id)].registered) {
+ link.status = link_status;
+ link.port_id = hinic3_physical_port_id(nic_io->hwdev);
+ err = hinic3_mbox_to_vf(nic_io->hwdev, vf_id, HINIC3_MOD_HILINK,
+ MAG_CMD_GET_LINK_STATUS, &link,
+ sizeof(link), &link, &out_size, 0,
+ HINIC3_CHANNEL_NIC);
+ if (err == MBOX_ERRCODE_UNKNOWN_DES_FUNC) {
+ nic_warn(nic_io->dev_hdl, "VF%d not initialized, disconnect it\n",
+ HW_VF_ID_TO_OS(vf_id));
+ hinic3_unregister_vf(nic_io, vf_id);
+ return;
+ }
+ if (err || !out_size || link.head.status)
+ nic_err(nic_io->dev_hdl,
+ "Send link change event to VF %d failed, err: %d, status: 0x%x, out_size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf_id), err, link.head.status, out_size);
+ }
+}
+
+void hinic3_notify_all_vfs_link_changed(void *hwdev, u8 link_status)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 i;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ nic_io->link_status = link_status;
+ for (i = 1; i <= nic_io->max_vfs; i++) {
+ if (!nic_io->vf_infos[HW_VF_ID_TO_OS(i)].link_forced)
+ hinic3_notify_vf_link_status(nic_io, i, link_status);
+ }
+}
+
+static int hinic3_get_vf_link_status_msg_handler(struct hinic3_nic_io *nic_io,
+ u16 vf_id, void *buf_in,
+ u16 in_size, void *buf_out,
+ u16 *out_size)
+{
+ struct vf_data_storage *vf_infos = nic_io->vf_infos;
+ struct mag_cmd_get_link_status *get_link = buf_out;
+ bool link_forced, link_up;
+
+ link_forced = vf_infos[HW_VF_ID_TO_OS(vf_id)].link_forced;
+ link_up = vf_infos[HW_VF_ID_TO_OS(vf_id)].link_up;
+
+ if (link_forced)
+ get_link->status = link_up ?
+ HINIC3_LINK_UP : HINIC3_LINK_DOWN;
+ else
+ get_link->status = nic_io->link_status;
+
+ get_link->head.status = 0;
+ *out_size = sizeof(*get_link);
+
+ return 0;
+}
+
+int hinic3_refresh_nic_cfg(void *hwdev, struct nic_port_info *port_info)
+{
+ /* TO DO */
+ return 0;
+}
+
+static void get_port_info(void *hwdev,
+ const struct mag_cmd_get_link_status *link_status,
+ struct hinic3_event_link_info *link_info)
+{
+ struct nic_port_info port_info = {0};
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (hinic3_func_type(hwdev) != TYPE_VF && link_status->status) {
+ err = hinic3_get_port_info(hwdev, &port_info, HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_warn(nic_io->dev_hdl, "Failed to get port info\n");
+ } else {
+ link_info->valid = 1;
+ link_info->port_type = port_info.port_type;
+ link_info->autoneg_cap = port_info.autoneg_cap;
+ link_info->autoneg_state = port_info.autoneg_state;
+ link_info->duplex = port_info.duplex;
+ link_info->speed = port_info.speed;
+ hinic3_refresh_nic_cfg(hwdev, &port_info);
+ }
+ }
+}
+
+static void link_status_event_handler(void *hwdev, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size)
+{
+ struct mag_cmd_get_link_status *link_status = NULL;
+ struct mag_cmd_get_link_status *ret_link_status = NULL;
+ struct hinic3_event_info event_info = {0};
+ struct hinic3_event_link_info *link_info = (void *)event_info.event_data;
+ struct hinic3_nic_io *nic_io = NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ link_status = buf_in;
+ sdk_info(nic_io->dev_hdl, "Link status report received, func_id: %u, status: %u\n",
+ hinic3_global_func_id(hwdev), link_status->status);
+
+ hinic3_link_event_stats(hwdev, link_status->status);
+
+ /* link event reported only after set vport enable */
+ get_port_info(hwdev, link_status, link_info);
+
+ event_info.service = EVENT_SRV_NIC;
+ event_info.type = link_status->status ?
+ EVENT_NIC_LINK_UP : EVENT_NIC_LINK_DOWN;
+
+ hinic3_event_callback(hwdev, &event_info);
+
+ if (hinic3_func_type(hwdev) != TYPE_VF) {
+ hinic3_notify_all_vfs_link_changed(hwdev, link_status->status);
+ ret_link_status = buf_out;
+ ret_link_status->head.status = 0;
+ *out_size = sizeof(*ret_link_status);
+ }
+}
+
+static void cable_plug_event(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct mag_cmd_wire_event *plug_event = buf_in;
+ struct hinic3_port_routine_cmd *rt_cmd = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_event_info event_info;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ rt_cmd = &nic_io->nic_cfg.rt_cmd;
+
+ mutex_lock(&nic_io->nic_cfg.sfp_mutex);
+ rt_cmd->mpu_send_sfp_abs = false;
+ rt_cmd->mpu_send_sfp_info = false;
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+
+ memset(&event_info, 0, sizeof(event_info));
+ event_info.service = EVENT_SRV_NIC;
+ event_info.type = EVENT_NIC_PORT_MODULE_EVENT;
+ ((struct hinic3_port_module_event *)(void *)event_info.event_data)->type =
+ plug_event->status ? HINIC3_PORT_MODULE_CABLE_PLUGGED :
+ HINIC3_PORT_MODULE_CABLE_UNPLUGGED;
+
+ *out_size = sizeof(*plug_event);
+ plug_event = buf_out;
+ plug_event->head.status = 0;
+
+ hinic3_event_callback(hwdev, &event_info);
+}
+
+static void port_sfp_info_event(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct mag_cmd_get_xsfp_info *sfp_info = buf_in;
+ struct hinic3_port_routine_cmd *rt_cmd = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (in_size != sizeof(*sfp_info)) {
+ sdk_err(nic_io->dev_hdl, "Invalid sfp info cmd, length: %u, should be %ld\n",
+ in_size, sizeof(*sfp_info));
+ return;
+ }
+
+ rt_cmd = &nic_io->nic_cfg.rt_cmd;
+ mutex_lock(&nic_io->nic_cfg.sfp_mutex);
+ memcpy(&rt_cmd->std_sfp_info, sfp_info,
+ sizeof(struct mag_cmd_get_xsfp_info));
+ rt_cmd->mpu_send_sfp_info = true;
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+}
+
+static void port_sfp_abs_event(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct mag_cmd_get_xsfp_present *sfp_abs = buf_in;
+ struct hinic3_port_routine_cmd *rt_cmd = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (in_size != sizeof(*sfp_abs)) {
+ sdk_err(nic_io->dev_hdl, "Invalid sfp absent cmd, length: %u, should be %ld\n",
+ in_size, sizeof(*sfp_abs));
+ return;
+ }
+
+ rt_cmd = &nic_io->nic_cfg.rt_cmd;
+ mutex_lock(&nic_io->nic_cfg.sfp_mutex);
+ memcpy(&rt_cmd->abs, sfp_abs,
+ sizeof(struct mag_cmd_get_xsfp_present));
+ rt_cmd->mpu_send_sfp_abs = true;
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+}
+
+bool hinic3_if_sfp_absent(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_port_routine_cmd *rt_cmd = NULL;
+ struct mag_cmd_get_xsfp_present sfp_abs;
+ u8 port_id = hinic3_physical_port_id(hwdev);
+ u16 out_size = sizeof(sfp_abs);
+ int err;
+ bool sfp_abs_status;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&sfp_abs, 0, sizeof(sfp_abs));
+
+ rt_cmd = &nic_io->nic_cfg.rt_cmd;
+ mutex_lock(&nic_io->nic_cfg.sfp_mutex);
+ if (rt_cmd->mpu_send_sfp_abs) {
+ if (rt_cmd->abs.head.status) {
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+ return true;
+ }
+
+ sfp_abs_status = (bool)rt_cmd->abs.abs_status;
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+ return sfp_abs_status;
+ }
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+
+ sfp_abs.port_id = port_id;
+ err = mag_msg_to_mgmt_sync(hwdev, MAG_CMD_GET_XSFP_PRESENT,
+ &sfp_abs, sizeof(sfp_abs), &sfp_abs,
+ &out_size);
+ if (sfp_abs.head.status || err || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get port%u sfp absent status, err: %d, status: 0x%x, out size: 0x%x\n",
+ port_id, err, sfp_abs.head.status, out_size);
+ return true;
+ }
+
+ return (sfp_abs.abs_status == 0 ? false : true);
+}
+
+int hinic3_get_sfp_info(void *hwdev, struct mag_cmd_get_xsfp_info *sfp_info)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_port_routine_cmd *rt_cmd = NULL;
+ u16 out_size = sizeof(*sfp_info);
+ int err;
+
+ if (!hwdev || !sfp_info)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ rt_cmd = &nic_io->nic_cfg.rt_cmd;
+ mutex_lock(&nic_io->nic_cfg.sfp_mutex);
+ if (rt_cmd->mpu_send_sfp_info) {
+ if (rt_cmd->std_sfp_info.head.status) {
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+ return -EIO;
+ }
+
+ memcpy(sfp_info, &rt_cmd->std_sfp_info, sizeof(*sfp_info));
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+ return 0;
+ }
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+
+ sfp_info->port_id = hinic3_physical_port_id(hwdev);
+ err = mag_msg_to_mgmt_sync(hwdev, MAG_CMD_GET_XSFP_INFO, sfp_info,
+ sizeof(*sfp_info), sfp_info, &out_size);
+ if (sfp_info->head.status || err || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get port%u sfp eeprom information, err: %d, status: 0x%x, out size: 0x%x\n",
+ hinic3_physical_port_id(hwdev), err,
+ sfp_info->head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_get_sfp_eeprom(void *hwdev, u8 *data, u32 len)
+{
+ struct mag_cmd_get_xsfp_info sfp_info;
+ int err;
+
+ if (!hwdev || !data)
+ return -EINVAL;
+
+ if (hinic3_if_sfp_absent(hwdev))
+ return -ENXIO;
+
+ memset(&sfp_info, 0, sizeof(sfp_info));
+
+ err = hinic3_get_sfp_info(hwdev, &sfp_info);
+ if (err)
+ return err;
+
+ memcpy(data, sfp_info.sfp_info, len);
+
+ return 0;
+}
+
+int hinic3_get_sfp_type(void *hwdev, u8 *sfp_type, u8 *sfp_type_ext)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_port_routine_cmd *rt_cmd = NULL;
+ u8 sfp_data[STD_SFP_INFO_MAX_SIZE];
+ int err;
+
+ if (!hwdev || !sfp_type || !sfp_type_ext)
+ return -EINVAL;
+
+ if (hinic3_if_sfp_absent(hwdev))
+ return -ENXIO;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ rt_cmd = &nic_io->nic_cfg.rt_cmd;
+
+ mutex_lock(&nic_io->nic_cfg.sfp_mutex);
+ if (rt_cmd->mpu_send_sfp_info) {
+ if (rt_cmd->std_sfp_info.head.status) {
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+ return -EIO;
+ }
+
+ *sfp_type = rt_cmd->std_sfp_info.sfp_info[0];
+ *sfp_type_ext = rt_cmd->std_sfp_info.sfp_info[1];
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+ return 0;
+ }
+ mutex_unlock(&nic_io->nic_cfg.sfp_mutex);
+
+ err = hinic3_get_sfp_eeprom(hwdev, (u8 *)sfp_data,
+ STD_SFP_INFO_MAX_SIZE);
+ if (err)
+ return err;
+
+ *sfp_type = sfp_data[0];
+ *sfp_type_ext = sfp_data[1];
+
+ return 0;
+}
+
+int hinic3_set_link_status_follow(void *hwdev, enum hinic3_link_follow_status status)
+{
+ struct mag_cmd_set_link_follow follow;
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 out_size = sizeof(follow);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ if (status >= HINIC3_LINK_FOLLOW_STATUS_MAX) {
+ nic_err(nic_io->dev_hdl, "Invalid link follow status: %d\n", status);
+ return -EINVAL;
+ }
+
+ memset(&follow, 0, sizeof(follow));
+ follow.function_id = hinic3_global_func_id(hwdev);
+ follow.follow = status;
+
+ err = mag_msg_to_mgmt_sync(hwdev, MAG_CMD_SET_LINK_FOLLOW, &follow,
+ sizeof(follow), &follow, &out_size);
+ if ((follow.head.status != HINIC3_MGMT_CMD_UNSUPPORTED && follow.head.status) ||
+ err || !out_size) {
+ nic_err(nic_io->dev_hdl, "Failed to set link status follow port status, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, follow.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return follow.head.status;
+}
+
+int hinic3_update_pf_bw(void *hwdev)
+{
+ struct nic_port_info port_info = {0};
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF || !HINIC3_SUPPORT_RATE_LIMIT(hwdev))
+ return 0;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ err = hinic3_get_port_info(hwdev, &port_info, HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to get port info\n");
+ return -EIO;
+ }
+
+ err = hinic3_set_pf_rate(hwdev, port_info.speed);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to set pf bandwidth\n");
+ return err;
+ }
+
+ return 0;
+}
+
+int hinic3_set_pf_bw_limit(void *hwdev, u32 bw_limit)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ u32 old_bw_limit;
+ u8 link_state = 0;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return 0;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ if (bw_limit > MAX_LIMIT_BW) {
+ nic_err(nic_io->dev_hdl, "Invalid bandwidth: %u\n", bw_limit);
+ return -EINVAL;
+ }
+
+ err = hinic3_get_link_state(hwdev, &link_state);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to get link state\n");
+ return -EIO;
+ }
+
+ if (!link_state) {
+ nic_err(nic_io->dev_hdl, "Link status must be up when setting pf tx rate\n");
+ return -EINVAL;
+ }
+
+ old_bw_limit = nic_io->nic_cfg.pf_bw_limit;
+ nic_io->nic_cfg.pf_bw_limit = bw_limit;
+
+ err = hinic3_update_pf_bw(hwdev);
+ if (err) {
+ nic_io->nic_cfg.pf_bw_limit = old_bw_limit;
+ return err;
+ }
+
+ return 0;
+}
+
+static const struct vf_msg_handler vf_mag_cmd_handler[] = {
+ {
+ .cmd = MAG_CMD_GET_LINK_STATUS,
+ .handler = hinic3_get_vf_link_status_msg_handler,
+ },
+};
+
+/* pf/ppf handler mbox msg from vf */
+int hinic3_pf_mag_mbox_handler(void *hwdev, u16 vf_id,
+ u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ u32 index, cmd_size = ARRAY_LEN(vf_mag_cmd_handler);
+ struct hinic3_nic_io *nic_io = NULL;
+ const struct vf_msg_handler *handler = NULL;
+
+ if (!hwdev)
+ return -EFAULT;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ for (index = 0; index < cmd_size; index++) {
+ handler = &vf_mag_cmd_handler[index];
+ if (cmd == handler->cmd)
+ return handler->handler(nic_io, vf_id, buf_in, in_size,
+ buf_out, out_size);
+ }
+
+ nic_warn(nic_io->dev_hdl, "NO handler for mag cmd: %u received from vf id: %u\n",
+ cmd, vf_id);
+
+ return -EINVAL;
+}
+
+static struct nic_event_handler mag_cmd_handler[] = {
+ {
+ .cmd = MAG_CMD_GET_LINK_STATUS,
+ .handler = link_status_event_handler,
+ },
+
+ {
+ .cmd = MAG_CMD_WIRE_EVENT,
+ .handler = cable_plug_event,
+ },
+
+ {
+ .cmd = MAG_CMD_GET_XSFP_INFO,
+ .handler = port_sfp_info_event,
+ },
+
+ {
+ .cmd = MAG_CMD_GET_XSFP_PRESENT,
+ .handler = port_sfp_abs_event,
+ },
+};
+
+static int hinic3_mag_event_handler(void *hwdev, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ u32 size = ARRAY_LEN(mag_cmd_handler);
+ u32 i;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ *out_size = 0;
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ for (i = 0; i < size; i++) {
+ if (cmd == mag_cmd_handler[i].cmd) {
+ mag_cmd_handler[i].handler(hwdev, buf_in, in_size,
+ buf_out, out_size);
+ return 0;
+ }
+ }
+
+ /* can't find this event cmd */
+ sdk_warn(nic_io->dev_hdl, "Unsupported mag event, cmd: %u\n", cmd);
+ *out_size = sizeof(struct mgmt_msg_head);
+ ((struct mgmt_msg_head *)buf_out)->status = HINIC3_MGMT_CMD_UNSUPPORTED;
+
+ return 0;
+}
+
+int hinic3_vf_mag_event_handler(void *hwdev, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size)
+{
+ return hinic3_mag_event_handler(hwdev, cmd, buf_in, in_size,
+ buf_out, out_size);
+}
+
+/* pf/ppf handler mgmt cpu report hilink event */
+void hinic3_pf_mag_event_handler(void *pri_handle, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size)
+{
+ hinic3_mag_event_handler(pri_handle, cmd, buf_in, in_size,
+ buf_out, out_size);
+}
+
+static int _mag_msg_to_mgmt_sync(void *hwdev, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u16 channel)
+{
+ u32 i, cmd_cnt = ARRAY_LEN(vf_mag_cmd_handler);
+ bool cmd_to_pf = false;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF) {
+ for (i = 0; i < cmd_cnt; i++) {
+ if (cmd == vf_mag_cmd_handler[i].cmd) {
+ cmd_to_pf = true;
+ break;
+ }
+ }
+ }
+
+ if (cmd_to_pf)
+ return hinic3_mbox_to_pf(hwdev, HINIC3_MOD_HILINK, cmd, buf_in,
+ in_size, buf_out, out_size, 0,
+ channel);
+
+ return hinic3_msg_to_mgmt_sync(hwdev, HINIC3_MOD_HILINK, cmd, buf_in,
+ in_size, buf_out, out_size, 0, channel);
+}
+
+static int mag_msg_to_mgmt_sync(void *hwdev, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ return _mag_msg_to_mgmt_sync(hwdev, cmd, buf_in, in_size, buf_out,
+ out_size, HINIC3_CHANNEL_NIC);
+}
+
+static int mag_msg_to_mgmt_sync_ch(void *hwdev, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u16 channel)
+{
+ return _mag_msg_to_mgmt_sync(hwdev, cmd, buf_in, in_size, buf_out,
+ out_size, channel);
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_main.c b/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
new file mode 100644
index 000000000000..87f6f5417e9e
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
@@ -0,0 +1,1125 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/ethtool.h>
+#include <linux/dcbnl.h>
+#include <linux/tcp.h>
+#include <linux/ip.h>
+#include <linux/debugfs.h>
+
+#include "ossl_knl.h"
+#include "hinic3_hw.h"
+#include "hinic3_crm.h"
+#include "hinic3_mt.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_tx.h"
+#include "hinic3_rx.h"
+#include "hinic3_lld.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_rss.h"
+#include "hinic3_dcb.h"
+#include "hinic3_nic_prof.h"
+#include "hinic3_profile.h"
+
+/*lint -e806*/
+#define DEFAULT_POLL_WEIGHT 64
+static unsigned int poll_weight = DEFAULT_POLL_WEIGHT;
+module_param(poll_weight, uint, 0444);
+MODULE_PARM_DESC(poll_weight, "Number packets for NAPI budget (default=64)");
+
+#define HINIC3_DEAULT_TXRX_MSIX_PENDING_LIMIT 2
+#define HINIC3_DEAULT_TXRX_MSIX_COALESC_TIMER_CFG 25
+#define HINIC3_DEAULT_TXRX_MSIX_RESEND_TIMER_CFG 7
+
+static unsigned char qp_pending_limit = HINIC3_DEAULT_TXRX_MSIX_PENDING_LIMIT;
+module_param(qp_pending_limit, byte, 0444);
+MODULE_PARM_DESC(qp_pending_limit, "QP MSI-X Interrupt coalescing parameter pending_limit (default=2)");
+
+static unsigned char qp_coalesc_timer_cfg =
+ HINIC3_DEAULT_TXRX_MSIX_COALESC_TIMER_CFG;
+module_param(qp_coalesc_timer_cfg, byte, 0444);
+MODULE_PARM_DESC(qp_coalesc_timer_cfg, "QP MSI-X Interrupt coalescing parameter coalesc_timer_cfg (default=32)");
+
+#define DEFAULT_RX_BUFF_LEN 2
+u16 rx_buff = DEFAULT_RX_BUFF_LEN;
+module_param(rx_buff, ushort, 0444);
+MODULE_PARM_DESC(rx_buff, "Set rx_buff size, buffer len must be 2^n. 2 - 16, default is 2KB");
+
+static unsigned int lro_replenish_thld = 256;
+module_param(lro_replenish_thld, uint, 0444);
+MODULE_PARM_DESC(lro_replenish_thld, "Number wqe for lro replenish buffer (default=256)");
+
+static unsigned char set_link_status_follow = HINIC3_LINK_FOLLOW_STATUS_MAX;
+module_param(set_link_status_follow, byte, 0444);
+MODULE_PARM_DESC(set_link_status_follow, "Set link status follow port status (0=default,1=follow,2=separate,3=unset");
+
+/*lint +e806*/
+
+#define HINIC3_NIC_DEV_WQ_NAME "hinic3_nic_dev_wq"
+
+#define DEFAULT_MSG_ENABLE (NETIF_MSG_DRV | NETIF_MSG_LINK)
+
+#define QID_MASKED(q_id, nic_dev) ((q_id) & ((nic_dev)->num_qps - 1))
+#define WATCHDOG_TIMEOUT 5
+
+#define HINIC3_SQ_DEPTH 1024
+#define HINIC3_RQ_DEPTH 1024
+
+enum hinic3_rx_buff_len {
+ RX_BUFF_VALID_2KB = 2,
+ RX_BUFF_VALID_4KB = 4,
+ RX_BUFF_VALID_8KB = 8,
+ RX_BUFF_VALID_16KB = 16,
+};
+
+#define CONVERT_UNIT 1024
+
+#ifdef HAVE_MULTI_VLAN_OFFLOAD_EN
+static int hinic3_netdev_event(struct notifier_block *notifier, unsigned long event, void *ptr);
+
+/* used for netdev notifier register/unregister */
+static DEFINE_MUTEX(hinic3_netdev_notifiers_mutex);
+static int hinic3_netdev_notifiers_ref_cnt;
+static struct notifier_block hinic3_netdev_notifier = {
+ .notifier_call = hinic3_netdev_event,
+};
+
+static void hinic3_register_notifier(struct hinic3_nic_dev *nic_dev)
+{
+ int err;
+
+ mutex_lock(&hinic3_netdev_notifiers_mutex);
+ hinic3_netdev_notifiers_ref_cnt++;
+ if (hinic3_netdev_notifiers_ref_cnt == 1) {
+ err = register_netdevice_notifier(&hinic3_netdev_notifier);
+ if (err) {
+ nic_info(&nic_dev->pdev->dev, "Register netdevice notifier failed, err: %d\n",
+ err);
+ hinic3_netdev_notifiers_ref_cnt--;
+ }
+ }
+ mutex_unlock(&hinic3_netdev_notifiers_mutex);
+}
+
+static void hinic3_unregister_notifier(struct hinic3_nic_dev *nic_dev)
+{
+ mutex_lock(&hinic3_netdev_notifiers_mutex);
+ if (hinic3_netdev_notifiers_ref_cnt == 1)
+ unregister_netdevice_notifier(&hinic3_netdev_notifier);
+
+ if (hinic3_netdev_notifiers_ref_cnt)
+ hinic3_netdev_notifiers_ref_cnt--;
+ mutex_unlock(&hinic3_netdev_notifiers_mutex);
+}
+
+#define HINIC3_MAX_VLAN_DEPTH_OFFLOAD_SUPPORT 1
+#define HINIC3_VLAN_CLEAR_OFFLOAD (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | \
+ NETIF_F_SCTP_CRC | NETIF_F_RXCSUM | \
+ NETIF_F_ALL_TSO)
+
+static int hinic3_netdev_event(struct notifier_block *notifier, unsigned long event, void *ptr)
+{
+ struct net_device *ndev = netdev_notifier_info_to_dev(ptr);
+ struct net_device *real_dev = NULL;
+ struct net_device *ret = NULL;
+ u16 vlan_depth;
+
+ if (!is_vlan_dev(ndev))
+ return NOTIFY_DONE;
+
+ dev_hold(ndev);
+
+ switch (event) {
+ case NETDEV_REGISTER:
+ real_dev = vlan_dev_real_dev(ndev);
+ if (!hinic3_is_netdev_ops_match(real_dev))
+ goto out;
+
+ vlan_depth = 1;
+ ret = vlan_dev_priv(ndev)->real_dev;
+ while (is_vlan_dev(ret)) {
+ ret = vlan_dev_priv(ret)->real_dev;
+ vlan_depth++;
+ }
+
+ if (vlan_depth == HINIC3_MAX_VLAN_DEPTH_OFFLOAD_SUPPORT) {
+ ndev->vlan_features &= (~HINIC3_VLAN_CLEAR_OFFLOAD);
+ } else if (vlan_depth > HINIC3_MAX_VLAN_DEPTH_OFFLOAD_SUPPORT) {
+#ifdef HAVE_NDO_SET_FEATURES
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+ set_netdev_hw_features(ndev,
+ get_netdev_hw_features(ndev) &
+ (~HINIC3_VLAN_CLEAR_OFFLOAD));
+#else
+ ndev->hw_features &= (~HINIC3_VLAN_CLEAR_OFFLOAD);
+#endif
+#endif
+ ndev->features &= (~HINIC3_VLAN_CLEAR_OFFLOAD);
+ }
+
+ break;
+
+ default:
+ break;
+ };
+
+out:
+ dev_put(ndev);
+
+ return NOTIFY_DONE;
+}
+#endif
+
+void hinic3_link_status_change(struct hinic3_nic_dev *nic_dev, bool status)
+{
+ struct net_device *netdev = nic_dev->netdev;
+
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev) ||
+ test_bit(HINIC3_LP_TEST, &nic_dev->flags) ||
+ test_bit(HINIC3_FORCE_LINK_UP, &nic_dev->flags))
+ return;
+
+ if (status) {
+ if (netif_carrier_ok(netdev))
+ return;
+
+ nic_dev->link_status = status;
+ netif_carrier_on(netdev);
+ nicif_info(nic_dev, link, netdev, "Link is up\n");
+ } else {
+ if (!netif_carrier_ok(netdev))
+ return;
+
+ nic_dev->link_status = status;
+ netif_carrier_off(netdev);
+ nicif_info(nic_dev, link, netdev, "Link is down\n");
+ }
+}
+
+static void netdev_feature_init(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ netdev_features_t dft_fts = 0;
+ netdev_features_t cso_fts = 0;
+ netdev_features_t vlan_fts = 0;
+ netdev_features_t tso_fts = 0;
+ netdev_features_t hw_features = 0;
+
+ dft_fts |= NETIF_F_SG | NETIF_F_HIGHDMA;
+
+ if (HINIC3_SUPPORT_CSUM(nic_dev->hwdev))
+ cso_fts |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM;
+ if (HINIC3_SUPPORT_SCTP_CRC(nic_dev->hwdev))
+ cso_fts |= NETIF_F_SCTP_CRC;
+
+ if (HINIC3_SUPPORT_TSO(nic_dev->hwdev))
+ tso_fts |= NETIF_F_TSO | NETIF_F_TSO6;
+
+ if (HINIC3_SUPPORT_VLAN_OFFLOAD(nic_dev->hwdev)) {
+#if defined(NETIF_F_HW_VLAN_CTAG_TX)
+ vlan_fts |= NETIF_F_HW_VLAN_CTAG_TX;
+#elif defined(NETIF_F_HW_VLAN_TX)
+ vlan_fts |= NETIF_F_HW_VLAN_TX;
+#endif
+
+#if defined(NETIF_F_HW_VLAN_CTAG_RX)
+ vlan_fts |= NETIF_F_HW_VLAN_CTAG_RX;
+#elif defined(NETIF_F_HW_VLAN_RX)
+ vlan_fts |= NETIF_F_HW_VLAN_RX;
+#endif
+ }
+
+ if (HINIC3_SUPPORT_RXVLAN_FILTER(nic_dev->hwdev)) {
+#if defined(NETIF_F_HW_VLAN_CTAG_FILTER)
+ vlan_fts |= NETIF_F_HW_VLAN_CTAG_FILTER;
+#elif defined(NETIF_F_HW_VLAN_FILTER)
+ vlan_fts |= NETIF_F_HW_VLAN_FILTER;
+#endif
+ }
+
+#ifdef HAVE_ENCAPSULATION_TSO
+ if (HINIC3_SUPPORT_VXLAN_OFFLOAD(nic_dev->hwdev))
+ tso_fts |= NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_UDP_TUNNEL_CSUM;
+#endif /* HAVE_ENCAPSULATION_TSO */
+
+ /* LRO is disable in default, only set hw features */
+ if (HINIC3_SUPPORT_LRO(nic_dev->hwdev))
+ hw_features |= NETIF_F_LRO;
+
+#if (KERNEL_VERSION(4, 11, 0) > LINUX_VERSION_CODE)
+ if (HINIC3_SUPPORT_UFO(nic_dev->hwdev)) {
+ /* UFO is disable in default */
+ hw_features |= NETIF_F_UFO;
+ netdev->vlan_features |= NETIF_F_UFO;
+ }
+#endif
+
+ netdev->features |= dft_fts | cso_fts | tso_fts | vlan_fts;
+ netdev->vlan_features |= dft_fts | cso_fts | tso_fts;
+
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+ hw_features |= get_netdev_hw_features(netdev);
+#else
+ hw_features |= netdev->hw_features;
+#endif
+
+ hw_features |= netdev->features;
+
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+ set_netdev_hw_features(netdev, hw_features);
+#else
+ netdev->hw_features = hw_features;
+#endif
+
+#ifdef IFF_UNICAST_FLT
+ netdev->priv_flags |= IFF_UNICAST_FLT;
+#endif
+
+#ifdef HAVE_ENCAPSULATION_CSUM
+ netdev->hw_enc_features |= dft_fts;
+ if (HINIC3_SUPPORT_VXLAN_OFFLOAD(nic_dev->hwdev)) {
+ netdev->hw_enc_features |= cso_fts;
+#ifdef HAVE_ENCAPSULATION_TSO
+ netdev->hw_enc_features |= tso_fts | NETIF_F_TSO_ECN;
+#endif /* HAVE_ENCAPSULATION_TSO */
+ }
+#endif /* HAVE_ENCAPSULATION_CSUM */
+}
+
+static void init_intr_coal_param(struct hinic3_nic_dev *nic_dev)
+{
+ struct hinic3_intr_coal_info *info = NULL;
+ u16 i;
+
+ for (i = 0; i < nic_dev->max_qps; i++) {
+ info = &nic_dev->intr_coalesce[i];
+
+ info->pending_limt = qp_pending_limit;
+ info->coalesce_timer_cfg = qp_coalesc_timer_cfg;
+
+ info->resend_timer_cfg = HINIC3_DEAULT_TXRX_MSIX_RESEND_TIMER_CFG;
+
+ info->pkt_rate_high = HINIC3_RX_RATE_HIGH;
+ info->rx_usecs_high = HINIC3_RX_COAL_TIME_HIGH;
+ info->rx_pending_limt_high = HINIC3_RX_PENDING_LIMIT_HIGH;
+
+ info->pkt_rate_low = HINIC3_RX_RATE_LOW;
+ info->rx_usecs_low = HINIC3_RX_COAL_TIME_LOW;
+ info->rx_pending_limt_low = HINIC3_RX_PENDING_LIMIT_LOW;
+ }
+}
+
+static int hinic3_init_intr_coalesce(struct hinic3_nic_dev *nic_dev)
+{
+ u64 size;
+
+ if (qp_pending_limit != HINIC3_DEAULT_TXRX_MSIX_PENDING_LIMIT ||
+ qp_coalesc_timer_cfg != HINIC3_DEAULT_TXRX_MSIX_COALESC_TIMER_CFG)
+ nic_dev->intr_coal_set_flag = 1;
+ else
+ nic_dev->intr_coal_set_flag = 0;
+
+ size = sizeof(*nic_dev->intr_coalesce) * nic_dev->max_qps;
+ if (!size) {
+ nic_err(&nic_dev->pdev->dev, "Cannot allocate zero size intr coalesce\n");
+ return -EINVAL;
+ }
+ nic_dev->intr_coalesce = kzalloc(size, GFP_KERNEL);
+ if (!nic_dev->intr_coalesce) {
+ nic_err(&nic_dev->pdev->dev, "Failed to alloc intr coalesce\n");
+ return -ENOMEM;
+ }
+
+ init_intr_coal_param(nic_dev);
+
+ if (test_bit(HINIC3_INTR_ADAPT, &nic_dev->flags))
+ nic_dev->adaptive_rx_coal = 1;
+ else
+ nic_dev->adaptive_rx_coal = 0;
+
+ return 0;
+}
+
+static void hinic3_free_intr_coalesce(struct hinic3_nic_dev *nic_dev)
+{
+ kfree(nic_dev->intr_coalesce);
+}
+
+static int hinic3_alloc_txrxqs(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ int err;
+
+ err = hinic3_alloc_txqs(netdev);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to alloc txqs\n");
+ return err;
+ }
+
+ err = hinic3_alloc_rxqs(netdev);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to alloc rxqs\n");
+ goto alloc_rxqs_err;
+ }
+
+ err = hinic3_init_intr_coalesce(nic_dev);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to init_intr_coalesce\n");
+ goto init_intr_err;
+ }
+
+ return 0;
+
+init_intr_err:
+ hinic3_free_rxqs(netdev);
+
+alloc_rxqs_err:
+ hinic3_free_txqs(netdev);
+
+ return err;
+}
+
+static void hinic3_free_txrxqs(struct hinic3_nic_dev *nic_dev)
+{
+ hinic3_free_intr_coalesce(nic_dev);
+ hinic3_free_rxqs(nic_dev->netdev);
+ hinic3_free_txqs(nic_dev->netdev);
+}
+
+static void hinic3_sw_deinit(struct hinic3_nic_dev *nic_dev)
+{
+ hinic3_free_txrxqs(nic_dev);
+
+ hinic3_clean_mac_list_filter(nic_dev);
+
+ hinic3_del_mac(nic_dev->hwdev, nic_dev->netdev->dev_addr, 0,
+ hinic3_global_func_id(nic_dev->hwdev),
+ HINIC3_CHANNEL_NIC);
+
+ hinic3_clear_rss_config(nic_dev);
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags))
+ hinic3_sync_dcb_state(nic_dev->hwdev, 1, 0);
+}
+
+static int hinic3_sw_init(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u64 nic_features;
+ int err = 0;
+
+ nic_features = hinic3_get_feature_cap(nic_dev->hwdev);
+ /* You can update the features supported by the driver according to the
+ * scenario here
+ */
+ nic_features &= NIC_DRV_DEFAULT_FEATURE;
+ hinic3_update_nic_feature(nic_dev->hwdev, nic_features);
+
+ sema_init(&nic_dev->port_state_sem, 1);
+
+ err = hinic3_dcb_init(nic_dev);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to init dcb\n");
+ return -EFAULT;
+ }
+
+ nic_dev->q_params.sq_depth = HINIC3_SQ_DEPTH;
+ nic_dev->q_params.rq_depth = HINIC3_RQ_DEPTH;
+
+ hinic3_try_to_enable_rss(nic_dev);
+
+ err = hinic3_get_default_mac(nic_dev->hwdev, netdev->dev_addr);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to get MAC address\n");
+ goto get_mac_err;
+ }
+
+ if (!is_valid_ether_addr(netdev->dev_addr)) {
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ nic_err(&nic_dev->pdev->dev, "Invalid MAC address %pM\n",
+ netdev->dev_addr);
+ err = -EIO;
+ goto err_mac;
+ }
+
+ nic_info(&nic_dev->pdev->dev, "Invalid MAC address %pM, using random\n",
+ netdev->dev_addr);
+ eth_hw_addr_random(netdev);
+ }
+
+ err = hinic3_set_mac(nic_dev->hwdev, netdev->dev_addr, 0,
+ hinic3_global_func_id(nic_dev->hwdev),
+ HINIC3_CHANNEL_NIC);
+ /* When this is VF driver, we must consider that PF has already set VF
+ * MAC, and we can't consider this condition is error status during
+ * driver probe procedure.
+ */
+ if (err && err != HINIC3_PF_SET_VF_ALREADY) {
+ nic_err(&nic_dev->pdev->dev, "Failed to set default MAC\n");
+ goto set_mac_err;
+ }
+
+ /* MTU range: 384 - 9600 */
+#ifdef HAVE_NETDEVICE_MIN_MAX_MTU
+ netdev->min_mtu = HINIC3_MIN_MTU_SIZE;
+ netdev->max_mtu = HINIC3_MAX_JUMBO_FRAME_SIZE;
+#endif
+
+#ifdef HAVE_NETDEVICE_EXTENDED_MIN_MAX_MTU
+ netdev->extended->min_mtu = HINIC3_MIN_MTU_SIZE;
+ netdev->extended->max_mtu = HINIC3_MAX_JUMBO_FRAME_SIZE;
+#endif
+
+ err = hinic3_alloc_txrxqs(nic_dev);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to alloc qps\n");
+ goto alloc_qps_err;
+ }
+
+ return 0;
+
+alloc_qps_err:
+ hinic3_del_mac(nic_dev->hwdev, netdev->dev_addr, 0,
+ hinic3_global_func_id(nic_dev->hwdev),
+ HINIC3_CHANNEL_NIC);
+
+set_mac_err:
+err_mac:
+get_mac_err:
+ hinic3_clear_rss_config(nic_dev);
+
+ return err;
+}
+
+static void hinic3_assign_netdev_ops(struct hinic3_nic_dev *adapter)
+{
+ hinic3_set_netdev_ops(adapter);
+ if (!HINIC3_FUNC_IS_VF(adapter->hwdev))
+ hinic3_set_ethtool_ops(adapter->netdev);
+ else
+ hinic3vf_set_ethtool_ops(adapter->netdev);
+
+ adapter->netdev->watchdog_timeo = WATCHDOG_TIMEOUT * HZ;
+}
+
+static int hinic3_validate_parameters(struct hinic3_lld_dev *lld_dev)
+{
+ struct pci_dev *pdev = lld_dev->pdev;
+
+ /* If weight exceeds the queue depth, the queue resources will be
+ * exhausted, and increasing it has no effect.
+ */
+ if (!poll_weight || poll_weight > HINIC3_MAX_RX_QUEUE_DEPTH) {
+ nic_warn(&pdev->dev, "Module Parameter poll_weight is out of range: [1, %d], resetting to %d\n",
+ HINIC3_MAX_RX_QUEUE_DEPTH, DEFAULT_POLL_WEIGHT);
+ poll_weight = DEFAULT_POLL_WEIGHT;
+ }
+
+ /* check rx_buff value, default rx_buff is 2KB.
+ * Valid rx_buff include 2KB/4KB/8KB/16KB.
+ */
+ if (rx_buff != RX_BUFF_VALID_2KB && rx_buff != RX_BUFF_VALID_4KB &&
+ rx_buff != RX_BUFF_VALID_8KB && rx_buff != RX_BUFF_VALID_16KB) {
+ nic_warn(&pdev->dev, "Module Parameter rx_buff value %u is out of range, must be 2^n. Valid range is 2 - 16, resetting to %dKB",
+ rx_buff, DEFAULT_RX_BUFF_LEN);
+ rx_buff = DEFAULT_RX_BUFF_LEN;
+ }
+
+ return 0;
+}
+
+static void decide_intr_cfg(struct hinic3_nic_dev *nic_dev)
+{
+ set_bit(HINIC3_INTR_ADAPT, &nic_dev->flags);
+}
+
+static void adaptive_configuration_init(struct hinic3_nic_dev *nic_dev)
+{
+ decide_intr_cfg(nic_dev);
+}
+
+static int set_interrupt_moder(struct hinic3_nic_dev *nic_dev, u16 q_id,
+ u8 coalesc_timer_cfg, u8 pending_limt)
+{
+ struct interrupt_info info;
+ int err;
+
+ memset(&info, 0, sizeof(info));
+
+ if (coalesc_timer_cfg == nic_dev->rxqs[q_id].last_coalesc_timer_cfg &&
+ pending_limt == nic_dev->rxqs[q_id].last_pending_limt)
+ return 0;
+
+ /* netdev not running or qp not in using,
+ * don't need to set coalesce to hw
+ */
+ if (!HINIC3_CHANNEL_RES_VALID(nic_dev) ||
+ q_id >= nic_dev->q_params.num_qps)
+ return 0;
+
+ info.lli_set = 0;
+ info.interrupt_coalesc_set = 1;
+ info.coalesc_timer_cfg = coalesc_timer_cfg;
+ info.pending_limt = pending_limt;
+ info.msix_index = nic_dev->q_params.irq_cfg[q_id].msix_entry_idx;
+ info.resend_timer_cfg =
+ nic_dev->intr_coalesce[q_id].resend_timer_cfg;
+
+ err = hinic3_set_interrupt_cfg(nic_dev->hwdev, info,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to modify moderation for Queue: %u\n", q_id);
+ } else {
+ nic_dev->rxqs[q_id].last_coalesc_timer_cfg = coalesc_timer_cfg;
+ nic_dev->rxqs[q_id].last_pending_limt = pending_limt;
+ }
+
+ return err;
+}
+
+static void calc_coal_para(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_intr_coal_info *q_coal, u64 rx_rate,
+ u8 *coalesc_timer_cfg, u8 *pending_limt)
+{
+ if (rx_rate < q_coal->pkt_rate_low) {
+ *coalesc_timer_cfg = q_coal->rx_usecs_low;
+ *pending_limt = q_coal->rx_pending_limt_low;
+ } else if (rx_rate > q_coal->pkt_rate_high) {
+ *coalesc_timer_cfg = q_coal->rx_usecs_high;
+ *pending_limt = q_coal->rx_pending_limt_high;
+ } else {
+ *coalesc_timer_cfg =
+ (u8)((rx_rate - q_coal->pkt_rate_low) *
+ (q_coal->rx_usecs_high - q_coal->rx_usecs_low) /
+ (q_coal->pkt_rate_high - q_coal->pkt_rate_low) +
+ q_coal->rx_usecs_low);
+
+ *pending_limt =
+ (u8)((rx_rate - q_coal->pkt_rate_low) *
+ (q_coal->rx_pending_limt_high - q_coal->rx_pending_limt_low) /
+ (q_coal->pkt_rate_high - q_coal->pkt_rate_low) +
+ q_coal->rx_pending_limt_low);
+ }
+}
+
+static void update_queue_coal(struct hinic3_nic_dev *nic_dev, u16 qid,
+ u64 rx_rate, u64 avg_pkt_size, u64 tx_rate)
+{
+ struct hinic3_intr_coal_info *q_coal = NULL;
+ u8 coalesc_timer_cfg, pending_limt;
+
+ q_coal = &nic_dev->intr_coalesce[qid];
+
+ if (rx_rate > HINIC3_RX_RATE_THRESH && avg_pkt_size > HINIC3_AVG_PKT_SMALL) {
+ calc_coal_para(nic_dev, q_coal, rx_rate, &coalesc_timer_cfg, &pending_limt);
+ } else {
+ coalesc_timer_cfg = HINIC3_LOWEST_LATENCY;
+ pending_limt = q_coal->rx_pending_limt_low;
+ }
+
+ set_interrupt_moder(nic_dev, qid, coalesc_timer_cfg, pending_limt);
+}
+
+void hinic3_auto_moderation_work(struct work_struct *work)
+{
+ struct delayed_work *delay = to_delayed_work(work);
+ struct hinic3_nic_dev *nic_dev = container_of(delay,
+ struct hinic3_nic_dev,
+ moderation_task);
+ unsigned long period = (unsigned long)(jiffies -
+ nic_dev->last_moder_jiffies);
+ u64 rx_packets, rx_bytes, rx_pkt_diff, rx_rate, avg_pkt_size;
+ u64 tx_packets, tx_bytes, tx_pkt_diff, tx_rate;
+ u16 qid;
+
+ if (!test_bit(HINIC3_INTF_UP, &nic_dev->flags))
+ return;
+
+ queue_delayed_work(nic_dev->workq, &nic_dev->moderation_task,
+ HINIC3_MODERATONE_DELAY);
+
+ if (!nic_dev->adaptive_rx_coal || !period)
+ return;
+
+ for (qid = 0; qid < nic_dev->q_params.num_qps; qid++) {
+ rx_packets = nic_dev->rxqs[qid].rxq_stats.packets;
+ rx_bytes = nic_dev->rxqs[qid].rxq_stats.bytes;
+ tx_packets = nic_dev->txqs[qid].txq_stats.packets;
+ tx_bytes = nic_dev->txqs[qid].txq_stats.bytes;
+
+ rx_pkt_diff =
+ rx_packets - nic_dev->rxqs[qid].last_moder_packets;
+ avg_pkt_size = rx_pkt_diff ?
+ ((unsigned long)(rx_bytes -
+ nic_dev->rxqs[qid].last_moder_bytes)) /
+ rx_pkt_diff : 0;
+
+ rx_rate = rx_pkt_diff * HZ / period;
+ tx_pkt_diff =
+ tx_packets - nic_dev->txqs[qid].last_moder_packets;
+ tx_rate = tx_pkt_diff * HZ / period;
+
+ update_queue_coal(nic_dev, qid, rx_rate, avg_pkt_size,
+ tx_rate);
+
+ nic_dev->rxqs[qid].last_moder_packets = rx_packets;
+ nic_dev->rxqs[qid].last_moder_bytes = rx_bytes;
+ nic_dev->txqs[qid].last_moder_packets = tx_packets;
+ nic_dev->txqs[qid].last_moder_bytes = tx_bytes;
+ }
+
+ nic_dev->last_moder_jiffies = jiffies;
+}
+
+static void hinic3_periodic_work_handler(struct work_struct *work)
+{
+ struct delayed_work *delay = to_delayed_work(work);
+ struct hinic3_nic_dev *nic_dev = container_of(delay, struct hinic3_nic_dev, periodic_work);
+
+ if (test_and_clear_bit(EVENT_WORK_TX_TIMEOUT, &nic_dev->event_flag))
+ hinic3_fault_event_report(nic_dev->hwdev, HINIC3_FAULT_SRC_TX_TIMEOUT,
+ FAULT_LEVEL_SERIOUS_FLR);
+
+ queue_delayed_work(nic_dev->workq, &nic_dev->periodic_work, HZ);
+}
+
+static void free_nic_dev(struct hinic3_nic_dev *nic_dev)
+{
+ hinic3_deinit_nic_prof_adapter(nic_dev);
+ destroy_workqueue(nic_dev->workq);
+ kfree(nic_dev->vlan_bitmap);
+}
+
+static int setup_nic_dev(struct net_device *netdev,
+ struct hinic3_lld_dev *lld_dev)
+{
+ struct pci_dev *pdev = lld_dev->pdev;
+ struct hinic3_nic_dev *nic_dev;
+ char *netdev_name_fmt;
+ u32 page_num;
+
+ nic_dev = (struct hinic3_nic_dev *)netdev_priv(netdev);
+ nic_dev->netdev = netdev;
+ SET_NETDEV_DEV(netdev, &pdev->dev);
+ nic_dev->lld_dev = lld_dev;
+ nic_dev->hwdev = lld_dev->hwdev;
+ nic_dev->pdev = pdev;
+ nic_dev->poll_weight = (int)poll_weight;
+ nic_dev->msg_enable = DEFAULT_MSG_ENABLE;
+ nic_dev->lro_replenish_thld = lro_replenish_thld;
+ nic_dev->rx_buff_len = (u16)(rx_buff * CONVERT_UNIT);
+ nic_dev->dma_rx_buff_size = RX_BUFF_NUM_PER_PAGE * nic_dev->rx_buff_len;
+ page_num = nic_dev->dma_rx_buff_size / PAGE_SIZE;
+ nic_dev->page_order = page_num > 0 ? ilog2(page_num) : 0;
+
+ mutex_init(&nic_dev->nic_mutex);
+
+ nic_dev->vlan_bitmap = kzalloc(VLAN_BITMAP_SIZE(nic_dev), GFP_KERNEL);
+ if (!nic_dev->vlan_bitmap) {
+ nic_err(&pdev->dev, "Failed to allocate vlan bitmap\n");
+ return -ENOMEM;
+ }
+
+ nic_dev->workq = create_singlethread_workqueue(HINIC3_NIC_DEV_WQ_NAME);
+ if (!nic_dev->workq) {
+ nic_err(&pdev->dev, "Failed to initialize nic workqueue\n");
+ kfree(nic_dev->vlan_bitmap);
+ return -ENOMEM;
+ }
+
+ INIT_DELAYED_WORK(&nic_dev->periodic_work, hinic3_periodic_work_handler);
+ INIT_DELAYED_WORK(&nic_dev->rxq_check_work, hinic3_rxq_check_work_handler);
+
+ INIT_LIST_HEAD(&nic_dev->uc_filter_list);
+ INIT_LIST_HEAD(&nic_dev->mc_filter_list);
+ INIT_WORK(&nic_dev->rx_mode_work, hinic3_set_rx_mode_work);
+
+ INIT_LIST_HEAD(&nic_dev->rx_flow_rule.rules);
+ INIT_LIST_HEAD(&nic_dev->tcam.tcam_list);
+ INIT_LIST_HEAD(&nic_dev->tcam.tcam_dynamic_info.tcam_dynamic_list);
+
+ hinic3_init_nic_prof_adapter(nic_dev);
+
+ netdev_name_fmt = hinic3_get_dft_netdev_name_fmt(nic_dev);
+ if (netdev_name_fmt)
+ strncpy(netdev->name, netdev_name_fmt, IFNAMSIZ);
+
+ return 0;
+}
+
+static int hinic3_set_default_hw_feature(struct hinic3_nic_dev *nic_dev)
+{
+ int err;
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ hinic3_dcb_reset_hw_config(nic_dev);
+
+ if (set_link_status_follow < HINIC3_LINK_FOLLOW_STATUS_MAX) {
+ err = hinic3_set_link_status_follow(nic_dev->hwdev,
+ set_link_status_follow);
+ if (err == HINIC3_MGMT_CMD_UNSUPPORTED)
+ nic_warn(&nic_dev->pdev->dev,
+ "Current version of firmware doesn't support to set link status follow port status\n");
+ }
+ }
+
+ err = hinic3_set_nic_feature_to_hw(nic_dev->hwdev);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to set nic features\n");
+ return err;
+ }
+
+ /* enable all hw features in netdev->features */
+ err = hinic3_set_hw_features(nic_dev);
+ if (err) {
+ hinic3_update_nic_feature(nic_dev->hwdev, 0);
+ hinic3_set_nic_feature_to_hw(nic_dev->hwdev);
+ return err;
+ }
+
+ if (HINIC3_SUPPORT_RXQ_RECOVERY(nic_dev->hwdev))
+ set_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags);
+
+ return 0;
+}
+
+static int nic_probe(struct hinic3_lld_dev *lld_dev, void **uld_dev,
+ char *uld_dev_name)
+{
+ struct pci_dev *pdev = lld_dev->pdev;
+ struct hinic3_nic_dev *nic_dev = NULL;
+ struct net_device *netdev = NULL;
+ u16 max_qps, glb_func_id;
+ int err;
+
+ if (!hinic3_support_nic(lld_dev->hwdev, NULL)) {
+ nic_info(&pdev->dev, "Hw don't support nic\n");
+ return 0;
+ }
+
+ nic_info(&pdev->dev, "NIC service probe begin\n");
+
+ err = hinic3_validate_parameters(lld_dev);
+ if (err) {
+ err = -EINVAL;
+ goto err_out;
+ }
+
+ glb_func_id = hinic3_global_func_id(lld_dev->hwdev);
+ err = hinic3_func_reset(lld_dev->hwdev, glb_func_id, HINIC3_NIC_RES,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(&pdev->dev, "Failed to reset function\n");
+ goto err_out;
+ }
+
+ max_qps = hinic3_func_max_nic_qnum(lld_dev->hwdev);
+ netdev = alloc_etherdev_mq(sizeof(*nic_dev), max_qps);
+ if (!netdev) {
+ nic_err(&pdev->dev, "Failed to allocate ETH device\n");
+ err = -ENOMEM;
+ goto err_out;
+ }
+
+ nic_dev = (struct hinic3_nic_dev *)netdev_priv(netdev);
+ err = setup_nic_dev(netdev, lld_dev);
+ if (err)
+ goto setup_dev_err;
+
+ adaptive_configuration_init(nic_dev);
+
+ /* get nic cap from hw */
+ hinic3_support_nic(lld_dev->hwdev, &nic_dev->nic_cap);
+
+ err = hinic3_init_nic_hwdev(nic_dev->hwdev, pdev, &pdev->dev,
+ nic_dev->rx_buff_len);
+ if (err) {
+ nic_err(&pdev->dev, "Failed to init nic hwdev\n");
+ goto init_nic_hwdev_err;
+ }
+
+ err = hinic3_sw_init(nic_dev);
+ if (err)
+ goto sw_init_err;
+
+ hinic3_assign_netdev_ops(nic_dev);
+ netdev_feature_init(netdev);
+
+ err = hinic3_set_default_hw_feature(nic_dev);
+ if (err)
+ goto set_features_err;
+
+#ifdef HAVE_MULTI_VLAN_OFFLOAD_EN
+ hinic3_register_notifier(nic_dev);
+#endif
+
+ err = register_netdev(netdev);
+ if (err) {
+ nic_err(&pdev->dev, "Failed to register netdev\n");
+ err = -ENOMEM;
+ goto netdev_err;
+ }
+
+ queue_delayed_work(nic_dev->workq, &nic_dev->periodic_work, HZ);
+ netif_carrier_off(netdev);
+
+ *uld_dev = nic_dev;
+ nicif_info(nic_dev, probe, netdev, "Register netdev succeed\n");
+ nic_info(&pdev->dev, "NIC service probed\n");
+
+ return 0;
+
+netdev_err:
+#ifdef HAVE_MULTI_VLAN_OFFLOAD_EN
+ hinic3_unregister_notifier(nic_dev);
+#endif
+ hinic3_update_nic_feature(nic_dev->hwdev, 0);
+ hinic3_set_nic_feature_to_hw(nic_dev->hwdev);
+
+set_features_err:
+ hinic3_sw_deinit(nic_dev);
+
+sw_init_err:
+ hinic3_free_nic_hwdev(nic_dev->hwdev);
+
+init_nic_hwdev_err:
+ free_nic_dev(nic_dev);
+setup_dev_err:
+ free_netdev(netdev);
+
+err_out:
+ nic_err(&pdev->dev, "NIC service probe failed\n");
+
+ return err;
+}
+
+static void nic_remove(struct hinic3_lld_dev *lld_dev, void *adapter)
+{
+ struct hinic3_nic_dev *nic_dev = adapter;
+ struct net_device *netdev = NULL;
+
+ if (!nic_dev || !hinic3_support_nic(lld_dev->hwdev, NULL))
+ return;
+
+ nic_info(&lld_dev->pdev->dev, "NIC service remove begin\n");
+
+ netdev = nic_dev->netdev;
+
+ unregister_netdev(netdev);
+#ifdef HAVE_MULTI_VLAN_OFFLOAD_EN
+ hinic3_unregister_notifier(nic_dev);
+#endif
+
+ cancel_delayed_work_sync(&nic_dev->periodic_work);
+ cancel_delayed_work_sync(&nic_dev->rxq_check_work);
+ cancel_work_sync(&nic_dev->rx_mode_work);
+ destroy_workqueue(nic_dev->workq);
+
+ hinic3_flush_rx_flow_rule(nic_dev);
+
+ hinic3_update_nic_feature(nic_dev->hwdev, 0);
+ hinic3_set_nic_feature_to_hw(nic_dev->hwdev);
+
+ hinic3_sw_deinit(nic_dev);
+
+ hinic3_free_nic_hwdev(nic_dev->hwdev);
+
+ hinic3_deinit_nic_prof_adapter(nic_dev);
+ kfree(nic_dev->vlan_bitmap);
+
+ free_netdev(netdev);
+
+ nic_info(&lld_dev->pdev->dev, "NIC service removed\n");
+}
+
+static void sriov_state_change(struct hinic3_nic_dev *nic_dev,
+ const struct hinic3_sriov_state_info *info)
+{
+ if (!info->enable)
+ hinic3_clear_vfs_info(nic_dev->hwdev);
+}
+
+static void hinic3_port_module_event_handler(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_event_info *event)
+{
+ const char *g_hinic3_module_link_err[LINK_ERR_NUM] = { "Unrecognized module" };
+ struct hinic3_port_module_event *module_event = (void *)event->event_data;
+ enum port_module_event_type type = module_event->type;
+ enum link_err_type err_type = module_event->err_type;
+
+ switch (type) {
+ case HINIC3_PORT_MODULE_CABLE_PLUGGED:
+ case HINIC3_PORT_MODULE_CABLE_UNPLUGGED:
+ nicif_info(nic_dev, link, nic_dev->netdev,
+ "Port module event: Cable %s\n",
+ type == HINIC3_PORT_MODULE_CABLE_PLUGGED ?
+ "plugged" : "unplugged");
+ break;
+ case HINIC3_PORT_MODULE_LINK_ERR:
+ if (err_type >= LINK_ERR_NUM) {
+ nicif_info(nic_dev, link, nic_dev->netdev,
+ "Link failed, Unknown error type: 0x%x\n",
+ err_type);
+ } else {
+ nicif_info(nic_dev, link, nic_dev->netdev,
+ "Link failed, error type: 0x%x: %s\n",
+ err_type,
+ g_hinic3_module_link_err[err_type]);
+ }
+ break;
+ default:
+ nicif_err(nic_dev, link, nic_dev->netdev,
+ "Unknown port module type %d\n", type);
+ break;
+ }
+}
+
+static void nic_event(struct hinic3_lld_dev *lld_dev, void *adapter,
+ struct hinic3_event_info *event)
+{
+ struct hinic3_nic_dev *nic_dev = adapter;
+ struct hinic3_fault_event *fault = NULL;
+
+ if (!nic_dev || !event || !hinic3_support_nic(lld_dev->hwdev, NULL))
+ return;
+
+ switch (HINIC3_SRV_EVENT_TYPE(event->service, event->type)) {
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_NIC, EVENT_NIC_LINK_DOWN):
+ hinic3_link_status_change(nic_dev, false);
+ break;
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_NIC, EVENT_NIC_LINK_UP):
+ hinic3_link_status_change(nic_dev, true);
+ break;
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_NIC, EVENT_NIC_PORT_MODULE_EVENT):
+ hinic3_port_module_event_handler(nic_dev, event);
+ break;
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_COMM, EVENT_COMM_SRIOV_STATE_CHANGE):
+ sriov_state_change(nic_dev, (void *)event->event_data);
+ break;
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_COMM, EVENT_COMM_FAULT):
+ fault = (void *)event->event_data;
+ if (fault->fault_level == FAULT_LEVEL_SERIOUS_FLR &&
+ fault->event.chip.func_id == hinic3_global_func_id(lld_dev->hwdev))
+ hinic3_link_status_change(nic_dev, false);
+ break;
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_COMM, EVENT_COMM_PCIE_LINK_DOWN):
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_COMM, EVENT_COMM_HEART_LOST):
+ case HINIC3_SRV_EVENT_TYPE(EVENT_SRV_COMM, EVENT_COMM_MGMT_WATCHDOG):
+ hinic3_link_status_change(nic_dev, false);
+ break;
+ default:
+ break;
+ }
+}
+
+struct net_device *hinic3_get_netdev_by_lld(struct hinic3_lld_dev *lld_dev)
+{
+ struct hinic3_nic_dev *nic_dev = NULL;
+
+ if (!lld_dev || !hinic3_support_nic(lld_dev->hwdev, NULL))
+ return NULL;
+
+ nic_dev = hinic3_get_uld_dev_unsafe(lld_dev, SERVICE_T_NIC);
+ if (!nic_dev) {
+ nic_err(&lld_dev->pdev->dev,
+ "There's no net device attached on the pci device");
+ return NULL;
+ }
+
+ return nic_dev->netdev;
+}
+EXPORT_SYMBOL(hinic3_get_netdev_by_lld);
+
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_netdev(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = NULL;
+
+ if (!netdev || !hinic3_is_netdev_ops_match(netdev))
+ return NULL;
+
+ nic_dev = netdev_priv(netdev);
+ if (!nic_dev)
+ return NULL;
+
+ return nic_dev->lld_dev;
+}
+EXPORT_SYMBOL(hinic3_get_lld_dev_by_netdev);
+
+struct hinic3_uld_info g_nic_uld_info = {
+ .probe = nic_probe,
+ .remove = nic_remove,
+ .suspend = NULL,
+ .resume = NULL,
+ .event = nic_event,
+ .ioctl = nic_ioctl,
+}; /*lint -e766*/
+
+struct hinic3_uld_info *get_nic_uld_info(void)
+{
+ return &g_nic_uld_info;
+}
+
+#define HINIC3_NIC_DRV_DESC "Intelligent Network Interface Card Driver"
+
+static __init int hinic3_nic_lld_init(void)
+{
+ int err;
+
+ pr_info("%s - version %s\n", HINIC3_NIC_DRV_DESC,
+ HINIC3_NIC_DRV_VERSION);
+
+ err = hinic3_lld_init();
+ if (err) {
+ pr_err("SDK init failed.\n");
+ return err;
+ }
+
+ err = hinic3_register_uld(SERVICE_T_NIC, &g_nic_uld_info);
+ if (err) {
+ pr_err("Register hinic3 uld failed\n");
+ hinic3_lld_exit();
+ return err;
+ }
+
+ err = hinic3_module_pre_init();
+ if (err) {
+ pr_err("Init custom failed\n");
+ hinic3_unregister_uld(SERVICE_T_NIC);
+ hinic3_lld_exit();
+ return err;
+ }
+
+ return 0;
+}
+
+static __exit void hinic3_nic_lld_exit(void)
+{
+ hinic3_unregister_uld(SERVICE_T_NIC);
+
+ hinic3_module_post_exit();
+
+ hinic3_lld_exit();
+}
+
+module_init(hinic3_nic_lld_init);
+module_exit(hinic3_nic_lld_exit);
+
+MODULE_AUTHOR("Huawei Technologies CO., Ltd");
+MODULE_DESCRIPTION(HINIC3_NIC_DRV_DESC);
+MODULE_VERSION(HINIC3_NIC_DRV_VERSION);
+MODULE_LICENSE("GPL");
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h b/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
new file mode 100644
index 000000000000..c4524d703c7d
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
@@ -0,0 +1,1252 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Huawei HiNIC PCI Express Linux driver
+ * Copyright(c) 2017 Huawei Technologies Co., Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ *
+ */
+
+#ifndef HINIC_MGMT_INTERFACE_H
+#define HINIC_MGMT_INTERFACE_H
+
+#include "nic_cfg_comm.h"
+#include "mgmt_msg_base.h"
+
+#ifndef ETH_ALEN
+#define ETH_ALEN 6
+#endif
+
+#define HINIC3_CMD_OP_SET 1
+#define HINIC3_CMD_OP_GET 0
+
+#define HINIC3_CMD_OP_ADD 1
+#define HINIC3_CMD_OP_DEL 0
+
+#ifndef BIT
+#define BIT(n) (1UL << (n))
+#endif
+
+enum nic_feature_cap {
+ NIC_F_CSUM = BIT(0),
+ NIC_F_SCTP_CRC = BIT(1),
+ NIC_F_TSO = BIT(2),
+ NIC_F_LRO = BIT(3),
+ NIC_F_UFO = BIT(4),
+ NIC_F_RSS = BIT(5),
+ NIC_F_RX_VLAN_FILTER = BIT(6),
+ NIC_F_RX_VLAN_STRIP = BIT(7),
+ NIC_F_TX_VLAN_INSERT = BIT(8),
+ NIC_F_VXLAN_OFFLOAD = BIT(9),
+ NIC_F_IPSEC_OFFLOAD = BIT(10),
+ NIC_F_FDIR = BIT(11),
+ NIC_F_PROMISC = BIT(12),
+ NIC_F_ALLMULTI = BIT(13),
+ NIC_F_XSFP_REPORT = BIT(14),
+ NIC_F_VF_MAC = BIT(15),
+ NIC_F_RATE_LIMIT = BIT(16),
+ NIC_F_RXQ_RECOVERY = BIT(17),
+};
+
+#define NIC_F_ALL_MASK 0x3FFFF /* 使能所有属性 */
+
+struct hinic3_mgmt_msg_head {
+ u8 status;
+ u8 version;
+ u8 rsvd0[6];
+};
+
+#define NIC_MAX_FEATURE_QWORD 4
+struct hinic3_cmd_feature_nego {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode; /* 1: set, 0: get */
+ u8 rsvd;
+ u64 s_feature[NIC_MAX_FEATURE_QWORD];
+};
+
+struct hinic3_port_mac_set {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 vlan_id;
+ u16 rsvd1;
+ u8 mac[ETH_ALEN];
+};
+
+struct hinic3_port_mac_update {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 vlan_id;
+ u16 rsvd1;
+ u8 old_mac[ETH_ALEN];
+ u16 rsvd2;
+ u8 new_mac[ETH_ALEN];
+};
+
+struct hinic3_vport_state {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+ u8 state; /* 0--disable, 1--enable */
+ u8 rsvd2[3];
+};
+
+struct hinic3_port_state {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+ u8 state; /* 0--disable, 1--enable */
+ u8 rsvd2[3];
+};
+
+#define HINIC3_SET_PORT_CAR_PROFILE 0
+#define HINIC3_SET_PORT_CAR_STATE 1
+
+struct hinic3_port_car_info {
+ u32 cir; /* unit: kbps, range:[1,400*1000*1000], i.e. 1Kbps~400Gbps(400M*kbps) */
+ u32 xir; /* unit: kbps, range:[1,400*1000*1000], i.e. 1Kbps~400Gbps(400M*kbps) */
+ u32 cbs; /* unit: Byte, range:[1,320*1000*1000], i.e. 1byte~2560Mbit */
+ u32 xbs; /* unit: Byte, range:[1,320*1000*1000], i.e. 1byte~2560Mbit */
+};
+
+struct hinic3_cmd_set_port_car {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 opcode; /* 0--set car profile, 1--set car state */
+ u8 state; /* 0--disable, 1--enable */
+ u8 rsvd;
+
+ struct hinic3_port_car_info car;
+};
+
+struct hinic3_cmd_clear_qp_resource {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+};
+
+struct hinic3_cmd_cache_out_qp_resource {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+};
+
+struct hinic3_port_stats_info {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+};
+
+struct hinic3_vport_stats {
+ u64 tx_unicast_pkts_vport;
+ u64 tx_unicast_bytes_vport;
+ u64 tx_multicast_pkts_vport;
+ u64 tx_multicast_bytes_vport;
+ u64 tx_broadcast_pkts_vport;
+ u64 tx_broadcast_bytes_vport;
+
+ u64 rx_unicast_pkts_vport;
+ u64 rx_unicast_bytes_vport;
+ u64 rx_multicast_pkts_vport;
+ u64 rx_multicast_bytes_vport;
+ u64 rx_broadcast_pkts_vport;
+ u64 rx_broadcast_bytes_vport;
+
+ u64 tx_discard_vport;
+ u64 rx_discard_vport;
+ u64 tx_err_vport;
+ u64 rx_err_vport;
+};
+
+struct hinic3_phy_fpga_port_stats {
+ u64 mac_rx_total_octs_port;
+ u64 mac_tx_total_octs_port;
+ u64 mac_rx_under_frame_pkts_port;
+ u64 mac_rx_frag_pkts_port;
+ u64 mac_rx_64_oct_pkts_port;
+ u64 mac_rx_127_oct_pkts_port;
+ u64 mac_rx_255_oct_pkts_port;
+ u64 mac_rx_511_oct_pkts_port;
+ u64 mac_rx_1023_oct_pkts_port;
+ u64 mac_rx_max_oct_pkts_port;
+ u64 mac_rx_over_oct_pkts_port;
+ u64 mac_tx_64_oct_pkts_port;
+ u64 mac_tx_127_oct_pkts_port;
+ u64 mac_tx_255_oct_pkts_port;
+ u64 mac_tx_511_oct_pkts_port;
+ u64 mac_tx_1023_oct_pkts_port;
+ u64 mac_tx_max_oct_pkts_port;
+ u64 mac_tx_over_oct_pkts_port;
+ u64 mac_rx_good_pkts_port;
+ u64 mac_rx_crc_error_pkts_port;
+ u64 mac_rx_broadcast_ok_port;
+ u64 mac_rx_multicast_ok_port;
+ u64 mac_rx_mac_frame_ok_port;
+ u64 mac_rx_length_err_pkts_port;
+ u64 mac_rx_vlan_pkts_port;
+ u64 mac_rx_pause_pkts_port;
+ u64 mac_rx_unknown_mac_frame_port;
+ u64 mac_tx_good_pkts_port;
+ u64 mac_tx_broadcast_ok_port;
+ u64 mac_tx_multicast_ok_port;
+ u64 mac_tx_underrun_pkts_port;
+ u64 mac_tx_mac_frame_ok_port;
+ u64 mac_tx_vlan_pkts_port;
+ u64 mac_tx_pause_pkts_port;
+};
+
+struct hinic3_port_stats {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ struct hinic3_phy_fpga_port_stats stats;
+};
+
+struct hinic3_cmd_vport_stats {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u32 stats_size;
+ u32 rsvd1;
+ struct hinic3_vport_stats stats;
+ u64 rsvd2[6];
+};
+
+struct hinic3_cmd_qpn {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 base_qpn;
+};
+
+enum hinic3_func_tbl_cfg_bitmap {
+ FUNC_CFG_INIT,
+ FUNC_CFG_RX_BUF_SIZE,
+ FUNC_CFG_MTU,
+};
+
+struct hinic3_func_tbl_cfg {
+ u16 rx_wqe_buf_size;
+ u16 mtu;
+ u32 rsvd[9];
+};
+
+struct hinic3_cmd_set_func_tbl {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd;
+
+ u32 cfg_bitmap;
+ struct hinic3_func_tbl_cfg tbl_cfg;
+};
+
+struct hinic3_cmd_cons_idx_attr {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_idx;
+ u8 dma_attr_off;
+ u8 pending_limit;
+ u8 coalescing_time;
+ u8 intr_en;
+ u16 intr_idx;
+ u32 l2nic_sqn;
+ u32 rsvd;
+ u64 ci_addr;
+};
+
+typedef union {
+ struct {
+ u32 tbl_index;
+ u32 cnt;
+ u32 total_cnt;
+ } mac_table_arg;
+ struct {
+ u32 er_id;
+ u32 vlan_id;
+ } vlan_elb_table_arg;
+ struct {
+ u32 func_id;
+ } vlan_filter_arg;
+ struct {
+ u32 mc_id;
+ } mc_elb_arg;
+ struct {
+ u32 func_id;
+ } func_tbl_arg;
+ struct {
+ u32 port_id;
+ } port_tbl_arg;
+ struct {
+ u32 tbl_index;
+ u32 cnt;
+ u32 total_cnt;
+ } fdir_io_table_arg;
+ struct {
+ u32 tbl_index;
+ u32 cnt;
+ u32 total_cnt;
+ } flexq_table_arg;
+ u32 args[4];
+} sm_tbl_args;
+
+#define DFX_SM_TBL_BUF_MAX (768)
+
+struct nic_cmd_dfx_sm_table {
+ struct hinic3_mgmt_msg_head msg_head;
+ u32 tbl_type;
+ sm_tbl_args args;
+ u8 tbl_buf[DFX_SM_TBL_BUF_MAX];
+};
+
+struct hinic3_cmd_vlan_offload {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 vlan_offload;
+ u8 rsvd1[5];
+};
+
+/* ucode capture cfg info */
+struct nic_cmd_capture_info {
+ struct hinic3_mgmt_msg_head msg_head;
+ u32 op_type;
+ u32 func_port;
+ u32 is_en_trx; /* 也作为tx_rx */
+ u32 offset_cos; /* 也作为cos */
+ u32 data_vlan; /* 也作为vlan */
+};
+
+struct hinic3_cmd_lro_config {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode;
+ u8 rsvd1;
+ u8 lro_ipv4_en;
+ u8 lro_ipv6_en;
+ u8 lro_max_pkt_len; /* unit is 1K */
+ u8 resv2[13];
+};
+
+struct hinic3_cmd_lro_timer {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 opcode; /* 1: set timer value, 0: get timer value */
+ u8 rsvd1;
+ u16 rsvd2;
+ u32 timer;
+};
+
+struct hinic3_cmd_local_lro_state {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode; /* 0: get state, 1: set state */
+ u8 state; /* 0: disable, 1: enable */
+};
+
+struct hinic3_cmd_vf_vlan_config {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode;
+ u8 rsvd1;
+ u16 vlan_id;
+ u8 qos;
+ u8 rsvd2[5];
+};
+
+struct hinic3_cmd_spoofchk_set {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 state;
+ u8 rsvd1;
+};
+
+struct hinic3_cmd_tx_rate_cfg {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+ u32 min_rate;
+ u32 max_rate;
+ u8 rsvd2[8];
+};
+
+struct hinic3_cmd_port_info {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 rsvd1[3];
+ u8 port_type;
+ u8 autoneg_cap;
+ u8 autoneg_state;
+ u8 duplex;
+ u8 speed;
+ u8 fec;
+ u16 rsvd2;
+ u32 rsvd3[4];
+};
+
+struct hinic3_cmd_register_vf {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 op_register; /* 0 - unregister, 1 - register */
+ u8 rsvd1[3];
+ u32 support_extra_feature;
+ u8 rsvd2[32];
+};
+
+struct hinic3_cmd_link_state {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 state;
+ u16 rsvd1;
+};
+
+struct hinic3_cmd_vlan_config {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode;
+ u8 rsvd1;
+ u16 vlan_id;
+ u16 rsvd2;
+};
+
+/* set vlan filter */
+struct hinic3_cmd_set_vlan_filter {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 resvd[2];
+ u32 vlan_filter_ctrl; /* bit0:vlan filter en; bit1:broadcast_filter_en */
+};
+
+struct hinic3_cmd_link_ksettings_info {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 rsvd1[3];
+
+ u32 valid_bitmap;
+ u8 speed; /* enum nic_speed_level */
+ u8 autoneg; /* 0 - off, 1 - on */
+ u8 fec; /* 0 - RSFEC, 1 - BASEFEC, 2 - NOFEC */
+ u8 rsvd2[21]; /* reserved for duplex, port, etc. */
+};
+
+struct mpu_lt_info {
+ u8 node;
+ u8 inst;
+ u8 entry_size;
+ u8 rsvd;
+ u32 lt_index;
+ u32 offset;
+ u32 len;
+};
+
+struct nic_mpu_lt_opera {
+ struct hinic3_mgmt_msg_head msg_head;
+ struct mpu_lt_info net_lt_cmd;
+ u8 data[100];
+};
+
+struct hinic3_force_pkt_drop {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port;
+ u8 rsvd1[3];
+};
+struct hinic3_rx_mode_config {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+ u32 rx_mode;
+};
+
+/* rss */
+struct hinic3_rss_context_table {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+ u32 context;
+};
+
+struct hinic3_cmd_rss_engine_type {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode;
+ u8 hash_engine;
+ u8 rsvd1[4];
+};
+
+struct hinic3_cmd_rss_hash_key {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode;
+ u8 rsvd1;
+ u8 key[NIC_RSS_KEY_SIZE];
+};
+
+struct hinic3_rss_indir_table {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u16 rsvd1;
+ u8 indir[NIC_RSS_INDIR_SIZE];
+};
+
+#define NIC_RSS_CMD_TEMP_ALLOC 0x01
+#define NIC_RSS_CMD_TEMP_FREE 0x02
+
+struct hinic3_rss_template_mgmt {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 cmd;
+ u8 template_id;
+ u8 rsvd1[4];
+};
+
+struct hinic3_cmd_rss_config {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 rss_en;
+ u8 rq_priority_number;
+ u8 prio_tc[NIC_DCB_COS_MAX];
+ u16 num_qps;
+ u16 rsvd1;
+};
+
+struct hinic3_dcb_state {
+ u8 dcb_on;
+ u8 default_cos;
+ u8 trust;
+ u8 rsvd1;
+ u8 pcp2cos[NIC_DCB_UP_MAX];
+ u8 dscp2cos[64];
+ u32 rsvd2[7];
+};
+
+struct hinic3_cmd_vf_dcb_state {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ struct hinic3_dcb_state state;
+};
+
+struct hinic3_up_ets_cfg { /* delet */
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 rsvd1[3];
+
+ u8 cos_tc[NIC_DCB_COS_MAX];
+ u8 tc_bw[NIC_DCB_TC_MAX];
+ u8 cos_prio[NIC_DCB_COS_MAX];
+ u8 cos_bw[NIC_DCB_COS_MAX];
+ u8 tc_prio[NIC_DCB_TC_MAX];
+};
+
+#define CMD_QOS_ETS_COS_TC BIT(0)
+#define CMD_QOS_ETS_TC_BW BIT(1)
+#define CMD_QOS_ETS_COS_PRIO BIT(2)
+#define CMD_QOS_ETS_COS_BW BIT(3)
+#define CMD_QOS_ETS_TC_PRIO BIT(4)
+struct hinic3_cmd_ets_cfg {
+ struct hinic3_mgmt_msg_head head;
+
+ u8 port_id;
+ u8 op_code; /* 1 - set, 0 - get */
+ /* bit0 - cos_tc, bit1 - tc_bw, bit2 - cos_prio, bit3 - cos_bw, bit4 - tc_prio */
+ u8 cfg_bitmap;
+ u8 rsvd;
+
+ u8 cos_tc[NIC_DCB_COS_MAX];
+ u8 tc_bw[NIC_DCB_TC_MAX];
+ u8 cos_prio[NIC_DCB_COS_MAX]; /* 0 - DWRR, 1 - STRICT */
+ u8 cos_bw[NIC_DCB_COS_MAX];
+ u8 tc_prio[NIC_DCB_TC_MAX]; /* 0 - DWRR, 1 - STRICT */
+};
+
+struct hinic3_cmd_set_dcb_state {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id;
+ u8 op_code; /* 0 - get dcb state, 1 - set dcb state */
+ u8 state; /* 0 - disable, 1 - enable dcb */
+ u8 port_state; /* 0 - disable, 1 - enable dcb */
+ u8 rsvd[7];
+};
+
+#define PFC_BIT_MAP_NUM 8
+struct hinic3_cmd_set_pfc {
+ struct hinic3_mgmt_msg_head head;
+
+ u8 port_id;
+ u8 op_code; /* 0:get 1: set pfc_en 2: set pfc_bitmap 3: set all */
+ u8 pfc_en; /* pfc_en 和 pfc_bitmap 必须同时设置 */
+ u8 pfc_bitmap;
+ u8 rsvd[4];
+};
+
+#define CMD_QOS_PORT_TRUST BIT(0)
+#define CMD_QOS_PORT_DFT_COS BIT(1)
+struct hinic3_cmd_qos_port_cfg {
+ struct hinic3_mgmt_msg_head head;
+
+ u8 port_id;
+ u8 op_code; /* 0 - get, 1 - set */
+ u8 cfg_bitmap; /* bit0 - trust, bit1 - dft_cos */
+ u8 rsvd0;
+
+ u8 trust;
+ u8 dft_cos;
+ u8 rsvd1[18];
+};
+
+#define MAP_COS_MAX_NUM 8
+#define CMD_QOS_MAP_PCP2COS BIT(0)
+#define CMD_QOS_MAP_DSCP2COS BIT(1)
+struct hinic3_cmd_qos_map_cfg {
+ struct hinic3_mgmt_msg_head head;
+
+ u8 op_code;
+ u8 cfg_bitmap; /* bit0 - pcp2cos, bit1 - dscp2cos */
+ u16 rsvd0;
+
+ u8 pcp2cos[8]; /* 必须8个一起配置 */
+ /* 配置dscp2cos时,若cos值设置为0xFF,MPU则忽略此dscp优先级的配置,
+ * 允许一次性配置多个dscp跟cos的映射关系
+ */
+ u8 dscp2cos[64];
+ u32 rsvd1[4];
+};
+
+struct hinic3_cos_up_map {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 cos_valid_mask; /* every bit indicate index of map is valid 1 or not 0 */
+ u16 rsvd1;
+
+ /* user priority in cos(index:cos, value: up pri) */
+ u8 map[NIC_DCB_UP_MAX];
+};
+
+struct hinic3_cmd_pause_config {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 opcode;
+ u16 rsvd1;
+ u8 auto_neg;
+ u8 rx_pause;
+ u8 tx_pause;
+ u8 rsvd2[5];
+};
+
+/* pfc风暴检测配置 */
+struct nic_cmd_pause_inquiry_cfg {
+ struct hinic3_mgmt_msg_head head;
+
+ u32 valid;
+
+ u32 type; /* 1: set, 2: get */
+
+ u32 rx_inquiry_pause_drop_pkts_en; /* rx 卸包使能 */
+ u32 rx_inquiry_pause_period_ms; /* rx pause 检测周期 默认 200ms */
+ u32 rx_inquiry_pause_times; /* rx pause 检测次数 默认1次 */
+ /* rx pause 检测阈值 默认 PAUSE_FRAME_THD_10G/25G/40G/100 */
+ u32 rx_inquiry_pause_frame_thd;
+ u32 rx_inquiry_tx_total_pkts; /* rx pause 检测tx收包总数 */
+
+ u32 tx_inquiry_pause_en; /* tx pause 检测使能 */
+ u32 tx_inquiry_pause_period_ms; /* tx pause 检测周期 默认 200ms */
+ u32 tx_inquiry_pause_times; /* tx pause 检测次数 默认 5次 */
+ u32 tx_inquiry_pause_frame_thd; /* tx pause 检测阈值 */
+ u32 tx_inquiry_rx_total_pkts; /* tx pause 检测rx收包总数 */
+
+ u32 rsvd[4];
+};
+
+/* pfc/pause风暴tx异常上报 */
+struct nic_cmd_tx_pause_notice {
+ struct hinic3_mgmt_msg_head head;
+
+ u32 tx_pause_except; /* 1: 异常,0: 正常 */
+ u32 except_level;
+ u32 rsvd;
+};
+
+#define HINIC3_CMD_OP_FREE 0
+#define HINIC3_CMD_OP_ALLOC 1
+
+struct hinic3_cmd_cfg_qps {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 opcode; /* 1: alloc qp, 0: free qp */
+ u8 rsvd1;
+ u16 num_qps;
+ u16 rsvd2;
+};
+
+struct hinic3_cmd_led_config {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port;
+ u8 type;
+ u8 mode;
+ u8 rsvd1;
+};
+
+struct hinic3_cmd_port_loopback {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 opcode;
+ u8 mode;
+ u8 en;
+ u32 rsvd1[2];
+};
+
+struct hinic3_cmd_get_light_module_abs {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 abs_status; /* 0:present, 1:absent */
+ u8 rsv[2];
+};
+
+#define STD_SFP_INFO_MAX_SIZE 640
+struct hinic3_cmd_get_std_sfp_info {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u8 port_id;
+ u8 wire_type;
+ u16 eeprom_len;
+ u32 rsvd;
+ u8 sfp_info[STD_SFP_INFO_MAX_SIZE];
+};
+
+struct hinic3_cable_plug_event {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 plugged; /* 0: unplugged, 1: plugged */
+ u8 port_id;
+};
+
+/* MAC模块接口 */
+struct nic_cmd_mac_info {
+ struct hinic3_mgmt_msg_head head;
+
+ u32 valid_bitmap;
+ u16 rsvd;
+
+ u8 host_id[32];
+ u8 port_id[32];
+ u8 mac_addr[192];
+};
+
+struct nic_cmd_set_tcam_enable {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id;
+ u8 tcam_enable;
+ u8 rsvd1;
+ u32 rsvd2;
+};
+
+struct nic_cmd_set_fdir_status {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id;
+ u16 rsvd1;
+ u8 pkt_type_en;
+ u8 pkt_type;
+ u8 qid;
+ u8 rsvd2;
+};
+
+#define HINIC3_TCAM_BLOCK_ENABLE 1
+#define HINIC3_TCAM_BLOCK_DISABLE 0
+#define HINIC3_MAX_TCAM_RULES_NUM 4096
+
+/* tcam block type, according to tcam block size */
+enum {
+ NIC_TCAM_BLOCK_TYPE_LARGE = 0, /* block_size: 16 */
+ NIC_TCAM_BLOCK_TYPE_SMALL, /* block_size: 0 */
+ NIC_TCAM_BLOCK_TYPE_MAX
+};
+
+/* alloc tcam block input struct */
+struct nic_cmd_ctrl_tcam_block_in {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id; /* func_id */
+ u8 alloc_en; /* 0: 释放分配的tcam block, 1: 申请新的tcam block */
+ /* 0: 分配16 size 的tcam block, 1: 分配0 size的tcam block, 其他预留 */
+ u8 tcam_type;
+ u16 tcam_block_index;
+ /* 驱动发给uP表示驱动希望分配的block大小
+ * uP返回给驱动的接口,表示uP 支持的分配的tcam block大小
+ */
+ u16 alloc_block_num;
+};
+
+/* alloc tcam block output struct */
+struct nic_cmd_ctrl_tcam_block_out {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id; /* func_id */
+ u8 alloc_en; /* 0: 释放分配的tcam block, 1: 申请新的tcam block */
+ /* 0: 分配16 size 的tcam block, 1: 分配0 size的tcam block, 其他预留 */
+ u8 tcam_type;
+ u16 tcam_block_index;
+ /* 驱动发给uP表示驱动希望分配的block大小
+ * uP返回给驱动的接口,表示uP 支持的分配的tcam block大小
+ */
+ u16 mpu_alloc_block_size;
+};
+
+struct nic_cmd_flush_tcam_rules {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id; /* func_id */
+ u16 rsvd;
+};
+
+struct nic_cmd_dfx_fdir_tcam_block_table {
+ struct hinic3_mgmt_msg_head head;
+ u8 tcam_type;
+ u8 valid;
+ u16 tcam_block_index;
+ u16 use_function_id;
+ u16 rsvd;
+};
+
+struct tcam_result {
+ u32 qid;
+ u32 rsvd;
+};
+
+#define TCAM_FLOW_KEY_SIZE (44)
+
+struct tcam_key_x_y {
+ u8 x[TCAM_FLOW_KEY_SIZE];
+ u8 y[TCAM_FLOW_KEY_SIZE];
+};
+
+struct nic_tcam_cfg_rule {
+ u32 index;
+ struct tcam_result data;
+ struct tcam_key_x_y key;
+};
+
+#define TCAM_RULE_FDIR_TYPE 0
+#define TCAM_RULE_PPA_TYPE 1
+
+struct nic_cmd_fdir_add_rule {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id;
+ u8 type;
+ u8 rsvd;
+ struct nic_tcam_cfg_rule rule;
+};
+
+struct nic_cmd_fdir_del_rules {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id;
+ u8 type;
+ u8 rsvd;
+ u32 index_start;
+ u32 index_num;
+};
+
+struct nic_cmd_fdir_get_rule {
+ struct hinic3_mgmt_msg_head head;
+
+ u32 index;
+ u8 valid;
+ u8 type;
+ u16 rsvd;
+ struct tcam_key_x_y key;
+ struct tcam_result data;
+ u64 packet_count;
+ u64 byte_count;
+};
+
+struct hinic3_tcam_key_ipv4_mem {
+ u32 rsvd1 : 4;
+ u32 tunnel_type : 4;
+ u32 ip_proto : 8;
+ u32 rsvd0 : 16;
+ u32 sipv4_h : 16;
+ u32 ip_type : 1;
+ u32 function_id : 15;
+ u32 dipv4_h : 16;
+ u32 sipv4_l : 16;
+ u32 rsvd2 : 16;
+ u32 dipv4_l : 16;
+ u32 rsvd3;
+ u32 dport : 16;
+ u32 rsvd4 : 16;
+ u32 rsvd5 : 16;
+ u32 sport : 16;
+ u32 outer_sipv4_h : 16;
+ u32 rsvd6 : 16;
+ u32 outer_dipv4_h : 16;
+ u32 outer_sipv4_l : 16;
+ u32 vni_h : 16;
+ u32 outer_dipv4_l : 16;
+ u32 rsvd7 : 16;
+ u32 vni_l : 16;
+};
+
+struct hinic3_tcam_key_ipv6_mem {
+ u32 rsvd1 : 4;
+ u32 tunnel_type : 4;
+ u32 ip_proto : 8;
+ u32 rsvd0 : 16;
+ u32 sipv6_key0 : 16;
+ u32 ip_type : 1;
+ u32 function_id : 15;
+ u32 sipv6_key2 : 16;
+ u32 sipv6_key1 : 16;
+ u32 sipv6_key4 : 16;
+ u32 sipv6_key3 : 16;
+ u32 sipv6_key6 : 16;
+ u32 sipv6_key5 : 16;
+ u32 dport : 16;
+ u32 sipv6_key7 : 16;
+ u32 dipv6_key0 : 16;
+ u32 sport : 16;
+ u32 dipv6_key2 : 16;
+ u32 dipv6_key1 : 16;
+ u32 dipv6_key4 : 16;
+ u32 dipv6_key3 : 16;
+ u32 dipv6_key6 : 16;
+ u32 dipv6_key5 : 16;
+ u32 rsvd2 : 16;
+ u32 dipv6_key7 : 16;
+};
+
+struct hinic3_tcam_key_vxlan_ipv6_mem {
+ u32 rsvd1 : 4;
+ u32 tunnel_type : 4;
+ u32 ip_proto : 8;
+ u32 rsvd0 : 16;
+
+ u32 dipv6_key0 : 16;
+ u32 ip_type : 1;
+ u32 function_id : 15;
+
+ u32 dipv6_key2 : 16;
+ u32 dipv6_key1 : 16;
+
+ u32 dipv6_key4 : 16;
+ u32 dipv6_key3 : 16;
+
+ u32 dipv6_key6 : 16;
+ u32 dipv6_key5 : 16;
+
+ u32 dport : 16;
+ u32 dipv6_key7 : 16;
+
+ u32 rsvd2 : 16;
+ u32 sport : 16;
+
+ u32 outer_sipv4_h : 16;
+ u32 rsvd3 : 16;
+
+ u32 outer_dipv4_h : 16;
+ u32 outer_sipv4_l : 16;
+
+ u32 vni_h : 16;
+ u32 outer_dipv4_l : 16;
+
+ u32 rsvd4 : 16;
+ u32 vni_l : 16;
+};
+
+struct tag_tcam_key {
+ union {
+ struct hinic3_tcam_key_ipv4_mem key_info;
+ struct hinic3_tcam_key_ipv6_mem key_info_ipv6;
+ struct hinic3_tcam_key_vxlan_ipv6_mem key_info_vxlan_ipv6;
+ };
+
+ union {
+ struct hinic3_tcam_key_ipv4_mem key_mask;
+ struct hinic3_tcam_key_ipv6_mem key_mask_ipv6;
+ struct hinic3_tcam_key_vxlan_ipv6_mem key_mask_vxlan_ipv6;
+ };
+};
+
+enum {
+ PPA_TABLE_ID_CLEAN_CMD = 0,
+ PPA_TABLE_ID_ADD_CMD,
+ PPA_TABLE_ID_DEL_CMD,
+ FDIR_TABLE_ID_ADD_CMD,
+ FDIR_TABLE_ID_DEL_CMD,
+ PPA_TABEL_ID_MAX
+};
+
+struct hinic3_ppa_cfg_table_id_cmd {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 rsvd0;
+ u16 cmd;
+ u16 table_id;
+ u16 rsvd1;
+};
+
+struct hinic3_ppa_cfg_ppa_en_cmd {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 func_id;
+ u8 ppa_en;
+ u8 rsvd;
+};
+
+struct hinic3_ppa_cfg_mode_cmd {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 rsvd0;
+ u8 ppa_mode;
+ u8 qpc_func_nums;
+ u16 base_qpc_func_id;
+ u16 rsvd1;
+};
+
+struct hinic3_ppa_flush_en_cmd {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u16 rsvd0;
+ u8 flush_en; /* 0 flush done, 1 in flush operation */
+ u8 rsvd1;
+};
+
+struct hinic3_ppa_fdir_query_cmd {
+ struct hinic3_mgmt_msg_head msg_head;
+
+ u32 index;
+ u32 rsvd;
+ u64 pkt_nums;
+ u64 pkt_bytes;
+};
+
+/* BIOS CONF */
+enum {
+ NIC_NVM_DATA_SET = BIT(0), /* 1-save, 0-read */
+ NIC_NVM_DATA_PXE = BIT(1),
+ NIC_NVM_DATA_VLAN = BIT(2),
+ NIC_NVM_DATA_VLAN_PRI = BIT(3),
+ NIC_NVM_DATA_VLAN_ID = BIT(4),
+ NIC_NVM_DATA_WORK_MODE = BIT(5),
+ NIC_NVM_DATA_PF_SPEED_LIMIT = BIT(6),
+ NIC_NVM_DATA_GE_MODE = BIT(7),
+ NIC_NVM_DATA_AUTO_NEG = BIT(8),
+ NIC_NVM_DATA_LINK_FEC = BIT(9),
+ NIC_NVM_DATA_PF_ADAPTIVE_LINK = BIT(10),
+ NIC_NVM_DATA_SRIOV_CONTROL = BIT(11),
+ NIC_NVM_DATA_EXTEND_MODE = BIT(12),
+ NIC_NVM_DATA_RESET = BIT(31),
+};
+
+#define BIOS_CFG_SIGNATURE 0x1923E518
+#define BIOS_OP_CFG_ALL(op_code_val) (((op_code_val) >> 1) & (0xFFFFFFFF))
+#define BIOS_OP_CFG_WRITE(op_code_val) ((op_code_val) & NIC_NVM_DATA_SET)
+#define BIOS_OP_CFG_PXE_EN(op_code_val) ((op_code_val) & NIC_NVM_DATA_PXE)
+#define BIOS_OP_CFG_VLAN_EN(op_code_val) ((op_code_val) & NIC_NVM_DATA_VLAN)
+#define BIOS_OP_CFG_VLAN_PRI(op_code_val) ((op_code_val) & NIC_NVM_DATA_VLAN_PRI)
+#define BIOS_OP_CFG_VLAN_ID(op_code_val) ((op_code_val) & NIC_NVM_DATA_VLAN_ID)
+#define BIOS_OP_CFG_WORK_MODE(op_code_val) ((op_code_val) & NIC_NVM_DATA_WORK_MODE)
+#define BIOS_OP_CFG_PF_BW(op_code_val) ((op_code_val) & NIC_NVM_DATA_PF_SPEED_LIMIT)
+#define BIOS_OP_CFG_GE_SPEED(op_code_val) ((op_code_val) & NIC_NVM_DATA_GE_MODE)
+#define BIOS_OP_CFG_AUTO_NEG(op_code_val) ((op_code_val) & NIC_NVM_DATA_AUTO_NEG)
+#define BIOS_OP_CFG_LINK_FEC(op_code_val) ((op_code_val) & NIC_NVM_DATA_LINK_FEC)
+#define BIOS_OP_CFG_AUTO_ADPAT(op_code_val) ((op_code_val) & NIC_NVM_DATA_PF_ADAPTIVE_LINK)
+#define BIOS_OP_CFG_SRIOV_ENABLE(op_code_val) ((op_code_val) & NIC_NVM_DATA_SRIOV_CONTROL)
+#define BIOS_OP_CFG_EXTEND_MODE(op_code_val) ((op_code_val) & NIC_NVM_DATA_EXTEND_MODE)
+#define BIOS_OP_CFG_RST_DEF_SET(op_code_val) ((op_code_val) & (u32)NIC_NVM_DATA_RESET)
+
+#define NIC_BIOS_CFG_MAX_PF_BW 100
+/* 注意:此结构必须保证4字节对齐 */
+struct nic_bios_cfg {
+ u32 signature; /* 签名,用于判断FLASH的内容合法性 */
+ u8 pxe_en; /* PXE enable: 0 - disable 1 - enable */
+ u8 extend_mode;
+ u8 rsvd0[2];
+ u8 pxe_vlan_en; /* PXE VLAN enable: 0 - disable 1 - enable */
+ u8 pxe_vlan_pri; /* PXE VLAN priority: 0-7 */
+ u16 pxe_vlan_id; /* PXE VLAN ID 1-4094 */
+ u32 service_mode; /* 参考CHIPIF_SERVICE_MODE_x 宏 */
+ u32 pf_bw; /* PF速率,百分比 0-100 */
+ u8 speed; /* enum of port speed */
+ u8 auto_neg; /* 自协商开关 0 - 字段无效 1 - 开2 - 关 */
+ u8 lanes; /* lane num */
+ u8 fec; /* FEC模式, 参考 enum mag_cmd_port_fec */
+ u8 auto_adapt; /* 自适应模式配置0 - 无效配置 1 - 开启 2 - 关闭 */
+ u8 func_valid; /* 指示func_id是否有效; 0 - 无效,other - 有效 */
+ u8 func_id; /* 当func_valid不为0时,该成员才有意义 */
+ u8 sriov_en; /* SRIOV-EN: 0 - 无效配置, 1 - 开启, 2 - 关闭 */
+};
+
+struct nic_cmd_bios_cfg {
+ struct hinic3_mgmt_msg_head head;
+ u32 op_code; /* Operation Code: Bit0[0: read 1:write, BIT1-6: cfg_mask */
+ struct nic_bios_cfg bios_cfg;
+};
+
+struct nic_cmd_vhd_config {
+ struct hinic3_mgmt_msg_head head;
+
+ u16 func_id;
+ u8 vhd_type;
+ u8 virtio_small_enable; /* 0: mergeable mode, 1: small mode */
+};
+
+/* BOND */
+struct hinic3_create_bond_info {
+ u32 bond_id; /* bond设备号,output时有效,mpu操作成功返回时回填 */
+ u32 master_slave_port_id; /* */
+ u32 slave_bitmap; /* bond port id bitmap */
+ u32 poll_timeout; /* bond设备链路检查时间 */
+ u32 up_delay; /* 暂时预留 */
+ u32 down_delay; /* 暂时预留 */
+ u32 bond_mode; /* 暂时预留 */
+ u32 active_pf; /* bond使用的active pf id */
+ u32 active_port_max_num; /* bond活动成员口个数上限 */
+ u32 active_port_min_num; /* bond活动成员口个数下限 */
+ u32 xmit_hash_policy; /* hash策略,用于微码选路逻辑 */
+ u32 rsvd[2];
+};
+
+/* 创建bond的消息接口 */
+struct hinic3_cmd_create_bond {
+ struct hinic3_mgmt_msg_head head;
+ struct hinic3_create_bond_info create_bond_info;
+};
+
+struct hinic3_cmd_delete_bond {
+ struct hinic3_mgmt_msg_head head;
+ u32 bond_id;
+ u32 rsvd[2];
+};
+
+struct hinic3_open_close_bond_info {
+ u32 bond_id; /* bond设备号 */
+ u32 open_close_flag; /* 开启/关闭bond标识:1为open, 0为close */
+ u32 rsvd[2];
+};
+
+/* MPU bond的消息接口 */
+struct hinic3_cmd_open_close_bond {
+ struct hinic3_mgmt_msg_head head;
+ struct hinic3_open_close_bond_info open_close_bond_info;
+};
+
+/* LACPDU的port相关字段 */
+struct lacp_port_params {
+ u16 port_number;
+ u16 port_priority;
+ u16 key;
+ u16 system_priority;
+ u8 system[ETH_ALEN];
+ u8 port_state;
+ u8 rsvd;
+};
+
+struct lacp_port_info {
+ u32 selected;
+ u32 aggregator_port_id; /* 使用的 aggregator port ID */
+
+ struct lacp_port_params actor; /* actor port参数 */
+ struct lacp_port_params partner; /* partner port参数 */
+
+ u64 tx_lacp_pkts;
+ u64 rx_lacp_pkts;
+ u64 rx_8023ad_drop;
+ u64 tx_8023ad_drop;
+ u64 unknown_pkt_drop;
+ u64 rx_marker_pkts;
+ u64 tx_marker_pkts;
+};
+
+/* lacp 状态信息 */
+struct hinic3_bond_status_info {
+ struct hinic3_mgmt_msg_head head;
+ u32 bond_id;
+ u32 bon_mmi_status; /* 该bond子设备的链路状态 */
+ u32 active_bitmap; /* 该bond子设备的slave port状态 */
+ u32 port_count; /* 该bond子设备个数 */
+
+ struct lacp_port_info port_info[4];
+
+ u64 success_report_cnt[4]; /* 每个host成功上报lacp协商结果次数 */
+ u64 fail_report_cnt[4]; /* 每个host上报lacp协商结果失败次数 */
+
+ u64 poll_timeout;
+ u64 fast_periodic_timeout;
+ u64 slow_periodic_timeout;
+ u64 short_timeout;
+ u64 long_timeout;
+ u64 aggregate_wait_timeout;
+ u64 tx_period_timeout;
+ u64 rx_marker_timer;
+};
+
+/* lacp协商结果更新之后向主机侧发送异步消息通知结构体 */
+struct hinic3_bond_active_report_info {
+ struct hinic3_mgmt_msg_head head;
+ u32 bond_id;
+ u32 bon_mmi_status; /* 该bond子设备的链路状态 */
+ u32 active_bitmap; /* 该bond子设备的slave port状态 */
+
+ u8 rsvd[16];
+};
+
+/* IP checksum error packets, enable rss quadruple hash. */
+struct hinic3_ipcs_err_rss_enable_operation_s {
+ struct hinic3_mgmt_msg_head head;
+
+ u8 en_tag;
+ u8 type; /* 1: set 0: get */
+ u8 rsvd[2];
+};
+
+struct hinic3_smac_check_state {
+ struct hinic3_mgmt_msg_head head;
+ u8 smac_check_en; /* 1: enable 0: disable */
+ u8 op_code; /* 1: set 0: get */
+ u8 rsvd[2];
+};
+
+#endif /* HINIC_MGMT_INTERFACE_H */
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_mt.h b/drivers/net/ethernet/huawei/hinic3/hinic3_mt.h
new file mode 100644
index 000000000000..4e9f38d1ed6a
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_mt.h
@@ -0,0 +1,681 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_MT_H
+#define HINIC3_MT_H
+
+#define HINIC3_DRV_NAME "hisdk3"
+#define HINIC3_CHIP_NAME "hinic"
+/* Interrupt at most records, interrupt will be recorded in the FFM */
+
+#define NICTOOL_CMD_TYPE (0x18)
+
+struct api_cmd_rd {
+ u32 pf_id;
+ u8 dest;
+ u8 *cmd;
+ u16 size;
+ void *ack;
+ u16 ack_size;
+};
+
+struct api_cmd_wr {
+ u32 pf_id;
+ u8 dest;
+ u8 *cmd;
+ u16 size;
+};
+
+#define PF_DEV_INFO_NUM 32
+
+struct pf_dev_info {
+ u64 bar0_size;
+ u8 bus;
+ u8 slot;
+ u8 func;
+ u64 phy_addr;
+};
+
+/* Indicates the maximum number of interrupts that can be recorded.
+ * Subsequent interrupts are not recorded in FFM.
+ */
+#define FFM_RECORD_NUM_MAX 64
+
+struct ffm_intr_info {
+ u8 node_id;
+ /* error level of the interrupt source */
+ u8 err_level;
+ /* Classification by interrupt source properties */
+ u16 err_type;
+ u32 err_csr_addr;
+ u32 err_csr_value;
+};
+
+struct ffm_intr_tm_info {
+ struct ffm_intr_info intr_info;
+ u8 times;
+ u8 sec;
+ u8 min;
+ u8 hour;
+ u8 mday;
+ u8 mon;
+ u16 year;
+};
+
+struct ffm_record_info {
+ u32 ffm_num;
+ u32 last_err_csr_addr;
+ u32 last_err_csr_value;
+ struct ffm_intr_tm_info ffm[FFM_RECORD_NUM_MAX];
+};
+
+struct dbgtool_k_glb_info {
+ struct semaphore dbgtool_sem;
+ struct ffm_record_info *ffm;
+};
+
+struct msg_2_up {
+ u8 pf_id;
+ u8 mod;
+ u8 cmd;
+ void *buf_in;
+ u16 in_size;
+ void *buf_out;
+ u16 *out_size;
+};
+
+struct dbgtool_param {
+ union {
+ struct api_cmd_rd api_rd;
+ struct api_cmd_wr api_wr;
+ struct pf_dev_info *dev_info;
+ struct ffm_record_info *ffm_rd;
+ struct msg_2_up msg2up;
+ } param;
+ char chip_name[16];
+};
+
+/* dbgtool command type */
+/* You can add commands as required. The dbgtool command can be
+ * used to invoke all interfaces of the kernel-mode x86 driver.
+ */
+typedef enum {
+ DBGTOOL_CMD_API_RD = 0,
+ DBGTOOL_CMD_API_WR,
+ DBGTOOL_CMD_FFM_RD,
+ DBGTOOL_CMD_FFM_CLR,
+ DBGTOOL_CMD_PF_DEV_INFO_GET,
+ DBGTOOL_CMD_MSG_2_UP,
+ DBGTOOL_CMD_FREE_MEM,
+ DBGTOOL_CMD_NUM
+} dbgtool_cmd;
+
+#define PF_MAX_SIZE (16)
+#define BUSINFO_LEN (32)
+
+enum module_name {
+ SEND_TO_NPU = 1,
+ SEND_TO_MPU,
+ SEND_TO_SM,
+
+ SEND_TO_HW_DRIVER,
+#define SEND_TO_SRV_DRV_BASE (SEND_TO_HW_DRIVER + 1)
+ SEND_TO_NIC_DRIVER = SEND_TO_SRV_DRV_BASE,
+ SEND_TO_OVS_DRIVER,
+ SEND_TO_ROCE_DRIVER,
+ SEND_TO_TOE_DRIVER,
+ SEND_TO_IOE_DRIVER,
+ SEND_TO_FC_DRIVER,
+ SEND_TO_VBS_DRIVER,
+ SEND_TO_IPSEC_DRIVER,
+ SEND_TO_VIRTIO_DRIVER,
+ SEND_TO_MIGRATE_DRIVER,
+ SEND_TO_PPA_DRIVER,
+ SEND_TO_CUSTOM_DRIVER = SEND_TO_SRV_DRV_BASE + 11,
+ SEND_TO_DRIVER_MAX = SEND_TO_SRV_DRV_BASE + 15, /* reserved */
+};
+
+enum driver_cmd_type {
+ TX_INFO = 1,
+ Q_NUM,
+ TX_WQE_INFO,
+ TX_MAPPING,
+ RX_INFO,
+ RX_WQE_INFO,
+ RX_CQE_INFO,
+ UPRINT_FUNC_EN,
+ UPRINT_FUNC_RESET,
+ UPRINT_SET_PATH,
+ UPRINT_GET_STATISTICS,
+ FUNC_TYPE,
+ GET_FUNC_IDX,
+ GET_INTER_NUM,
+ CLOSE_TX_STREAM,
+ GET_DRV_VERSION,
+ CLEAR_FUNC_STASTIC,
+ GET_HW_STATS,
+ CLEAR_HW_STATS,
+ GET_SELF_TEST_RES,
+ GET_CHIP_FAULT_STATS,
+ NIC_RSVD1,
+ NIC_RSVD2,
+ NIC_RSVD3,
+ GET_CHIP_ID,
+ GET_SINGLE_CARD_INFO,
+ GET_FIRMWARE_ACTIVE_STATUS,
+ ROCE_DFX_FUNC,
+ GET_DEVICE_ID,
+ GET_PF_DEV_INFO,
+ CMD_FREE_MEM,
+ GET_LOOPBACK_MODE = 32,
+ SET_LOOPBACK_MODE,
+ SET_LINK_MODE,
+ SET_PF_BW_LIMIT,
+ GET_PF_BW_LIMIT,
+ ROCE_CMD,
+ GET_POLL_WEIGHT,
+ SET_POLL_WEIGHT,
+ GET_HOMOLOGUE,
+ SET_HOMOLOGUE,
+ GET_SSET_COUNT,
+ GET_SSET_ITEMS,
+ IS_DRV_IN_VM,
+ LRO_ADPT_MGMT,
+ SET_INTER_COAL_PARAM,
+ GET_INTER_COAL_PARAM,
+ GET_CHIP_INFO,
+ GET_NIC_STATS_LEN,
+ GET_NIC_STATS_STRING,
+ GET_NIC_STATS_INFO,
+ GET_PF_ID,
+ NIC_RSVD4,
+ NIC_RSVD5,
+ DCB_QOS_INFO,
+ DCB_PFC_STATE,
+ DCB_ETS_STATE,
+ DCB_STATE,
+ QOS_DEV,
+ GET_QOS_COS,
+ GET_ULD_DEV_NAME,
+ GET_TX_TIMEOUT,
+ SET_TX_TIMEOUT,
+
+ RSS_CFG = 0x40,
+ RSS_INDIR,
+ PORT_ID,
+
+ GET_FUNC_CAP = 0x50,
+ GET_XSFP_PRESENT = 0x51,
+ GET_XSFP_INFO = 0x52,
+ DEV_NAME_TEST = 0x53,
+
+ GET_WIN_STAT = 0x60,
+ WIN_CSR_READ = 0x61,
+ WIN_CSR_WRITE = 0x62,
+ WIN_API_CMD_RD = 0x63,
+
+ VM_COMPAT_TEST = 0xFF
+};
+
+enum api_chain_cmd_type {
+ API_CSR_READ,
+ API_CSR_WRITE
+};
+
+enum sm_cmd_type {
+ SM_CTR_RD16 = 1,
+ SM_CTR_RD32,
+ SM_CTR_RD64_PAIR,
+ SM_CTR_RD64,
+ SM_CTR_RD32_CLEAR,
+ SM_CTR_RD64_PAIR_CLEAR,
+ SM_CTR_RD64_CLEAR
+};
+
+struct cqm_stats {
+ atomic_t cqm_cmd_alloc_cnt;
+ atomic_t cqm_cmd_free_cnt;
+ atomic_t cqm_send_cmd_box_cnt;
+ atomic_t cqm_send_cmd_imm_cnt;
+ atomic_t cqm_db_addr_alloc_cnt;
+ atomic_t cqm_db_addr_free_cnt;
+ atomic_t cqm_fc_srq_create_cnt;
+ atomic_t cqm_srq_create_cnt;
+ atomic_t cqm_rq_create_cnt;
+ atomic_t cqm_qpc_mpt_create_cnt;
+ atomic_t cqm_nonrdma_queue_create_cnt;
+ atomic_t cqm_rdma_queue_create_cnt;
+ atomic_t cqm_rdma_table_create_cnt;
+ atomic_t cqm_qpc_mpt_delete_cnt;
+ atomic_t cqm_nonrdma_queue_delete_cnt;
+ atomic_t cqm_rdma_queue_delete_cnt;
+ atomic_t cqm_rdma_table_delete_cnt;
+ atomic_t cqm_func_timer_clear_cnt;
+ atomic_t cqm_func_hash_buf_clear_cnt;
+ atomic_t cqm_scq_callback_cnt;
+ atomic_t cqm_ecq_callback_cnt;
+ atomic_t cqm_nocq_callback_cnt;
+ atomic_t cqm_aeq_callback_cnt[112];
+};
+
+struct link_event_stats {
+ atomic_t link_down_stats;
+ atomic_t link_up_stats;
+};
+
+enum hinic3_fault_err_level {
+ FAULT_LEVEL_FATAL,
+ FAULT_LEVEL_SERIOUS_RESET,
+ FAULT_LEVEL_HOST,
+ FAULT_LEVEL_SERIOUS_FLR,
+ FAULT_LEVEL_GENERAL,
+ FAULT_LEVEL_SUGGESTION,
+ FAULT_LEVEL_MAX,
+};
+
+enum hinic3_fault_type {
+ FAULT_TYPE_CHIP,
+ FAULT_TYPE_UCODE,
+ FAULT_TYPE_MEM_RD_TIMEOUT,
+ FAULT_TYPE_MEM_WR_TIMEOUT,
+ FAULT_TYPE_REG_RD_TIMEOUT,
+ FAULT_TYPE_REG_WR_TIMEOUT,
+ FAULT_TYPE_PHY_FAULT,
+ FAULT_TYPE_TSENSOR_FAULT,
+ FAULT_TYPE_MAX,
+};
+
+struct fault_event_stats {
+ /* TODO :HINIC_NODE_ID_MAX: temp use the value of 1822(22) */
+ atomic_t chip_fault_stats[22][FAULT_LEVEL_MAX];
+ atomic_t fault_type_stat[FAULT_TYPE_MAX];
+ atomic_t pcie_fault_stats;
+};
+
+enum hinic3_ucode_event_type {
+ HINIC3_INTERNAL_OTHER_FATAL_ERROR = 0x0,
+ HINIC3_CHANNEL_BUSY = 0x7,
+ HINIC3_NIC_FATAL_ERROR_MAX = 0x8,
+};
+
+struct hinic3_hw_stats {
+ atomic_t heart_lost_stats;
+ struct cqm_stats cqm_stats;
+ struct link_event_stats link_event_stats;
+ struct fault_event_stats fault_event_stats;
+ atomic_t nic_ucode_event_stats[HINIC3_NIC_FATAL_ERROR_MAX];
+};
+
+#ifndef IFNAMSIZ
+#define IFNAMSIZ 16
+#endif
+
+struct pf_info {
+ char name[IFNAMSIZ];
+ char bus_info[BUSINFO_LEN];
+ u32 pf_type;
+};
+
+struct card_info {
+ struct pf_info pf[PF_MAX_SIZE];
+ u32 pf_num;
+};
+
+struct hinic3_nic_loop_mode {
+ u32 loop_mode;
+ u32 loop_ctrl;
+};
+
+struct hinic3_pf_info {
+ u32 isvalid;
+ u32 pf_id;
+};
+
+enum hinic3_show_set {
+ HINIC3_SHOW_SSET_IO_STATS = 1,
+};
+
+#define HINIC3_SHOW_ITEM_LEN 32
+struct hinic3_show_item {
+ char name[HINIC3_SHOW_ITEM_LEN];
+ u8 hexadecimal; /* 0: decimal , 1: Hexadecimal */
+ u8 rsvd[7];
+ u64 value;
+};
+
+#define HINIC3_CHIP_FAULT_SIZE (110 * 1024)
+#define MAX_DRV_BUF_SIZE 4096
+
+struct nic_cmd_chip_fault_stats {
+ u32 offset;
+ u8 chip_fault_stats[MAX_DRV_BUF_SIZE];
+};
+
+#define NIC_TOOL_MAGIC 'x'
+
+#define CARD_MAX_SIZE (64)
+
+struct nic_card_id {
+ u32 id[CARD_MAX_SIZE];
+ u32 num;
+};
+
+struct func_pdev_info {
+ u64 bar0_phy_addr;
+ u64 bar0_size;
+ u64 bar1_phy_addr;
+ u64 bar1_size;
+ u64 bar3_phy_addr;
+ u64 bar3_size;
+ u64 rsvd1[4];
+};
+
+struct hinic3_card_func_info {
+ u32 num_pf;
+ u32 rsvd0;
+ u64 usr_api_phy_addr;
+ struct func_pdev_info pdev_info[CARD_MAX_SIZE];
+};
+
+struct wqe_info {
+ int q_id;
+ void *slq_handle;
+ unsigned int wqe_id;
+};
+
+#define MAX_VER_INFO_LEN 128
+struct drv_version_info {
+ char ver[MAX_VER_INFO_LEN];
+};
+
+struct hinic3_tx_hw_page {
+ u64 phy_addr;
+ u64 *map_addr;
+};
+
+struct nic_sq_info {
+ u16 q_id;
+ u16 pi;
+ u16 ci; /* sw_ci */
+ u16 fi; /* hw_ci */
+ u32 q_depth;
+ u16 pi_reverse; /* TODO: what is this? */
+ u16 wqebb_size;
+ u8 priority;
+ u16 *ci_addr;
+ u64 cla_addr;
+ void *slq_handle;
+ /* TODO: NIC don't use direct wqe */
+ struct hinic3_tx_hw_page direct_wqe;
+ struct hinic3_tx_hw_page doorbell;
+ u32 page_idx;
+ u32 glb_sq_id;
+};
+
+struct nic_rq_info {
+ u16 q_id;
+ u16 delta;
+ u16 hw_pi;
+ u16 ci; /* sw_ci */
+ u16 sw_pi;
+ u16 wqebb_size;
+ u16 q_depth;
+ u16 buf_len;
+
+ void *slq_handle;
+ u64 ci_wqe_page_addr;
+ u64 ci_cla_tbl_addr;
+
+ u8 coalesc_timer_cfg;
+ u8 pending_limt;
+ u16 msix_idx;
+ u32 msix_vector;
+};
+
+#define MT_EPERM 1 /* Operation not permitted */
+#define MT_EIO 2 /* I/O error */
+#define MT_EINVAL 3 /* Invalid argument */
+#define MT_EBUSY 4 /* Device or resource busy */
+#define MT_EOPNOTSUPP 0xFF /* Operation not supported */
+
+struct mt_msg_head {
+ u8 status;
+ u8 rsvd1[3];
+};
+
+#define MT_DCB_OPCODE_WR BIT(0) /* 1 - write, 0 - read */
+struct hinic3_mt_qos_info { /* delete */
+ struct mt_msg_head head;
+
+ u16 op_code;
+ u8 valid_cos_bitmap;
+ u8 valid_up_bitmap;
+ u32 rsvd1;
+};
+
+struct hinic3_mt_dcb_state {
+ struct mt_msg_head head;
+
+ u16 op_code; /* 0 - get dcb state, 1 - set dcb state */
+ u8 state; /* 0 - disable, 1 - enable dcb */
+ u8 rsvd;
+};
+
+#define MT_DCB_ETS_UP_TC BIT(1)
+#define MT_DCB_ETS_UP_BW BIT(2)
+#define MT_DCB_ETS_UP_PRIO BIT(3)
+#define MT_DCB_ETS_TC_BW BIT(4)
+#define MT_DCB_ETS_TC_PRIO BIT(5)
+
+#define DCB_UP_TC_NUM 0x8
+struct hinic3_mt_ets_state { /* delete */
+ struct mt_msg_head head;
+
+ u16 op_code;
+ u8 up_tc[DCB_UP_TC_NUM];
+ u8 up_bw[DCB_UP_TC_NUM];
+ u8 tc_bw[DCB_UP_TC_NUM];
+ u8 up_prio_bitmap;
+ u8 tc_prio_bitmap;
+ u32 rsvd;
+};
+
+#define MT_DCB_PFC_PFC_STATE BIT(1)
+#define MT_DCB_PFC_PFC_PRI_EN BIT(2)
+
+struct hinic3_mt_pfc_state { /* delete */
+ struct mt_msg_head head;
+
+ u16 op_code;
+ u8 state;
+ u8 pfc_en_bitpamp;
+ u32 rsvd;
+};
+
+#define CMD_QOS_DEV_TRUST BIT(0)
+#define CMD_QOS_DEV_DFT_COS BIT(1)
+#define CMD_QOS_DEV_PCP2COS BIT(2)
+#define CMD_QOS_DEV_DSCP2COS BIT(3)
+
+struct hinic3_mt_qos_dev_cfg {
+ struct mt_msg_head head;
+
+ u8 op_code; /* 0:get 1: set */
+ u8 rsvd0;
+ /* bit0 - trust, bit1 - dft_cos, bit2 - pcp2cos, bit3 - dscp2cos */
+ u16 cfg_bitmap;
+
+ u8 trust; /* 0 - pcp, 1 - dscp */
+ u8 dft_cos;
+ u16 rsvd1;
+ u8 pcp2cos[8]; /* 必须8个一起配置 */
+ /* 配置dscp2cos时,若cos值设置为0xFF,驱动则忽略此dscp优先级的配置,
+ * 允许一次性配置多个dscp跟cos的映射关系
+ */
+ u8 dscp2cos[64];
+ u32 rsvd2[4];
+};
+
+enum mt_api_type {
+ API_TYPE_MBOX = 1,
+ API_TYPE_API_CHAIN_BYPASS,
+ API_TYPE_API_CHAIN_TO_MPU,
+ API_TYPE_CLP,
+};
+
+struct npu_cmd_st {
+ u32 mod : 8;
+ u32 cmd : 8;
+ u32 ack_type : 3;
+ u32 direct_resp : 1;
+ u32 len : 12;
+};
+
+struct mpu_cmd_st {
+ u32 api_type : 8;
+ u32 mod : 8;
+ u32 cmd : 16;
+};
+
+struct msg_module {
+ char device_name[IFNAMSIZ];
+ u32 module;
+ union {
+ u32 msg_formate; /* for driver */
+ struct npu_cmd_st npu_cmd;
+ struct mpu_cmd_st mpu_cmd;
+ };
+ u32 timeout; /* for mpu/npu cmd */
+ u32 func_idx;
+ u32 buf_in_size;
+ u32 buf_out_size;
+ void *in_buf;
+ void *out_buf;
+ int bus_num;
+ u8 port_id;
+ u8 rsvd1[3];
+ u32 rsvd2[4];
+};
+
+struct hinic3_mt_qos_cos_cfg {
+ struct mt_msg_head head;
+
+ u8 port_id;
+ u8 func_cos_bitmap;
+ u8 port_cos_bitmap;
+ u8 func_max_cos_num;
+ u32 rsvd2[4];
+};
+
+#define MAX_NETDEV_NUM 4
+
+enum hinic3_bond_cmd_to_custom_e {
+ CMD_CUSTOM_BOND_DEV_CREATE = 1,
+ CMD_CUSTOM_BOND_DEV_DELETE,
+ CMD_CUSTOM_BOND_GET_CHIP_NAME,
+ CMD_CUSTOM_BOND_GET_CARD_INFO
+};
+
+typedef enum {
+ HASH_POLICY_L2 = 0, /* SMAC_DMAC */
+ HASH_POLICY_L23 = 1, /* SIP_DIP_SPORT_DPORT */
+ HASH_POLICY_L34 = 2, /* SMAC_DMAC_SIP_DIP */
+ HASH_POLICY_MAX = 3 /* MAX */
+} xmit_hash_policy_e;
+
+/* bond mode */
+typedef enum tag_bond_mode {
+ BOND_MODE_NONE = 0, /**< bond disable */
+ BOND_MODE_BACKUP = 1, /**< 1 for active-backup */
+ BOND_MODE_BALANCE = 2, /**< 2 for balance-xor */
+ BOND_MODE_LACP = 4, /**< 4 for 802.3ad */
+ BOND_MODE_MAX
+} bond_mode_e;
+
+struct add_bond_dev_s {
+ struct mt_msg_head head;
+ /* input can be empty, indicates that the value
+ * is assigned by the driver
+ */
+ char bond_name[IFNAMSIZ];
+ u8 slave_cnt;
+ u8 rsvd[3];
+ char slave_name[MAX_NETDEV_NUM][IFNAMSIZ]; /* unit : ms */
+ u32 poll_timeout; /* default value = 100 */
+ u32 up_delay; /* default value = 0 */
+ u32 down_delay; /* default value = 0 */
+ u32 bond_mode; /* default value = BOND_MODE_LACP */
+
+ /* maximum number of active bond member interfaces,
+ * default value = 0
+ */
+ u32 active_port_max_num;
+ /* minimum number of active bond member interfaces,
+ * default value = 0
+ */
+ u32 active_port_min_num;
+ /* hash policy, which is used for microcode routing logic,
+ * default value = 0
+ */
+ xmit_hash_policy_e xmit_hash_policy;
+};
+
+struct del_bond_dev_s {
+ struct mt_msg_head head;
+ char bond_name[IFNAMSIZ];
+};
+
+struct get_bond_chip_name_s {
+ char bond_name[IFNAMSIZ];
+ char chip_name[IFNAMSIZ];
+};
+
+struct bond_drv_msg_s {
+ u32 bond_id;
+ u32 slave_cnt;
+ u32 master_slave_index;
+ char bond_name[IFNAMSIZ];
+ char slave_name[MAX_NETDEV_NUM][IFNAMSIZ];
+};
+
+#define MAX_BONDING_CNT_PER_CARD (2)
+
+struct bond_negotiate_status {
+ u8 status;
+ u8 version;
+ u8 rsvd0[6];
+ u32 bond_id;
+ u32 bond_mmi_status; /* 该bond子设备的链路状态 */
+ u32 active_bitmap; /* 该bond子设备的slave port状态 */
+
+ u8 rsvd[16];
+};
+
+struct bond_all_msg_s {
+ struct bond_drv_msg_s drv_msg;
+ struct bond_negotiate_status active_info;
+};
+
+struct get_card_bond_msg_s {
+ u32 bond_cnt;
+ struct bond_all_msg_s all_msg[MAX_BONDING_CNT_PER_CARD];
+};
+
+int alloc_buff_in(void *hwdev, struct msg_module *nt_msg, u32 in_size, void **buf_in);
+
+int alloc_buff_out(void *hwdev, struct msg_module *nt_msg, u32 out_size, void **buf_out);
+
+void free_buff_in(void *hwdev, const struct msg_module *nt_msg, void *buf_in);
+
+void free_buff_out(void *hwdev, struct msg_module *nt_msg, void *buf_out);
+
+int copy_buf_out_to_user(struct msg_module *nt_msg, u32 out_size, void *buf_out);
+
+int send_to_mpu(void *hwdev, struct msg_module *nt_msg, void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+int send_to_npu(void *hwdev, struct msg_module *nt_msg, void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size);
+int send_to_sm(void *hwdev, struct msg_module *nt_msg, void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+#endif /* _HINIC3_MT_H_ */
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c b/drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c
new file mode 100644
index 000000000000..67553270f710
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c
@@ -0,0 +1,1975 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+#include <net/dsfield.h>
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/netlink.h>
+#include <linux/debugfs.h>
+#include <linux/ip.h>
+
+#include "ossl_knl.h"
+#ifdef HAVE_XDP_SUPPORT
+#include <linux/bpf.h>
+#endif
+#include "hinic3_hw.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_tx.h"
+#include "hinic3_rx.h"
+#include "hinic3_dcb.h"
+#include "hinic3_nic_prof.h"
+
+#define HINIC3_DEFAULT_RX_CSUM_OFFLOAD 0xFFF
+
+#define HINIC3_LRO_DEFAULT_COAL_PKT_SIZE 32
+#define HINIC3_LRO_DEFAULT_TIME_LIMIT 16
+#define HINIC3_WAIT_FLUSH_QP_RESOURCE_TIMEOUT 100
+static void hinic3_nic_set_rx_mode(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ if (netdev_uc_count(netdev) != nic_dev->netdev_uc_cnt ||
+ netdev_mc_count(netdev) != nic_dev->netdev_mc_cnt) {
+ set_bit(HINIC3_UPDATE_MAC_FILTER, &nic_dev->flags);
+ nic_dev->netdev_uc_cnt = netdev_uc_count(netdev);
+ nic_dev->netdev_mc_cnt = netdev_mc_count(netdev);
+ }
+
+ queue_work(nic_dev->workq, &nic_dev->rx_mode_work);
+}
+
+static int hinic3_alloc_txrxq_resources(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_txrxq_params *q_params)
+{
+ u32 size;
+ int err;
+
+ size = sizeof(*q_params->txqs_res) * q_params->num_qps;
+ q_params->txqs_res = kzalloc(size, GFP_KERNEL);
+ if (!q_params->txqs_res) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc txqs resources array\n");
+ return -ENOMEM;
+ }
+
+ size = sizeof(*q_params->rxqs_res) * q_params->num_qps;
+ q_params->rxqs_res = kzalloc(size, GFP_KERNEL);
+ if (!q_params->rxqs_res) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc rxqs resource array\n");
+ err = -ENOMEM;
+ goto alloc_rxqs_res_arr_err;
+ }
+
+ size = sizeof(*q_params->irq_cfg) * q_params->num_qps;
+ q_params->irq_cfg = kzalloc(size, GFP_KERNEL);
+ if (!q_params->irq_cfg) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc irq resource array\n");
+ err = -ENOMEM;
+ goto alloc_irq_cfg_err;
+ }
+
+ err = hinic3_alloc_txqs_res(nic_dev, q_params->num_qps,
+ q_params->sq_depth, q_params->txqs_res);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc txqs resource\n");
+ goto alloc_txqs_res_err;
+ }
+
+ err = hinic3_alloc_rxqs_res(nic_dev, q_params->num_qps,
+ q_params->rq_depth, q_params->rxqs_res);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc rxqs resource\n");
+ goto alloc_rxqs_res_err;
+ }
+
+ return 0;
+
+alloc_rxqs_res_err:
+ hinic3_free_txqs_res(nic_dev, q_params->num_qps, q_params->sq_depth,
+ q_params->txqs_res);
+
+alloc_txqs_res_err:
+ kfree(q_params->irq_cfg);
+ q_params->irq_cfg = NULL;
+
+alloc_irq_cfg_err:
+ kfree(q_params->rxqs_res);
+ q_params->rxqs_res = NULL;
+
+alloc_rxqs_res_arr_err:
+ kfree(q_params->txqs_res);
+ q_params->txqs_res = NULL;
+
+ return err;
+}
+
+static void hinic3_free_txrxq_resources(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_txrxq_params *q_params)
+{
+ hinic3_free_rxqs_res(nic_dev, q_params->num_qps, q_params->rq_depth,
+ q_params->rxqs_res);
+ hinic3_free_txqs_res(nic_dev, q_params->num_qps, q_params->sq_depth,
+ q_params->txqs_res);
+
+ kfree(q_params->irq_cfg);
+ q_params->irq_cfg = NULL;
+
+ kfree(q_params->rxqs_res);
+ q_params->rxqs_res = NULL;
+
+ kfree(q_params->txqs_res);
+ q_params->txqs_res = NULL;
+}
+
+static int hinic3_configure_txrxqs(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_txrxq_params *q_params)
+{
+ int err;
+
+ err = hinic3_configure_txqs(nic_dev, q_params->num_qps,
+ q_params->sq_depth, q_params->txqs_res);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to configure txqs\n");
+ return err;
+ }
+
+ err = hinic3_configure_rxqs(nic_dev, q_params->num_qps,
+ q_params->rq_depth, q_params->rxqs_res);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to configure rxqs\n");
+ return err;
+ }
+
+ return 0;
+}
+
+static void config_dcb_qps_map(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u8 num_cos;
+
+ if (!test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags)) {
+ hinic3_update_tx_db_cos(nic_dev, 0);
+ return;
+ }
+
+ num_cos = hinic3_get_dev_user_cos_num(nic_dev);
+ hinic3_update_qp_cos_cfg(nic_dev, num_cos);
+ /* For now, we don't support to change num_cos */
+ if (num_cos > nic_dev->cos_config_num_max ||
+ nic_dev->q_params.num_qps < num_cos) {
+ nicif_err(nic_dev, drv, netdev, "Invalid num_cos: %u or num_qps: %u, disable DCB\n",
+ num_cos, nic_dev->q_params.num_qps);
+ nic_dev->q_params.num_cos = 0;
+ clear_bit(HINIC3_DCB_ENABLE, &nic_dev->flags);
+ /* if we can't enable rss or get enough num_qps,
+ * need to sync default configure to hw
+ */
+ hinic3_configure_dcb(netdev);
+ }
+
+ hinic3_update_tx_db_cos(nic_dev, 1);
+}
+
+static int hinic3_configure(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ int err;
+
+ err = hinic3_set_port_mtu(nic_dev->hwdev, (u16)netdev->mtu);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to set mtu\n");
+ return err;
+ }
+
+ config_dcb_qps_map(nic_dev);
+
+ /* rx rss init */
+ err = hinic3_rx_configure(netdev, test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags) ? 1 : 0);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to configure rx\n");
+ return err;
+ }
+
+ return 0;
+}
+
+static void hinic3_remove_configure(struct hinic3_nic_dev *nic_dev)
+{
+ hinic3_rx_remove_configure(nic_dev->netdev);
+}
+
+/* try to modify the number of irq to the target number,
+ * and return the actual number of irq.
+ */
+static u16 hinic3_qp_irq_change(struct hinic3_nic_dev *nic_dev,
+ u16 dst_num_qp_irq)
+{
+ struct irq_info *qps_irq_info = nic_dev->qps_irq_info;
+ u16 resp_irq_num, irq_num_gap, i;
+ u16 idx;
+ int err;
+
+ if (dst_num_qp_irq > nic_dev->num_qp_irq) {
+ irq_num_gap = dst_num_qp_irq - nic_dev->num_qp_irq;
+ err = hinic3_alloc_irqs(nic_dev->hwdev, SERVICE_T_NIC,
+ irq_num_gap,
+ &qps_irq_info[nic_dev->num_qp_irq],
+ &resp_irq_num);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to alloc irqs\n");
+ return nic_dev->num_qp_irq;
+ }
+
+ nic_dev->num_qp_irq += resp_irq_num;
+ } else if (dst_num_qp_irq < nic_dev->num_qp_irq) {
+ irq_num_gap = nic_dev->num_qp_irq - dst_num_qp_irq;
+ for (i = 0; i < irq_num_gap; i++) {
+ idx = (nic_dev->num_qp_irq - i) - 1;
+ hinic3_free_irq(nic_dev->hwdev, SERVICE_T_NIC,
+ qps_irq_info[idx].irq_id);
+ qps_irq_info[idx].irq_id = 0;
+ qps_irq_info[idx].msix_entry_idx = 0;
+ }
+ nic_dev->num_qp_irq = dst_num_qp_irq;
+ }
+
+ return nic_dev->num_qp_irq;
+}
+
+static void config_dcb_num_qps(struct hinic3_nic_dev *nic_dev,
+ const struct hinic3_dyna_txrxq_params *q_params,
+ u16 max_qps)
+{
+ u8 num_cos = q_params->num_cos;
+ u8 user_cos_num = hinic3_get_dev_user_cos_num(nic_dev);
+
+ if (!num_cos || num_cos > nic_dev->cos_config_num_max || num_cos > max_qps)
+ return; /* will disable DCB in config_dcb_qps_map() */
+
+ hinic3_update_qp_cos_cfg(nic_dev, user_cos_num);
+}
+
+static void hinic3_config_num_qps(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_txrxq_params *q_params)
+{
+ u16 alloc_num_irq, cur_num_irq;
+ u16 dst_num_irq;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags))
+ q_params->num_qps = 1;
+
+ config_dcb_num_qps(nic_dev, q_params, q_params->num_qps);
+
+ if (nic_dev->num_qp_irq >= q_params->num_qps)
+ goto out;
+
+ cur_num_irq = nic_dev->num_qp_irq;
+
+ alloc_num_irq = hinic3_qp_irq_change(nic_dev, q_params->num_qps);
+ if (alloc_num_irq < q_params->num_qps) {
+ q_params->num_qps = alloc_num_irq;
+ config_dcb_num_qps(nic_dev, q_params, q_params->num_qps);
+ nicif_warn(nic_dev, drv, nic_dev->netdev,
+ "Can not get enough irqs, adjust num_qps to %u\n",
+ q_params->num_qps);
+
+ /* The current irq may be in use, we must keep it */
+ dst_num_irq = (u16)max_t(u16, cur_num_irq, q_params->num_qps);
+ hinic3_qp_irq_change(nic_dev, dst_num_irq);
+ }
+
+out:
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Finally num_qps: %u\n",
+ q_params->num_qps);
+}
+
+/* determin num_qps from rss_tmpl_id/irq_num/dcb_en */
+static int hinic3_setup_num_qps(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u32 irq_size;
+
+ nic_dev->num_qp_irq = 0;
+
+ irq_size = sizeof(*nic_dev->qps_irq_info) * nic_dev->max_qps;
+ if (!irq_size) {
+ nicif_err(nic_dev, drv, netdev, "Cannot allocate zero size entries\n");
+ return -EINVAL;
+ }
+ nic_dev->qps_irq_info = kzalloc(irq_size, GFP_KERNEL);
+ if (!nic_dev->qps_irq_info) {
+ nicif_err(nic_dev, drv, netdev, "Failed to alloc qps_irq_info\n");
+ return -ENOMEM;
+ }
+
+ hinic3_config_num_qps(nic_dev, &nic_dev->q_params);
+
+ return 0;
+}
+
+static void hinic3_destroy_num_qps(struct hinic3_nic_dev *nic_dev)
+{
+ u16 i;
+
+ for (i = 0; i < nic_dev->num_qp_irq; i++)
+ hinic3_free_irq(nic_dev->hwdev, SERVICE_T_NIC,
+ nic_dev->qps_irq_info[i].irq_id);
+
+ kfree(nic_dev->qps_irq_info);
+}
+
+int hinic3_force_port_disable(struct hinic3_nic_dev *nic_dev)
+{
+ int err;
+
+ down(&nic_dev->port_state_sem);
+
+ err = hinic3_set_port_enable(nic_dev->hwdev, false, HINIC3_CHANNEL_NIC);
+ if (!err)
+ nic_dev->force_port_disable = true;
+
+ up(&nic_dev->port_state_sem);
+
+ return err;
+}
+
+int hinic3_force_set_port_state(struct hinic3_nic_dev *nic_dev, bool enable)
+{
+ int err = 0;
+
+ down(&nic_dev->port_state_sem);
+
+ nic_dev->force_port_disable = false;
+ err = hinic3_set_port_enable(nic_dev->hwdev, enable,
+ HINIC3_CHANNEL_NIC);
+
+ up(&nic_dev->port_state_sem);
+
+ return err;
+}
+
+int hinic3_maybe_set_port_state(struct hinic3_nic_dev *nic_dev, bool enable)
+{
+ int err;
+
+ down(&nic_dev->port_state_sem);
+
+ /* Do nothing when force disable
+ * Port will disable when call force port disable
+ * and should not enable port when in force mode
+ */
+ if (nic_dev->force_port_disable) {
+ up(&nic_dev->port_state_sem);
+ return 0;
+ }
+
+ err = hinic3_set_port_enable(nic_dev->hwdev, enable,
+ HINIC3_CHANNEL_NIC);
+
+ up(&nic_dev->port_state_sem);
+
+ return err;
+}
+
+static void hinic3_print_link_message(struct hinic3_nic_dev *nic_dev,
+ u8 link_status)
+{
+ if (nic_dev->link_status == link_status)
+ return;
+
+ nic_dev->link_status = link_status;
+
+ nicif_info(nic_dev, link, nic_dev->netdev, "Link is %s\n",
+ (link_status ? "up" : "down"));
+}
+
+static int hinic3_alloc_channel_resources(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_qp_params *qp_params,
+ struct hinic3_dyna_txrxq_params *trxq_params)
+{
+ int err;
+
+ qp_params->num_qps = trxq_params->num_qps;
+ qp_params->sq_depth = trxq_params->sq_depth;
+ qp_params->rq_depth = trxq_params->rq_depth;
+
+ err = hinic3_alloc_qps(nic_dev->hwdev, nic_dev->qps_irq_info,
+ qp_params);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to alloc qps\n");
+ return err;
+ }
+
+ err = hinic3_alloc_txrxq_resources(nic_dev, trxq_params);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to alloc txrxq resources\n");
+ hinic3_free_qps(nic_dev->hwdev, qp_params);
+ return err;
+ }
+
+ return 0;
+}
+
+static void hinic3_free_channel_resources(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_qp_params *qp_params,
+ struct hinic3_dyna_txrxq_params *trxq_params)
+{
+ mutex_lock(&nic_dev->nic_mutex);
+ hinic3_free_txrxq_resources(nic_dev, trxq_params);
+ hinic3_free_qps(nic_dev->hwdev, qp_params);
+ mutex_unlock(&nic_dev->nic_mutex);
+}
+
+static int hinic3_open_channel(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_qp_params *qp_params,
+ struct hinic3_dyna_txrxq_params *trxq_params)
+{
+ int err;
+
+ err = hinic3_init_qps(nic_dev->hwdev, qp_params);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to init qps\n");
+ return err;
+ }
+
+ err = hinic3_configure_txrxqs(nic_dev, trxq_params);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to configure txrxqs\n");
+ goto cfg_txrxqs_err;
+ }
+
+ err = hinic3_qps_irq_init(nic_dev);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to init txrxq irq\n");
+ goto init_qp_irq_err;
+ }
+
+ err = hinic3_configure(nic_dev);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to init txrxq irq\n");
+ goto configure_err;
+ }
+
+ return 0;
+
+configure_err:
+ hinic3_qps_irq_deinit(nic_dev);
+
+init_qp_irq_err:
+cfg_txrxqs_err:
+ hinic3_deinit_qps(nic_dev->hwdev, qp_params);
+
+ return err;
+}
+
+static void hinic3_close_channel(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_qp_params *qp_params)
+{
+ hinic3_remove_configure(nic_dev);
+ hinic3_qps_irq_deinit(nic_dev);
+ hinic3_deinit_qps(nic_dev->hwdev, qp_params);
+}
+
+int hinic3_vport_up(struct hinic3_nic_dev *nic_dev)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u8 link_status = 0;
+ u16 glb_func_id;
+ int err;
+
+ glb_func_id = hinic3_global_func_id(nic_dev->hwdev);
+ err = hinic3_set_vport_enable(nic_dev->hwdev, glb_func_id, true,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to enable vport\n");
+ goto vport_enable_err;
+ }
+
+ err = hinic3_maybe_set_port_state(nic_dev, true);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to enable port\n");
+ goto port_enable_err;
+ }
+
+ netif_set_real_num_tx_queues(netdev, nic_dev->q_params.num_qps);
+ netif_set_real_num_rx_queues(netdev, nic_dev->q_params.num_qps);
+ netif_tx_wake_all_queues(netdev);
+
+ if (test_bit(HINIC3_FORCE_LINK_UP, &nic_dev->flags)) {
+ link_status = true;
+ netif_carrier_on(netdev);
+ } else {
+ err = hinic3_get_link_state(nic_dev->hwdev, &link_status);
+ if (!err && link_status)
+ netif_carrier_on(netdev);
+ }
+
+ queue_delayed_work(nic_dev->workq, &nic_dev->moderation_task,
+ HINIC3_MODERATONE_DELAY);
+ if (test_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags))
+ queue_delayed_work(nic_dev->workq, &nic_dev->rxq_check_work, HZ);
+
+ hinic3_print_link_message(nic_dev, link_status);
+
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev))
+ hinic3_notify_all_vfs_link_changed(nic_dev->hwdev, link_status);
+
+ return 0;
+
+port_enable_err:
+ hinic3_set_vport_enable(nic_dev->hwdev, glb_func_id, false,
+ HINIC3_CHANNEL_NIC);
+
+vport_enable_err:
+ hinic3_flush_qps_res(nic_dev->hwdev);
+ /* After set vport disable 100ms, no packets will be send to host */
+ msleep(100);
+
+ return err;
+}
+
+void hinic3_vport_down(struct hinic3_nic_dev *nic_dev)
+{
+ u16 glb_func_id;
+
+ netif_carrier_off(nic_dev->netdev);
+ netif_tx_disable(nic_dev->netdev);
+
+ cancel_delayed_work_sync(&nic_dev->rxq_check_work);
+
+ cancel_delayed_work_sync(&nic_dev->moderation_task);
+
+ if (hinic3_get_chip_present_flag(nic_dev->hwdev)) {
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev))
+ hinic3_notify_all_vfs_link_changed(nic_dev->hwdev, 0);
+
+ hinic3_maybe_set_port_state(nic_dev, false);
+
+ glb_func_id = hinic3_global_func_id(nic_dev->hwdev);
+ hinic3_set_vport_enable(nic_dev->hwdev, glb_func_id, false,
+ HINIC3_CHANNEL_NIC);
+
+ hinic3_flush_txqs(nic_dev->netdev);
+ /* After set vport disable 100ms,
+ * no packets will be send to host
+ * FPGA set 2000ms
+ */
+ msleep(HINIC3_WAIT_FLUSH_QP_RESOURCE_TIMEOUT);
+ hinic3_flush_qps_res(nic_dev->hwdev);
+ }
+}
+
+int hinic3_change_channel_settings(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_txrxq_params *trxq_params,
+ hinic3_reopen_handler reopen_handler,
+ const void *priv_data)
+{
+ struct hinic3_dyna_qp_params new_qp_params = {0};
+ struct hinic3_dyna_qp_params cur_qp_params = {0};
+ int err;
+
+ hinic3_config_num_qps(nic_dev, trxq_params);
+
+ err = hinic3_alloc_channel_resources(nic_dev, &new_qp_params,
+ trxq_params);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc channel resources\n");
+ return err;
+ }
+
+ if (!test_and_set_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags)) {
+ hinic3_vport_down(nic_dev);
+ hinic3_close_channel(nic_dev, &cur_qp_params);
+ hinic3_free_channel_resources(nic_dev, &cur_qp_params,
+ &nic_dev->q_params);
+ }
+
+ if (nic_dev->num_qp_irq > trxq_params->num_qps)
+ hinic3_qp_irq_change(nic_dev, trxq_params->num_qps);
+ nic_dev->q_params = *trxq_params;
+
+ if (reopen_handler)
+ reopen_handler(nic_dev, priv_data);
+
+ err = hinic3_open_channel(nic_dev, &new_qp_params, trxq_params);
+ if (err)
+ goto open_channel_err;
+
+ err = hinic3_vport_up(nic_dev);
+ if (err)
+ goto vport_up_err;
+
+ clear_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags);
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Change channel settings success\n");
+
+ return 0;
+
+vport_up_err:
+ hinic3_close_channel(nic_dev, &new_qp_params);
+
+open_channel_err:
+ hinic3_free_channel_resources(nic_dev, &new_qp_params, trxq_params);
+
+ return err;
+}
+
+int hinic3_open(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_dyna_qp_params qp_params = {0};
+ int err;
+
+ if (test_bit(HINIC3_INTF_UP, &nic_dev->flags)) {
+ nicif_info(nic_dev, drv, netdev, "Netdev already open, do nothing\n");
+ return 0;
+ }
+
+ err = hinic3_init_nicio_res(nic_dev->hwdev);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to init nicio resources\n");
+ return err;
+ }
+
+ err = hinic3_setup_num_qps(nic_dev);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to setup num_qps\n");
+ goto setup_qps_err;
+ }
+
+ err = hinic3_alloc_channel_resources(nic_dev, &qp_params,
+ &nic_dev->q_params);
+ if (err)
+ goto alloc_channel_res_err;
+
+ err = hinic3_open_channel(nic_dev, &qp_params, &nic_dev->q_params);
+ if (err)
+ goto open_channel_err;
+
+ err = hinic3_vport_up(nic_dev);
+ if (err)
+ goto vport_up_err;
+
+ err = hinic3_set_master_dev_state(nic_dev, true);
+ if (err)
+ goto set_master_dev_err;
+
+ set_bit(HINIC3_INTF_UP, &nic_dev->flags);
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Netdev is up\n");
+
+ return 0;
+
+set_master_dev_err:
+ hinic3_vport_down(nic_dev);
+
+vport_up_err:
+ hinic3_close_channel(nic_dev, &qp_params);
+
+open_channel_err:
+ hinic3_free_channel_resources(nic_dev, &qp_params, &nic_dev->q_params);
+
+alloc_channel_res_err:
+ hinic3_destroy_num_qps(nic_dev);
+
+setup_qps_err:
+ hinic3_deinit_nicio_res(nic_dev->hwdev);
+
+ return err;
+}
+
+int hinic3_close(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_dyna_qp_params qp_params = {0};
+
+ if (!test_and_clear_bit(HINIC3_INTF_UP, &nic_dev->flags)) {
+ nicif_info(nic_dev, drv, netdev, "Netdev already close, do nothing\n");
+ return 0;
+ }
+
+ if (test_and_clear_bit(HINIC3_CHANGE_RES_INVALID, &nic_dev->flags))
+ goto out;
+
+ hinic3_set_master_dev_state(nic_dev, false);
+
+ hinic3_vport_down(nic_dev);
+ hinic3_close_channel(nic_dev, &qp_params);
+ hinic3_free_channel_resources(nic_dev, &qp_params, &nic_dev->q_params);
+
+out:
+ hinic3_deinit_nicio_res(nic_dev->hwdev);
+ hinic3_destroy_num_qps(nic_dev);
+
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Netdev is down\n");
+
+ return 0;
+}
+
+#define IPV6_ADDR_LEN 4
+#define PKT_INFO_LEN 9
+#define BITS_PER_TUPLE 32
+static u32 calc_xor_rss(u8 *rss_tunple, u32 len)
+{
+ u32 hash_value;
+ u32 i;
+
+ hash_value = rss_tunple[0];
+ for (i = 1; i < len; i++)
+ hash_value = hash_value ^ rss_tunple[i];
+
+ return hash_value;
+}
+
+static u32 calc_toep_rss(const u32 *rss_tunple, u32 len, const u32 *rss_key)
+{
+ u32 rss = 0;
+ u32 i, j;
+
+ for (i = 1; i <= len; i++) {
+ for (j = 0; j < BITS_PER_TUPLE; j++)
+ if (rss_tunple[i - 1] & ((u32)1 <<
+ (u32)((BITS_PER_TUPLE - 1) - j)))
+ rss ^= (rss_key[i - 1] << j) |
+ (u32)((u64)rss_key[i] >>
+ (BITS_PER_TUPLE - j));
+ }
+
+ return rss;
+}
+
+#define RSS_VAL(val, type) \
+ (((type) == HINIC3_RSS_HASH_ENGINE_TYPE_TOEP) ? ntohl(val) : (val))
+
+static u8 parse_ipv6_info(struct sk_buff *skb, u32 *rss_tunple,
+ u8 hash_engine, u32 *len)
+{
+ struct ipv6hdr *ipv6hdr = ipv6_hdr(skb);
+ u32 *saddr = (u32 *)&ipv6hdr->saddr;
+ u32 *daddr = (u32 *)&ipv6hdr->daddr;
+ u8 i;
+
+ for (i = 0; i < IPV6_ADDR_LEN; i++) {
+ rss_tunple[i] = RSS_VAL(daddr[i], hash_engine);
+ /* The offset of the sport relative to the dport is 4 */
+ rss_tunple[(u32)(i + IPV6_ADDR_LEN)] =
+ RSS_VAL(saddr[i], hash_engine);
+ }
+ *len = IPV6_ADDR_LEN + IPV6_ADDR_LEN;
+
+ if (skb_network_header(skb) + sizeof(*ipv6hdr) ==
+ skb_transport_header(skb))
+ return ipv6hdr->nexthdr;
+ return 0;
+}
+
+static u16 select_queue_by_hash_func(struct net_device *dev, struct sk_buff *skb,
+ unsigned int num_tx_queues)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(dev);
+ struct nic_rss_type rss_type = nic_dev->rss_type;
+ struct iphdr *iphdr = NULL;
+ u32 rss_tunple[PKT_INFO_LEN] = {0};
+ u32 len = 0;
+ u32 hash = 0;
+ u8 hash_engine = nic_dev->rss_hash_engine;
+ u8 l4_proto;
+ unsigned char *l4_hdr = NULL;
+
+ if (skb_rx_queue_recorded(skb)) {
+ hash = skb_get_rx_queue(skb);
+ if (unlikely(hash >= num_tx_queues))
+ hash %= num_tx_queues;
+
+ return (u16)hash;
+ }
+
+ iphdr = ip_hdr(skb);
+ if (iphdr->version == IPV4_VERSION) {
+ rss_tunple[len++] = RSS_VAL(iphdr->daddr, hash_engine);
+ rss_tunple[len++] = RSS_VAL(iphdr->saddr, hash_engine);
+ l4_proto = iphdr->protocol;
+ } else if (iphdr->version == IPV6_VERSION) {
+ l4_proto = parse_ipv6_info(skb, (u32 *)rss_tunple,
+ hash_engine, &len);
+ } else {
+ return (u16)hash;
+ }
+
+ if ((iphdr->version == IPV4_VERSION &&
+ ((l4_proto == IPPROTO_UDP && rss_type.udp_ipv4) ||
+ (l4_proto == IPPROTO_TCP && rss_type.tcp_ipv4))) ||
+ (iphdr->version == IPV6_VERSION &&
+ ((l4_proto == IPPROTO_UDP && rss_type.udp_ipv6) ||
+ (l4_proto == IPPROTO_TCP && rss_type.tcp_ipv6)))) {
+ l4_hdr = skb_transport_header(skb);
+ /* High 16 bits are dport, low 16 bits are sport. */
+ rss_tunple[len++] = ((u32)ntohs(*((u16 *)l4_hdr + 1U)) << 16) |
+ ntohs(*(u16 *)l4_hdr);
+ } /* rss_type.ipv4 and rss_type.ipv6 default on. */
+
+ if (hash_engine == HINIC3_RSS_HASH_ENGINE_TYPE_TOEP)
+ hash = calc_toep_rss((u32 *)rss_tunple, len,
+ nic_dev->rss_hkey_be);
+ else
+ hash = calc_xor_rss((u8 *)rss_tunple, len * (u32)sizeof(u32));
+
+ return (u16)nic_dev->rss_indir[hash & 0xFF];
+}
+
+#define GET_DSCP_PRI_OFFSET 2
+static u8 hinic3_get_dscp_up(struct hinic3_nic_dev *nic_dev, struct sk_buff *skb)
+{
+ int dscp_cp;
+
+ if (skb->protocol == htons(ETH_P_IP))
+ dscp_cp = ipv4_get_dsfield(ip_hdr(skb)) >> GET_DSCP_PRI_OFFSET;
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ dscp_cp = ipv6_get_dsfield(ipv6_hdr(skb)) >> GET_DSCP_PRI_OFFSET;
+ else
+ return nic_dev->hw_dcb_cfg.default_cos;
+ return nic_dev->hw_dcb_cfg.dscp2cos[dscp_cp];
+}
+
+#if defined(HAVE_NDO_SELECT_QUEUE_SB_DEV_ONLY)
+static u16 hinic3_select_queue(struct net_device *netdev, struct sk_buff *skb,
+ struct net_device *sb_dev)
+#elif defined(HAVE_NDO_SELECT_QUEUE_ACCEL_FALLBACK)
+#if defined(HAVE_NDO_SELECT_QUEUE_SB_DEV)
+static u16 hinic3_select_queue(struct net_device *netdev, struct sk_buff *skb,
+ struct net_device *sb_dev,
+ select_queue_fallback_t fallback)
+#else
+static u16 hinic3_select_queue(struct net_device *netdev, struct sk_buff *skb,
+ __always_unused void *accel,
+ select_queue_fallback_t fallback)
+#endif
+
+#elif defined(HAVE_NDO_SELECT_QUEUE_ACCEL)
+static u16 hinic3_select_queue(struct net_device *netdev, struct sk_buff *skb,
+ __always_unused void *accel)
+
+#else
+static u16 hinic3_select_queue(struct net_device *netdev, struct sk_buff *skb)
+#endif /* end of HAVE_NDO_SELECT_QUEUE_ACCEL_FALLBACK */
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u16 txq;
+ u8 cos, qp_num;
+
+ if (test_bit(HINIC3_SAME_RXTX, &nic_dev->flags))
+ return select_queue_by_hash_func(netdev, skb, netdev->real_num_tx_queues);
+
+ txq =
+#if defined(HAVE_NDO_SELECT_QUEUE_SB_DEV_ONLY)
+ netdev_pick_tx(netdev, skb, NULL);
+#elif defined(HAVE_NDO_SELECT_QUEUE_ACCEL_FALLBACK)
+#ifdef HAVE_NDO_SELECT_QUEUE_SB_DEV
+ fallback(netdev, skb, sb_dev);
+#else
+ fallback(netdev, skb);
+#endif
+#else
+ skb_tx_hash(netdev, skb);
+#endif
+
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags)) {
+ if (nic_dev->hw_dcb_cfg.trust == DCB_PCP) {
+ if (skb->vlan_tci)
+ cos = nic_dev->hw_dcb_cfg.pcp2cos[skb->vlan_tci >> VLAN_PRIO_SHIFT];
+ else
+ cos = nic_dev->hw_dcb_cfg.default_cos;
+ } else {
+ cos = hinic3_get_dscp_up(nic_dev, skb);
+ }
+
+ qp_num = nic_dev->hw_dcb_cfg.cos_qp_num[cos] ?
+ txq % nic_dev->hw_dcb_cfg.cos_qp_num[cos] : 0;
+ txq = nic_dev->hw_dcb_cfg.cos_qp_offset[cos] + qp_num;
+ }
+
+ return txq;
+}
+
+#ifdef HAVE_NDO_GET_STATS64
+#ifdef HAVE_VOID_NDO_GET_STATS64
+static void hinic3_get_stats64(struct net_device *netdev,
+ struct rtnl_link_stats64 *stats)
+#else
+static struct rtnl_link_stats64
+ *hinic3_get_stats64(struct net_device *netdev,
+ struct rtnl_link_stats64 *stats)
+#endif
+
+#else /* !HAVE_NDO_GET_STATS64 */
+static struct net_device_stats *hinic3_get_stats(struct net_device *netdev)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+#ifndef HAVE_NDO_GET_STATS64
+#ifdef HAVE_NETDEV_STATS_IN_NETDEV
+ struct net_device_stats *stats = &netdev->stats;
+#else
+ struct net_device_stats *stats = &nic_dev->net_stats;
+#endif /* HAVE_NETDEV_STATS_IN_NETDEV */
+#endif /* HAVE_NDO_GET_STATS64 */
+ struct hinic3_txq_stats *txq_stats = NULL;
+ struct hinic3_rxq_stats *rxq_stats = NULL;
+ struct hinic3_txq *txq = NULL;
+ struct hinic3_rxq *rxq = NULL;
+ u64 bytes, packets, dropped, errors;
+ unsigned int start;
+ int i;
+
+ bytes = 0;
+ packets = 0;
+ dropped = 0;
+ for (i = 0; i < nic_dev->max_qps; i++) {
+ if (!nic_dev->txqs)
+ break;
+
+ txq = &nic_dev->txqs[i];
+ txq_stats = &txq->txq_stats;
+ do {
+ start = u64_stats_fetch_begin(&txq_stats->syncp);
+ bytes += txq_stats->bytes;
+ packets += txq_stats->packets;
+ dropped += txq_stats->dropped;
+ } while (u64_stats_fetch_retry(&txq_stats->syncp, start));
+ }
+ stats->tx_packets = packets;
+ stats->tx_bytes = bytes;
+ stats->tx_dropped = dropped;
+
+ bytes = 0;
+ packets = 0;
+ errors = 0;
+ dropped = 0;
+ for (i = 0; i < nic_dev->max_qps; i++) {
+ if (!nic_dev->rxqs)
+ break;
+
+ rxq = &nic_dev->rxqs[i];
+ rxq_stats = &rxq->rxq_stats;
+ do {
+ start = u64_stats_fetch_begin(&rxq_stats->syncp);
+ bytes += rxq_stats->bytes;
+ packets += rxq_stats->packets;
+ errors += rxq_stats->csum_errors +
+ rxq_stats->other_errors;
+ dropped += rxq_stats->dropped;
+ } while (u64_stats_fetch_retry(&rxq_stats->syncp, start));
+ }
+ stats->rx_packets = packets;
+ stats->rx_bytes = bytes;
+ stats->rx_errors = errors;
+ stats->rx_dropped = dropped;
+
+#ifndef HAVE_VOID_NDO_GET_STATS64
+ return stats;
+#endif
+}
+
+#ifdef HAVE_NDO_TX_TIMEOUT_TXQ
+static void hinic3_tx_timeout(struct net_device *netdev, unsigned int txqueue)
+#else
+static void hinic3_tx_timeout(struct net_device *netdev)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_io_queue *sq = NULL;
+ bool hw_err = false;
+ u32 sw_pi, hw_ci;
+ u8 q_id;
+
+ HINIC3_NIC_STATS_INC(nic_dev, netdev_tx_timeout);
+ nicif_err(nic_dev, drv, netdev, "Tx timeout\n");
+
+ for (q_id = 0; q_id < nic_dev->q_params.num_qps; q_id++) {
+ if (!netif_xmit_stopped(netdev_get_tx_queue(netdev, q_id)))
+ continue;
+
+ sq = nic_dev->txqs[q_id].sq;
+ sw_pi = hinic3_get_sq_local_pi(sq);
+ hw_ci = hinic3_get_sq_hw_ci(sq);
+ nicif_info(nic_dev, drv, netdev,
+ "txq%u: sw_pi: %hu, hw_ci: %u, sw_ci: %u, napi->state: 0x%lx.\n",
+ q_id, sw_pi, hw_ci, hinic3_get_sq_local_ci(sq),
+ nic_dev->q_params.irq_cfg[q_id].napi.state);
+
+ if (sw_pi != hw_ci)
+ hw_err = true;
+ }
+
+ if (hw_err)
+ set_bit(EVENT_WORK_TX_TIMEOUT, &nic_dev->event_flag);
+}
+
+static int hinic3_change_mtu(struct net_device *netdev, int new_mtu)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u32 mtu = (u32)new_mtu;
+ int err = 0;
+
+#ifdef HAVE_XDP_SUPPORT
+ u32 xdp_max_mtu;
+
+ if (hinic3_is_xdp_enable(nic_dev)) {
+ xdp_max_mtu = hinic3_xdp_max_mtu(nic_dev);
+ if (mtu > xdp_max_mtu) {
+ nicif_err(nic_dev, drv, netdev,
+ "Max MTU for xdp usage is %d\n", xdp_max_mtu);
+ return -EINVAL;
+ }
+ }
+#endif
+
+ err = hinic3_config_port_mtu(nic_dev, mtu);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to change port mtu to %d\n",
+ new_mtu);
+ } else {
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Change mtu from %u to %d\n",
+ netdev->mtu, new_mtu);
+ netdev->mtu = mtu;
+ }
+
+ return err;
+}
+
+static int hinic3_set_mac_addr(struct net_device *netdev, void *addr)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct sockaddr *saddr = addr;
+ int err;
+
+ if (!is_valid_ether_addr(saddr->sa_data))
+ return -EADDRNOTAVAIL;
+
+ if (ether_addr_equal(netdev->dev_addr, saddr->sa_data)) {
+ nicif_info(nic_dev, drv, netdev,
+ "Already using mac address %pM\n",
+ saddr->sa_data);
+ return 0;
+ }
+
+ err = hinic3_config_port_mac(nic_dev, saddr);
+ if (err)
+ return err;
+
+ ether_addr_copy(netdev->dev_addr, saddr->sa_data);
+
+ nicif_info(nic_dev, drv, netdev, "Set new mac address %pM\n",
+ saddr->sa_data);
+
+ return 0;
+}
+
+#if (KERNEL_VERSION(3, 3, 0) > LINUX_VERSION_CODE)
+static void
+#else
+static int
+#endif
+hinic3_vlan_rx_add_vid(struct net_device *netdev,
+ #if (KERNEL_VERSION(3, 10, 0) <= LINUX_VERSION_CODE)
+ __always_unused __be16 proto,
+ #endif
+ u16 vid)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ unsigned long *vlan_bitmap = nic_dev->vlan_bitmap;
+ u16 func_id;
+ u32 col, line;
+ int err = 0;
+
+ /* VLAN 0 donot be added, which is the same as VLAN 0 deleted. */
+ if (vid == 0)
+ goto end;
+
+ col = VID_COL(nic_dev, vid);
+ line = VID_LINE(nic_dev, vid);
+
+ func_id = hinic3_global_func_id(nic_dev->hwdev);
+
+ err = hinic3_add_vlan(nic_dev->hwdev, vid, func_id);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to add vlan %u\n", vid);
+ goto end;
+ }
+
+ set_bit(col, &vlan_bitmap[line]);
+
+ nicif_info(nic_dev, drv, netdev, "Add vlan %u\n", vid);
+
+end:
+#if (KERNEL_VERSION(3, 3, 0) <= LINUX_VERSION_CODE)
+ return err;
+#else
+ return;
+#endif
+}
+
+#if (KERNEL_VERSION(3, 3, 0) > LINUX_VERSION_CODE)
+static void
+#else
+static int
+#endif
+hinic3_vlan_rx_kill_vid(struct net_device *netdev,
+ #if (KERNEL_VERSION(3, 10, 0) <= LINUX_VERSION_CODE)
+ __always_unused __be16 proto,
+ #endif
+ u16 vid)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ unsigned long *vlan_bitmap = nic_dev->vlan_bitmap;
+ u16 func_id;
+ int col, line;
+ int err = 0;
+
+ col = VID_COL(nic_dev, vid);
+ line = VID_LINE(nic_dev, vid);
+
+ /* In the broadcast scenario, ucode finds the corresponding function
+ * based on VLAN 0 of vlan table. If we delete VLAN 0, the VLAN function
+ * is affected.
+ */
+ if (vid == 0)
+ goto end;
+
+ func_id = hinic3_global_func_id(nic_dev->hwdev);
+ err = hinic3_del_vlan(nic_dev->hwdev, vid, func_id);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to delete vlan\n");
+ goto end;
+ }
+
+ clear_bit(col, &vlan_bitmap[line]);
+
+ nicif_info(nic_dev, drv, netdev, "Remove vlan %u\n", vid);
+
+end:
+#if (KERNEL_VERSION(3, 3, 0) <= LINUX_VERSION_CODE)
+ return err;
+#else
+ return;
+#endif
+}
+
+#ifdef NEED_VLAN_RESTORE
+static int hinic3_vlan_restore(struct net_device *netdev)
+{
+ int err = 0;
+#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
+ struct net_device *vlandev = NULL;
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ unsigned long *vlan_bitmap = nic_dev->vlan_bitmap;
+ u32 col, line;
+ u16 i;
+
+ if (!netdev->netdev_ops->ndo_vlan_rx_add_vid)
+ return -EFAULT;
+ rcu_read_lock();
+ for (i = 0; i < VLAN_N_VID; i++) {
+/* lint -e778 */
+#ifdef HAVE_VLAN_FIND_DEV_DEEP_RCU
+ vlandev =
+ __vlan_find_dev_deep_rcu(netdev, htons(ETH_P_8021Q), i);
+#else
+ vlandev = __vlan_find_dev_deep(netdev, htons(ETH_P_8021Q), i);
+#endif
+/* lint +e778 */
+ col = VID_COL(nic_dev, i);
+ line = VID_LINE(nic_dev, i);
+ if (!vlandev && (vlan_bitmap[line] & (1UL << col)) != 0) {
+#if (KERNEL_VERSION(3, 10, 0) <= LINUX_VERSION_CODE)
+ err = netdev->netdev_ops->ndo_vlan_rx_kill_vid(netdev,
+ htons(ETH_P_8021Q), i);
+ if (err) {
+ hinic3_err(nic_dev, drv, "delete vlan %u failed, err code %d\n",
+ i, err);
+ break;
+ }
+#else
+ netdev->netdev_ops->ndo_vlan_rx_kill_vid(netdev, i);
+#endif
+ } else if (vlandev && (vlan_bitmap[line] & (1UL << col)) == 0) {
+#if (KERNEL_VERSION(3, 10, 0) <= LINUX_VERSION_CODE)
+ err = netdev->netdev_ops->ndo_vlan_rx_add_vid(netdev,
+ htons(ETH_P_8021Q), i);
+ if (err) {
+ hinic3_err(nic_dev, drv, "restore vlan %u failed, err code %d\n",
+ i, err);
+ break;
+ }
+#else
+ netdev->netdev_ops->ndo_vlan_rx_add_vid(netdev, i);
+#endif
+ }
+ }
+ rcu_read_unlock();
+#endif
+
+ return err;
+}
+#endif
+
+#define SET_FEATURES_OP_STR(op) ((op) ? "Enable" : "Disable")
+
+static int set_feature_rx_csum(struct hinic3_nic_dev *nic_dev,
+ netdev_features_t wanted_features,
+ netdev_features_t features,
+ netdev_features_t *failed_features)
+{
+ netdev_features_t changed = wanted_features ^ features;
+
+ if (changed & NETIF_F_RXCSUM)
+ hinic3_info(nic_dev, drv, "%s rx csum success\n",
+ SET_FEATURES_OP_STR(wanted_features &
+ NETIF_F_RXCSUM));
+
+ return 0;
+}
+
+static int set_feature_tso(struct hinic3_nic_dev *nic_dev,
+ netdev_features_t wanted_features,
+ netdev_features_t features,
+ netdev_features_t *failed_features)
+{
+ netdev_features_t changed = wanted_features ^ features;
+
+ if (changed & NETIF_F_TSO)
+ hinic3_info(nic_dev, drv, "%s tso success\n",
+ SET_FEATURES_OP_STR(wanted_features & NETIF_F_TSO));
+
+ return 0;
+}
+
+#ifdef NETIF_F_UFO
+static int set_feature_ufo(struct hinic3_nic_dev *nic_dev,
+ netdev_features_t wanted_features,
+ netdev_features_t features,
+ netdev_features_t *failed_features)
+{
+ netdev_features_t changed = wanted_features ^ features;
+
+ if (changed & NETIF_F_UFO)
+ hinic3_info(nic_dev, drv, "%s ufo success\n",
+ SET_FEATURES_OP_STR(wanted_features & NETIF_F_UFO));
+
+ return 0;
+}
+#endif
+
+static int set_feature_lro(struct hinic3_nic_dev *nic_dev,
+ netdev_features_t wanted_features,
+ netdev_features_t features,
+ netdev_features_t *failed_features)
+{
+ netdev_features_t changed = wanted_features ^ features;
+ bool en = !!(wanted_features & NETIF_F_LRO);
+ int err;
+
+ if (!(changed & NETIF_F_LRO))
+ return 0;
+
+#ifdef HAVE_XDP_SUPPORT
+ if (en && hinic3_is_xdp_enable(nic_dev)) {
+ hinic3_err(nic_dev, drv, "Can not enable LRO when xdp is enable\n");
+ *failed_features |= NETIF_F_LRO;
+ return -EINVAL;
+ }
+#endif
+
+ err = hinic3_set_rx_lro_state(nic_dev->hwdev, en,
+ HINIC3_LRO_DEFAULT_TIME_LIMIT,
+ HINIC3_LRO_DEFAULT_COAL_PKT_SIZE);
+ if (err) {
+ hinic3_err(nic_dev, drv, "%s lro failed\n",
+ SET_FEATURES_OP_STR(en));
+ *failed_features |= NETIF_F_LRO;
+ } else {
+ hinic3_info(nic_dev, drv, "%s lro success\n",
+ SET_FEATURES_OP_STR(en));
+ }
+
+ return err;
+}
+
+static int set_feature_rx_cvlan(struct hinic3_nic_dev *nic_dev,
+ netdev_features_t wanted_features,
+ netdev_features_t features,
+ netdev_features_t *failed_features)
+{
+ netdev_features_t changed = wanted_features ^ features;
+#ifdef NETIF_F_HW_VLAN_CTAG_RX
+ netdev_features_t vlan_feature = NETIF_F_HW_VLAN_CTAG_RX;
+#else
+ netdev_features_t vlan_feature = NETIF_F_HW_VLAN_RX;
+#endif
+ bool en = !!(wanted_features & vlan_feature);
+ int err;
+
+ if (!(changed & vlan_feature))
+ return 0;
+
+ err = hinic3_set_rx_vlan_offload(nic_dev->hwdev, en);
+ if (err) {
+ hinic3_err(nic_dev, drv, "%s rxvlan failed\n",
+ SET_FEATURES_OP_STR(en));
+ *failed_features |= vlan_feature;
+ } else {
+ hinic3_info(nic_dev, drv, "%s rxvlan success\n",
+ SET_FEATURES_OP_STR(en));
+ }
+
+ return err;
+}
+
+static int set_feature_vlan_filter(struct hinic3_nic_dev *nic_dev,
+ netdev_features_t wanted_features,
+ netdev_features_t features,
+ netdev_features_t *failed_features)
+{
+ netdev_features_t changed = wanted_features ^ features;
+#if defined(NETIF_F_HW_VLAN_CTAG_FILTER)
+ netdev_features_t vlan_filter_feature = NETIF_F_HW_VLAN_CTAG_FILTER;
+#elif defined(NETIF_F_HW_VLAN_FILTER)
+ netdev_features_t vlan_filter_feature = NETIF_F_HW_VLAN_FILTER;
+#endif
+ bool en = !!(wanted_features & vlan_filter_feature);
+ int err = 0;
+
+ if (!(changed & vlan_filter_feature))
+ return 0;
+
+#ifdef NEED_VLAN_RESTORE
+ if (en)
+ err = hinic3_vlan_restore(nic_dev->netdev);
+#endif
+
+ if (err == 0)
+ err = hinic3_set_vlan_fliter(nic_dev->hwdev, en);
+ if (err) {
+ hinic3_err(nic_dev, drv, "%s rx vlan filter failed\n",
+ SET_FEATURES_OP_STR(en));
+ *failed_features |= vlan_filter_feature;
+ } else {
+ hinic3_info(nic_dev, drv, "%s rx vlan filter success\n",
+ SET_FEATURES_OP_STR(en));
+ }
+
+ return err;
+}
+
+static int set_features(struct hinic3_nic_dev *nic_dev,
+ netdev_features_t pre_features,
+ netdev_features_t features)
+{
+ netdev_features_t failed_features = 0;
+ u32 err = 0;
+
+ err |= (u32)set_feature_rx_csum(nic_dev, features, pre_features,
+ &failed_features);
+ err |= (u32)set_feature_tso(nic_dev, features, pre_features,
+ &failed_features);
+ err |= (u32)set_feature_lro(nic_dev, features, pre_features,
+ &failed_features);
+#ifdef NETIF_F_UFO
+ err |= (u32)set_feature_ufo(nic_dev, features, pre_features,
+ &failed_features);
+#endif
+ err |= (u32)set_feature_rx_cvlan(nic_dev, features, pre_features,
+ &failed_features);
+ err |= (u32)set_feature_vlan_filter(nic_dev, features, pre_features,
+ &failed_features);
+ if (err) {
+ nic_dev->netdev->features = features ^ failed_features;
+ return -EIO;
+ }
+
+ return 0;
+}
+
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+static int hinic3_set_features(struct net_device *netdev, u32 features)
+#else
+static int hinic3_set_features(struct net_device *netdev,
+ netdev_features_t features)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ return set_features(nic_dev, nic_dev->netdev->features,
+ features);
+}
+
+int hinic3_set_hw_features(struct hinic3_nic_dev *nic_dev)
+{
+ /* enable all hw features in netdev->features */
+ return set_features(nic_dev, ~nic_dev->netdev->features,
+ nic_dev->netdev->features);
+}
+
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+static u32 hinic3_fix_features(struct net_device *netdev, u32 features)
+#else
+static netdev_features_t hinic3_fix_features(struct net_device *netdev,
+ netdev_features_t features)
+#endif
+{
+ netdev_features_t features_tmp = features;
+
+ /* If Rx checksum is disabled, then LRO should also be disabled */
+ if (!(features_tmp & NETIF_F_RXCSUM))
+ features_tmp &= ~NETIF_F_LRO;
+
+ return features_tmp;
+}
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void hinic3_netpoll(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u16 i;
+
+ for (i = 0; i < nic_dev->q_params.num_qps; i++)
+ napi_schedule(&nic_dev->q_params.irq_cfg[i].napi);
+}
+#endif /* CONFIG_NET_POLL_CONTROLLER */
+
+static int hinic3_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
+{
+ struct hinic3_nic_dev *adapter = netdev_priv(netdev);
+ int err;
+
+ if (is_multicast_ether_addr(mac) || /*lint !e574*/
+ vf >= pci_num_vf(adapter->pdev)) /*lint !e574*/
+ return -EINVAL;
+
+ err = hinic3_set_vf_mac(adapter->hwdev, OS_VF_ID_TO_HW(vf), mac);
+ if (err)
+ return err;
+
+ if (!is_zero_ether_addr(mac))
+ nic_info(&adapter->pdev->dev, "Setting MAC %pM on VF %d\n",
+ mac, vf);
+ else
+ nic_info(&adapter->pdev->dev, "Deleting MAC on VF %d\n", vf);
+
+ nic_info(&adapter->pdev->dev, "Please reload the VF driver to make this change effective.");
+
+ return 0;
+}
+
+/*lint -save -e574 -e734*/
+#ifdef IFLA_VF_MAX
+static int set_hw_vf_vlan(void *hwdev, u16 cur_vlanprio, int vf,
+ u16 vlan, u8 qos)
+{
+ int err = 0;
+ u16 old_vlan = cur_vlanprio & VLAN_VID_MASK;
+
+ if (vlan || qos) {
+ if (cur_vlanprio) {
+ err = hinic3_kill_vf_vlan(hwdev, OS_VF_ID_TO_HW(vf));
+ if (err)
+ return err;
+ }
+ err = hinic3_add_vf_vlan(hwdev, OS_VF_ID_TO_HW(vf), vlan, qos);
+ } else {
+ err = hinic3_kill_vf_vlan(hwdev, OS_VF_ID_TO_HW(vf));
+ }
+
+ err = hinic3_update_mac_vlan(hwdev, old_vlan, vlan, OS_VF_ID_TO_HW(vf));
+ return err;
+}
+
+#define HINIC3_MAX_VLAN_ID 4094
+#define HINIC3_MAX_QOS_NUM 7
+
+#ifdef IFLA_VF_VLAN_INFO_MAX
+static int hinic3_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
+ u8 qos, __be16 vlan_proto)
+#else
+static int hinic3_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
+ u8 qos)
+#endif
+{
+ struct hinic3_nic_dev *adapter = netdev_priv(netdev);
+ u16 vlanprio, cur_vlanprio;
+
+ if (vf >= pci_num_vf(adapter->pdev) ||
+ vlan > HINIC3_MAX_VLAN_ID || qos > HINIC3_MAX_QOS_NUM)
+ return -EINVAL;
+#ifdef IFLA_VF_VLAN_INFO_MAX
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+#endif
+ vlanprio = vlan | (qos << HINIC3_VLAN_PRIORITY_SHIFT);
+ cur_vlanprio = hinic3_vf_info_vlanprio(adapter->hwdev,
+ OS_VF_ID_TO_HW(vf));
+ /* duplicate request, so just return success */
+ if (vlanprio == cur_vlanprio)
+ return 0;
+
+ return set_hw_vf_vlan(adapter->hwdev, cur_vlanprio, vf, vlan, qos);
+}
+#endif
+
+#ifdef HAVE_VF_SPOOFCHK_CONFIGURE
+static int hinic3_ndo_set_vf_spoofchk(struct net_device *netdev, int vf,
+ bool setting)
+{
+ struct hinic3_nic_dev *adapter = netdev_priv(netdev);
+ int err = 0;
+ bool cur_spoofchk = false;
+
+ if (vf >= pci_num_vf(adapter->pdev))
+ return -EINVAL;
+
+ cur_spoofchk = hinic3_vf_info_spoofchk(adapter->hwdev,
+ OS_VF_ID_TO_HW(vf));
+ /* same request, so just return success */
+ if ((setting && cur_spoofchk) || (!setting && !cur_spoofchk))
+ return 0;
+
+ err = hinic3_set_vf_spoofchk(adapter->hwdev,
+ (u16)OS_VF_ID_TO_HW(vf), setting);
+ if (!err)
+ nicif_info(adapter, drv, netdev, "Set VF %d spoofchk %s\n",
+ vf, setting ? "on" : "off");
+
+ return err;
+}
+#endif
+
+#ifdef HAVE_NDO_SET_VF_TRUST
+static int hinic3_ndo_set_vf_trust(struct net_device *netdev, int vf, bool setting)
+{
+ struct hinic3_nic_dev *adapter = netdev_priv(netdev);
+ int err;
+ bool cur_trust;
+
+ if (vf >= pci_num_vf(adapter->pdev))
+ return -EINVAL;
+
+ cur_trust = hinic3_get_vf_trust(adapter->hwdev,
+ OS_VF_ID_TO_HW(vf));
+ /* same request, so just return success */
+ if ((setting && cur_trust) || (!setting && !cur_trust))
+ return 0;
+
+ err = hinic3_set_vf_trust(adapter->hwdev,
+ (u16)OS_VF_ID_TO_HW(vf), setting);
+ if (!err)
+ nicif_info(adapter, drv, netdev, "Set VF %d trusted %s successfully\n",
+ vf, setting ? "on" : "off");
+ else
+ nicif_err(adapter, drv, netdev, "Failed set VF %d trusted %s\n",
+ vf, setting ? "on" : "off");
+
+ return err;
+}
+#endif
+
+static int hinic3_ndo_get_vf_config(struct net_device *netdev,
+ int vf, struct ifla_vf_info *ivi)
+{
+ struct hinic3_nic_dev *adapter = netdev_priv(netdev);
+
+ if (vf >= pci_num_vf(adapter->pdev))
+ return -EINVAL;
+
+ hinic3_get_vf_config(adapter->hwdev, (u16)OS_VF_ID_TO_HW(vf), ivi);
+
+ return 0;
+}
+
+/**
+ * hinic3_ndo_set_vf_link_state
+ * @netdev: network interface device structure
+ * @vf_id: VF identifier
+ * @link: required link state
+ *
+ * Set the link state of a specified VF, regardless of physical link state
+ **/
+int hinic3_ndo_set_vf_link_state(struct net_device *netdev, int vf_id, int link)
+{
+ static const char * const vf_link[] = {"auto", "enable", "disable"};
+ struct hinic3_nic_dev *adapter = netdev_priv(netdev);
+ int err;
+
+ /* validate the request */
+ if (vf_id >= pci_num_vf(adapter->pdev)) {
+ nicif_err(adapter, drv, netdev,
+ "Invalid VF Identifier %d\n", vf_id);
+ return -EINVAL;
+ }
+
+ err = hinic3_set_vf_link_state(adapter->hwdev,
+ (u16)OS_VF_ID_TO_HW(vf_id), link);
+ if (!err)
+ nicif_info(adapter, drv, netdev, "Set VF %d link state: %s\n",
+ vf_id, vf_link[link]);
+
+ return err;
+}
+
+static int is_set_vf_bw_param_valid(const struct hinic3_nic_dev *adapter,
+ int vf, int min_tx_rate, int max_tx_rate)
+{
+ if (!HINIC3_SUPPORT_RATE_LIMIT(adapter->hwdev)) {
+ nicif_err(adapter, drv, adapter->netdev, "Current function doesn't support to set vf rate limit\n");
+ return -EOPNOTSUPP;
+ }
+
+ /* verify VF is active */
+ if (vf >= pci_num_vf(adapter->pdev)) {
+ nicif_err(adapter, drv, adapter->netdev, "VF number must be less than %d\n",
+ pci_num_vf(adapter->pdev));
+ return -EINVAL;
+ }
+
+ if (max_tx_rate < min_tx_rate) {
+ nicif_err(adapter, drv, adapter->netdev, "Invalid rate, max rate %d must greater than min rate %d\n",
+ max_tx_rate, min_tx_rate);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+#define HINIC3_TX_RATE_TABLE_FULL 12
+
+#ifdef HAVE_NDO_SET_VF_MIN_MAX_TX_RATE
+static int hinic3_ndo_set_vf_bw(struct net_device *netdev,
+ int vf, int min_tx_rate, int max_tx_rate)
+#else
+static int hinic3_ndo_set_vf_bw(struct net_device *netdev, int vf,
+ int max_tx_rate)
+#endif /* HAVE_NDO_SET_VF_MIN_MAX_TX_RATE */
+{
+ struct hinic3_nic_dev *adapter = netdev_priv(netdev);
+ struct nic_port_info port_info = {0};
+#ifndef HAVE_NDO_SET_VF_MIN_MAX_TX_RATE
+ int min_tx_rate = 0;
+#endif
+ u8 link_status = 0;
+ u32 speeds[] = {0, SPEED_10, SPEED_100, SPEED_1000, SPEED_10000,
+ SPEED_25000, SPEED_40000, SPEED_50000, SPEED_100000,
+ SPEED_200000};
+ int err = 0;
+
+ err = is_set_vf_bw_param_valid(adapter, vf, min_tx_rate, max_tx_rate);
+ if (err)
+ return err;
+
+ err = hinic3_get_link_state(adapter->hwdev, &link_status);
+ if (err) {
+ nicif_err(adapter, drv, netdev,
+ "Get link status failed when set vf tx rate\n");
+ return -EIO;
+ }
+
+ if (!link_status) {
+ nicif_err(adapter, drv, netdev,
+ "Link status must be up when set vf tx rate\n");
+ return -EINVAL;
+ }
+
+ err = hinic3_get_port_info(adapter->hwdev, &port_info,
+ HINIC3_CHANNEL_NIC);
+ if (err || port_info.speed >= PORT_SPEED_UNKNOWN)
+ return -EIO;
+
+ /* rate limit cannot be less than 0 and greater than link speed */
+ if (max_tx_rate < 0 || max_tx_rate > speeds[port_info.speed]) {
+ nicif_err(adapter, drv, netdev, "Set vf max tx rate must be in [0 - %u]\n",
+ speeds[port_info.speed]);
+ return -EINVAL;
+ }
+
+ err = hinic3_set_vf_tx_rate(adapter->hwdev, (u16)OS_VF_ID_TO_HW(vf),
+ (u32)max_tx_rate, (u32)min_tx_rate);
+ if (err) {
+ nicif_err(adapter, drv, netdev,
+ "Unable to set VF %d max rate %d min rate %d%s\n",
+ vf, max_tx_rate, min_tx_rate,
+ err == HINIC3_TX_RATE_TABLE_FULL ?
+ ", tx rate profile is full" : "");
+ return -EIO;
+ }
+
+#ifdef HAVE_NDO_SET_VF_MIN_MAX_TX_RATE
+ nicif_info(adapter, drv, netdev,
+ "Set VF %d max tx rate %d min tx rate %d successfully\n",
+ vf, max_tx_rate, min_tx_rate);
+#else
+ nicif_info(adapter, drv, netdev,
+ "Set VF %d tx rate %d successfully\n",
+ vf, max_tx_rate);
+#endif
+
+ return 0;
+}
+
+#ifdef HAVE_XDP_SUPPORT
+bool hinic3_is_xdp_enable(struct hinic3_nic_dev *nic_dev)
+{
+ return !!nic_dev->xdp_prog;
+}
+
+int hinic3_xdp_max_mtu(struct hinic3_nic_dev *nic_dev)
+{
+ return nic_dev->rx_buff_len - (ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN);
+}
+
+static int hinic3_xdp_setup(struct hinic3_nic_dev *nic_dev,
+ struct bpf_prog *prog,
+ struct netlink_ext_ack *extack)
+{
+ struct bpf_prog *old_prog = NULL;
+ int max_mtu = hinic3_xdp_max_mtu(nic_dev);
+ int q_id;
+
+ if (nic_dev->netdev->mtu > max_mtu) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to setup xdp program, the current MTU %d is larger than max allowed MTU %d\n",
+ nic_dev->netdev->mtu, max_mtu);
+ NL_SET_ERR_MSG_MOD(extack,
+ "MTU too large for loading xdp program");
+ return -EINVAL;
+ }
+
+ if (nic_dev->netdev->features & NETIF_F_LRO) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to setup xdp program while LRO is on\n");
+ NL_SET_ERR_MSG_MOD(extack,
+ "Failed to setup xdp program while LRO is on\n");
+ return -EINVAL;
+ }
+
+ old_prog = xchg(&nic_dev->xdp_prog, prog);
+ for (q_id = 0; q_id < nic_dev->max_qps; q_id++)
+ xchg(&nic_dev->rxqs[q_id].xdp_prog, nic_dev->xdp_prog);
+
+ if (old_prog)
+ bpf_prog_put(old_prog);
+
+ return 0;
+}
+
+#ifdef HAVE_NDO_BPF_NETDEV_BPF
+static int hinic3_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
+#else
+static int hinic3_xdp(struct net_device *netdev, struct netdev_xdp *xdp)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ switch (xdp->command) {
+ case XDP_SETUP_PROG:
+ return hinic3_xdp_setup(nic_dev, xdp->prog, xdp->extack);
+#ifdef HAVE_XDP_QUERY_PROG
+ case XDP_QUERY_PROG:
+ xdp->prog_id = nic_dev->xdp_prog ?
+ nic_dev->xdp_prog->aux->id : 0;
+ return 0;
+#endif
+ default:
+ return -EINVAL;
+ }
+}
+#endif
+
+static const struct net_device_ops hinic3_netdev_ops = {
+ .ndo_open = hinic3_open,
+ .ndo_stop = hinic3_close,
+ .ndo_start_xmit = hinic3_xmit_frame,
+
+#ifdef HAVE_NDO_GET_STATS64
+ .ndo_get_stats64 = hinic3_get_stats64,
+#else
+ .ndo_get_stats = hinic3_get_stats,
+#endif /* HAVE_NDO_GET_STATS64 */
+
+ .ndo_tx_timeout = hinic3_tx_timeout,
+ .ndo_select_queue = hinic3_select_queue,
+#ifdef HAVE_RHEL7_NETDEV_OPS_EXT_NDO_CHANGE_MTU
+ .extended.ndo_change_mtu = hinic3_change_mtu,
+#else
+ .ndo_change_mtu = hinic3_change_mtu,
+#endif
+ .ndo_set_mac_address = hinic3_set_mac_addr,
+ .ndo_validate_addr = eth_validate_addr,
+
+#if defined(NETIF_F_HW_VLAN_TX) || defined(NETIF_F_HW_VLAN_CTAG_TX)
+ .ndo_vlan_rx_add_vid = hinic3_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = hinic3_vlan_rx_kill_vid,
+#endif
+
+#ifdef HAVE_RHEL7_NET_DEVICE_OPS_EXT
+ /* RHEL7 requires this to be defined to enable extended ops. RHEL7
+ * uses the function get_ndo_ext to retrieve offsets for extended
+ * fields from with the net_device_ops struct and ndo_size is checked
+ * to determine whether or not the offset is valid.
+ */
+ .ndo_size = sizeof(const struct net_device_ops),
+#endif
+
+#ifdef IFLA_VF_MAX
+ .ndo_set_vf_mac = hinic3_ndo_set_vf_mac,
+#ifdef HAVE_RHEL7_NETDEV_OPS_EXT_NDO_SET_VF_VLAN
+ .extended.ndo_set_vf_vlan = hinic3_ndo_set_vf_vlan,
+#else
+ .ndo_set_vf_vlan = hinic3_ndo_set_vf_vlan,
+#endif
+#ifdef HAVE_NDO_SET_VF_MIN_MAX_TX_RATE
+ .ndo_set_vf_rate = hinic3_ndo_set_vf_bw,
+#else
+ .ndo_set_vf_tx_rate = hinic3_ndo_set_vf_bw,
+#endif /* HAVE_NDO_SET_VF_MIN_MAX_TX_RATE */
+#ifdef HAVE_VF_SPOOFCHK_CONFIGURE
+ .ndo_set_vf_spoofchk = hinic3_ndo_set_vf_spoofchk,
+#endif
+
+#ifdef HAVE_NDO_SET_VF_TRUST
+#ifdef HAVE_RHEL7_NET_DEVICE_OPS_EXT
+ .extended.ndo_set_vf_trust = hinic3_ndo_set_vf_trust,
+#else
+ .ndo_set_vf_trust = hinic3_ndo_set_vf_trust,
+#endif /* HAVE_RHEL7_NET_DEVICE_OPS_EXT */
+#endif /* HAVE_NDO_SET_VF_TRUST */
+
+ .ndo_get_vf_config = hinic3_ndo_get_vf_config,
+#endif
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ .ndo_poll_controller = hinic3_netpoll,
+#endif /* CONFIG_NET_POLL_CONTROLLER */
+
+ .ndo_set_rx_mode = hinic3_nic_set_rx_mode,
+
+#ifdef HAVE_XDP_SUPPORT
+#ifdef HAVE_NDO_BPF_NETDEV_BPF
+ .ndo_bpf = hinic3_xdp,
+#else
+ .ndo_xdp = hinic3_xdp,
+#endif
+#endif
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+};
+
+/* RHEL6 keeps these operations in a separate structure */
+static const struct net_device_ops_ext hinic3_netdev_ops_ext = {
+ .size = sizeof(struct net_device_ops_ext),
+#endif /* HAVE_RHEL6_NET_DEVICE_OPS_EXT */
+
+#ifdef HAVE_NDO_SET_VF_LINK_STATE
+ .ndo_set_vf_link_state = hinic3_ndo_set_vf_link_state,
+#endif
+
+#ifdef HAVE_NDO_SET_FEATURES
+ .ndo_fix_features = hinic3_fix_features,
+ .ndo_set_features = hinic3_set_features,
+#endif /* HAVE_NDO_SET_FEATURES */
+};
+
+static const struct net_device_ops hinic3vf_netdev_ops = {
+ .ndo_open = hinic3_open,
+ .ndo_stop = hinic3_close,
+ .ndo_start_xmit = hinic3_xmit_frame,
+
+#ifdef HAVE_NDO_GET_STATS64
+ .ndo_get_stats64 = hinic3_get_stats64,
+#else
+ .ndo_get_stats = hinic3_get_stats,
+#endif /* HAVE_NDO_GET_STATS64 */
+
+ .ndo_tx_timeout = hinic3_tx_timeout,
+ .ndo_select_queue = hinic3_select_queue,
+
+#ifdef HAVE_RHEL7_NET_DEVICE_OPS_EXT
+ /* RHEL7 requires this to be defined to enable extended ops. RHEL7
+ * uses the function get_ndo_ext to retrieve offsets for extended
+ * fields from with the net_device_ops struct and ndo_size is checked
+ * to determine whether or not the offset is valid.
+ */
+ .ndo_size = sizeof(const struct net_device_ops),
+#endif
+
+#ifdef HAVE_RHEL7_NETDEV_OPS_EXT_NDO_CHANGE_MTU
+ .extended.ndo_change_mtu = hinic3_change_mtu,
+#else
+ .ndo_change_mtu = hinic3_change_mtu,
+#endif
+ .ndo_set_mac_address = hinic3_set_mac_addr,
+ .ndo_validate_addr = eth_validate_addr,
+
+#if defined(NETIF_F_HW_VLAN_TX) || defined(NETIF_F_HW_VLAN_CTAG_TX)
+ .ndo_vlan_rx_add_vid = hinic3_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = hinic3_vlan_rx_kill_vid,
+#endif
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ .ndo_poll_controller = hinic3_netpoll,
+#endif /* CONFIG_NET_POLL_CONTROLLER */
+
+ .ndo_set_rx_mode = hinic3_nic_set_rx_mode,
+
+#ifdef HAVE_XDP_SUPPORT
+#ifdef HAVE_NDO_BPF_NETDEV_BPF
+ .ndo_bpf = hinic3_xdp,
+#else
+ .ndo_xdp = hinic3_xdp,
+#endif
+#endif
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+};
+
+/* RHEL6 keeps these operations in a separate structure */
+static const struct net_device_ops_ext hinic3vf_netdev_ops_ext = {
+ .size = sizeof(struct net_device_ops_ext),
+#endif /* HAVE_RHEL6_NET_DEVICE_OPS_EXT */
+
+#ifdef HAVE_NDO_SET_FEATURES
+ .ndo_fix_features = hinic3_fix_features,
+ .ndo_set_features = hinic3_set_features,
+#endif /* HAVE_NDO_SET_FEATURES */
+};
+
+void hinic3_set_netdev_ops(struct hinic3_nic_dev *nic_dev)
+{
+ if (!HINIC3_FUNC_IS_VF(nic_dev->hwdev)) {
+ nic_dev->netdev->netdev_ops = &hinic3_netdev_ops;
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+ set_netdev_ops_ext(nic_dev->netdev, &hinic3_netdev_ops_ext);
+#endif /* HAVE_RHEL6_NET_DEVICE_OPS_EXT */
+ } else {
+ nic_dev->netdev->netdev_ops = &hinic3vf_netdev_ops;
+#ifdef HAVE_RHEL6_NET_DEVICE_OPS_EXT
+ set_netdev_ops_ext(nic_dev->netdev, &hinic3vf_netdev_ops_ext);
+#endif /* HAVE_RHEL6_NET_DEVICE_OPS_EXT */
+ }
+}
+
+bool hinic3_is_netdev_ops_match(const struct net_device *netdev)
+{
+ return netdev->netdev_ops == &hinic3_netdev_ops ||
+ netdev->netdev_ops == &hinic3vf_netdev_ops;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic.h
new file mode 100644
index 000000000000..69cacbae3b57
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic.h
@@ -0,0 +1,183 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NIC_H
+#define HINIC3_NIC_H
+
+#include <linux/types.h>
+#include <linux/semaphore.h>
+
+#include "hinic3_common.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_cfg.h"
+#include "mag_cmd.h"
+
+/* ************************ array index define ********************* */
+#define ARRAY_INDEX_0 0
+#define ARRAY_INDEX_1 1
+#define ARRAY_INDEX_2 2
+#define ARRAY_INDEX_3 3
+#define ARRAY_INDEX_4 4
+#define ARRAY_INDEX_5 5
+#define ARRAY_INDEX_6 6
+#define ARRAY_INDEX_7 7
+
+struct hinic3_sq_attr {
+ u8 dma_attr_off;
+ u8 pending_limit;
+ u8 coalescing_time;
+ u8 intr_en;
+ u16 intr_idx;
+ u32 l2nic_sqn;
+ u64 ci_dma_base;
+};
+
+struct vf_data_storage {
+ u8 drv_mac_addr[ETH_ALEN];
+ u8 user_mac_addr[ETH_ALEN];
+ bool registered;
+ bool use_specified_mac;
+ u16 pf_vlan;
+ u8 pf_qos;
+ u8 rsvd2;
+ u32 max_rate;
+ u32 min_rate;
+
+ bool link_forced;
+ bool link_up; /* only valid if VF link is forced */
+ bool spoofchk;
+ bool trust;
+ u16 num_qps;
+ u32 support_extra_feature;
+};
+
+struct hinic3_port_routine_cmd {
+ bool mpu_send_sfp_info;
+ bool mpu_send_sfp_abs;
+
+ struct mag_cmd_get_xsfp_info std_sfp_info;
+ struct mag_cmd_get_xsfp_present abs;
+};
+
+struct hinic3_nic_cfg {
+ struct semaphore cfg_lock;
+
+ /* Valid when pfc is disable */
+ bool pause_set;
+ struct nic_pause_config nic_pause;
+
+ u8 pfc_en;
+ u8 pfc_bitmap;
+
+ struct nic_port_info port_info;
+
+ /* percentage of pf link bandwidth */
+ u32 pf_bw_limit;
+ u32 rsvd2;
+
+ struct hinic3_port_routine_cmd rt_cmd;
+ struct mutex sfp_mutex; /* mutex used for copy sfp info */
+};
+
+struct hinic3_nic_io {
+ void *hwdev;
+ void *pcidev_hdl;
+ void *dev_hdl;
+
+ u8 link_status;
+ u8 rsvd1;
+ u32 rsvd2;
+
+ struct hinic3_io_queue *sq;
+ struct hinic3_io_queue *rq;
+
+ u16 num_qps;
+ u16 max_qps;
+
+ void *ci_vaddr_base;
+ dma_addr_t ci_dma_base;
+
+ u8 __iomem *sqs_db_addr;
+ u8 __iomem *rqs_db_addr;
+
+ u16 max_vfs;
+ u16 rsvd3;
+ u32 rsvd4;
+
+ struct vf_data_storage *vf_infos;
+ struct hinic3_dcb_state dcb_state;
+ struct hinic3_nic_cfg nic_cfg;
+
+ u16 rx_buff_len;
+ u16 rsvd5;
+ u32 rsvd6;
+ u64 feature_cap;
+ u64 rsvd7;
+};
+
+struct vf_msg_handler {
+ u16 cmd;
+ int (*handler)(struct hinic3_nic_io *nic_io, u16 vf,
+ void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+};
+
+struct nic_event_handler {
+ u16 cmd;
+ void (*handler)(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+};
+
+int hinic3_set_ci_table(void *hwdev, struct hinic3_sq_attr *attr);
+
+int l2nic_msg_to_mgmt_sync(void *hwdev, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+
+int l2nic_msg_to_mgmt_sync_ch(void *hwdev, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size, u16 channel);
+
+int hinic3_cfg_vf_vlan(struct hinic3_nic_io *nic_io, u8 opcode, u16 vid,
+ u8 qos, int vf_id);
+
+int hinic3_vf_event_handler(void *hwdev,
+ u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+
+void hinic3_pf_event_handler(void *hwdev, u16 cmd,
+ void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+
+int hinic3_pf_mbox_handler(void *hwdev,
+ u16 vf_id, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+
+u8 hinic3_nic_sw_aeqe_handler(void *hwdev, u8 event, u8 *data);
+
+int hinic3_vf_func_init(struct hinic3_nic_io *nic_io);
+
+void hinic3_vf_func_free(struct hinic3_nic_io *nic_io);
+
+void hinic3_notify_dcb_state_event(struct hinic3_nic_io *nic_io,
+ struct hinic3_dcb_state *dcb_state);
+
+int hinic3_save_dcb_state(struct hinic3_nic_io *nic_io,
+ struct hinic3_dcb_state *dcb_state);
+
+void hinic3_notify_vf_link_status(struct hinic3_nic_io *nic_io,
+ u16 vf_id, u8 link_status);
+
+int hinic3_vf_mag_event_handler(void *hwdev, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size);
+
+void hinic3_pf_mag_event_handler(void *pri_handle, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size);
+
+int hinic3_pf_mag_mbox_handler(void *hwdev, u16 vf_id,
+ u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+
+void hinic3_unregister_vf(struct hinic3_nic_io *nic_io, u16 vf_id);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.c b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.c
new file mode 100644
index 000000000000..2c1b5658b458
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.c
@@ -0,0 +1,1608 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/ethtool.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic.h"
+#include "hinic3_nic_cmd.h"
+#include "hinic3_common.h"
+#include "hinic3_nic_cfg.h"
+
+int hinic3_set_ci_table(void *hwdev, struct hinic3_sq_attr *attr)
+{
+ struct hinic3_cmd_cons_idx_attr cons_idx_attr;
+ u16 out_size = sizeof(cons_idx_attr);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !attr)
+ return -EINVAL;
+
+ memset(&cons_idx_attr, 0, sizeof(cons_idx_attr));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ cons_idx_attr.func_idx = hinic3_global_func_id(hwdev);
+
+ cons_idx_attr.dma_attr_off = attr->dma_attr_off;
+ cons_idx_attr.pending_limit = attr->pending_limit;
+ cons_idx_attr.coalescing_time = attr->coalescing_time;
+
+ if (attr->intr_en) {
+ cons_idx_attr.intr_en = attr->intr_en;
+ cons_idx_attr.intr_idx = attr->intr_idx;
+ }
+
+ cons_idx_attr.l2nic_sqn = attr->l2nic_sqn;
+ cons_idx_attr.ci_addr = attr->ci_dma_base;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SQ_CI_ATTR_SET,
+ &cons_idx_attr, sizeof(cons_idx_attr),
+ &cons_idx_attr, &out_size);
+ if (err || !out_size || cons_idx_attr.msg_head.status) {
+ sdk_err(nic_io->dev_hdl,
+ "Failed to set ci attribute table, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, cons_idx_attr.msg_head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+#define PF_SET_VF_MAC(hwdev, status) \
+ (hinic3_func_type(hwdev) == TYPE_VF && \
+ (status) == HINIC3_PF_SET_VF_ALREADY)
+
+static int hinic3_check_mac_info(void *hwdev, u8 status, u16 vlan_id)
+{
+ if ((status && status != HINIC3_MGMT_STATUS_EXIST) ||
+ ((vlan_id & CHECK_IPSU_15BIT) &&
+ status == HINIC3_MGMT_STATUS_EXIST)) {
+ if (PF_SET_VF_MAC(hwdev, status))
+ return 0;
+
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+#define HINIC_VLAN_ID_MASK 0x7FFF
+
+int hinic3_set_mac(void *hwdev, const u8 *mac_addr, u16 vlan_id, u16 func_id,
+ u16 channel)
+{
+ struct hinic3_port_mac_set mac_info;
+ u16 out_size = sizeof(mac_info);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !mac_addr)
+ return -EINVAL;
+
+ memset(&mac_info, 0, sizeof(mac_info));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ if ((vlan_id & HINIC_VLAN_ID_MASK) >= VLAN_N_VID) {
+ nic_err(nic_io->dev_hdl, "Invalid VLAN number: %d\n",
+ (vlan_id & HINIC_VLAN_ID_MASK));
+ return -EINVAL;
+ }
+
+ mac_info.func_id = func_id;
+ mac_info.vlan_id = vlan_id;
+ ether_addr_copy(mac_info.mac, mac_addr);
+
+ err = l2nic_msg_to_mgmt_sync_ch(hwdev, HINIC3_NIC_CMD_SET_MAC,
+ &mac_info, sizeof(mac_info),
+ &mac_info, &out_size, channel);
+ if (err || !out_size ||
+ hinic3_check_mac_info(hwdev, mac_info.msg_head.status,
+ mac_info.vlan_id)) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to update MAC, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, mac_info.msg_head.status, out_size, channel);
+ return -EIO;
+ }
+
+ if (PF_SET_VF_MAC(hwdev, mac_info.msg_head.status)) {
+ nic_warn(nic_io->dev_hdl, "PF has already set VF mac, Ignore set operation\n");
+ return HINIC3_PF_SET_VF_ALREADY;
+ }
+
+ if (mac_info.msg_head.status == HINIC3_MGMT_STATUS_EXIST) {
+ nic_warn(nic_io->dev_hdl, "MAC is repeated. Ignore update operation\n");
+ return 0;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_mac);
+
+int hinic3_del_mac(void *hwdev, const u8 *mac_addr, u16 vlan_id, u16 func_id,
+ u16 channel)
+{
+ struct hinic3_port_mac_set mac_info;
+ u16 out_size = sizeof(mac_info);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !mac_addr)
+ return -EINVAL;
+
+ memset(&mac_info, 0, sizeof(mac_info));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ if ((vlan_id & HINIC_VLAN_ID_MASK) >= VLAN_N_VID) {
+ nic_err(nic_io->dev_hdl, "Invalid VLAN number: %d\n",
+ (vlan_id & HINIC_VLAN_ID_MASK));
+ return -EINVAL;
+ }
+
+ mac_info.func_id = func_id;
+ mac_info.vlan_id = vlan_id;
+ ether_addr_copy(mac_info.mac, mac_addr);
+
+ err = l2nic_msg_to_mgmt_sync_ch(hwdev, HINIC3_NIC_CMD_DEL_MAC,
+ &mac_info, sizeof(mac_info), &mac_info,
+ &out_size, channel);
+ if (err || !out_size ||
+ (mac_info.msg_head.status && !PF_SET_VF_MAC(hwdev, mac_info.msg_head.status))) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to delete MAC, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, mac_info.msg_head.status, out_size, channel);
+ return -EIO;
+ }
+
+ if (PF_SET_VF_MAC(hwdev, mac_info.msg_head.status)) {
+ nic_warn(nic_io->dev_hdl, "PF has already set VF mac, Ignore delete operation.\n");
+ return HINIC3_PF_SET_VF_ALREADY;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_del_mac);
+
+int hinic3_update_mac(void *hwdev, u8 *old_mac, u8 *new_mac, u16 vlan_id,
+ u16 func_id)
+{
+ struct hinic3_port_mac_update mac_info;
+ u16 out_size = sizeof(mac_info);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !old_mac || !new_mac)
+ return -EINVAL;
+
+ memset(&mac_info, 0, sizeof(mac_info));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ if ((vlan_id & HINIC_VLAN_ID_MASK) >= VLAN_N_VID) {
+ nic_err(nic_io->dev_hdl, "Invalid VLAN number: %d\n",
+ (vlan_id & HINIC_VLAN_ID_MASK));
+ return -EINVAL;
+ }
+
+ mac_info.func_id = func_id;
+ mac_info.vlan_id = vlan_id;
+ ether_addr_copy(mac_info.old_mac, old_mac);
+ ether_addr_copy(mac_info.new_mac, new_mac);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_UPDATE_MAC,
+ &mac_info, sizeof(mac_info),
+ &mac_info, &out_size);
+ if (err || !out_size ||
+ hinic3_check_mac_info(hwdev, mac_info.msg_head.status,
+ mac_info.vlan_id)) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to update MAC, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, mac_info.msg_head.status, out_size);
+ return -EIO;
+ }
+
+ if (PF_SET_VF_MAC(hwdev, mac_info.msg_head.status)) {
+ nic_warn(nic_io->dev_hdl, "PF has already set VF MAC. Ignore update operation\n");
+ return HINIC3_PF_SET_VF_ALREADY;
+ }
+
+ if (mac_info.msg_head.status == HINIC3_MGMT_STATUS_EXIST) {
+ nic_warn(nic_io->dev_hdl, "MAC is repeated. Ignore update operation\n");
+ return 0;
+ }
+
+ return 0;
+}
+
+int hinic3_get_default_mac(void *hwdev, u8 *mac_addr)
+{
+ struct hinic3_port_mac_set mac_info;
+ u16 out_size = sizeof(mac_info);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !mac_addr)
+ return -EINVAL;
+
+ memset(&mac_info, 0, sizeof(mac_info));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ mac_info.func_id = hinic3_global_func_id(hwdev);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_GET_MAC,
+ &mac_info, sizeof(mac_info),
+ &mac_info, &out_size);
+ if (err || !out_size || mac_info.msg_head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get mac, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, mac_info.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ ether_addr_copy(mac_addr, mac_info.mac);
+
+ return 0;
+}
+
+static int hinic3_config_vlan(struct hinic3_nic_io *nic_io, u8 opcode,
+ u16 vlan_id, u16 func_id)
+{
+ struct hinic3_cmd_vlan_config vlan_info;
+ u16 out_size = sizeof(vlan_info);
+ int err;
+
+ memset(&vlan_info, 0, sizeof(vlan_info));
+ vlan_info.opcode = opcode;
+ vlan_info.func_id = func_id;
+ vlan_info.vlan_id = vlan_id;
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_CFG_FUNC_VLAN,
+ &vlan_info, sizeof(vlan_info),
+ &vlan_info, &out_size);
+ if (err || !out_size || vlan_info.msg_head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to %s vlan, err: %d, status: 0x%x, out size: 0x%x\n",
+ opcode == HINIC3_CMD_OP_ADD ? "add" : "delete",
+ err, vlan_info.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_add_vlan(void *hwdev, u16 vlan_id, u16 func_id)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ return hinic3_config_vlan(nic_io, HINIC3_CMD_OP_ADD, vlan_id, func_id);
+}
+
+int hinic3_del_vlan(void *hwdev, u16 vlan_id, u16 func_id)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ return hinic3_config_vlan(nic_io, HINIC3_CMD_OP_DEL, vlan_id, func_id);
+}
+
+int hinic3_set_vport_enable(void *hwdev, u16 func_id, bool enable, u16 channel)
+{
+ struct hinic3_vport_state en_state;
+ u16 out_size = sizeof(en_state);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&en_state, 0, sizeof(en_state));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ en_state.func_id = func_id;
+ en_state.state = enable ? 1 : 0;
+
+ err = l2nic_msg_to_mgmt_sync_ch(hwdev, HINIC3_NIC_CMD_SET_VPORT_ENABLE,
+ &en_state, sizeof(en_state),
+ &en_state, &out_size, channel);
+ if (err || !out_size || en_state.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set vport state, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, en_state.msg_head.status, out_size, channel);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(hinic3_set_vport_enable);
+
+int hinic3_set_dcb_state(void *hwdev, struct hinic3_dcb_state *dcb_state)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !dcb_state)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!memcmp(&nic_io->dcb_state, dcb_state, sizeof(nic_io->dcb_state)))
+ return 0;
+
+ /* save in sdk, vf will get dcb state when probing */
+ hinic3_save_dcb_state(nic_io, dcb_state);
+
+ /* notify stateful in pf, than notify all vf */
+ hinic3_notify_dcb_state_event(nic_io, dcb_state);
+
+ return 0;
+}
+
+int hinic3_get_dcb_state(void *hwdev, struct hinic3_dcb_state *dcb_state)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !dcb_state)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ memcpy(dcb_state, &nic_io->dcb_state, sizeof(*dcb_state));
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_dcb_state);
+
+int hinic3_get_cos_by_pri(void *hwdev, u8 pri, u8 *cos)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !cos)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ if (pri >= NIC_DCB_UP_MAX && nic_io->dcb_state.trust == HINIC3_DCB_PCP)
+ return -EINVAL;
+
+ if (pri >= NIC_DCB_IP_PRI_MAX && nic_io->dcb_state.trust == HINIC3_DCB_DSCP)
+ return -EINVAL;
+
+/*lint -e662*/
+/*lint -e661*/
+ if (nic_io->dcb_state.dcb_on) {
+ if (nic_io->dcb_state.trust == HINIC3_DCB_PCP)
+ *cos = nic_io->dcb_state.pcp2cos[pri];
+ else
+ *cos = nic_io->dcb_state.dscp2cos[pri];
+ } else {
+ *cos = nic_io->dcb_state.default_cos;
+ }
+/*lint +e662*/
+/*lint +e661*/
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_cos_by_pri);
+
+int hinic3_save_dcb_state(struct hinic3_nic_io *nic_io,
+ struct hinic3_dcb_state *dcb_state)
+{
+ memcpy(&nic_io->dcb_state, dcb_state, sizeof(*dcb_state));
+
+ return 0;
+}
+
+int hinic3_get_pf_dcb_state(void *hwdev, struct hinic3_dcb_state *dcb_state)
+{
+ struct hinic3_cmd_vf_dcb_state vf_dcb;
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 out_size = sizeof(vf_dcb);
+ int err;
+
+ if (!hwdev || !dcb_state)
+ return -EINVAL;
+
+ memset(&vf_dcb, 0, sizeof(vf_dcb));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ if (hinic3_func_type(hwdev) != TYPE_VF) {
+ nic_err(nic_io->dev_hdl, "Only vf need to get pf dcb state\n");
+ return -EINVAL;
+ }
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_VF_COS, &vf_dcb,
+ sizeof(vf_dcb), &vf_dcb, &out_size);
+ if (err || !out_size || vf_dcb.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to get vf default cos, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, vf_dcb.msg_head.status, out_size);
+ return -EFAULT;
+ }
+
+ memcpy(dcb_state, &vf_dcb.state, sizeof(*dcb_state));
+ /* Save dcb_state in hw for stateful module */
+ hinic3_save_dcb_state(nic_io, dcb_state);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_pf_dcb_state);
+
+#define UNSUPPORT_SET_PAUSE 0x10
+static int hinic3_cfg_hw_pause(struct hinic3_nic_io *nic_io, u8 opcode,
+ struct nic_pause_config *nic_pause)
+{
+ struct hinic3_cmd_pause_config pause_info;
+ u16 out_size = sizeof(pause_info);
+ int err;
+
+ memset(&pause_info, 0, sizeof(pause_info));
+
+ pause_info.port_id = hinic3_physical_port_id(nic_io->hwdev);
+ pause_info.opcode = opcode;
+ if (opcode == HINIC3_CMD_OP_SET) {
+ pause_info.auto_neg = nic_pause->auto_neg;
+ pause_info.rx_pause = nic_pause->rx_pause;
+ pause_info.tx_pause = nic_pause->tx_pause;
+ }
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_CFG_PAUSE_INFO,
+ &pause_info, sizeof(pause_info),
+ &pause_info, &out_size);
+ if (err || !out_size || pause_info.msg_head.status) {
+ if (pause_info.msg_head.status == UNSUPPORT_SET_PAUSE) {
+ err = -EOPNOTSUPP;
+ nic_err(nic_io->dev_hdl, "Can not set pause when pfc is enable\n");
+ } else {
+ err = -EFAULT;
+ nic_err(nic_io->dev_hdl, "Failed to %s pause info, err: %d, status: 0x%x, out size: 0x%x\n",
+ opcode == HINIC3_CMD_OP_SET ? "set" : "get",
+ err, pause_info.msg_head.status, out_size);
+ }
+ return err;
+ }
+
+ if (opcode == HINIC3_CMD_OP_GET) {
+ nic_pause->auto_neg = pause_info.auto_neg;
+ nic_pause->rx_pause = pause_info.rx_pause;
+ nic_pause->tx_pause = pause_info.tx_pause;
+ }
+
+ return 0;
+}
+
+int hinic3_set_pause_info(void *hwdev, struct nic_pause_config nic_pause)
+{
+ struct hinic3_nic_cfg *nic_cfg = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ nic_cfg = &nic_io->nic_cfg;
+
+ down(&nic_cfg->cfg_lock);
+
+ err = hinic3_cfg_hw_pause(nic_io, HINIC3_CMD_OP_SET, &nic_pause);
+ if (err) {
+ up(&nic_cfg->cfg_lock);
+ return err;
+ }
+
+ nic_cfg->pfc_en = 0;
+ nic_cfg->pfc_bitmap = 0;
+ nic_cfg->pause_set = true;
+ nic_cfg->nic_pause.auto_neg = nic_pause.auto_neg;
+ nic_cfg->nic_pause.rx_pause = nic_pause.rx_pause;
+ nic_cfg->nic_pause.tx_pause = nic_pause.tx_pause;
+
+ up(&nic_cfg->cfg_lock);
+
+ return 0;
+}
+
+int hinic3_get_pause_info(void *hwdev, struct nic_pause_config *nic_pause)
+{
+ struct hinic3_nic_cfg *nic_cfg = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ int err = 0;
+
+ if (!hwdev || !nic_pause)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ nic_cfg = &nic_io->nic_cfg;
+
+ err = hinic3_cfg_hw_pause(nic_io, HINIC3_CMD_OP_GET, nic_pause);
+ if (err)
+ return err;
+
+ if (nic_cfg->pause_set || !nic_pause->auto_neg) {
+ nic_pause->rx_pause = nic_cfg->nic_pause.rx_pause;
+ nic_pause->tx_pause = nic_cfg->nic_pause.tx_pause;
+ }
+
+ return 0;
+}
+
+int hinic3_sync_dcb_state(void *hwdev, u8 op_code, u8 state)
+{
+ struct hinic3_cmd_set_dcb_state dcb_state;
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 out_size = sizeof(dcb_state);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&dcb_state, 0, sizeof(dcb_state));
+
+ dcb_state.op_code = op_code;
+ dcb_state.state = state;
+ dcb_state.func_id = hinic3_global_func_id(hwdev);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_QOS_DCB_STATE,
+ &dcb_state, sizeof(dcb_state), &dcb_state, &out_size);
+ if (err || dcb_state.head.status || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to set dcb state, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, dcb_state.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_dcb_set_rq_iq_mapping(void *hwdev, u32 num_rqs, u8 *map,
+ u32 max_map_num)
+{
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_dcb_set_rq_iq_mapping);
+
+int hinic3_flush_qps_res(void *hwdev)
+{
+ struct hinic3_cmd_clear_qp_resource sq_res;
+ u16 out_size = sizeof(sq_res);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ memset(&sq_res, 0, sizeof(sq_res));
+
+ sq_res.func_id = hinic3_global_func_id(hwdev);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_CLEAR_QP_RESOURCE,
+ &sq_res, sizeof(sq_res), &sq_res,
+ &out_size);
+ if (err || !out_size || sq_res.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to clear sq resources, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, sq_res.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_flush_qps_res);
+
+int hinic3_cache_out_qps_res(void *hwdev)
+{
+ struct hinic3_cmd_cache_out_qp_resource qp_res;
+ u16 out_size = sizeof(qp_res);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ memset(&qp_res, 0, sizeof(qp_res));
+
+ qp_res.func_id = hinic3_global_func_id(hwdev);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_CACHE_OUT_QP_RES,
+ &qp_res, sizeof(qp_res), &qp_res, &out_size);
+ if (err || !out_size || qp_res.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to cache out qp resources, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, qp_res.msg_head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_get_fpga_phy_port_stats(void *hwdev, struct hinic3_phy_fpga_port_stats *stats)
+{
+ struct hinic3_port_stats *port_stats = NULL;
+ struct hinic3_port_stats_info stats_info;
+ u16 out_size = sizeof(*port_stats);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ port_stats = kzalloc(sizeof(*port_stats), GFP_KERNEL);
+ if (!port_stats)
+ return -ENOMEM;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ memset(&stats_info, 0, sizeof(stats_info));
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_GET_PORT_STAT,
+ &stats_info, sizeof(stats_info),
+ port_stats, &out_size);
+ if (err || !out_size || port_stats->msg_head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get port statistics, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, port_stats->msg_head.status, out_size);
+ err = -EIO;
+ goto out;
+ }
+
+ memcpy(stats, &port_stats->stats, sizeof(*stats));
+
+out:
+ kfree(port_stats);
+
+ return err;
+}
+EXPORT_SYMBOL(hinic3_get_fpga_phy_port_stats);
+
+int hinic3_get_vport_stats(void *hwdev, u16 func_id, struct hinic3_vport_stats *stats)
+{
+ struct hinic3_port_stats_info stats_info;
+ struct hinic3_cmd_vport_stats vport_stats;
+ u16 out_size = sizeof(vport_stats);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !stats)
+ return -EINVAL;
+
+ memset(&stats_info, 0, sizeof(stats_info));
+ memset(&vport_stats, 0, sizeof(vport_stats));
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ stats_info.func_id = func_id;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_GET_VPORT_STAT,
+ &stats_info, sizeof(stats_info),
+ &vport_stats, &out_size);
+ if (err || !out_size || vport_stats.msg_head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get function statistics, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, vport_stats.msg_head.status, out_size);
+ return -EFAULT;
+ }
+
+ memcpy(stats, &vport_stats.stats, sizeof(*stats));
+
+ return 0;
+}
+
+static int hinic3_set_function_table(struct hinic3_nic_io *nic_io, u32 cfg_bitmap,
+ const struct hinic3_func_tbl_cfg *cfg)
+{
+ struct hinic3_cmd_set_func_tbl cmd_func_tbl;
+ u16 out_size = sizeof(cmd_func_tbl);
+ int err;
+
+ memset(&cmd_func_tbl, 0, sizeof(cmd_func_tbl));
+ cmd_func_tbl.func_id = hinic3_global_func_id(nic_io->hwdev);
+ cmd_func_tbl.cfg_bitmap = cfg_bitmap;
+ cmd_func_tbl.tbl_cfg = *cfg;
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_SET_FUNC_TBL,
+ &cmd_func_tbl, sizeof(cmd_func_tbl),
+ &cmd_func_tbl, &out_size);
+ if (err || cmd_func_tbl.msg_head.status || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to set func table, bitmap: 0x%x, err: %d, status: 0x%x, out size: 0x%x\n",
+ cfg_bitmap, err, cmd_func_tbl.msg_head.status,
+ out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int hinic3_init_function_table(struct hinic3_nic_io *nic_io)
+{
+ struct hinic3_func_tbl_cfg func_tbl_cfg = {0};
+ u32 cfg_bitmap = BIT(FUNC_CFG_INIT) | BIT(FUNC_CFG_MTU) |
+ BIT(FUNC_CFG_RX_BUF_SIZE);
+
+ func_tbl_cfg.mtu = 0x3FFF; /* default, max mtu */
+ func_tbl_cfg.rx_wqe_buf_size = nic_io->rx_buff_len;
+
+ return hinic3_set_function_table(nic_io, cfg_bitmap, &func_tbl_cfg);
+}
+
+int hinic3_set_port_mtu(void *hwdev, u16 new_mtu)
+{
+ struct hinic3_func_tbl_cfg func_tbl_cfg = {0};
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ if (new_mtu < HINIC3_MIN_MTU_SIZE) {
+ nic_err(nic_io->dev_hdl,
+ "Invalid mtu size: %ubytes, mtu size < %ubytes",
+ new_mtu, HINIC3_MIN_MTU_SIZE);
+ return -EINVAL;
+ }
+
+ if (new_mtu > HINIC3_MAX_JUMBO_FRAME_SIZE) {
+ nic_err(nic_io->dev_hdl, "Invalid mtu size: %ubytes, mtu size > %ubytes",
+ new_mtu, HINIC3_MAX_JUMBO_FRAME_SIZE);
+ return -EINVAL;
+ }
+
+ func_tbl_cfg.mtu = new_mtu;
+ return hinic3_set_function_table(nic_io, BIT(FUNC_CFG_MTU),
+ &func_tbl_cfg);
+}
+
+static int nic_feature_nego(void *hwdev, u8 opcode, u64 *s_feature, u16 size)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_cmd_feature_nego feature_nego;
+ u16 out_size = sizeof(feature_nego);
+ int err;
+
+ if (!hwdev || !s_feature || size > NIC_MAX_FEATURE_QWORD)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&feature_nego, 0, sizeof(feature_nego));
+ feature_nego.func_id = hinic3_global_func_id(hwdev);
+ feature_nego.opcode = opcode;
+ if (opcode == HINIC3_CMD_OP_SET)
+ memcpy(feature_nego.s_feature, s_feature, size * sizeof(u64));
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_FEATURE_NEGO,
+ &feature_nego, sizeof(feature_nego),
+ &feature_nego, &out_size);
+ if (err || !out_size || feature_nego.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to negotiate nic feature, err:%d, status: 0x%x, out_size: 0x%x\n",
+ err, feature_nego.msg_head.status, out_size);
+ return -EIO;
+ }
+
+ if (opcode == HINIC3_CMD_OP_GET)
+ memcpy(s_feature, feature_nego.s_feature, size * sizeof(u64));
+
+ return 0;
+}
+
+static int hinic3_get_bios_pf_bw_limit(void *hwdev, u32 *pf_bw_limit)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct nic_cmd_bios_cfg cfg = {{0}};
+ u16 out_size = sizeof(cfg);
+ int err;
+
+ if (!hwdev || !pf_bw_limit)
+ return -EINVAL;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF || !HINIC3_SUPPORT_RATE_LIMIT(hwdev))
+ return 0;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ cfg.bios_cfg.func_id = (u8)hinic3_global_func_id(hwdev);
+ cfg.bios_cfg.func_valid = 1;
+ cfg.op_code = 0 | NIC_NVM_DATA_PF_SPEED_LIMIT;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_BIOS_CFG, &cfg, sizeof(cfg),
+ &cfg, &out_size);
+ if (err || !out_size || cfg.head.status) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to get bios pf bandwidth limit, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, cfg.head.status, out_size);
+ return -EIO;
+ }
+
+ /* check data is valid or not */
+ if (cfg.bios_cfg.signature != BIOS_CFG_SIGNATURE)
+ nic_warn(nic_io->dev_hdl, "Invalid bios configuration data, signature: 0x%x\n",
+ cfg.bios_cfg.signature);
+
+ if (cfg.bios_cfg.pf_bw > MAX_LIMIT_BW) {
+ nic_err(nic_io->dev_hdl, "Invalid bios cfg pf bandwidth limit: %u\n",
+ cfg.bios_cfg.pf_bw);
+ return -EINVAL;
+ }
+
+ *pf_bw_limit = cfg.bios_cfg.pf_bw;
+
+ return 0;
+}
+
+int hinic3_set_pf_rate(void *hwdev, u8 speed_level)
+{
+ struct hinic3_cmd_tx_rate_cfg rate_cfg = {{0}};
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 out_size = sizeof(rate_cfg);
+ u32 pf_rate;
+ int err;
+ u32 speed_convert[PORT_SPEED_UNKNOWN] = {
+ 0, 10, 100, 1000, 10000, 25000, 40000, 50000, 100000, 200000
+ };
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EINVAL;
+
+ if (speed_level >= PORT_SPEED_UNKNOWN) {
+ nic_err(nic_io->dev_hdl, "Invalid speed level: %hhu\n", speed_level);
+ return -EINVAL;
+ }
+
+ if (nic_io->nic_cfg.pf_bw_limit == MAX_LIMIT_BW) {
+ pf_rate = 0;
+ } else {
+ /* divided by 100 to convert to percentage */
+ pf_rate = (speed_convert[speed_level] / 100) * nic_io->nic_cfg.pf_bw_limit;
+ /* bandwidth limit is very small but not unlimit in this case */
+ if (pf_rate == 0 && speed_level != PORT_SPEED_NOT_SET)
+ pf_rate = 1;
+ }
+
+ rate_cfg.func_id = hinic3_global_func_id(hwdev);
+ rate_cfg.min_rate = 0;
+ rate_cfg.max_rate = pf_rate;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SET_MAX_MIN_RATE, &rate_cfg,
+ sizeof(rate_cfg), &rate_cfg, &out_size);
+ if (err || !out_size || rate_cfg.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set rate(%u), err: %d, status: 0x%x, out size: 0x%x\n",
+ pf_rate, err, rate_cfg.msg_head.status, out_size);
+ return rate_cfg.msg_head.status ? rate_cfg.msg_head.status : -EIO;
+ }
+
+ return 0;
+}
+
+static int hinic3_get_nic_feature_from_hw(void *hwdev, u64 *s_feature, u16 size)
+{
+ return nic_feature_nego(hwdev, HINIC3_CMD_OP_GET, s_feature, size);
+}
+
+int hinic3_set_nic_feature_to_hw(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ return nic_feature_nego(hwdev, HINIC3_CMD_OP_SET, &nic_io->feature_cap, 1);
+}
+
+u64 hinic3_get_feature_cap(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ return nic_io->feature_cap;
+}
+
+void hinic3_update_nic_feature(void *hwdev, u64 s_feature)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ nic_io->feature_cap = s_feature;
+
+ nic_info(nic_io->dev_hdl, "Update nic feature to 0x%llx\n", nic_io->feature_cap);
+}
+
+static inline int init_nic_hwdev_param_valid(const void *hwdev, const void *pcidev_hdl,
+ const void *dev_hdl)
+{
+ if (!hwdev || !pcidev_hdl || !dev_hdl)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int hinic3_init_nic_io(void *hwdev, void *pcidev_hdl, void *dev_hdl,
+ struct hinic3_nic_io **nic_io)
+{
+ if (init_nic_hwdev_param_valid(hwdev, pcidev_hdl, dev_hdl))
+ return -EINVAL;
+
+ *nic_io = kzalloc(sizeof(**nic_io), GFP_KERNEL);
+ if (!(*nic_io))
+ return -ENOMEM;
+
+ (*nic_io)->dev_hdl = dev_hdl;
+ (*nic_io)->pcidev_hdl = pcidev_hdl;
+ (*nic_io)->hwdev = hwdev;
+
+ sema_init(&((*nic_io)->nic_cfg.cfg_lock), 1);
+ mutex_init(&((*nic_io)->nic_cfg.sfp_mutex));
+
+ (*nic_io)->nic_cfg.rt_cmd.mpu_send_sfp_abs = false;
+ (*nic_io)->nic_cfg.rt_cmd.mpu_send_sfp_info = false;
+
+ return 0;
+}
+
+/* *
+ * hinic3_init_nic_hwdev - init nic hwdev
+ * @hwdev: pointer to hwdev
+ * @pcidev_hdl: pointer to pcidev or handler
+ * @dev_hdl: pointer to pcidev->dev or handler, for sdk_err() or dma_alloc()
+ * @rx_buff_len: rx_buff_len is receive buffer length
+ */
+int hinic3_init_nic_hwdev(void *hwdev, void *pcidev_hdl, void *dev_hdl,
+ u16 rx_buff_len)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ err = hinic3_init_nic_io(hwdev, pcidev_hdl, dev_hdl, &nic_io);
+ if (err)
+ return err;
+
+ err = hinic3_register_service_adapter(hwdev, nic_io, SERVICE_T_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to register service adapter\n");
+ goto register_sa_err;
+ }
+
+ err = hinic3_set_func_svc_used_state(hwdev, SVC_T_NIC, 1, HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to set function svc used state\n");
+ goto set_used_state_err;
+ }
+
+ err = hinic3_init_function_table(nic_io);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to init function table\n");
+ goto err_out;
+ }
+
+ err = hinic3_get_nic_feature_from_hw(hwdev, &nic_io->feature_cap, 1);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to get nic features\n");
+ goto err_out;
+ }
+
+ sdk_info(dev_hdl, "nic features: 0x%llx\n", nic_io->feature_cap);
+
+ err = hinic3_get_bios_pf_bw_limit(hwdev, &nic_io->nic_cfg.pf_bw_limit);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to get pf bandwidth limit\n");
+ goto err_out;
+ }
+
+ err = hinic3_vf_func_init(nic_io);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to init vf info\n");
+ goto err_out;
+ }
+
+ nic_io->rx_buff_len = rx_buff_len;
+
+ return 0;
+
+err_out:
+ hinic3_set_func_svc_used_state(hwdev, SVC_T_NIC, 0, HINIC3_CHANNEL_NIC);
+
+set_used_state_err:
+ hinic3_unregister_service_adapter(hwdev, SERVICE_T_NIC);
+
+register_sa_err:
+ mutex_deinit(&nic_io->nic_cfg.sfp_mutex);
+ sema_deinit(&nic_io->nic_cfg.cfg_lock);
+
+ kfree(nic_io);
+
+ return err;
+}
+EXPORT_SYMBOL(hinic3_init_nic_hwdev);
+
+void hinic3_free_nic_hwdev(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return;
+
+ hinic3_vf_func_free(nic_io);
+
+ hinic3_set_func_svc_used_state(hwdev, SVC_T_NIC, 0, HINIC3_CHANNEL_NIC);
+
+ hinic3_unregister_service_adapter(hwdev, SERVICE_T_NIC);
+
+ mutex_deinit(&nic_io->nic_cfg.sfp_mutex);
+ sema_deinit(&nic_io->nic_cfg.cfg_lock);
+
+ kfree(nic_io);
+}
+EXPORT_SYMBOL(hinic3_free_nic_hwdev);
+
+int hinic3_force_drop_tx_pkt(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_force_pkt_drop pkt_drop;
+ u16 out_size = sizeof(pkt_drop);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&pkt_drop, 0, sizeof(pkt_drop));
+ pkt_drop.port = hinic3_physical_port_id(hwdev);
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_FORCE_PKT_DROP,
+ &pkt_drop, sizeof(pkt_drop),
+ &pkt_drop, &out_size);
+ if ((pkt_drop.msg_head.status != HINIC3_MGMT_CMD_UNSUPPORTED &&
+ pkt_drop.msg_head.status) || err || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Failed to set force tx packets drop, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, pkt_drop.msg_head.status, out_size);
+ return -EFAULT;
+ }
+
+ return pkt_drop.msg_head.status;
+}
+
+int hinic3_set_rx_mode(void *hwdev, u32 enable)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_rx_mode_config rx_mode_cfg;
+ u16 out_size = sizeof(rx_mode_cfg);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&rx_mode_cfg, 0, sizeof(rx_mode_cfg));
+ rx_mode_cfg.func_id = hinic3_global_func_id(hwdev);
+ rx_mode_cfg.rx_mode = enable;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SET_RX_MODE,
+ &rx_mode_cfg, sizeof(rx_mode_cfg),
+ &rx_mode_cfg, &out_size);
+ if (err || !out_size || rx_mode_cfg.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set rx mode, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, rx_mode_cfg.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_set_rx_vlan_offload(void *hwdev, u8 en)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_cmd_vlan_offload vlan_cfg;
+ u16 out_size = sizeof(vlan_cfg);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&vlan_cfg, 0, sizeof(vlan_cfg));
+ vlan_cfg.func_id = hinic3_global_func_id(hwdev);
+ vlan_cfg.vlan_offload = en;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SET_RX_VLAN_OFFLOAD,
+ &vlan_cfg, sizeof(vlan_cfg),
+ &vlan_cfg, &out_size);
+ if (err || !out_size || vlan_cfg.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set rx vlan offload, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, vlan_cfg.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_update_mac_vlan(void *hwdev, u16 old_vlan, u16 new_vlan, int vf_id)
+{
+ struct vf_data_storage *vf_info = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 func_id;
+ int err;
+
+ if (!hwdev || old_vlan >= VLAN_N_VID || new_vlan >= VLAN_N_VID)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+ if (!nic_io->vf_infos || is_zero_ether_addr(vf_info->drv_mac_addr))
+ return 0;
+
+ func_id = hinic3_glb_pf_vf_offset(nic_io->hwdev) + (u16)vf_id;
+
+ err = hinic3_del_mac(nic_io->hwdev, vf_info->drv_mac_addr,
+ old_vlan, func_id, HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to delete VF %d MAC %pM vlan %u\n",
+ HW_VF_ID_TO_OS(vf_id), vf_info->drv_mac_addr, old_vlan);
+ return err;
+ }
+
+ err = hinic3_set_mac(nic_io->hwdev, vf_info->drv_mac_addr,
+ new_vlan, func_id, HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to add VF %d MAC %pM vlan %u\n",
+ HW_VF_ID_TO_OS(vf_id), vf_info->drv_mac_addr, new_vlan);
+ hinic3_set_mac(nic_io->hwdev, vf_info->drv_mac_addr,
+ old_vlan, func_id, HINIC3_CHANNEL_NIC);
+ return err;
+ }
+
+ return 0;
+}
+
+static int hinic3_set_rx_lro(void *hwdev, u8 ipv4_en, u8 ipv6_en,
+ u8 lro_max_pkt_len)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_cmd_lro_config lro_cfg;
+ u16 out_size = sizeof(lro_cfg);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&lro_cfg, 0, sizeof(lro_cfg));
+ lro_cfg.func_id = hinic3_global_func_id(hwdev);
+ lro_cfg.opcode = HINIC3_CMD_OP_SET;
+ lro_cfg.lro_ipv4_en = ipv4_en;
+ lro_cfg.lro_ipv6_en = ipv6_en;
+ lro_cfg.lro_max_pkt_len = lro_max_pkt_len;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_CFG_RX_LRO,
+ &lro_cfg, sizeof(lro_cfg),
+ &lro_cfg, &out_size);
+ if (err || !out_size || lro_cfg.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set lro offload, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, lro_cfg.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_set_rx_lro_timer(void *hwdev, u32 timer_value)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_cmd_lro_timer lro_timer;
+ u16 out_size = sizeof(lro_timer);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&lro_timer, 0, sizeof(lro_timer));
+ lro_timer.opcode = HINIC3_CMD_OP_SET;
+ lro_timer.timer = timer_value;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_CFG_LRO_TIMER,
+ &lro_timer, sizeof(lro_timer),
+ &lro_timer, &out_size);
+ if (err || !out_size || lro_timer.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set lro timer, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, lro_timer.msg_head.status, out_size);
+
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_set_rx_lro_state(void *hwdev, u8 lro_en, u32 lro_timer,
+ u32 lro_max_pkt_len)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ u8 ipv4_en = 0, ipv6_en = 0;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ ipv4_en = lro_en ? 1 : 0;
+ ipv6_en = lro_en ? 1 : 0;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ nic_info(nic_io->dev_hdl, "Set LRO max coalesce packet size to %uK\n",
+ lro_max_pkt_len);
+
+ err = hinic3_set_rx_lro(hwdev, ipv4_en, ipv6_en, (u8)lro_max_pkt_len);
+ if (err)
+ return err;
+
+ /* we don't set LRO timer for VF */
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return 0;
+
+ nic_info(nic_io->dev_hdl, "Set LRO timer to %u\n", lro_timer);
+
+ return hinic3_set_rx_lro_timer(hwdev, lro_timer);
+}
+
+int hinic3_set_vlan_fliter(void *hwdev, u32 vlan_filter_ctrl)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_cmd_set_vlan_filter vlan_filter;
+ u16 out_size = sizeof(vlan_filter);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&vlan_filter, 0, sizeof(vlan_filter));
+ vlan_filter.func_id = hinic3_global_func_id(hwdev);
+ vlan_filter.vlan_filter_ctrl = vlan_filter_ctrl;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SET_VLAN_FILTER_EN,
+ &vlan_filter, sizeof(vlan_filter),
+ &vlan_filter, &out_size);
+ if (err || !out_size || vlan_filter.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set vlan filter, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, vlan_filter.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_set_func_capture_en(void *hwdev, u16 func_id, bool cap_en)
+{
+ // struct hinic_hwdev *dev = hwdev;
+ struct nic_cmd_capture_info cap_info = {{0}};
+ u16 out_size = sizeof(cap_info);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ /* 2 function capture types */
+ // cap_info.op_type = UP_UCAPTURE_OP_TYPE_FUNC;
+ cap_info.is_en_trx = cap_en;
+ cap_info.func_port = func_id;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SET_UCAPTURE_OPT,
+ &cap_info, sizeof(cap_info),
+ &cap_info, &out_size);
+ if (err || !out_size || cap_info.msg_head.status)
+ return -EINVAL;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_func_capture_en);
+
+int hinic3_add_tcam_rule(void *hwdev, struct nic_tcam_cfg_rule *tcam_rule)
+{
+ u16 out_size = sizeof(struct nic_cmd_fdir_add_rule);
+ struct nic_cmd_fdir_add_rule tcam_cmd;
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !tcam_rule)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (tcam_rule->index >= HINIC3_MAX_TCAM_RULES_NUM) {
+ nic_err(nic_io->dev_hdl, "Tcam rules num to add is invalid\n");
+ return -EINVAL;
+ }
+
+ memset(&tcam_cmd, 0, sizeof(struct nic_cmd_fdir_add_rule));
+ memcpy((void *)&tcam_cmd.rule, (void *)tcam_rule,
+ sizeof(struct nic_tcam_cfg_rule));
+ tcam_cmd.func_id = hinic3_global_func_id(hwdev);
+ tcam_cmd.type = TCAM_RULE_FDIR_TYPE;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_ADD_TC_FLOW,
+ &tcam_cmd, sizeof(tcam_cmd),
+ &tcam_cmd, &out_size);
+ if (err || tcam_cmd.head.status || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Add tcam rule failed, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, tcam_cmd.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_del_tcam_rule(void *hwdev, u32 index)
+{
+ u16 out_size = sizeof(struct nic_cmd_fdir_del_rules);
+ struct nic_cmd_fdir_del_rules tcam_cmd;
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (index >= HINIC3_MAX_TCAM_RULES_NUM) {
+ nic_err(nic_io->dev_hdl, "Tcam rules num to del is invalid\n");
+ return -EINVAL;
+ }
+
+ memset(&tcam_cmd, 0, sizeof(struct nic_cmd_fdir_del_rules));
+ tcam_cmd.index_start = index;
+ tcam_cmd.index_num = 1;
+ tcam_cmd.func_id = hinic3_global_func_id(hwdev);
+ tcam_cmd.type = TCAM_RULE_FDIR_TYPE;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_DEL_TC_FLOW,
+ &tcam_cmd, sizeof(tcam_cmd),
+ &tcam_cmd, &out_size);
+ if (err || tcam_cmd.head.status || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Del tcam rule failed, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, tcam_cmd.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+/**
+ * hinic3_mgmt_tcam_block - alloc or free tcam block for IO packet.
+ *
+ * @param hwdev
+ * The hardware interface of a nic device.
+ * @param alloc_en
+ * 1 alloc block.
+ * 0 free block.
+ * @param index
+ * block index from firmware.
+ * @return
+ * 0 on success,
+ * negative error value otherwise.
+ */
+static int hinic3_mgmt_tcam_block(void *hwdev, u8 alloc_en, u16 *index)
+{
+ struct nic_cmd_ctrl_tcam_block_out tcam_block_info;
+ u16 out_size = sizeof(struct nic_cmd_ctrl_tcam_block_out);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !index)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&tcam_block_info, 0,
+ sizeof(struct nic_cmd_ctrl_tcam_block_out));
+
+ tcam_block_info.func_id = hinic3_global_func_id(hwdev);
+ tcam_block_info.alloc_en = alloc_en;
+ tcam_block_info.tcam_type = NIC_TCAM_BLOCK_TYPE_LARGE;
+ tcam_block_info.tcam_block_index = *index;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_CFG_TCAM_BLOCK,
+ &tcam_block_info, sizeof(tcam_block_info),
+ &tcam_block_info, &out_size);
+ if (err || tcam_block_info.head.status || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Set tcam block failed, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, tcam_block_info.head.status, out_size);
+ return -EIO;
+ }
+
+ if (alloc_en)
+ *index = tcam_block_info.tcam_block_index;
+
+ return 0;
+}
+
+int hinic3_alloc_tcam_block(void *hwdev, u16 *index)
+{
+ return hinic3_mgmt_tcam_block(hwdev, HINIC3_TCAM_BLOCK_ENABLE, index);
+}
+
+int hinic3_free_tcam_block(void *hwdev, u16 *index)
+{
+ return hinic3_mgmt_tcam_block(hwdev, HINIC3_TCAM_BLOCK_DISABLE, index);
+}
+
+int hinic3_set_fdir_tcam_rule_filter(void *hwdev, bool enable)
+{
+ struct nic_cmd_set_tcam_enable port_tcam_cmd;
+ u16 out_size = sizeof(port_tcam_cmd);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&port_tcam_cmd, 0, sizeof(port_tcam_cmd));
+ port_tcam_cmd.func_id = hinic3_global_func_id(hwdev);
+ port_tcam_cmd.tcam_enable = (u8)enable;
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_ENABLE_TCAM,
+ &port_tcam_cmd, sizeof(port_tcam_cmd),
+ &port_tcam_cmd, &out_size);
+ if (err || port_tcam_cmd.head.status || !out_size) {
+ nic_err(nic_io->dev_hdl, "Set fdir tcam filter failed, err: %d, status: 0x%x, out size: 0x%x, enable: 0x%x\n",
+ err, port_tcam_cmd.head.status, out_size,
+ enable);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_flush_tcam_rule(void *hwdev)
+{
+ struct nic_cmd_flush_tcam_rules tcam_flush;
+ u16 out_size = sizeof(struct nic_cmd_flush_tcam_rules);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&tcam_flush, 0, sizeof(struct nic_cmd_flush_tcam_rules));
+ tcam_flush.func_id = hinic3_global_func_id(hwdev);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_FLUSH_TCAM,
+ &tcam_flush,
+ sizeof(struct nic_cmd_flush_tcam_rules),
+ &tcam_flush, &out_size);
+ if (err || tcam_flush.head.status || !out_size) {
+ nic_err(nic_io->dev_hdl,
+ "Flush tcam fdir rules failed, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, tcam_flush.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_get_rxq_hw_info(void *hwdev, struct rxq_check_info *rxq_info, u16 num_qps, u16 wqe_type)
+{
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_rxq_hw *rxq_hw = NULL;
+ struct rxq_check_info *rxq_info_out = NULL;
+ int err;
+ u16 i;
+
+ if (!hwdev || !rxq_info)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ cmd_buf = hinic3_alloc_cmd_buf(hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd_buf.\n");
+ return -ENOMEM;
+ }
+
+ rxq_hw = cmd_buf->buf;
+ rxq_hw->func_id = hinic3_global_func_id(hwdev);
+ rxq_hw->num_queues = num_qps;
+
+ hinic3_cpu_to_be32(rxq_hw, sizeof(struct hinic3_rxq_hw));
+
+ cmd_buf->size = sizeof(struct hinic3_rxq_hw);
+
+ err = hinic3_cmdq_detail_resp(hwdev, HINIC3_MOD_L2NIC, HINIC3_UCODE_CMD_RXQ_INFO_GET,
+ cmd_buf, cmd_buf, NULL, 0, HINIC3_CHANNEL_NIC);
+ if (err)
+ goto get_rxq_info_failed;
+
+ rxq_info_out = cmd_buf->buf;
+ for (i = 0; i < num_qps; i++) {
+ rxq_info[i].hw_pi = rxq_info_out[i].hw_pi >> wqe_type;
+ rxq_info[i].hw_ci = rxq_info_out[i].hw_ci >> wqe_type;
+ }
+
+get_rxq_info_failed:
+ hinic3_free_cmd_buf(hwdev, cmd_buf);
+
+ return err;
+}
+
+int hinic3_pf_set_vf_link_state(void *hwdev, bool vf_link_forced, bool link_state)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct vf_data_storage *vf_infos = NULL;
+ int vf_id;
+
+ if (!hwdev) {
+ pr_err("hwdev is null.\n");
+ return -EINVAL;
+ }
+
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("nic_io is null.\n");
+ return -EINVAL;
+ }
+
+ vf_infos = nic_io->vf_infos;
+ for (vf_id = 0; vf_id < nic_io->max_vfs; vf_id++) {
+ vf_infos[vf_id].link_up = link_state;
+ vf_infos[vf_id].link_forced = vf_link_forced;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_pf_set_vf_link_state);
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.h
new file mode 100644
index 000000000000..dc0a8eb6e2df
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.h
@@ -0,0 +1,620 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NIC_CFG_H
+#define HINIC3_NIC_CFG_H
+
+#include <linux/types.h>
+#include <linux/netdevice.h>
+
+#include "hinic3_mgmt_interface.h"
+#include "mag_cmd.h"
+
+#define OS_VF_ID_TO_HW(os_vf_id) ((os_vf_id) + 1)
+#define HW_VF_ID_TO_OS(hw_vf_id) ((hw_vf_id) - 1)
+
+#define HINIC3_VLAN_PRIORITY_SHIFT 13
+
+#define HINIC3_RSS_INDIR_4B_UNIT 3
+#define HINIC3_RSS_INDIR_NUM 2
+
+#define HINIC3_RSS_KEY_RSV_NUM 2
+#define HINIC3_MAX_NUM_RQ 256
+
+#define HINIC3_MIN_MTU_SIZE 256
+#define HINIC3_MAX_JUMBO_FRAME_SIZE 9600
+
+#define HINIC3_PF_SET_VF_ALREADY 0x4
+#define HINIC3_MGMT_STATUS_EXIST 0x6
+#define CHECK_IPSU_15BIT 0x8000
+
+#define HINIC3_MGMT_STATUS_TABLE_EMPTY 0xB /* Table empty */
+#define HINIC3_MGMT_STATUS_TABLE_FULL 0xC /* Table full */
+
+#define HINIC3_LOWEST_LATENCY 3
+#define HINIC3_MULTI_VM_LATENCY 32
+#define HINIC3_MULTI_VM_PENDING_LIMIT 4
+
+#define HINIC3_RX_RATE_LOW 200000
+#define HINIC3_RX_COAL_TIME_LOW 25
+#define HINIC3_RX_PENDING_LIMIT_LOW 2
+
+#define HINIC3_RX_RATE_HIGH 700000
+#define HINIC3_RX_COAL_TIME_HIGH 225
+#define HINIC3_RX_PENDING_LIMIT_HIGH 8
+
+#define HINIC3_RX_RATE_THRESH 50000
+#define HINIC3_TX_RATE_THRESH 50000
+#define HINIC3_RX_RATE_LOW_VM 100000
+#define HINIC3_RX_PENDING_LIMIT_HIGH_VM 87
+
+#define HINIC3_DCB_PCP 0
+#define HINIC3_DCB_DSCP 1
+
+#define MAX_LIMIT_BW 100
+
+enum hinic3_valid_link_settings {
+ HILINK_LINK_SET_SPEED = 0x1,
+ HILINK_LINK_SET_AUTONEG = 0x2,
+ HILINK_LINK_SET_FEC = 0x4,
+};
+
+enum hinic3_link_follow_status {
+ HINIC3_LINK_FOLLOW_DEFAULT,
+ HINIC3_LINK_FOLLOW_PORT,
+ HINIC3_LINK_FOLLOW_SEPARATE,
+ HINIC3_LINK_FOLLOW_STATUS_MAX,
+};
+
+struct hinic3_link_ksettings {
+ u32 valid_bitmap;
+ u8 speed; /* enum nic_speed_level */
+ u8 autoneg; /* 0 - off; 1 - on */
+ u8 fec; /* 0 - RSFEC; 1 - BASEFEC; 2 - NOFEC */
+};
+
+u64 hinic3_get_feature_cap(void *hwdev);
+
+#define HINIC3_SUPPORT_FEATURE(hwdev, feature) \
+ (hinic3_get_feature_cap(hwdev) & NIC_F_##feature)
+#define HINIC3_SUPPORT_CSUM(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, CSUM)
+#define HINIC3_SUPPORT_SCTP_CRC(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, SCTP_CRC)
+#define HINIC3_SUPPORT_TSO(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, TSO)
+#define HINIC3_SUPPORT_UFO(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, UFO)
+#define HINIC3_SUPPORT_LRO(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, LRO)
+#define HINIC3_SUPPORT_RSS(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, RSS)
+#define HINIC3_SUPPORT_RXVLAN_FILTER(hwdev) \
+ HINIC3_SUPPORT_FEATURE(hwdev, RX_VLAN_FILTER)
+#define HINIC3_SUPPORT_VLAN_OFFLOAD(hwdev) \
+ (HINIC3_SUPPORT_FEATURE(hwdev, RX_VLAN_STRIP) && \
+ HINIC3_SUPPORT_FEATURE(hwdev, TX_VLAN_INSERT))
+#define HINIC3_SUPPORT_VXLAN_OFFLOAD(hwdev) \
+ HINIC3_SUPPORT_FEATURE(hwdev, VXLAN_OFFLOAD)
+#define HINIC3_SUPPORT_IPSEC_OFFLOAD(hwdev) \
+ HINIC3_SUPPORT_FEATURE(hwdev, IPSEC_OFFLOAD)
+#define HINIC3_SUPPORT_FDIR(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, FDIR)
+#define HINIC3_SUPPORT_PROMISC(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, PROMISC)
+#define HINIC3_SUPPORT_ALLMULTI(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, ALLMULTI)
+#define HINIC3_SUPPORT_VF_MAC(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, VF_MAC)
+#define HINIC3_SUPPORT_RATE_LIMIT(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, RATE_LIMIT)
+
+#define HINIC3_SUPPORT_RXQ_RECOVERY(hwdev) HINIC3_SUPPORT_FEATURE(hwdev, RXQ_RECOVERY)
+
+struct nic_rss_type {
+ u8 tcp_ipv6_ext;
+ u8 ipv6_ext;
+ u8 tcp_ipv6;
+ u8 ipv6;
+ u8 tcp_ipv4;
+ u8 ipv4;
+ u8 udp_ipv6;
+ u8 udp_ipv4;
+};
+
+enum hinic3_rss_hash_type {
+ HINIC3_RSS_HASH_ENGINE_TYPE_XOR = 0,
+ HINIC3_RSS_HASH_ENGINE_TYPE_TOEP,
+ HINIC3_RSS_HASH_ENGINE_TYPE_MAX,
+};
+
+/* rss */
+struct nic_rss_indirect_tbl {
+ u32 rsvd[4]; /* Make sure that 16B beyond entry[] */
+ u16 entry[NIC_RSS_INDIR_SIZE];
+};
+
+struct nic_rss_context_tbl {
+ u32 rsvd[4];
+ u32 ctx;
+};
+
+#define NIC_CONFIG_ALL_QUEUE_VLAN_CTX 0xFFFF
+struct nic_vlan_ctx {
+ u32 func_id;
+ u32 qid; /* if qid = 0xFFFF, config current function all queue */
+ u32 vlan_tag;
+ u32 vlan_mode;
+ u32 vlan_sel;
+};
+
+enum hinic3_link_status {
+ HINIC3_LINK_DOWN = 0,
+ HINIC3_LINK_UP
+};
+
+struct nic_port_info {
+ u8 port_type;
+ u8 autoneg_cap;
+ u8 autoneg_state;
+ u8 duplex;
+ u8 speed;
+ u8 fec;
+ u32 supported_mode;
+ u32 advertised_mode;
+};
+
+struct nic_pause_config {
+ u8 auto_neg;
+ u8 rx_pause;
+ u8 tx_pause;
+};
+
+struct rxq_check_info {
+ u16 hw_pi;
+ u16 hw_ci;
+};
+
+struct hinic3_rxq_hw {
+ u32 func_id;
+ u32 num_queues;
+
+ u32 rsvd[14];
+};
+
+#define MODULE_TYPE_SFP 0x3
+#define MODULE_TYPE_QSFP28 0x11
+#define MODULE_TYPE_QSFP 0x0C
+#define MODULE_TYPE_QSFP_PLUS 0x0D
+
+#define TCAM_IP_TYPE_MASK 0x1
+#define TCAM_TUNNEL_TYPE_MASK 0xF
+#define TCAM_FUNC_ID_MASK 0x7FFF
+
+int hinic3_add_tcam_rule(void *hwdev, struct nic_tcam_cfg_rule *tcam_rule);
+int hinic3_del_tcam_rule(void *hwdev, u32 index);
+
+int hinic3_alloc_tcam_block(void *hwdev, u16 *index);
+int hinic3_free_tcam_block(void *hwdev, u16 *index);
+
+int hinic3_set_fdir_tcam_rule_filter(void *hwdev, bool enable);
+
+int hinic3_flush_tcam_rule(void *hwdev);
+
+/* *
+ * @brief hinic3_update_mac - update mac address to hardware
+ * @param hwdev: device pointer to hwdev
+ * @param old_mac: old mac to delete
+ * @param new_mac: new mac to update
+ * @param vlan_id: vlan id
+ * @param func_id: function index
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_update_mac(void *hwdev, u8 *old_mac, u8 *new_mac, u16 vlan_id,
+ u16 func_id);
+
+/* *
+ * @brief hinic3_get_default_mac - get default mac address
+ * @param hwdev: device pointer to hwdev
+ * @param mac_addr: mac address from hardware
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_default_mac(void *hwdev, u8 *mac_addr);
+
+/* *
+ * @brief hinic3_set_port_mtu - set function mtu
+ * @param hwdev: device pointer to hwdev
+ * @param new_mtu: mtu
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_port_mtu(void *hwdev, u16 new_mtu);
+
+/* *
+ * @brief hinic3_get_link_state - get link state
+ * @param hwdev: device pointer to hwdev
+ * @param link_state: link state, 0-link down, 1-link up
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_link_state(void *hwdev, u8 *link_state);
+
+/* *
+ * @brief hinic3_get_vport_stats - get function stats
+ * @param hwdev: device pointer to hwdev
+ * @param func_id: function index
+ * @param stats: function stats
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_vport_stats(void *hwdev, u16 func_id, struct hinic3_vport_stats *stats);
+
+/* *
+ * @brief hinic3_notify_all_vfs_link_changed - notify to all vfs link changed
+ * @param hwdev: device pointer to hwdev
+ * @param link_status: link state, 0-link down, 1-link up
+ */
+void hinic3_notify_all_vfs_link_changed(void *hwdev, u8 link_status);
+
+/* *
+ * @brief hinic3_force_drop_tx_pkt - force drop tx packet
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_force_drop_tx_pkt(void *hwdev);
+
+/* *
+ * @brief hinic3_set_rx_mode - set function rx mode
+ * @param hwdev: device pointer to hwdev
+ * @param enable: rx mode state
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_rx_mode(void *hwdev, u32 enable);
+
+/* *
+ * @brief hinic3_set_rx_vlan_offload - set function vlan offload valid state
+ * @param hwdev: device pointer to hwdev
+ * @param en: 0-disable, 1-enable
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_rx_vlan_offload(void *hwdev, u8 en);
+
+/* *
+ * @brief hinic3_set_rx_lro_state - set rx LRO configuration
+ * @param hwdev: device pointer to hwdev
+ * @param lro_en: 0-disable, 1-enable
+ * @param lro_timer: LRO aggregation timeout
+ * @param lro_max_pkt_len: LRO coalesce packet size(unit is 1K)
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_rx_lro_state(void *hwdev, u8 lro_en, u32 lro_timer,
+ u32 lro_max_pkt_len);
+
+/* *
+ * @brief hinic3_set_vf_spoofchk - set vf spoofchk
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @param spoofchk: spoofchk
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_vf_spoofchk(void *hwdev, u16 vf_id, bool spoofchk);
+
+/* *
+ * @brief hinic3_vf_info_spoofchk - get vf spoofchk info
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @retval spoofchk state
+ */
+bool hinic3_vf_info_spoofchk(void *hwdev, int vf_id);
+
+/* *
+ * @brief hinic3_add_vf_vlan - add vf vlan id
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @param vlan: vlan id
+ * @param qos: qos
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_add_vf_vlan(void *hwdev, int vf_id, u16 vlan, u8 qos);
+
+/* *
+ * @brief hinic3_kill_vf_vlan - kill vf vlan
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @param vlan: vlan id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_kill_vf_vlan(void *hwdev, int vf_id);
+
+/* *
+ * @brief hinic3_set_vf_mac - set vf mac
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @param mac_addr: vf mac address
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_vf_mac(void *hwdev, int vf_id, unsigned char *mac_addr);
+
+/* *
+ * @brief hinic3_vf_info_vlanprio - get vf vlan priority
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @retval zero: vlan priority
+ */
+u16 hinic3_vf_info_vlanprio(void *hwdev, int vf_id);
+
+/* *
+ * @brief hinic3_set_vf_tx_rate - set vf tx rate
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @param max_rate: max rate
+ * @param min_rate: min rate
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_vf_tx_rate(void *hwdev, u16 vf_id, u32 max_rate, u32 min_rate);
+
+/* *
+ * @brief hinic3_set_vf_tx_rate - set vf tx rate
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @param ivi: vf info
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+void hinic3_get_vf_config(void *hwdev, u16 vf_id, struct ifla_vf_info *ivi);
+
+/* *
+ * @brief hinic3_set_vf_link_state - set vf link state
+ * @param hwdev: device pointer to hwdev
+ * @param vf_id: vf id
+ * @param link: link state
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_vf_link_state(void *hwdev, u16 vf_id, int link);
+
+/* *
+ * @brief hinic3_get_port_info - set port info
+ * @param hwdev: device pointer to hwdev
+ * @param port_info: port info
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_port_info(void *hwdev, struct nic_port_info *port_info,
+ u16 channel);
+
+/* *
+ * @brief hinic3_set_rss_type - set rss type
+ * @param hwdev: device pointer to hwdev
+ * @param rss_type: rss type
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_rss_type(void *hwdev, struct nic_rss_type rss_type);
+
+/* *
+ * @brief hinic3_get_rss_type - get rss type
+ * @param hwdev: device pointer to hwdev
+ * @param rss_type: rss type
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_rss_type(void *hwdev, struct nic_rss_type *rss_type);
+
+/* *
+ * @brief hinic3_rss_get_hash_engine - get rss hash engine
+ * @param hwdev: device pointer to hwdev
+ * @param type: hash engine
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_rss_get_hash_engine(void *hwdev, u8 *type);
+
+/* *
+ * @brief hinic3_rss_set_hash_engine - set rss hash engine
+ * @param hwdev: device pointer to hwdev
+ * @param type: hash engine
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_rss_set_hash_engine(void *hwdev, u8 type);
+
+/* *
+ * @brief hinic3_rss_cfg - set rss configuration
+ * @param hwdev: device pointer to hwdev
+ * @param rss_en: enable rss flag
+ * @param type: number of TC
+ * @param cos_num: cos num
+ * @param num_qps: number of queue
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_rss_cfg(void *hwdev, u8 rss_en, u8 cos_num, u8 *prio_tc,
+ u16 num_qps);
+
+/* *
+ * @brief hinic3_rss_set_template_tbl - set template table
+ * @param hwdev: device pointer to hwdev
+ * @param key: rss key
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_rss_set_hash_key(void *hwdev, const u8 *key);
+
+/* *
+ * @brief hinic3_rss_get_template_tbl - get template table
+ * @param hwdev: device pointer to hwdev
+ * @param key: rss key
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_rss_get_hash_key(void *hwdev, u8 *key);
+
+/* *
+ * @brief hinic3_refresh_nic_cfg - refresh port cfg
+ * @param hwdev: device pointer to hwdev
+ * @param port_info: port information
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_refresh_nic_cfg(void *hwdev, struct nic_port_info *port_info);
+
+/* *
+ * @brief hinic3_add_vlan - add vlan
+ * @param hwdev: device pointer to hwdev
+ * @param vlan_id: vlan id
+ * @param func_id: function id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_add_vlan(void *hwdev, u16 vlan_id, u16 func_id);
+
+/* *
+ * @brief hinic3_del_vlan - delete vlan
+ * @param hwdev: device pointer to hwdev
+ * @param vlan_id: vlan id
+ * @param func_id: function id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_del_vlan(void *hwdev, u16 vlan_id, u16 func_id);
+
+/* *
+ * @brief hinic3_rss_set_indir_tbl - set rss indirect table
+ * @param hwdev: device pointer to hwdev
+ * @param indir_table: rss indirect table
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_rss_set_indir_tbl(void *hwdev, const u32 *indir_table);
+
+/* *
+ * @brief hinic3_rss_get_indir_tbl - get rss indirect table
+ * @param hwdev: device pointer to hwdev
+ * @param indir_table: rss indirect table
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_rss_get_indir_tbl(void *hwdev, u32 *indir_table);
+
+/* *
+ * @brief hinic3_get_phy_port_stats - get port stats
+ * @param hwdev: device pointer to hwdev
+ * @param stats: port stats
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_phy_port_stats(void *hwdev, struct mag_cmd_port_stats *stats);
+
+int hinic3_get_fpga_phy_port_stats(void *hwdev, struct hinic3_phy_fpga_port_stats *stats);
+
+int hinic3_set_port_funcs_state(void *hwdev, bool enable);
+
+int hinic3_reset_port_link_cfg(void *hwdev);
+
+int hinic3_force_port_relink(void *hwdev);
+
+int hinic3_set_dcb_state(void *hwdev, struct hinic3_dcb_state *dcb_state);
+
+int hinic3_dcb_set_pfc(void *hwdev, u8 pfc_en, u8 pfc_bitmap);
+
+int hinic3_dcb_get_pfc(void *hwdev, u8 *pfc_en_bitmap);
+
+int hinic3_dcb_set_ets(void *hwdev, u8 *cos_tc, u8 *cos_bw, u8 *cos_prio,
+ u8 *tc_bw, u8 *tc_prio);
+
+int hinic3_dcb_set_cos_up_map(void *hwdev, u8 cos_valid_bitmap, u8 *cos_up,
+ u8 max_cos_num);
+
+int hinic3_dcb_set_rq_iq_mapping(void *hwdev, u32 num_rqs, u8 *map,
+ u32 max_map_num);
+
+int hinic3_sync_dcb_state(void *hwdev, u8 op_code, u8 state);
+
+int hinic3_get_pause_info(void *hwdev, struct nic_pause_config *nic_pause);
+
+int hinic3_set_pause_info(void *hwdev, struct nic_pause_config nic_pause);
+
+int hinic3_set_link_settings(void *hwdev,
+ struct hinic3_link_ksettings *settings);
+
+int hinic3_set_vlan_fliter(void *hwdev, u32 vlan_filter_ctrl);
+
+void hinic3_clear_vfs_info(void *hwdev);
+
+int hinic3_update_mac_vlan(void *hwdev, u16 old_vlan, u16 new_vlan, int vf_id);
+
+int hinic3_set_led_status(void *hwdev, enum mag_led_type type,
+ enum mag_led_mode mode);
+
+int hinic3_set_func_capture_en(void *hwdev, u16 func_id, bool cap_en);
+
+int hinic3_set_loopback_mode(void *hwdev, u8 mode, u8 enable);
+int hinic3_get_loopback_mode(void *hwdev, u8 *mode, u8 *enable);
+
+#ifdef HAVE_NDO_SET_VF_TRUST
+bool hinic3_get_vf_trust(void *hwdev, int vf_id);
+int hinic3_set_vf_trust(void *hwdev, u16 vf_id, bool trust);
+#endif
+
+int hinic3_set_autoneg(void *hwdev, bool enable);
+
+int hinic3_get_sfp_type(void *hwdev, u8 *sfp_type, u8 *sfp_type_ext);
+int hinic3_get_sfp_eeprom(void *hwdev, u8 *data, u32 len);
+
+bool hinic3_if_sfp_absent(void *hwdev);
+int hinic3_get_sfp_info(void *hwdev, struct mag_cmd_get_xsfp_info *sfp_info);
+
+/* *
+ * @brief hinic3_set_nic_feature_to_hw - sync nic feature to hardware
+ * @param hwdev: device pointer to hwdev
+ */
+int hinic3_set_nic_feature_to_hw(void *hwdev);
+
+/* *
+ * @brief hinic3_update_nic_feature - update nic feature
+ * @param hwdev: device pointer to hwdev
+ * @param s_feature: nic features
+ * @param size: @s_feature's array size
+ */
+void hinic3_update_nic_feature(void *hwdev, u64 s_feature);
+
+/* *
+ * @brief hinic3_set_link_status_follow - set link follow status
+ * @param hwdev: device pointer to hwdev
+ * @param status: link follow status
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_link_status_follow(void *hwdev, enum hinic3_link_follow_status status);
+
+/* *
+ * @brief hinic3_update_pf_bw - update pf bandwidth
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_update_pf_bw(void *hwdev);
+
+/* *
+ * @brief hinic3_set_pf_bw_limit - set pf bandwidth limit
+ * @param hwdev: device pointer to hwdev
+ * @param bw_limit: pf bandwidth limit
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_pf_bw_limit(void *hwdev, u32 bw_limit);
+
+/* *
+ * @brief hinic3_set_pf_rate - set pf rate
+ * @param hwdev: device pointer to hwdev
+ * @param speed_level: speed level
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_pf_rate(void *hwdev, u8 speed_level);
+
+int hinic3_get_rxq_hw_info(void *hwdev, struct rxq_check_info *rxq_info, u16 num_qps, u16 wqe_type);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg_vf.c b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg_vf.c
new file mode 100644
index 000000000000..b46cf78ce9e3
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg_vf.c
@@ -0,0 +1,637 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/ethtool.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic.h"
+#include "hinic3_nic_cmd.h"
+
+/*lint -e806*/
+static unsigned char set_vf_link_state;
+module_param(set_vf_link_state, byte, 0444);
+MODULE_PARM_DESC(set_vf_link_state, "Set vf link state, 0 represents link auto, 1 represents link always up, 2 represents link always down. - default is 0.");
+/*lint +e806*/
+
+/* In order to adapt different linux version */
+enum {
+ HINIC3_IFLA_VF_LINK_STATE_AUTO, /* link state of the uplink */
+ HINIC3_IFLA_VF_LINK_STATE_ENABLE, /* link always up */
+ HINIC3_IFLA_VF_LINK_STATE_DISABLE, /* link always down */
+};
+
+#define NIC_CVLAN_INSERT_ENABLE 0x1
+#define NIC_QINQ_INSERT_ENABLE 0X3
+static int hinic3_set_vlan_ctx(struct hinic3_nic_io *nic_io, u16 func_id,
+ u16 vlan_tag, u16 q_id, bool add)
+{
+ struct nic_vlan_ctx *vlan_ctx = NULL;
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ u64 out_param = 0;
+ int err;
+
+ cmd_buf = hinic3_alloc_cmd_buf(nic_io->hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd buf\n");
+ return -ENOMEM;
+ }
+
+ cmd_buf->size = sizeof(struct nic_vlan_ctx);
+ vlan_ctx = (struct nic_vlan_ctx *)cmd_buf->buf;
+
+ vlan_ctx->func_id = func_id;
+ vlan_ctx->qid = q_id;
+ vlan_ctx->vlan_tag = vlan_tag;
+ vlan_ctx->vlan_sel = 0; /* TPID0 in IPSU */
+ vlan_ctx->vlan_mode = add ?
+ NIC_QINQ_INSERT_ENABLE : NIC_CVLAN_INSERT_ENABLE;
+
+ hinic3_cpu_to_be32(vlan_ctx, sizeof(struct nic_vlan_ctx));
+
+ err = hinic3_cmdq_direct_resp(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_UCODE_CMD_MODIFY_VLAN_CTX,
+ cmd_buf, &out_param, 0,
+ HINIC3_CHANNEL_NIC);
+
+ hinic3_free_cmd_buf(nic_io->hwdev, cmd_buf);
+
+ if (err || out_param != 0) {
+ nic_err(nic_io->dev_hdl, "Failed to set vlan context, err: %d, out_param: 0x%llx\n",
+ err, out_param);
+ return -EFAULT;
+ }
+
+ return err;
+}
+
+int hinic3_cfg_vf_vlan(struct hinic3_nic_io *nic_io, u8 opcode, u16 vid,
+ u8 qos, int vf_id)
+{
+ struct hinic3_cmd_vf_vlan_config vf_vlan;
+ u16 out_size = sizeof(vf_vlan);
+ u16 glb_func_id;
+ int err;
+ u16 vlan_tag;
+
+ /* VLAN 0 is a special case, don't allow it to be removed */
+ if (!vid && opcode == HINIC3_CMD_OP_DEL)
+ return 0;
+
+ memset(&vf_vlan, 0, sizeof(vf_vlan));
+
+ vf_vlan.opcode = opcode;
+ vf_vlan.func_id = hinic3_glb_pf_vf_offset(nic_io->hwdev) + (u16)vf_id;
+ vf_vlan.vlan_id = vid;
+ vf_vlan.qos = qos;
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev, HINIC3_NIC_CMD_CFG_VF_VLAN,
+ &vf_vlan, sizeof(vf_vlan),
+ &vf_vlan, &out_size);
+ if (err || !out_size || vf_vlan.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set VF %d vlan, err: %d, status: 0x%x,out size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf_id), err, vf_vlan.msg_head.status,
+ out_size);
+ return -EFAULT;
+ }
+
+ vlan_tag = vid + (u16)(qos << VLAN_PRIO_SHIFT);
+
+ glb_func_id = hinic3_glb_pf_vf_offset(nic_io->hwdev) + (u16)vf_id;
+ err = hinic3_set_vlan_ctx(nic_io, glb_func_id, vlan_tag,
+ NIC_CONFIG_ALL_QUEUE_VLAN_CTX,
+ opcode == HINIC3_CMD_OP_ADD);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to set VF %d vlan ctx, err: %d\n",
+ HW_VF_ID_TO_OS(vf_id), err);
+
+ /* rollback vlan config */
+ if (opcode == HINIC3_CMD_OP_DEL)
+ vf_vlan.opcode = HINIC3_CMD_OP_ADD;
+ else
+ vf_vlan.opcode = HINIC3_CMD_OP_DEL;
+ l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_CFG_VF_VLAN, &vf_vlan,
+ sizeof(vf_vlan), &vf_vlan, &out_size);
+ return err;
+ }
+
+ return 0;
+}
+
+/* this function just be called by hinic3_ndo_set_vf_mac,
+ * others are not permitted.
+ */
+int hinic3_set_vf_mac(void *hwdev, int vf_id, unsigned char *mac_addr)
+{
+ struct vf_data_storage *vf_info;
+ struct hinic3_nic_io *nic_io;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+#ifndef __VMWARE__
+ /* duplicate request, so just return success */
+ if (ether_addr_equal(vf_info->user_mac_addr, mac_addr))
+ return 0;
+
+#else
+ if (ether_addr_equal(vf_info->user_mac_addr, mac_addr))
+ return 0;
+#endif
+ ether_addr_copy(vf_info->user_mac_addr, mac_addr);
+
+ return 0;
+}
+
+int hinic3_add_vf_vlan(void *hwdev, int vf_id, u16 vlan, u8 qos)
+{
+ struct hinic3_nic_io *nic_io;
+ int err;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ err = hinic3_cfg_vf_vlan(nic_io, HINIC3_CMD_OP_ADD, vlan, qos, vf_id);
+ if (err)
+ return err;
+
+ nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_vlan = vlan;
+ nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_qos = qos;
+
+ nic_info(nic_io->dev_hdl, "Setting VLAN %u, QOS 0x%x on VF %d\n",
+ vlan, qos, HW_VF_ID_TO_OS(vf_id));
+
+ return 0;
+}
+
+int hinic3_kill_vf_vlan(void *hwdev, int vf_id)
+{
+ struct vf_data_storage *vf_infos;
+ struct hinic3_nic_io *nic_io;
+ int err;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ vf_infos = nic_io->vf_infos;
+
+ err = hinic3_cfg_vf_vlan(nic_io, HINIC3_CMD_OP_DEL,
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_vlan,
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_qos, vf_id);
+ if (err)
+ return err;
+
+ nic_info(nic_io->dev_hdl, "Remove VLAN %u on VF %d\n",
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_vlan,
+ HW_VF_ID_TO_OS(vf_id));
+
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_vlan = 0;
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_qos = 0;
+
+ return 0;
+}
+
+u16 hinic3_vf_info_vlanprio(void *hwdev, int vf_id)
+{
+ struct hinic3_nic_io *nic_io;
+ u16 pf_vlan, vlanprio;
+ u8 pf_qos;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ pf_vlan = nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_vlan;
+ pf_qos = nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].pf_qos;
+ vlanprio = (u16)(pf_vlan | (pf_qos << HINIC3_VLAN_PRIORITY_SHIFT));
+
+ return vlanprio;
+}
+
+int hinic3_set_vf_link_state(void *hwdev, u16 vf_id, int link)
+{
+ struct hinic3_nic_io *nic_io =
+ hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ struct vf_data_storage *vf_infos = nic_io->vf_infos;
+ u8 link_status = 0;
+
+ switch (link) {
+ case HINIC3_IFLA_VF_LINK_STATE_AUTO:
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].link_forced = false;
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].link_up = nic_io->link_status ?
+ true : false;
+ link_status = nic_io->link_status;
+ break;
+ case HINIC3_IFLA_VF_LINK_STATE_ENABLE:
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].link_forced = true;
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].link_up = true;
+ link_status = HINIC3_LINK_UP;
+ break;
+ case HINIC3_IFLA_VF_LINK_STATE_DISABLE:
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].link_forced = true;
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].link_up = false;
+ link_status = HINIC3_LINK_DOWN;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* Notify the VF of its new link state */
+ hinic3_notify_vf_link_status(nic_io, vf_id, link_status);
+
+ return 0;
+}
+
+int hinic3_set_vf_spoofchk(void *hwdev, u16 vf_id, bool spoofchk)
+{
+ struct hinic3_cmd_spoofchk_set spoofchk_cfg;
+ struct vf_data_storage *vf_infos = NULL;
+ u16 out_size = sizeof(spoofchk_cfg);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ vf_infos = nic_io->vf_infos;
+
+ memset(&spoofchk_cfg, 0, sizeof(spoofchk_cfg));
+
+ spoofchk_cfg.func_id = hinic3_glb_pf_vf_offset(hwdev) + vf_id;
+ spoofchk_cfg.state = spoofchk ? 1 : 0;
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SET_SPOOPCHK_STATE,
+ &spoofchk_cfg,
+ sizeof(spoofchk_cfg), &spoofchk_cfg,
+ &out_size);
+ if (err || !out_size || spoofchk_cfg.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set VF(%d) spoofchk, err: %d, status: 0x%x, out size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf_id), err,
+ spoofchk_cfg.msg_head.status, out_size);
+ err = -EINVAL;
+ }
+
+ vf_infos[HW_VF_ID_TO_OS(vf_id)].spoofchk = spoofchk;
+
+ return err;
+}
+
+bool hinic3_vf_info_spoofchk(void *hwdev, int vf_id)
+{
+ struct hinic3_nic_io *nic_io;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ return nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].spoofchk;
+}
+
+#ifdef HAVE_NDO_SET_VF_TRUST
+int hinic3_set_vf_trust(void *hwdev, u16 vf_id, bool trust)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (vf_id > nic_io->max_vfs)
+ return -EINVAL;
+
+ nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].trust = trust;
+
+ return 0;
+}
+
+bool hinic3_get_vf_trust(void *hwdev, int vf_id)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (vf_id > nic_io->max_vfs)
+ return -EINVAL;
+
+ return nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].trust;
+}
+#endif
+
+static int hinic3_set_vf_tx_rate_max_min(struct hinic3_nic_io *nic_io,
+ u16 vf_id, u32 max_rate, u32 min_rate)
+{
+ struct hinic3_cmd_tx_rate_cfg rate_cfg;
+ u16 out_size = sizeof(rate_cfg);
+ int err;
+
+ memset(&rate_cfg, 0, sizeof(rate_cfg));
+
+ rate_cfg.func_id = hinic3_glb_pf_vf_offset(nic_io->hwdev) + vf_id;
+ rate_cfg.max_rate = max_rate;
+ rate_cfg.min_rate = min_rate;
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_SET_MAX_MIN_RATE,
+ &rate_cfg, sizeof(rate_cfg), &rate_cfg,
+ &out_size);
+ if (rate_cfg.msg_head.status || err || !out_size) {
+ nic_err(nic_io->dev_hdl, "Failed to set VF %d max rate %u, min rate %u, err: %d, status: 0x%x, out size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf_id), max_rate, min_rate, err,
+ rate_cfg.msg_head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_set_vf_tx_rate(void *hwdev, u16 vf_id, u32 max_rate, u32 min_rate)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!HINIC3_SUPPORT_RATE_LIMIT(hwdev)) {
+ nic_err(nic_io->dev_hdl, "Current function doesn't support to set vf rate limit\n");
+ return -EOPNOTSUPP;
+ }
+
+ err = hinic3_set_vf_tx_rate_max_min(nic_io, vf_id, max_rate, min_rate);
+ if (err)
+ return err;
+
+ nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].max_rate = max_rate;
+ nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].min_rate = min_rate;
+
+ return 0;
+}
+
+void hinic3_get_vf_config(void *hwdev, u16 vf_id, struct ifla_vf_info *ivi)
+{
+ struct vf_data_storage *vfinfo;
+ struct hinic3_nic_io *nic_io;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ vfinfo = nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+
+ ivi->vf = HW_VF_ID_TO_OS(vf_id);
+ ether_addr_copy(ivi->mac, vfinfo->user_mac_addr);
+ ivi->vlan = vfinfo->pf_vlan;
+ ivi->qos = vfinfo->pf_qos;
+
+#ifdef HAVE_VF_SPOOFCHK_CONFIGURE
+ ivi->spoofchk = vfinfo->spoofchk;
+#endif
+
+#ifdef HAVE_NDO_SET_VF_TRUST
+ ivi->trusted = vfinfo->trust;
+#endif
+
+#ifdef HAVE_NDO_SET_VF_MIN_MAX_TX_RATE
+ ivi->max_tx_rate = vfinfo->max_rate;
+ ivi->min_tx_rate = vfinfo->min_rate;
+#else
+ ivi->tx_rate = vfinfo->max_rate;
+#endif /* HAVE_NDO_SET_VF_MIN_MAX_TX_RATE */
+
+#ifdef HAVE_NDO_SET_VF_LINK_STATE
+ if (!vfinfo->link_forced)
+ ivi->linkstate = IFLA_VF_LINK_STATE_AUTO;
+ else if (vfinfo->link_up)
+ ivi->linkstate = IFLA_VF_LINK_STATE_ENABLE;
+ else
+ ivi->linkstate = IFLA_VF_LINK_STATE_DISABLE;
+#endif
+}
+
+static int hinic3_init_vf_infos(struct hinic3_nic_io *nic_io, u16 vf_id)
+{
+ struct vf_data_storage *vf_infos = nic_io->vf_infos;
+ u8 vf_link_state;
+
+ if (set_vf_link_state > HINIC3_IFLA_VF_LINK_STATE_DISABLE) {
+ nic_warn(nic_io->dev_hdl, "Module Parameter set_vf_link_state value %u is out of range, resetting to %d\n",
+ set_vf_link_state, HINIC3_IFLA_VF_LINK_STATE_AUTO);
+ set_vf_link_state = HINIC3_IFLA_VF_LINK_STATE_AUTO;
+ }
+
+ vf_link_state = set_vf_link_state;
+
+ switch (vf_link_state) {
+ case HINIC3_IFLA_VF_LINK_STATE_AUTO:
+ vf_infos[vf_id].link_forced = false;
+ break;
+ case HINIC3_IFLA_VF_LINK_STATE_ENABLE:
+ vf_infos[vf_id].link_forced = true;
+ vf_infos[vf_id].link_up = true;
+ break;
+ case HINIC3_IFLA_VF_LINK_STATE_DISABLE:
+ vf_infos[vf_id].link_forced = true;
+ vf_infos[vf_id].link_up = false;
+ break;
+ default:
+ nic_err(nic_io->dev_hdl, "Input parameter set_vf_link_state error: %u\n",
+ vf_link_state);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int vf_func_register(struct hinic3_nic_io *nic_io)
+{
+ struct hinic3_cmd_register_vf register_info;
+ u16 out_size = sizeof(register_info);
+ int err;
+
+ err = hinic3_register_vf_mbox_cb(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ nic_io->hwdev, hinic3_vf_event_handler);
+ if (err)
+ return err;
+
+ err = hinic3_register_vf_mbox_cb(nic_io->hwdev, HINIC3_MOD_HILINK,
+ nic_io->hwdev, hinic3_vf_mag_event_handler);
+ if (err)
+ goto reg_hilink_err;
+
+ memset(®ister_info, 0, sizeof(register_info));
+ register_info.op_register = 1;
+ register_info.support_extra_feature = 0;
+ err = hinic3_mbox_to_pf(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_NIC_CMD_VF_REGISTER,
+ ®ister_info, sizeof(register_info),
+ ®ister_info, &out_size, 0,
+ HINIC3_CHANNEL_NIC);
+ if (err || !out_size || register_info.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to register VF, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, register_info.msg_head.status, out_size);
+ err = -EIO;
+ goto register_err;
+ }
+
+ return 0;
+
+register_err:
+ hinic3_unregister_vf_mbox_cb(nic_io->hwdev, HINIC3_MOD_HILINK);
+
+reg_hilink_err:
+ hinic3_unregister_vf_mbox_cb(nic_io->hwdev, HINIC3_MOD_L2NIC);
+
+ return err;
+}
+
+static int pf_init_vf_infos(struct hinic3_nic_io *nic_io)
+{
+ u32 size;
+ int err;
+ u16 i;
+
+ nic_io->max_vfs = hinic3_func_max_vf(nic_io->hwdev);
+ size = sizeof(*nic_io->vf_infos) * nic_io->max_vfs;
+ if (!size)
+ return 0;
+
+ nic_io->vf_infos = kzalloc(size, GFP_KERNEL);
+ if (!nic_io->vf_infos)
+ return -ENOMEM;
+
+ for (i = 0; i < nic_io->max_vfs; i++) {
+ err = hinic3_init_vf_infos(nic_io, i);
+ if (err)
+ goto init_vf_infos_err;
+ }
+
+ err = hinic3_register_pf_mbox_cb(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ nic_io->hwdev, hinic3_pf_mbox_handler);
+ if (err)
+ goto register_pf_mbox_cb_err;
+
+ err = hinic3_register_pf_mbox_cb(nic_io->hwdev, HINIC3_MOD_HILINK,
+ nic_io->hwdev, hinic3_pf_mag_mbox_handler);
+ if (err)
+ goto register_pf_mag_mbox_cb_err;
+
+ return 0;
+
+register_pf_mag_mbox_cb_err:
+ hinic3_unregister_pf_mbox_cb(nic_io->hwdev, HINIC3_MOD_L2NIC);
+register_pf_mbox_cb_err:
+init_vf_infos_err:
+ kfree(nic_io->vf_infos);
+
+ return err;
+}
+
+int hinic3_vf_func_init(struct hinic3_nic_io *nic_io)
+{
+ int err;
+
+ if (hinic3_func_type(nic_io->hwdev) == TYPE_VF)
+ return vf_func_register(nic_io);
+
+ err = hinic3_register_mgmt_msg_cb(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ nic_io->hwdev, hinic3_pf_event_handler);
+ if (err)
+ return err;
+
+ err = hinic3_register_mgmt_msg_cb(nic_io->hwdev, HINIC3_MOD_HILINK,
+ nic_io->hwdev, hinic3_pf_mag_event_handler);
+ if (err)
+ goto register_mgmt_msg_cb_err;
+
+ err = pf_init_vf_infos(nic_io);
+ if (err)
+ goto pf_init_vf_infos_err;
+
+ return 0;
+
+pf_init_vf_infos_err:
+ hinic3_unregister_mgmt_msg_cb(nic_io->hwdev, HINIC3_MOD_HILINK);
+register_mgmt_msg_cb_err:
+ hinic3_unregister_mgmt_msg_cb(nic_io->hwdev, HINIC3_MOD_L2NIC);
+
+ return err;
+}
+
+void hinic3_vf_func_free(struct hinic3_nic_io *nic_io)
+{
+ struct hinic3_cmd_register_vf unregister;
+ u16 out_size = sizeof(unregister);
+ int err;
+
+ memset(&unregister, 0, sizeof(unregister));
+ unregister.op_register = 0;
+ if (hinic3_func_type(nic_io->hwdev) == TYPE_VF) {
+ err = hinic3_mbox_to_pf(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_NIC_CMD_VF_REGISTER,
+ &unregister, sizeof(unregister),
+ &unregister, &out_size, 0,
+ HINIC3_CHANNEL_NIC);
+ if (err || !out_size || unregister.msg_head.status)
+ nic_err(nic_io->dev_hdl, "Failed to unregister VF, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, unregister.msg_head.status, out_size);
+
+ hinic3_unregister_vf_mbox_cb(nic_io->hwdev, HINIC3_MOD_L2NIC);
+ } else {
+ if (nic_io->vf_infos) {
+ hinic3_unregister_pf_mbox_cb(nic_io->hwdev, HINIC3_MOD_HILINK);
+ hinic3_unregister_pf_mbox_cb(nic_io->hwdev, HINIC3_MOD_L2NIC);
+ hinic3_clear_vfs_info(nic_io->hwdev);
+ kfree(nic_io->vf_infos);
+ }
+ hinic3_unregister_mgmt_msg_cb(nic_io->hwdev, HINIC3_MOD_HILINK);
+ hinic3_unregister_mgmt_msg_cb(nic_io->hwdev, HINIC3_MOD_L2NIC);
+ }
+}
+
+static void clear_vf_infos(void *hwdev, u16 vf_id)
+{
+ struct vf_data_storage *vf_infos;
+ struct hinic3_nic_io *nic_io;
+ u16 func_id;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ func_id = hinic3_glb_pf_vf_offset(hwdev) + vf_id;
+ vf_infos = nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+ if (vf_infos->use_specified_mac)
+ hinic3_del_mac(hwdev, vf_infos->drv_mac_addr,
+ vf_infos->pf_vlan, func_id, HINIC3_CHANNEL_NIC);
+
+ if (hinic3_vf_info_vlanprio(hwdev, vf_id))
+ hinic3_kill_vf_vlan(hwdev, vf_id);
+
+ if (vf_infos->max_rate)
+ hinic3_set_vf_tx_rate(hwdev, vf_id, 0, 0);
+
+ if (vf_infos->spoofchk)
+ hinic3_set_vf_spoofchk(hwdev, vf_id, false);
+
+#ifdef HAVE_NDO_SET_VF_TRUST
+ if (vf_infos->trust)
+ hinic3_set_vf_trust(hwdev, vf_id, false);
+#endif
+
+ memset(vf_infos, 0, sizeof(*vf_infos));
+ /* set vf_infos to default */
+ hinic3_init_vf_infos(nic_io, HW_VF_ID_TO_OS(vf_id));
+}
+
+void hinic3_clear_vfs_info(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io =
+ hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ u16 i;
+
+ for (i = 0; i < nic_io->max_vfs; i++)
+ clear_vf_infos(hwdev, OS_VF_ID_TO_HW(i));
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cmd.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cmd.h
new file mode 100644
index 000000000000..31e224ab1095
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_cmd.h
@@ -0,0 +1,159 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C), 2001-2011, Huawei Tech. Co., Ltd.
+ * File Name : hinic3_comm_cmd.h
+ * Version : Initial Draft
+ * Created : 2019/4/25
+ * Last Modified :
+ * Description : NIC Commands between Driver and MPU
+ * Function List :
+ */
+
+#ifndef HINIC3_NIC_CMD_H
+#define HINIC3_NIC_CMD_H
+
+/* Commands between NIC to MPU
+ */
+enum hinic3_nic_cmd {
+ HINIC3_NIC_CMD_VF_REGISTER = 0, /* only for PFD and VFD */
+
+ /* FUNC CFG */
+ HINIC3_NIC_CMD_SET_FUNC_TBL = 5,
+ HINIC3_NIC_CMD_SET_VPORT_ENABLE,
+ HINIC3_NIC_CMD_SET_RX_MODE,
+ HINIC3_NIC_CMD_SQ_CI_ATTR_SET,
+ HINIC3_NIC_CMD_GET_VPORT_STAT,
+ HINIC3_NIC_CMD_CLEAN_VPORT_STAT,
+ HINIC3_NIC_CMD_CLEAR_QP_RESOURCE,
+ HINIC3_NIC_CMD_CFG_FLEX_QUEUE,
+ /* LRO CFG */
+ HINIC3_NIC_CMD_CFG_RX_LRO,
+ HINIC3_NIC_CMD_CFG_LRO_TIMER,
+ HINIC3_NIC_CMD_FEATURE_NEGO,
+ HINIC3_NIC_CMD_CFG_LOCAL_LRO_STATE,
+
+ HINIC3_NIC_CMD_CACHE_OUT_QP_RES,
+
+ /* MAC & VLAN CFG */
+ HINIC3_NIC_CMD_GET_MAC = 20,
+ HINIC3_NIC_CMD_SET_MAC,
+ HINIC3_NIC_CMD_DEL_MAC,
+ HINIC3_NIC_CMD_UPDATE_MAC,
+ HINIC3_NIC_CMD_GET_ALL_DEFAULT_MAC,
+
+ HINIC3_NIC_CMD_CFG_FUNC_VLAN,
+ HINIC3_NIC_CMD_SET_VLAN_FILTER_EN,
+ HINIC3_NIC_CMD_SET_RX_VLAN_OFFLOAD,
+ HINIC3_NIC_CMD_SMAC_CHECK_STATE,
+
+ /* SR-IOV */
+ HINIC3_NIC_CMD_CFG_VF_VLAN = 40,
+ HINIC3_NIC_CMD_SET_SPOOPCHK_STATE,
+ /* RATE LIMIT */
+ HINIC3_NIC_CMD_SET_MAX_MIN_RATE,
+
+ /* RSS CFG */
+ HINIC3_NIC_CMD_RSS_CFG = 60,
+ HINIC3_NIC_CMD_RSS_TEMP_MGR, /* TODO: delete after implement nego cmd */
+ HINIC3_NIC_CMD_GET_RSS_CTX_TBL, /* TODO: delete: move to ucode cmd */
+ HINIC3_NIC_CMD_CFG_RSS_HASH_KEY,
+ HINIC3_NIC_CMD_CFG_RSS_HASH_ENGINE,
+ HINIC3_NIC_CMD_SET_RSS_CTX_TBL_INTO_FUNC,
+ /* IP checksum error packets, enable rss quadruple hash */
+ HINIC3_NIC_CMD_IPCS_ERR_RSS_ENABLE_OP = 66,
+
+ /* PPA/FDIR */
+ HINIC3_NIC_CMD_ADD_TC_FLOW = 80,
+ HINIC3_NIC_CMD_DEL_TC_FLOW,
+ HINIC3_NIC_CMD_GET_TC_FLOW,
+ HINIC3_NIC_CMD_FLUSH_TCAM,
+ HINIC3_NIC_CMD_CFG_TCAM_BLOCK,
+ HINIC3_NIC_CMD_ENABLE_TCAM,
+ HINIC3_NIC_CMD_GET_TCAM_BLOCK,
+ HINIC3_NIC_CMD_CFG_PPA_TABLE_ID,
+ HINIC3_NIC_CMD_SET_PPA_EN = 88,
+ HINIC3_NIC_CMD_CFG_PPA_MODE,
+ HINIC3_NIC_CMD_CFG_PPA_FLUSH,
+ HINIC3_NIC_CMD_SET_FDIR_STATUS,
+ HINIC3_NIC_CMD_GET_PPA_COUNTER,
+
+ /* PORT CFG */
+ HINIC3_NIC_CMD_SET_PORT_ENABLE = 100,
+ HINIC3_NIC_CMD_CFG_PAUSE_INFO,
+
+ HINIC3_NIC_CMD_SET_PORT_CAR,
+ HINIC3_NIC_CMD_SET_ER_DROP_PKT,
+
+ HINIC3_NIC_CMD_VF_COS,
+ HINIC3_NIC_CMD_SETUP_COS_MAPPING,
+ HINIC3_NIC_CMD_SET_ETS,
+ HINIC3_NIC_CMD_SET_PFC,
+ HINIC3_NIC_CMD_QOS_ETS,
+ HINIC3_NIC_CMD_QOS_PFC,
+ HINIC3_NIC_CMD_QOS_DCB_STATE,
+ HINIC3_NIC_CMD_QOS_PORT_CFG,
+ HINIC3_NIC_CMD_QOS_MAP_CFG,
+ HINIC3_NIC_CMD_FORCE_PKT_DROP,
+ HINIC3_NIC_CMD_TX_PAUSE_EXCP_NOTICE = 118,
+ HINIC3_NIC_CMD_INQUIRT_PAUSE_CFG = 119,
+
+ /* MISC */
+ HINIC3_NIC_CMD_BIOS_CFG = 120,
+ HINIC3_NIC_CMD_SET_FIRMWARE_CUSTOM_PACKETS_MSG,
+
+ /* BOND */
+ HINIC3_NIC_CMD_BOND_DEV_CREATE = 134,
+ HINIC3_NIC_CMD_BOND_DEV_DELETE,
+ HINIC3_NIC_CMD_BOND_DEV_OPEN_CLOSE,
+ HINIC3_NIC_CMD_BOND_INFO_GET,
+ HINIC3_NIC_CMD_BOND_ACTIVE_INFO_GET,
+ HINIC3_NIC_CMD_BOND_ACTIVE_NOTICE,
+
+ /* DFX */
+ HINIC3_NIC_CMD_GET_SM_TABLE = 140,
+ HINIC3_NIC_CMD_RD_LINE_TBL,
+
+ HINIC3_NIC_CMD_SET_UCAPTURE_OPT = 160, /* TODO: move to roce */
+ HINIC3_NIC_CMD_SET_VHD_CFG,
+
+ /* TODO: move to HILINK */
+ HINIC3_NIC_CMD_GET_PORT_STAT = 200,
+ HINIC3_NIC_CMD_CLEAN_PORT_STAT,
+ HINIC3_NIC_CMD_CFG_LOOPBACK_MODE,
+ HINIC3_NIC_CMD_GET_SFP_QSFP_INFO,
+ HINIC3_NIC_CMD_SET_SFP_STATUS,
+ HINIC3_NIC_CMD_GET_LIGHT_MODULE_ABS,
+ HINIC3_NIC_CMD_GET_LINK_INFO,
+ HINIC3_NIC_CMD_CFG_AN_TYPE,
+ HINIC3_NIC_CMD_GET_PORT_INFO,
+ HINIC3_NIC_CMD_SET_LINK_SETTINGS,
+ HINIC3_NIC_CMD_ACTIVATE_BIOS_LINK_CFG,
+ HINIC3_NIC_CMD_RESTORE_LINK_CFG,
+ HINIC3_NIC_CMD_SET_LINK_FOLLOW,
+ HINIC3_NIC_CMD_GET_LINK_STATE,
+ HINIC3_NIC_CMD_LINK_STATUS_REPORT,
+ HINIC3_NIC_CMD_CABLE_PLUG_EVENT,
+ HINIC3_NIC_CMD_LINK_ERR_EVENT,
+ HINIC3_NIC_CMD_SET_LED_STATUS,
+
+ HINIC3_NIC_CMD_MAX = 256,
+};
+
+/* NIC CMDQ MODE */
+enum hinic3_ucode_cmd {
+ HINIC3_UCODE_CMD_MODIFY_QUEUE_CTX = 0,
+ HINIC3_UCODE_CMD_CLEAN_QUEUE_CONTEXT,
+ HINIC3_UCODE_CMD_ARM_SQ,
+ HINIC3_UCODE_CMD_ARM_RQ,
+ HINIC3_UCODE_CMD_SET_RSS_INDIR_TABLE,
+ HINIC3_UCODE_CMD_SET_RSS_CONTEXT_TABLE,
+ HINIC3_UCODE_CMD_GET_RSS_INDIR_TABLE,
+ HINIC3_UCODE_CMD_GET_RSS_CONTEXT_TABLE,
+ HINIC3_UCODE_CMD_SET_IQ_ENABLE,
+ HINIC3_UCODE_CMD_SET_RQ_FLUSH = 10,
+ HINIC3_UCODE_CMD_MODIFY_VLAN_CTX,
+ HINIC3_UCODE_CMD_PPA_HASH_TABLE,
+ HINIC3_UCODE_CMD_RXQ_INFO_GET = 13,
+};
+
+#endif /* HINIC3_NIC_CMD_H */
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.c b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.c
new file mode 100644
index 000000000000..17d48c4d6e51
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/types.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_mt.h"
+#include "hinic3_nic_qp.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic.h"
+
+int hinic3_dbg_get_wqe_info(void *hwdev, u16 q_id, u16 idx, u16 wqebb_cnt,
+ u8 *wqe, const u16 *wqe_size, enum hinic3_queue_type q_type)
+{
+ struct hinic3_io_queue *queue = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ void *src_wqebb = NULL;
+ u32 i, offset;
+
+ if (!hwdev) {
+ pr_err("hwdev is NULL.\n");
+ return -EINVAL;
+ }
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (q_id >= nic_io->num_qps) {
+ pr_err("q_id[%u] > num_qps_cfg[%u].\n", q_id, nic_io->num_qps);
+ return -EINVAL;
+ }
+
+ queue = (q_type == HINIC3_SQ) ? &nic_io->sq[q_id] : &nic_io->rq[q_id];
+
+ if ((idx + wqebb_cnt) > queue->wq.q_depth) {
+ pr_err("(idx[%u] + idx[%u]) > q_depth[%u].\n", idx, wqebb_cnt, queue->wq.q_depth);
+ return -EINVAL;
+ }
+
+ if (*wqe_size != (queue->wq.wqebb_size * wqebb_cnt)) {
+ pr_err("Unexpect out buf size from user :%u, expect: %d\n",
+ *wqe_size, (queue->wq.wqebb_size * wqebb_cnt));
+ return -EINVAL;
+ }
+
+ for (i = 0; i < wqebb_cnt; i++) {
+ src_wqebb = hinic3_wq_wqebb_addr(&queue->wq, (u16)WQ_MASK_IDX(&queue->wq, idx + i));
+ offset = queue->wq.wqebb_size * i;
+ memcpy(wqe + offset, src_wqebb, queue->wq.wqebb_size);
+ }
+
+ return 0;
+}
+
+int hinic3_dbg_get_sq_info(void *hwdev, u16 q_id, struct nic_sq_info *sq_info,
+ u32 msg_size)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_io_queue *sq = NULL;
+
+ if (!hwdev || !sq_info) {
+ pr_err("hwdev or sq_info is NULL.\n");
+ return -EINVAL;
+ }
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (q_id >= nic_io->num_qps) {
+ nic_err(nic_io->dev_hdl, "Input queue id(%u) is larger than the actual queue number\n",
+ q_id);
+ return -EINVAL;
+ }
+
+ if (msg_size != sizeof(*sq_info)) {
+ nic_err(nic_io->dev_hdl, "Unexpect out buf size from user :%u, expect: %lu\n",
+ msg_size, sizeof(*sq_info));
+ return -EINVAL;
+ }
+
+ sq = &nic_io->sq[q_id];
+
+ sq_info->q_id = q_id;
+ sq_info->pi = hinic3_get_sq_local_pi(sq);
+ sq_info->ci = hinic3_get_sq_local_ci(sq);
+ sq_info->fi = hinic3_get_sq_hw_ci(sq);
+ sq_info->q_depth = sq->wq.q_depth;
+ sq_info->wqebb_size = sq->wq.wqebb_size;
+
+ sq_info->ci_addr = sq->tx.cons_idx_addr;
+
+ sq_info->cla_addr = sq->wq.wq_block_paddr;
+ sq_info->slq_handle = sq;
+
+ sq_info->doorbell.map_addr = (u64 *)sq->db_addr;
+
+ return 0;
+}
+
+int hinic3_dbg_get_rq_info(void *hwdev, u16 q_id, struct nic_rq_info *rq_info,
+ u32 msg_size)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_io_queue *rq = NULL;
+
+ if (!hwdev || !rq_info) {
+ pr_err("hwdev or rq_info is NULL.\n");
+ return -EINVAL;
+ }
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (q_id >= nic_io->num_qps) {
+ nic_err(nic_io->dev_hdl, "Input queue id(%u) is larger than the actual queue number\n",
+ q_id);
+ return -EINVAL;
+ }
+
+ if (msg_size != sizeof(*rq_info)) {
+ nic_err(nic_io->dev_hdl, "Unexpect out buf size from user: %u, expect: %lu\n",
+ msg_size, sizeof(*rq_info));
+ return -EINVAL;
+ }
+
+ rq = &nic_io->rq[q_id];
+
+ rq_info->q_id = q_id;
+
+ rq_info->hw_pi = cpu_to_be16(*rq->rx.pi_virt_addr);
+
+ rq_info->wqebb_size = rq->wq.wqebb_size;
+ rq_info->q_depth = (u16)rq->wq.q_depth;
+
+ rq_info->buf_len = nic_io->rx_buff_len;
+
+ rq_info->slq_handle = rq;
+
+ rq_info->ci_wqe_page_addr = hinic3_wq_get_first_wqe_page_addr(&rq->wq);
+ rq_info->ci_cla_tbl_addr = rq->wq.wq_block_paddr;
+
+ rq_info->msix_idx = rq->msix_entry_idx;
+
+ return 0;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.h
new file mode 100644
index 000000000000..4ba96d5fbb32
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NIC_DBG_H
+#define HINIC3_NIC_DBG_H
+
+#include "hinic3_mt.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_srv_nic.h"
+
+int hinic3_dbg_get_sq_info(void *hwdev, u16 q_id, struct nic_sq_info *sq_info,
+ u32 msg_size);
+
+int hinic3_dbg_get_rq_info(void *hwdev, u16 q_id, struct nic_rq_info *rq_info,
+ u32 msg_size);
+
+int hinic3_dbg_get_wqe_info(void *hwdev, u16 q_id, u16 idx, u16 wqebb_cnt,
+ u8 *wqe, const u16 *wqe_size,
+ enum hinic3_queue_type q_type);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dev.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dev.h
new file mode 100644
index 000000000000..2967311aab76
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_dev.h
@@ -0,0 +1,387 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NIC_DEV_H
+#define HINIC3_NIC_DEV_H
+
+#include <linux/netdevice.h>
+#include <linux/semaphore.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+
+#include "ossl_knl.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_tx.h"
+#include "hinic3_rx.h"
+#include "hinic3_dcb.h"
+
+#define HINIC3_NIC_DRV_NAME "hinic3"
+#define HINIC3_NIC_DRV_VERSION ""
+
+#define HINIC3_FUNC_IS_VF(hwdev) (hinic3_func_type(hwdev) == TYPE_VF)
+
+#define HINIC3_AVG_PKT_SMALL 256U
+#define HINIC3_MODERATONE_DELAY HZ
+
+#define LP_PKT_CNT 64
+
+enum hinic3_flags {
+ HINIC3_INTF_UP,
+ HINIC3_MAC_FILTER_CHANGED,
+ HINIC3_LP_TEST,
+ HINIC3_RSS_ENABLE,
+ HINIC3_DCB_ENABLE,
+ HINIC3_SAME_RXTX,
+ HINIC3_INTR_ADAPT,
+ HINIC3_UPDATE_MAC_FILTER,
+ HINIC3_CHANGE_RES_INVALID,
+ HINIC3_RSS_DEFAULT_INDIR,
+ HINIC3_FORCE_LINK_UP,
+ HINIC3_BONDING_MASTER,
+ HINIC3_AUTONEG_RESET,
+ HINIC3_RXQ_RECOVERY,
+};
+
+#define HINIC3_CHANNEL_RES_VALID(nic_dev) \
+ (test_bit(HINIC3_INTF_UP, &(nic_dev)->flags) && \
+ !test_bit(HINIC3_CHANGE_RES_INVALID, &(nic_dev)->flags))
+
+#define RX_BUFF_NUM_PER_PAGE 2
+
+#define VLAN_BITMAP_BYTE_SIZE(nic_dev) (sizeof(*(nic_dev)->vlan_bitmap))
+#define VLAN_BITMAP_BITS_SIZE(nic_dev) (VLAN_BITMAP_BYTE_SIZE(nic_dev) * 8)
+#define VLAN_NUM_BITMAPS(nic_dev) (VLAN_N_VID / \
+ VLAN_BITMAP_BITS_SIZE(nic_dev))
+#define VLAN_BITMAP_SIZE(nic_dev) (VLAN_N_VID / \
+ VLAN_BITMAP_BYTE_SIZE(nic_dev))
+#define VID_LINE(nic_dev, vid) ((vid) / VLAN_BITMAP_BITS_SIZE(nic_dev))
+#define VID_COL(nic_dev, vid) ((vid) & (VLAN_BITMAP_BITS_SIZE(nic_dev) - 1))
+
+#define NIC_DRV_DEFAULT_FEATURE NIC_F_ALL_MASK
+
+enum hinic3_event_work_flags {
+ EVENT_WORK_TX_TIMEOUT,
+};
+
+enum hinic3_rx_mode_state {
+ HINIC3_HW_PROMISC_ON,
+ HINIC3_HW_ALLMULTI_ON,
+ HINIC3_PROMISC_FORCE_ON,
+ HINIC3_ALLMULTI_FORCE_ON,
+};
+
+enum mac_filter_state {
+ HINIC3_MAC_WAIT_HW_SYNC,
+ HINIC3_MAC_HW_SYNCED,
+ HINIC3_MAC_WAIT_HW_UNSYNC,
+ HINIC3_MAC_HW_UNSYNCED,
+};
+
+struct hinic3_mac_filter {
+ struct list_head list;
+ u8 addr[ETH_ALEN];
+ unsigned long state;
+};
+
+struct hinic3_irq {
+ struct net_device *netdev;
+ /* IRQ corresponding index number */
+ u16 msix_entry_idx;
+ u16 rsvd1;
+ u32 irq_id; /* The IRQ number from OS */
+
+ char irq_name[IFNAMSIZ + 16];
+ struct napi_struct napi;
+ cpumask_t affinity_mask;
+ struct hinic3_txq *txq;
+ struct hinic3_rxq *rxq;
+};
+
+struct hinic3_intr_coal_info {
+ u8 pending_limt;
+ u8 coalesce_timer_cfg;
+ u8 resend_timer_cfg;
+
+ u64 pkt_rate_low;
+ u8 rx_usecs_low;
+ u8 rx_pending_limt_low;
+ u64 pkt_rate_high;
+ u8 rx_usecs_high;
+ u8 rx_pending_limt_high;
+
+ u8 user_set_intr_coal_flag;
+};
+
+struct hinic3_dyna_txrxq_params {
+ u16 num_qps;
+ u8 num_cos;
+ u8 rsvd1;
+ u32 sq_depth;
+ u32 rq_depth;
+
+ struct hinic3_dyna_txq_res *txqs_res;
+ struct hinic3_dyna_rxq_res *rxqs_res;
+ struct hinic3_irq *irq_cfg;
+};
+
+#define HINIC3_NIC_STATS_INC(nic_dev, field) \
+do { \
+ u64_stats_update_begin(&(nic_dev)->stats.syncp); \
+ (nic_dev)->stats.field++; \
+ u64_stats_update_end(&(nic_dev)->stats.syncp); \
+} while (0)
+
+struct hinic3_nic_stats {
+ u64 netdev_tx_timeout;
+
+ /* Subdivision statistics show in private tool */
+ u64 tx_carrier_off_drop;
+ u64 tx_invalid_qid;
+ u64 rsvd1;
+ u64 rsvd2;
+#ifdef HAVE_NDO_GET_STATS64
+ struct u64_stats_sync syncp;
+#else
+ struct u64_stats_sync_empty syncp;
+#endif
+};
+
+#define HINIC3_TCAM_DYNAMIC_BLOCK_SIZE 16
+#define HINIC3_MAX_TCAM_FILTERS 512
+
+#define HINIC3_PKT_TCAM_DYNAMIC_INDEX_START(block_index) \
+ (HINIC3_TCAM_DYNAMIC_BLOCK_SIZE * (block_index))
+
+struct hinic3_rx_flow_rule {
+ struct list_head rules;
+ int tot_num_rules;
+};
+
+struct hinic3_tcam_dynamic_block {
+ struct list_head block_list;
+ u16 dynamic_block_id;
+ u16 dynamic_index_cnt;
+ u8 dynamic_index_used[HINIC3_TCAM_DYNAMIC_BLOCK_SIZE];
+};
+
+struct hinic3_tcam_dynamic_block_info {
+ struct list_head tcam_dynamic_list;
+ u16 dynamic_block_cnt;
+};
+
+struct hinic3_tcam_filter {
+ struct list_head tcam_filter_list;
+ u16 dynamic_block_id;
+ u16 index;
+ struct tag_tcam_key tcam_key;
+ u16 queue;
+};
+
+/* function level struct info */
+struct hinic3_tcam_info {
+ u16 tcam_rule_nums;
+ struct list_head tcam_list;
+ struct hinic3_tcam_dynamic_block_info tcam_dynamic_info;
+};
+
+struct hinic3_nic_dev {
+ struct pci_dev *pdev;
+ struct net_device *netdev;
+ struct hinic3_lld_dev *lld_dev;
+ void *hwdev;
+
+ int poll_weight;
+ u32 rsvd1;
+ unsigned long *vlan_bitmap;
+
+ u16 max_qps;
+
+ u32 msg_enable;
+ unsigned long flags;
+
+ u32 lro_replenish_thld;
+ u32 dma_rx_buff_size;
+ u16 rx_buff_len;
+ u32 page_order;
+
+ /* Rss related varibles */
+ u8 rss_hash_engine;
+ struct nic_rss_type rss_type;
+ u8 *rss_hkey;
+ /* hkey in big endian */
+ u32 *rss_hkey_be;
+ u32 *rss_indir;
+
+ u8 cos_config_num_max;
+ u8 func_dft_cos_bitmap;
+ u16 port_dft_cos_bitmap; /* used to tool validity check */
+
+ struct hinic3_dcb_config hw_dcb_cfg;
+ struct hinic3_dcb_config wanted_dcb_cfg;
+ struct hinic3_dcb_config dcb_cfg;
+ unsigned long dcb_flags;
+ int disable_port_cnt;
+ /* lock for disable or enable traffic flow */
+ struct semaphore dcb_sem;
+
+ struct hinic3_intr_coal_info *intr_coalesce;
+ unsigned long last_moder_jiffies;
+ u32 adaptive_rx_coal;
+ u8 intr_coal_set_flag;
+
+#ifndef HAVE_NETDEV_STATS_IN_NETDEV
+ struct net_device_stats net_stats;
+#endif
+
+ struct hinic3_nic_stats stats;
+
+ /* lock for nic resource */
+ struct mutex nic_mutex;
+ bool force_port_disable;
+ struct semaphore port_state_sem;
+ u8 link_status;
+
+ struct nic_service_cap nic_cap;
+
+ struct hinic3_txq *txqs;
+ struct hinic3_rxq *rxqs;
+ struct hinic3_dyna_txrxq_params q_params;
+
+ u16 num_qp_irq;
+ struct irq_info *qps_irq_info;
+
+ struct workqueue_struct *workq;
+
+ struct work_struct rx_mode_work;
+ struct delayed_work moderation_task;
+
+ struct list_head uc_filter_list;
+ struct list_head mc_filter_list;
+ unsigned long rx_mod_state;
+ int netdev_uc_cnt;
+ int netdev_mc_cnt;
+
+ int lb_test_rx_idx;
+ int lb_pkt_len;
+ u8 *lb_test_rx_buf;
+
+ struct hinic3_tcam_info tcam;
+ struct hinic3_rx_flow_rule rx_flow_rule;
+
+#ifdef HAVE_XDP_SUPPORT
+ struct bpf_prog *xdp_prog;
+#endif
+
+ struct delayed_work periodic_work;
+ /* reference to enum hinic3_event_work_flags */
+ unsigned long event_flag;
+
+ struct hinic3_nic_prof_attr *prof_attr;
+ struct hinic3_prof_adapter *prof_adap;
+ u64 rsvd8[7];
+ u32 rsvd9;
+ u32 rxq_get_err_times;
+ struct delayed_work rxq_check_work;
+};
+
+#define hinic_msg(level, nic_dev, msglvl, format, arg...) \
+do { \
+ if ((nic_dev)->netdev && (nic_dev)->netdev->reg_state \
+ == NETREG_REGISTERED) \
+ nicif_##level((nic_dev), msglvl, (nic_dev)->netdev, \
+ format, ## arg); \
+ else \
+ nic_##level(&(nic_dev)->pdev->dev, \
+ format, ## arg); \
+} while (0)
+
+#define hinic3_info(nic_dev, msglvl, format, arg...) \
+ hinic_msg(info, nic_dev, msglvl, format, ## arg)
+
+#define hinic3_warn(nic_dev, msglvl, format, arg...) \
+ hinic_msg(warn, nic_dev, msglvl, format, ## arg)
+
+#define hinic3_err(nic_dev, msglvl, format, arg...) \
+ hinic_msg(err, nic_dev, msglvl, format, ## arg)
+
+struct hinic3_uld_info *get_nic_uld_info(void);
+
+u32 hinic3_get_io_stats_size(const struct hinic3_nic_dev *nic_dev);
+
+void hinic3_get_io_stats(const struct hinic3_nic_dev *nic_dev, void *stats);
+
+int hinic3_open(struct net_device *netdev);
+
+int hinic3_close(struct net_device *netdev);
+
+void hinic3_set_ethtool_ops(struct net_device *netdev);
+
+void hinic3vf_set_ethtool_ops(struct net_device *netdev);
+
+int nic_ioctl(void *uld_dev, u32 cmd, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size);
+
+void hinic3_update_num_qps(struct net_device *netdev);
+
+int hinic3_qps_irq_init(struct hinic3_nic_dev *nic_dev);
+
+void hinic3_qps_irq_deinit(struct hinic3_nic_dev *nic_dev);
+
+void hinic3_set_netdev_ops(struct hinic3_nic_dev *nic_dev);
+
+bool hinic3_is_netdev_ops_match(const struct net_device *netdev);
+
+int hinic3_set_hw_features(struct hinic3_nic_dev *nic_dev);
+
+void hinic3_set_rx_mode_work(struct work_struct *work);
+
+void hinic3_clean_mac_list_filter(struct hinic3_nic_dev *nic_dev);
+
+void hinic3_get_strings(struct net_device *netdev, u32 stringset, u8 *data);
+
+void hinic3_get_ethtool_stats(struct net_device *netdev,
+ struct ethtool_stats *stats, u64 *data);
+
+int hinic3_get_sset_count(struct net_device *netdev, int sset);
+
+int hinic3_force_port_disable(struct hinic3_nic_dev *nic_dev);
+
+int hinic3_force_set_port_state(struct hinic3_nic_dev *nic_dev, bool enable);
+
+int hinic3_maybe_set_port_state(struct hinic3_nic_dev *nic_dev, bool enable);
+
+#ifdef ETHTOOL_GLINKSETTINGS
+#ifndef XENSERVER_HAVE_NEW_ETHTOOL_OPS
+int hinic3_get_link_ksettings(struct net_device *netdev,
+ struct ethtool_link_ksettings *link_settings);
+int hinic3_set_link_ksettings(struct net_device *netdev,
+ const struct ethtool_link_ksettings
+ *link_settings);
+#endif
+#endif
+
+#ifndef HAVE_NEW_ETHTOOL_LINK_SETTINGS_ONLY
+int hinic3_get_settings(struct net_device *netdev, struct ethtool_cmd *ep);
+int hinic3_set_settings(struct net_device *netdev,
+ struct ethtool_cmd *link_settings);
+#endif
+
+void hinic3_auto_moderation_work(struct work_struct *work);
+
+typedef void (*hinic3_reopen_handler)(struct hinic3_nic_dev *nic_dev,
+ const void *priv_data);
+int hinic3_change_channel_settings(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_dyna_txrxq_params *trxq_params,
+ hinic3_reopen_handler reopen_handler,
+ const void *priv_data);
+
+void hinic3_link_status_change(struct hinic3_nic_dev *nic_dev, bool status);
+
+#ifdef HAVE_XDP_SUPPORT
+bool hinic3_is_xdp_enable(struct hinic3_nic_dev *nic_dev);
+int hinic3_xdp_max_mtu(struct hinic3_nic_dev *nic_dev);
+#endif
+
+#endif
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_event.c b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_event.c
new file mode 100644
index 000000000000..57cf07cee554
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_event.c
@@ -0,0 +1,580 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/ethtool.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic.h"
+#include "hinic3_nic_cmd.h"
+
+static int hinic3_init_vf_config(struct hinic3_nic_io *nic_io, u16 vf_id)
+{
+ struct vf_data_storage *vf_info;
+ u16 func_id;
+ int err = 0;
+
+ vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+ ether_addr_copy(vf_info->drv_mac_addr, vf_info->user_mac_addr);
+ if (!is_zero_ether_addr(vf_info->drv_mac_addr)) {
+ vf_info->use_specified_mac = true;
+ func_id = hinic3_glb_pf_vf_offset(nic_io->hwdev) + vf_id;
+
+ err = hinic3_set_mac(nic_io->hwdev, vf_info->drv_mac_addr,
+ vf_info->pf_vlan, func_id,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to set VF %d MAC\n",
+ HW_VF_ID_TO_OS(vf_id));
+ return err;
+ }
+ } else {
+ vf_info->use_specified_mac = false;
+ }
+
+ if (hinic3_vf_info_vlanprio(nic_io->hwdev, vf_id)) {
+ err = hinic3_cfg_vf_vlan(nic_io, HINIC3_CMD_OP_ADD,
+ vf_info->pf_vlan, vf_info->pf_qos,
+ vf_id);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to add VF %d VLAN_QOS\n",
+ HW_VF_ID_TO_OS(vf_id));
+ return err;
+ }
+ }
+
+ if (vf_info->max_rate) {
+ err = hinic3_set_vf_tx_rate(nic_io->hwdev, vf_id,
+ vf_info->max_rate,
+ vf_info->min_rate);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to set VF %d max rate %u, min rate %u\n",
+ HW_VF_ID_TO_OS(vf_id), vf_info->max_rate,
+ vf_info->min_rate);
+ return err;
+ }
+ }
+
+ return 0;
+}
+
+static int register_vf_msg_handler(struct hinic3_nic_io *nic_io, u16 vf_id)
+{
+ int err;
+
+ if (vf_id > nic_io->max_vfs) {
+ nic_err(nic_io->dev_hdl, "Register VF id %d exceed limit[0-%d]\n",
+ HW_VF_ID_TO_OS(vf_id), HW_VF_ID_TO_OS(nic_io->max_vfs));
+ return -EFAULT;
+ }
+
+ err = hinic3_init_vf_config(nic_io, vf_id);
+ if (err)
+ return err;
+
+ nic_io->vf_infos[HW_VF_ID_TO_OS(vf_id)].registered = true;
+
+ return 0;
+}
+
+static int unregister_vf_msg_handler(struct hinic3_nic_io *nic_io, u16 vf_id)
+{
+ struct vf_data_storage *vf_info =
+ nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+ struct hinic3_port_mac_set mac_info;
+ u16 out_size = sizeof(mac_info);
+ int err;
+
+ if (vf_id > nic_io->max_vfs)
+ return -EFAULT;
+
+ vf_info->registered = false;
+
+ memset(&mac_info, 0, sizeof(mac_info));
+ mac_info.func_id = hinic3_glb_pf_vf_offset(nic_io->hwdev) + (u16)vf_id;
+ mac_info.vlan_id = vf_info->pf_vlan;
+ ether_addr_copy(mac_info.mac, vf_info->drv_mac_addr);
+
+ if (vf_info->use_specified_mac || vf_info->pf_vlan) {
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_DEL_MAC,
+ &mac_info, sizeof(mac_info),
+ &mac_info, &out_size);
+ if (err || mac_info.msg_head.status || !out_size) {
+ nic_err(nic_io->dev_hdl, "Failed to delete VF %d MAC, err: %d, status: 0x%x, out size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf_id), err,
+ mac_info.msg_head.status, out_size);
+ return -EFAULT;
+ }
+ }
+
+ memset(vf_info->drv_mac_addr, 0, ETH_ALEN);
+
+ return 0;
+}
+
+static int hinic3_register_vf_msg_handler(struct hinic3_nic_io *nic_io,
+ u16 vf_id, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct hinic3_cmd_register_vf *register_vf = buf_in;
+ struct hinic3_cmd_register_vf *register_info = buf_out;
+ struct vf_data_storage *vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+ int err;
+
+ if (register_vf->op_register) {
+ vf_info->support_extra_feature = register_vf->support_extra_feature;
+ err = register_vf_msg_handler(nic_io, vf_id);
+ } else {
+ err = unregister_vf_msg_handler(nic_io, vf_id);
+ vf_info->support_extra_feature = 0;
+ }
+
+ if (err)
+ register_info->msg_head.status = EFAULT;
+
+ *out_size = sizeof(*register_info);
+
+ return 0;
+}
+
+void hinic3_unregister_vf(struct hinic3_nic_io *nic_io, u16 vf_id)
+{
+ struct vf_data_storage *vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf_id);
+
+ unregister_vf_msg_handler(nic_io, vf_id);
+ vf_info->support_extra_feature = 0;
+}
+
+static int hinic3_get_vf_cos_msg_handler(struct hinic3_nic_io *nic_io,
+ u16 vf_id, void *buf_in,
+ u16 in_size, void *buf_out,
+ u16 *out_size)
+{
+ struct hinic3_cmd_vf_dcb_state *dcb_state = buf_out;
+
+ memcpy(&dcb_state->state, &nic_io->dcb_state,
+ sizeof(nic_io->dcb_state));
+
+ dcb_state->msg_head.status = 0;
+ *out_size = sizeof(*dcb_state);
+ return 0;
+}
+
+static int hinic3_get_vf_mac_msg_handler(struct hinic3_nic_io *nic_io, u16 vf,
+ void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct vf_data_storage *vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf);
+ struct hinic3_port_mac_set *mac_info = buf_out;
+
+ int err;
+
+ if (HINIC3_SUPPORT_VF_MAC(nic_io->hwdev)) {
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev, HINIC3_NIC_CMD_GET_MAC, buf_in,
+ in_size, buf_out, out_size);
+ if (!err) {
+ if (is_zero_ether_addr(mac_info->mac))
+ ether_addr_copy(mac_info->mac, vf_info->drv_mac_addr);
+ }
+ return err;
+ }
+
+ ether_addr_copy(mac_info->mac, vf_info->drv_mac_addr);
+ mac_info->msg_head.status = 0;
+ *out_size = sizeof(*mac_info);
+
+ return 0;
+}
+
+static int hinic3_set_vf_mac_msg_handler(struct hinic3_nic_io *nic_io, u16 vf,
+ void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct vf_data_storage *vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf);
+ struct hinic3_port_mac_set *mac_in = buf_in;
+ struct hinic3_port_mac_set *mac_out = buf_out;
+ int err;
+
+ if (vf_info->use_specified_mac && !vf_info->trust &&
+ is_valid_ether_addr(mac_in->mac)) {
+ nic_warn(nic_io->dev_hdl, "PF has already set VF %d MAC address, and vf trust is off.\n",
+ HW_VF_ID_TO_OS(vf));
+ mac_out->msg_head.status = HINIC3_PF_SET_VF_ALREADY;
+ *out_size = sizeof(*mac_out);
+ return 0;
+ }
+
+ if (is_valid_ether_addr(mac_in->mac))
+ mac_in->vlan_id = vf_info->pf_vlan;
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev, HINIC3_NIC_CMD_SET_MAC,
+ buf_in, in_size, buf_out, out_size);
+ if (err || !(*out_size)) {
+ nic_err(nic_io->dev_hdl, "Failed to set VF %d MAC address, err: %d,status: 0x%x, out size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf), err, mac_out->msg_head.status,
+ *out_size);
+ return -EFAULT;
+ }
+
+ if (is_valid_ether_addr(mac_in->mac) && !mac_out->msg_head.status)
+ ether_addr_copy(vf_info->drv_mac_addr, mac_in->mac);
+
+ return err;
+}
+
+static int hinic3_del_vf_mac_msg_handler(struct hinic3_nic_io *nic_io, u16 vf,
+ void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct vf_data_storage *vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf);
+ struct hinic3_port_mac_set *mac_in = buf_in;
+ struct hinic3_port_mac_set *mac_out = buf_out;
+ int err;
+
+ if (vf_info->use_specified_mac && !vf_info->trust &&
+ is_valid_ether_addr(mac_in->mac)) {
+ nic_warn(nic_io->dev_hdl, "PF has already set VF %d MAC address, and vf trust is off.\n",
+ HW_VF_ID_TO_OS(vf));
+ mac_out->msg_head.status = HINIC3_PF_SET_VF_ALREADY;
+ *out_size = sizeof(*mac_out);
+ return 0;
+ }
+
+ if (is_valid_ether_addr(mac_in->mac))
+ mac_in->vlan_id = vf_info->pf_vlan;
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev, HINIC3_NIC_CMD_DEL_MAC,
+ buf_in, in_size, buf_out, out_size);
+ if (err || !(*out_size)) {
+ nic_err(nic_io->dev_hdl, "Failed to delete VF %d MAC, err: %d, status: 0x%x, out size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf), err, mac_out->msg_head.status,
+ *out_size);
+ return -EFAULT;
+ }
+
+ if (is_valid_ether_addr(mac_in->mac) && !mac_out->msg_head.status)
+ eth_zero_addr(vf_info->drv_mac_addr);
+
+ return err;
+}
+
+static int hinic3_update_vf_mac_msg_handler(struct hinic3_nic_io *nic_io,
+ u16 vf, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct vf_data_storage *vf_info = nic_io->vf_infos + HW_VF_ID_TO_OS(vf);
+ struct hinic3_port_mac_update *mac_in = buf_in;
+ struct hinic3_port_mac_update *mac_out = buf_out;
+ int err;
+
+ if (!is_valid_ether_addr(mac_in->new_mac)) {
+ nic_err(nic_io->dev_hdl, "Update VF MAC is invalid.\n");
+ return -EINVAL;
+ }
+
+#ifndef __VMWARE__
+ if (vf_info->use_specified_mac && !vf_info->trust) {
+ nic_warn(nic_io->dev_hdl, "PF has already set VF %d MAC address, and vf trust is off.\n",
+ HW_VF_ID_TO_OS(vf));
+ mac_out->msg_head.status = HINIC3_PF_SET_VF_ALREADY;
+ *out_size = sizeof(*mac_out);
+ return 0;
+ }
+#else
+ err = hinic_config_vf_request(nic_io->hwdev->pcidev_hdl,
+ HW_VF_ID_TO_OS(vf),
+ HINIC_CFG_VF_MAC_CHANGED,
+ (void *)mac_in->new_mac);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to config VF %d MAC request, err: %d\n",
+ HW_VF_ID_TO_OS(vf), err);
+ return err;
+ }
+#endif
+ mac_in->vlan_id = vf_info->pf_vlan;
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev, HINIC3_NIC_CMD_UPDATE_MAC,
+ buf_in, in_size, buf_out, out_size);
+ if (err || !(*out_size)) {
+ nic_warn(nic_io->dev_hdl, "Failed to update VF %d MAC, err: %d,status: 0x%x, out size: 0x%x\n",
+ HW_VF_ID_TO_OS(vf), err, mac_out->msg_head.status,
+ *out_size);
+ return -EFAULT;
+ }
+
+ if (!mac_out->msg_head.status)
+ ether_addr_copy(vf_info->drv_mac_addr, mac_in->new_mac);
+
+ return err;
+}
+
+const struct vf_msg_handler vf_cmd_handler[] = {
+ {
+ .cmd = HINIC3_NIC_CMD_VF_REGISTER,
+ .handler = hinic3_register_vf_msg_handler,
+ },
+
+ {
+ .cmd = HINIC3_NIC_CMD_GET_MAC,
+ .handler = hinic3_get_vf_mac_msg_handler,
+ },
+
+ {
+ .cmd = HINIC3_NIC_CMD_SET_MAC,
+ .handler = hinic3_set_vf_mac_msg_handler,
+ },
+
+ {
+ .cmd = HINIC3_NIC_CMD_DEL_MAC,
+ .handler = hinic3_del_vf_mac_msg_handler,
+ },
+
+ {
+ .cmd = HINIC3_NIC_CMD_UPDATE_MAC,
+ .handler = hinic3_update_vf_mac_msg_handler,
+ },
+
+ {
+ .cmd = HINIC3_NIC_CMD_VF_COS,
+ .handler = hinic3_get_vf_cos_msg_handler
+ },
+};
+
+static int _l2nic_msg_to_mgmt_sync(void *hwdev, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u16 channel)
+{
+ u32 i, cmd_cnt = ARRAY_LEN(vf_cmd_handler);
+ bool cmd_to_pf = false;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF) {
+ for (i = 0; i < cmd_cnt; i++) {
+ if (cmd == vf_cmd_handler[i].cmd)
+ cmd_to_pf = true;
+ }
+ }
+
+ if (cmd_to_pf)
+ return hinic3_mbox_to_pf(hwdev, HINIC3_MOD_L2NIC, cmd, buf_in,
+ in_size, buf_out, out_size, 0,
+ channel);
+
+ return hinic3_msg_to_mgmt_sync(hwdev, HINIC3_MOD_L2NIC, cmd, buf_in,
+ in_size, buf_out, out_size, 0, channel);
+}
+
+int l2nic_msg_to_mgmt_sync(void *hwdev, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ return _l2nic_msg_to_mgmt_sync(hwdev, cmd, buf_in, in_size, buf_out,
+ out_size, HINIC3_CHANNEL_NIC);
+}
+
+int l2nic_msg_to_mgmt_sync_ch(void *hwdev, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size, u16 channel)
+{
+ return _l2nic_msg_to_mgmt_sync(hwdev, cmd, buf_in, in_size, buf_out,
+ out_size, channel);
+}
+
+/* pf/ppf handler mbox msg from vf */
+int hinic3_pf_mbox_handler(void *hwdev,
+ u16 vf_id, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ u32 index, cmd_size = ARRAY_LEN(vf_cmd_handler);
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EFAULT;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ for (index = 0; index < cmd_size; index++) {
+ if (cmd == vf_cmd_handler[index].cmd)
+ return vf_cmd_handler[index].handler(nic_io, vf_id,
+ buf_in, in_size,
+ buf_out, out_size);
+ }
+
+ nic_warn(nic_io->dev_hdl, "NO handler for nic cmd(%u) received from vf id: %u\n",
+ cmd, vf_id);
+
+ return -EINVAL;
+}
+
+void hinic3_notify_dcb_state_event(struct hinic3_nic_io *nic_io,
+ struct hinic3_dcb_state *dcb_state)
+{
+ struct hinic3_event_info event_info = {0};
+ int i;
+/*lint -e679*/
+ if (dcb_state->trust == HINIC3_DCB_PCP)
+ /* This is 8 user priority to cos mapping relationships */
+ sdk_info(nic_io->dev_hdl, "DCB %s, default cos %u, pcp2cos %u%u%u%u%u%u%u%u\n",
+ dcb_state->dcb_on ? "on" : "off", dcb_state->default_cos,
+ dcb_state->pcp2cos[ARRAY_INDEX_0], dcb_state->pcp2cos[ARRAY_INDEX_1],
+ dcb_state->pcp2cos[ARRAY_INDEX_2], dcb_state->pcp2cos[ARRAY_INDEX_3],
+ dcb_state->pcp2cos[ARRAY_INDEX_4], dcb_state->pcp2cos[ARRAY_INDEX_5],
+ dcb_state->pcp2cos[ARRAY_INDEX_6], dcb_state->pcp2cos[ARRAY_INDEX_7]);
+ else
+ for (i = 0; i < NIC_DCB_DSCP_NUM; i++) {
+ sdk_info(nic_io->dev_hdl,
+ "DCB %s, default cos %u, dscp2cos %u%u%u%u%u%u%u%u\n",
+ dcb_state->dcb_on ? "on" : "off", dcb_state->default_cos,
+ dcb_state->dscp2cos[ARRAY_INDEX_0 + i * NIC_DCB_DSCP_NUM],
+ dcb_state->dscp2cos[ARRAY_INDEX_1 + i * NIC_DCB_DSCP_NUM],
+ dcb_state->dscp2cos[ARRAY_INDEX_2 + i * NIC_DCB_DSCP_NUM],
+ dcb_state->dscp2cos[ARRAY_INDEX_3 + i * NIC_DCB_DSCP_NUM],
+ dcb_state->dscp2cos[ARRAY_INDEX_4 + i * NIC_DCB_DSCP_NUM],
+ dcb_state->dscp2cos[ARRAY_INDEX_5 + i * NIC_DCB_DSCP_NUM],
+ dcb_state->dscp2cos[ARRAY_INDEX_6 + i * NIC_DCB_DSCP_NUM],
+ dcb_state->dscp2cos[ARRAY_INDEX_7 + i * NIC_DCB_DSCP_NUM]);
+ }
+/*lint +e679*/
+ /* Saved in sdk for stateful module */
+ hinic3_save_dcb_state(nic_io, dcb_state);
+
+ event_info.service = EVENT_SRV_NIC;
+ event_info.type = EVENT_NIC_DCB_STATE_CHANGE;
+ memcpy((void *)event_info.event_data, dcb_state, sizeof(*dcb_state));
+
+ hinic3_event_callback(nic_io->hwdev, &event_info);
+}
+
+static void dcb_state_event(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct hinic3_cmd_vf_dcb_state *vf_dcb;
+ struct hinic3_nic_io *nic_io;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ vf_dcb = buf_in;
+ if (!vf_dcb)
+ return;
+
+ hinic3_notify_dcb_state_event(nic_io, &vf_dcb->state);
+}
+
+static void tx_pause_excp_event_handler(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct nic_cmd_tx_pause_notice *excp_info = buf_in;
+ struct hinic3_nic_io *nic_io = NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ if (in_size != sizeof(*excp_info)) {
+ nic_err(nic_io->dev_hdl, "Invalid in_size: %u, should be %ld\n",
+ in_size, sizeof(*excp_info));
+ return;
+ }
+
+ nic_warn(nic_io->dev_hdl, "Receive tx pause exception event, excp: %u, level: %u\n",
+ excp_info->tx_pause_except, excp_info->except_level);
+
+ hinic3_fault_event_report(hwdev, HINIC3_FAULT_SRC_TX_PAUSE_EXCP,
+ (u16)excp_info->except_level);
+}
+
+static void bond_active_event_handler(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct hinic3_bond_active_report_info *active_info = buf_in;
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_event_info event_info = {0};
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ if (in_size != sizeof(*active_info)) {
+ nic_err(nic_io->dev_hdl, "Invalid in_size: %u, should be %ld\n",
+ in_size, sizeof(*active_info));
+ return;
+ }
+
+ event_info.service = EVENT_SRV_NIC;
+ event_info.type = HINIC3_NIC_CMD_BOND_ACTIVE_NOTICE;
+ memcpy((void *)event_info.event_data, active_info, sizeof(*active_info));
+
+ hinic3_event_callback(nic_io->hwdev, &event_info);
+}
+
+static const struct nic_event_handler nic_cmd_handler[] = {
+ {
+ .cmd = HINIC3_NIC_CMD_VF_COS,
+ .handler = dcb_state_event,
+ },
+ {
+ .cmd = HINIC3_NIC_CMD_TX_PAUSE_EXCP_NOTICE,
+ .handler = tx_pause_excp_event_handler,
+ },
+
+ {
+ .cmd = HINIC3_NIC_CMD_BOND_ACTIVE_NOTICE,
+ .handler = bond_active_event_handler,
+ },
+};
+
+static int _event_handler(void *hwdev, u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ u32 size = sizeof(nic_cmd_handler) / sizeof(struct nic_event_handler);
+ u32 i;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ *out_size = 0;
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ for (i = 0; i < size; i++) {
+ if (cmd == nic_cmd_handler[i].cmd) {
+ nic_cmd_handler[i].handler(hwdev, buf_in, in_size,
+ buf_out, out_size);
+ return 0;
+ }
+ }
+
+ /* can't find this event cmd */
+ sdk_warn(nic_io->dev_hdl, "Unsupported nic event, cmd: %u\n", cmd);
+ *out_size = sizeof(struct mgmt_msg_head);
+ ((struct mgmt_msg_head *)buf_out)->status = HINIC3_MGMT_CMD_UNSUPPORTED;
+
+ return 0;
+}
+
+/* vf handler mbox msg from ppf/pf */
+/* vf link change event
+ * vf fault report event, TBD
+ */
+int hinic3_vf_event_handler(void *hwdev,
+ u16 cmd, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ return _event_handler(hwdev, cmd, buf_in, in_size, buf_out, out_size);
+}
+
+/* pf/ppf handler mgmt cpu report nic event */
+void hinic3_pf_event_handler(void *hwdev, u16 cmd,
+ void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ _event_handler(hwdev, cmd, buf_in, in_size, buf_out, out_size);
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.c b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.c
new file mode 100644
index 000000000000..22670ffe7ebf
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.c
@@ -0,0 +1,1122 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/types.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_nic_qp.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic.h"
+#include "hinic3_nic_cmd.h"
+#include "hinic3_nic_io.h"
+
+#define HINIC3_DEAULT_TX_CI_PENDING_LIMIT 1
+#define HINIC3_DEAULT_TX_CI_COALESCING_TIME 1
+#define HINIC3_DEAULT_DROP_THD_ON (0xFFFF)
+#define HINIC3_DEAULT_DROP_THD_OFF 0
+/*lint -e806*/
+static unsigned char tx_pending_limit = HINIC3_DEAULT_TX_CI_PENDING_LIMIT;
+module_param(tx_pending_limit, byte, 0444);
+MODULE_PARM_DESC(tx_pending_limit, "TX CI coalescing parameter pending_limit (default=0)");
+
+static unsigned char tx_coalescing_time = HINIC3_DEAULT_TX_CI_COALESCING_TIME;
+module_param(tx_coalescing_time, byte, 0444);
+MODULE_PARM_DESC(tx_coalescing_time, "TX CI coalescing parameter coalescing_time (default=0)");
+
+static unsigned char rq_wqe_type = HINIC3_NORMAL_RQ_WQE;
+module_param(rq_wqe_type, byte, 0444);
+MODULE_PARM_DESC(rq_wqe_type, "RQ WQE type 0-8Bytes, 1-16Bytes, 2-32Bytes (default=2)");
+
+/*lint +e806*/
+static u32 tx_drop_thd_on = HINIC3_DEAULT_DROP_THD_ON;
+module_param(tx_drop_thd_on, uint, 0644);
+MODULE_PARM_DESC(tx_drop_thd_on, "TX parameter drop_thd_on (default=0xffff)");
+
+static u32 tx_drop_thd_off = HINIC3_DEAULT_DROP_THD_OFF;
+module_param(tx_drop_thd_off, uint, 0644);
+MODULE_PARM_DESC(tx_drop_thd_off, "TX parameter drop_thd_off (default=0)");
+/* performance: ci addr RTE_CACHE_SIZE(64B) alignment */
+#define HINIC3_CI_Q_ADDR_SIZE (64)
+
+#define CI_TABLE_SIZE(num_qps, pg_sz) \
+ (ALIGN((num_qps) * HINIC3_CI_Q_ADDR_SIZE, pg_sz))
+
+#define HINIC3_CI_VADDR(base_addr, q_id) ((u8 *)(base_addr) + \
+ (q_id) * HINIC3_CI_Q_ADDR_SIZE)
+
+#define HINIC3_CI_PADDR(base_paddr, q_id) ((base_paddr) + \
+ (q_id) * HINIC3_CI_Q_ADDR_SIZE)
+
+#define WQ_PREFETCH_MAX 4
+#define WQ_PREFETCH_MIN 1
+#define WQ_PREFETCH_THRESHOLD 256
+
+#define HINIC3_Q_CTXT_MAX 31 /* (2048 - 8) / 64 */
+
+enum hinic3_qp_ctxt_type {
+ HINIC3_QP_CTXT_TYPE_SQ,
+ HINIC3_QP_CTXT_TYPE_RQ,
+};
+
+struct hinic3_qp_ctxt_header {
+ u16 num_queues;
+ u16 queue_type;
+ u16 start_qid;
+ u16 rsvd;
+};
+
+struct hinic3_sq_ctxt {
+ u32 ci_pi;
+ u32 drop_mode_sp;
+ u32 wq_pfn_hi_owner;
+ u32 wq_pfn_lo;
+
+ u32 rsvd0;
+ u32 pkt_drop_thd;
+ u32 global_sq_id;
+ u32 vlan_ceq_attr;
+
+ u32 pref_cache;
+ u32 pref_ci_owner;
+ u32 pref_wq_pfn_hi_ci;
+ u32 pref_wq_pfn_lo;
+
+ u32 rsvd8;
+ u32 rsvd9;
+ u32 wq_block_pfn_hi;
+ u32 wq_block_pfn_lo;
+};
+
+struct hinic3_rq_ctxt {
+ u32 ci_pi;
+ u32 ceq_attr;
+ u32 wq_pfn_hi_type_owner;
+ u32 wq_pfn_lo;
+
+ u32 rsvd[3];
+ u32 cqe_sge_len;
+
+ u32 pref_cache;
+ u32 pref_ci_owner;
+ u32 pref_wq_pfn_hi_ci;
+ u32 pref_wq_pfn_lo;
+
+ u32 pi_paddr_hi;
+ u32 pi_paddr_lo;
+ u32 wq_block_pfn_hi;
+ u32 wq_block_pfn_lo;
+};
+
+struct hinic3_sq_ctxt_block {
+ struct hinic3_qp_ctxt_header cmdq_hdr;
+ struct hinic3_sq_ctxt sq_ctxt[HINIC3_Q_CTXT_MAX];
+};
+
+struct hinic3_rq_ctxt_block {
+ struct hinic3_qp_ctxt_header cmdq_hdr;
+ struct hinic3_rq_ctxt rq_ctxt[HINIC3_Q_CTXT_MAX];
+};
+
+struct hinic3_clean_queue_ctxt {
+ struct hinic3_qp_ctxt_header cmdq_hdr;
+ u32 rsvd;
+};
+
+#define SQ_CTXT_SIZE(num_sqs) ((u16)(sizeof(struct hinic3_qp_ctxt_header) \
+ + (num_sqs) * sizeof(struct hinic3_sq_ctxt)))
+
+#define RQ_CTXT_SIZE(num_rqs) ((u16)(sizeof(struct hinic3_qp_ctxt_header) \
+ + (num_rqs) * sizeof(struct hinic3_rq_ctxt)))
+
+#define CI_IDX_HIGH_SHIFH 12
+
+#define CI_HIGN_IDX(val) ((val) >> CI_IDX_HIGH_SHIFH)
+
+#define SQ_CTXT_PI_IDX_SHIFT 0
+#define SQ_CTXT_CI_IDX_SHIFT 16
+
+#define SQ_CTXT_PI_IDX_MASK 0xFFFFU
+#define SQ_CTXT_CI_IDX_MASK 0xFFFFU
+
+#define SQ_CTXT_CI_PI_SET(val, member) (((val) & \
+ SQ_CTXT_##member##_MASK) \
+ << SQ_CTXT_##member##_SHIFT)
+
+#define SQ_CTXT_MODE_SP_FLAG_SHIFT 0
+#define SQ_CTXT_MODE_PKT_DROP_SHIFT 1
+
+#define SQ_CTXT_MODE_SP_FLAG_MASK 0x1U
+#define SQ_CTXT_MODE_PKT_DROP_MASK 0x1U
+
+#define SQ_CTXT_MODE_SET(val, member) (((val) & \
+ SQ_CTXT_MODE_##member##_MASK) \
+ << SQ_CTXT_MODE_##member##_SHIFT)
+
+#define SQ_CTXT_WQ_PAGE_HI_PFN_SHIFT 0
+#define SQ_CTXT_WQ_PAGE_OWNER_SHIFT 23
+
+#define SQ_CTXT_WQ_PAGE_HI_PFN_MASK 0xFFFFFU
+#define SQ_CTXT_WQ_PAGE_OWNER_MASK 0x1U
+
+#define SQ_CTXT_WQ_PAGE_SET(val, member) (((val) & \
+ SQ_CTXT_WQ_PAGE_##member##_MASK) \
+ << SQ_CTXT_WQ_PAGE_##member##_SHIFT)
+
+#define SQ_CTXT_PKT_DROP_THD_ON_SHIFT 0
+#define SQ_CTXT_PKT_DROP_THD_OFF_SHIFT 16
+
+#define SQ_CTXT_PKT_DROP_THD_ON_MASK 0xFFFFU
+#define SQ_CTXT_PKT_DROP_THD_OFF_MASK 0xFFFFU
+
+#define SQ_CTXT_PKT_DROP_THD_SET(val, member) (((val) & \
+ SQ_CTXT_PKT_DROP_##member##_MASK) \
+ << SQ_CTXT_PKT_DROP_##member##_SHIFT)
+
+#define SQ_CTXT_GLOBAL_SQ_ID_SHIFT 0
+
+#define SQ_CTXT_GLOBAL_SQ_ID_MASK 0x1FFFU
+
+#define SQ_CTXT_GLOBAL_QUEUE_ID_SET(val, member) (((val) & \
+ SQ_CTXT_##member##_MASK) \
+ << SQ_CTXT_##member##_SHIFT)
+
+#define SQ_CTXT_VLAN_TAG_SHIFT 0
+#define SQ_CTXT_VLAN_TYPE_SEL_SHIFT 16
+#define SQ_CTXT_VLAN_INSERT_MODE_SHIFT 19
+#define SQ_CTXT_VLAN_CEQ_EN_SHIFT 23
+
+#define SQ_CTXT_VLAN_TAG_MASK 0xFFFFU
+#define SQ_CTXT_VLAN_TYPE_SEL_MASK 0x7U
+#define SQ_CTXT_VLAN_INSERT_MODE_MASK 0x3U
+#define SQ_CTXT_VLAN_CEQ_EN_MASK 0x1U
+
+#define SQ_CTXT_VLAN_CEQ_SET(val, member) (((val) & \
+ SQ_CTXT_VLAN_##member##_MASK) \
+ << SQ_CTXT_VLAN_##member##_SHIFT)
+
+#define SQ_CTXT_PREF_CACHE_THRESHOLD_SHIFT 0
+#define SQ_CTXT_PREF_CACHE_MAX_SHIFT 14
+#define SQ_CTXT_PREF_CACHE_MIN_SHIFT 25
+
+#define SQ_CTXT_PREF_CACHE_THRESHOLD_MASK 0x3FFFU
+#define SQ_CTXT_PREF_CACHE_MAX_MASK 0x7FFU
+#define SQ_CTXT_PREF_CACHE_MIN_MASK 0x7FU
+
+#define SQ_CTXT_PREF_CI_HI_SHIFT 0
+#define SQ_CTXT_PREF_OWNER_SHIFT 4
+
+#define SQ_CTXT_PREF_CI_HI_MASK 0xFU
+#define SQ_CTXT_PREF_OWNER_MASK 0x1U
+
+#define SQ_CTXT_PREF_WQ_PFN_HI_SHIFT 0
+#define SQ_CTXT_PREF_CI_LOW_SHIFT 20
+
+#define SQ_CTXT_PREF_WQ_PFN_HI_MASK 0xFFFFFU
+#define SQ_CTXT_PREF_CI_LOW_MASK 0xFFFU
+
+#define SQ_CTXT_PREF_SET(val, member) (((val) & \
+ SQ_CTXT_PREF_##member##_MASK) \
+ << SQ_CTXT_PREF_##member##_SHIFT)
+
+#define SQ_CTXT_WQ_BLOCK_PFN_HI_SHIFT 0
+
+#define SQ_CTXT_WQ_BLOCK_PFN_HI_MASK 0x7FFFFFU
+
+#define SQ_CTXT_WQ_BLOCK_SET(val, member) (((val) & \
+ SQ_CTXT_WQ_BLOCK_##member##_MASK) \
+ << SQ_CTXT_WQ_BLOCK_##member##_SHIFT)
+
+#define RQ_CTXT_PI_IDX_SHIFT 0
+#define RQ_CTXT_CI_IDX_SHIFT 16
+
+#define RQ_CTXT_PI_IDX_MASK 0xFFFFU
+#define RQ_CTXT_CI_IDX_MASK 0xFFFFU
+
+#define RQ_CTXT_CI_PI_SET(val, member) (((val) & \
+ RQ_CTXT_##member##_MASK) \
+ << RQ_CTXT_##member##_SHIFT)
+
+#define RQ_CTXT_CEQ_ATTR_INTR_SHIFT 21
+#define RQ_CTXT_CEQ_ATTR_EN_SHIFT 31
+
+#define RQ_CTXT_CEQ_ATTR_INTR_MASK 0x3FFU
+#define RQ_CTXT_CEQ_ATTR_EN_MASK 0x1U
+
+#define RQ_CTXT_CEQ_ATTR_SET(val, member) (((val) & \
+ RQ_CTXT_CEQ_ATTR_##member##_MASK) \
+ << RQ_CTXT_CEQ_ATTR_##member##_SHIFT)
+
+#define RQ_CTXT_WQ_PAGE_HI_PFN_SHIFT 0
+#define RQ_CTXT_WQ_PAGE_WQE_TYPE_SHIFT 28
+#define RQ_CTXT_WQ_PAGE_OWNER_SHIFT 31
+
+#define RQ_CTXT_WQ_PAGE_HI_PFN_MASK 0xFFFFFU
+#define RQ_CTXT_WQ_PAGE_WQE_TYPE_MASK 0x3U
+#define RQ_CTXT_WQ_PAGE_OWNER_MASK 0x1U
+
+#define RQ_CTXT_WQ_PAGE_SET(val, member) (((val) & \
+ RQ_CTXT_WQ_PAGE_##member##_MASK) << \
+ RQ_CTXT_WQ_PAGE_##member##_SHIFT)
+
+#define RQ_CTXT_CQE_LEN_SHIFT 28
+
+#define RQ_CTXT_CQE_LEN_MASK 0x3U
+
+#define RQ_CTXT_CQE_LEN_SET(val, member) (((val) & \
+ RQ_CTXT_##member##_MASK) << \
+ RQ_CTXT_##member##_SHIFT)
+
+#define RQ_CTXT_PREF_CACHE_THRESHOLD_SHIFT 0
+#define RQ_CTXT_PREF_CACHE_MAX_SHIFT 14
+#define RQ_CTXT_PREF_CACHE_MIN_SHIFT 25
+
+#define RQ_CTXT_PREF_CACHE_THRESHOLD_MASK 0x3FFFU
+#define RQ_CTXT_PREF_CACHE_MAX_MASK 0x7FFU
+#define RQ_CTXT_PREF_CACHE_MIN_MASK 0x7FU
+
+#define RQ_CTXT_PREF_CI_HI_SHIFT 0
+#define RQ_CTXT_PREF_OWNER_SHIFT 4
+
+#define RQ_CTXT_PREF_CI_HI_MASK 0xFU
+#define RQ_CTXT_PREF_OWNER_MASK 0x1U
+
+#define RQ_CTXT_PREF_WQ_PFN_HI_SHIFT 0
+#define RQ_CTXT_PREF_CI_LOW_SHIFT 20
+
+#define RQ_CTXT_PREF_WQ_PFN_HI_MASK 0xFFFFFU
+#define RQ_CTXT_PREF_CI_LOW_MASK 0xFFFU
+
+#define RQ_CTXT_PREF_SET(val, member) (((val) & \
+ RQ_CTXT_PREF_##member##_MASK) << \
+ RQ_CTXT_PREF_##member##_SHIFT)
+
+#define RQ_CTXT_WQ_BLOCK_PFN_HI_SHIFT 0
+
+#define RQ_CTXT_WQ_BLOCK_PFN_HI_MASK 0x7FFFFFU
+
+#define RQ_CTXT_WQ_BLOCK_SET(val, member) (((val) & \
+ RQ_CTXT_WQ_BLOCK_##member##_MASK) << \
+ RQ_CTXT_WQ_BLOCK_##member##_SHIFT)
+
+#define SIZE_16BYTES(size) (ALIGN((size), 16) >> 4)
+
+#define WQ_PAGE_PFN_SHIFT 12
+#define WQ_BLOCK_PFN_SHIFT 9
+
+#define WQ_PAGE_PFN(page_addr) ((page_addr) >> WQ_PAGE_PFN_SHIFT)
+#define WQ_BLOCK_PFN(page_addr) ((page_addr) >> WQ_BLOCK_PFN_SHIFT)
+
+/* sq and rq */
+#define TOTAL_DB_NUM(num_qps) ((u16)(2 * (num_qps)))
+
+static int hinic3_create_sq(struct hinic3_nic_io *nic_io, struct hinic3_io_queue *sq,
+ u16 q_id, u32 sq_depth, u16 sq_msix_idx)
+{
+ int err;
+
+ /* sq used & hardware request init 1 */
+ sq->owner = 1;
+
+ sq->q_id = q_id;
+ sq->msix_entry_idx = sq_msix_idx;
+
+ err = hinic3_wq_create(nic_io->hwdev, &sq->wq, sq_depth,
+ (u16)BIT(HINIC3_SQ_WQEBB_SHIFT));
+ if (err) {
+ sdk_err(nic_io->dev_hdl, "Failed to create tx queue(%u) wq\n",
+ q_id);
+ return err;
+ }
+
+ return 0;
+}
+
+static void hinic3_destroy_sq(struct hinic3_nic_io *nic_io, struct hinic3_io_queue *sq)
+{
+ hinic3_wq_destroy(&sq->wq);
+}
+
+static int hinic3_create_rq(struct hinic3_nic_io *nic_io, struct hinic3_io_queue *rq,
+ u16 q_id, u32 rq_depth, u16 rq_msix_idx)
+{
+ int err;
+
+ rq->wqe_type = rq_wqe_type;
+ rq->q_id = q_id;
+ rq->msix_entry_idx = rq_msix_idx;
+
+ err = hinic3_wq_create(nic_io->hwdev, &rq->wq, rq_depth,
+ (u16)BIT(HINIC3_RQ_WQEBB_SHIFT + rq_wqe_type));
+ if (err) {
+ sdk_err(nic_io->dev_hdl, "Failed to create rx queue(%u) wq\n",
+ q_id);
+ return err;
+ }
+
+ rq->rx.pi_virt_addr = dma_zalloc_coherent(nic_io->dev_hdl, PAGE_SIZE,
+ &rq->rx.pi_dma_addr,
+ GFP_KERNEL);
+ if (!rq->rx.pi_virt_addr) {
+ hinic3_wq_destroy(&rq->wq);
+ nic_err(nic_io->dev_hdl, "Failed to allocate rq pi virt addr\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void hinic3_destroy_rq(struct hinic3_nic_io *nic_io, struct hinic3_io_queue *rq)
+{
+ dma_free_coherent(nic_io->dev_hdl, PAGE_SIZE, rq->rx.pi_virt_addr,
+ rq->rx.pi_dma_addr);
+
+ hinic3_wq_destroy(&rq->wq);
+}
+
+static int create_qp(struct hinic3_nic_io *nic_io, struct hinic3_io_queue *sq,
+ struct hinic3_io_queue *rq, u16 q_id, u32 sq_depth,
+ u32 rq_depth, u16 qp_msix_idx)
+{
+ int err;
+
+ err = hinic3_create_sq(nic_io, sq, q_id, sq_depth, qp_msix_idx);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to create sq, qid: %u\n",
+ q_id);
+ return err;
+ }
+
+ err = hinic3_create_rq(nic_io, rq, q_id, rq_depth, qp_msix_idx);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to create rq, qid: %u\n",
+ q_id);
+ goto create_rq_err;
+ }
+
+ return 0;
+
+create_rq_err:
+ hinic3_destroy_sq(nic_io, sq);
+
+ return err;
+}
+
+static void destroy_qp(struct hinic3_nic_io *nic_io, struct hinic3_io_queue *sq,
+ struct hinic3_io_queue *rq)
+{
+ hinic3_destroy_sq(nic_io, sq);
+ hinic3_destroy_rq(nic_io, rq);
+}
+
+int hinic3_init_nicio_res(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ void __iomem *db_base = NULL;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("Failed to get nic service adapter\n");
+ return -EFAULT;
+ }
+
+ nic_io->max_qps = hinic3_func_max_qnum(hwdev);
+
+ err = hinic3_alloc_db_addr(hwdev, &db_base, NULL);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate doorbell for sqs\n");
+ return -ENOMEM;
+ }
+ nic_io->sqs_db_addr = (u8 *)db_base;
+
+ err = hinic3_alloc_db_addr(hwdev, &db_base, NULL);
+ if (err) {
+ hinic3_free_db_addr(hwdev, nic_io->sqs_db_addr, NULL);
+ nic_err(nic_io->dev_hdl, "Failed to allocate doorbell for rqs\n");
+ return -ENOMEM;
+ }
+ nic_io->rqs_db_addr = (u8 *)db_base;
+
+ nic_io->ci_vaddr_base =
+ dma_zalloc_coherent(nic_io->dev_hdl,
+ CI_TABLE_SIZE(nic_io->max_qps, PAGE_SIZE),
+ &nic_io->ci_dma_base, GFP_KERNEL);
+ if (!nic_io->ci_vaddr_base) {
+ hinic3_free_db_addr(hwdev, nic_io->sqs_db_addr, NULL);
+ hinic3_free_db_addr(hwdev, nic_io->rqs_db_addr, NULL);
+ nic_err(nic_io->dev_hdl, "Failed to allocate ci area\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+void hinic3_deinit_nicio_res(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("Failed to get nic service adapter\n");
+ return;
+ }
+
+ dma_free_coherent(nic_io->dev_hdl,
+ CI_TABLE_SIZE(nic_io->max_qps, PAGE_SIZE),
+ nic_io->ci_vaddr_base, nic_io->ci_dma_base);
+ /* free all doorbell */
+ hinic3_free_db_addr(hwdev, nic_io->sqs_db_addr, NULL);
+ hinic3_free_db_addr(hwdev, nic_io->rqs_db_addr, NULL);
+}
+
+int hinic3_alloc_qps(void *hwdev, struct irq_info *qps_msix_arry,
+ struct hinic3_dyna_qp_params *qp_params)
+{
+ struct hinic3_io_queue *sqs = NULL;
+ struct hinic3_io_queue *rqs = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 q_id, i, num_qps;
+ int err;
+
+ if (!hwdev || !qps_msix_arry || !qp_params)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("Failed to get nic service adapter\n");
+ return -EFAULT;
+ }
+
+ if (qp_params->num_qps > nic_io->max_qps || !qp_params->num_qps)
+ return -EINVAL;
+
+ num_qps = qp_params->num_qps;
+ sqs = kcalloc(num_qps, sizeof(*sqs), GFP_KERNEL);
+ if (!sqs) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate sq\n");
+ err = -ENOMEM;
+ goto alloc_sqs_err;
+ }
+
+ rqs = kcalloc(num_qps, sizeof(*rqs), GFP_KERNEL);
+ if (!rqs) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate rq\n");
+ err = -ENOMEM;
+ goto alloc_rqs_err;
+ }
+
+ for (q_id = 0; q_id < num_qps; q_id++) {
+ err = create_qp(nic_io, &sqs[q_id], &rqs[q_id], q_id, qp_params->sq_depth,
+ qp_params->rq_depth, qps_msix_arry[q_id].msix_entry_idx);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate qp %u, err: %d\n", q_id, err);
+ goto create_qp_err;
+ }
+ }
+
+ qp_params->sqs = sqs;
+ qp_params->rqs = rqs;
+
+ return 0;
+
+create_qp_err:
+ for (i = 0; i < q_id; i++)
+ destroy_qp(nic_io, &sqs[i], &rqs[i]);
+
+ kfree(rqs);
+
+alloc_rqs_err:
+ kfree(sqs);
+
+alloc_sqs_err:
+
+ return err;
+}
+
+void hinic3_free_qps(void *hwdev, struct hinic3_dyna_qp_params *qp_params)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 q_id;
+
+ if (!hwdev || !qp_params)
+ return;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("Failed to get nic service adapter\n");
+ return;
+ }
+
+ for (q_id = 0; q_id < qp_params->num_qps; q_id++)
+ destroy_qp(nic_io, &qp_params->sqs[q_id],
+ &qp_params->rqs[q_id]);
+
+ kfree(qp_params->sqs);
+ kfree(qp_params->rqs);
+}
+
+static void init_qps_info(struct hinic3_nic_io *nic_io,
+ struct hinic3_dyna_qp_params *qp_params)
+{
+ struct hinic3_io_queue *sqs = qp_params->sqs;
+ struct hinic3_io_queue *rqs = qp_params->rqs;
+ u16 q_id;
+
+ nic_io->num_qps = qp_params->num_qps;
+ nic_io->sq = qp_params->sqs;
+ nic_io->rq = qp_params->rqs;
+ for (q_id = 0; q_id < nic_io->num_qps; q_id++) {
+ sqs[q_id].tx.cons_idx_addr =
+ HINIC3_CI_VADDR(nic_io->ci_vaddr_base, q_id);
+ /* clear ci value */
+ *(u16 *)sqs[q_id].tx.cons_idx_addr = 0;
+ sqs[q_id].db_addr = nic_io->sqs_db_addr;
+
+ /* The first num_qps doorbell is used by sq */
+ rqs[q_id].db_addr = nic_io->rqs_db_addr;
+ }
+}
+
+int hinic3_init_qps(void *hwdev, struct hinic3_dyna_qp_params *qp_params)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !qp_params)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("Failed to get nic service adapter\n");
+ return -EFAULT;
+ }
+
+ init_qps_info(nic_io, qp_params);
+
+ return hinic3_init_qp_ctxts(hwdev);
+}
+
+void hinic3_deinit_qps(void *hwdev, struct hinic3_dyna_qp_params *qp_params)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !qp_params)
+ return;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("Failed to get nic service adapter\n");
+ return;
+ }
+
+ qp_params->sqs = nic_io->sq;
+ qp_params->rqs = nic_io->rq;
+ qp_params->num_qps = nic_io->num_qps;
+
+ hinic3_free_qp_ctxts(hwdev);
+}
+
+int hinic3_create_qps(void *hwdev, u16 num_qp, u32 sq_depth, u32 rq_depth,
+ struct irq_info *qps_msix_arry)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_dyna_qp_params qp_params = {0};
+ int err;
+
+ if (!hwdev || !qps_msix_arry)
+ return -EFAULT;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io) {
+ pr_err("Failed to get nic service adapter\n");
+ return -EFAULT;
+ }
+
+ err = hinic3_init_nicio_res(hwdev);
+ if (err)
+ return err;
+
+ qp_params.num_qps = num_qp;
+ qp_params.sq_depth = sq_depth;
+ qp_params.rq_depth = rq_depth;
+ err = hinic3_alloc_qps(hwdev, qps_msix_arry, &qp_params);
+ if (err) {
+ hinic3_deinit_nicio_res(hwdev);
+ nic_err(nic_io->dev_hdl,
+ "Failed to allocate qps, err: %d\n", err);
+ return err;
+ }
+
+ init_qps_info(nic_io, &qp_params);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_create_qps);
+
+void hinic3_destroy_qps(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_dyna_qp_params qp_params = {0};
+
+ if (!hwdev)
+ return;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return;
+
+ hinic3_deinit_qps(hwdev, &qp_params);
+ hinic3_free_qps(hwdev, &qp_params);
+ hinic3_deinit_nicio_res(hwdev);
+}
+EXPORT_SYMBOL(hinic3_destroy_qps);
+
+void *hinic3_get_nic_queue(void *hwdev, u16 q_id, enum hinic3_queue_type q_type)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || q_type >= HINIC3_MAX_QUEUE_TYPE)
+ return NULL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return NULL;
+
+ return ((q_type == HINIC3_SQ) ? &nic_io->sq[q_id] : &nic_io->rq[q_id]);
+}
+EXPORT_SYMBOL(hinic3_get_nic_queue);
+
+static void hinic3_qp_prepare_cmdq_header(struct hinic3_qp_ctxt_header *qp_ctxt_hdr,
+ enum hinic3_qp_ctxt_type ctxt_type,
+ u16 num_queues, u16 q_id)
+{
+ qp_ctxt_hdr->queue_type = ctxt_type;
+ qp_ctxt_hdr->num_queues = num_queues;
+ qp_ctxt_hdr->start_qid = q_id;
+ qp_ctxt_hdr->rsvd = 0;
+
+ hinic3_cpu_to_be32(qp_ctxt_hdr, sizeof(*qp_ctxt_hdr));
+}
+
+static void hinic3_sq_prepare_ctxt(struct hinic3_io_queue *sq, u16 sq_id,
+ struct hinic3_sq_ctxt *sq_ctxt)
+{
+ u64 wq_page_addr;
+ u64 wq_page_pfn, wq_block_pfn;
+ u32 wq_page_pfn_hi, wq_page_pfn_lo;
+ u32 wq_block_pfn_hi, wq_block_pfn_lo;
+ u16 pi_start, ci_start;
+
+ ci_start = hinic3_get_sq_local_ci(sq);
+ pi_start = hinic3_get_sq_local_pi(sq);
+
+ wq_page_addr = hinic3_wq_get_first_wqe_page_addr(&sq->wq);
+
+ wq_page_pfn = WQ_PAGE_PFN(wq_page_addr);
+ wq_page_pfn_hi = upper_32_bits(wq_page_pfn);
+ wq_page_pfn_lo = lower_32_bits(wq_page_pfn);
+
+ wq_block_pfn = WQ_BLOCK_PFN(sq->wq.wq_block_paddr);
+ wq_block_pfn_hi = upper_32_bits(wq_block_pfn);
+ wq_block_pfn_lo = lower_32_bits(wq_block_pfn);
+
+ sq_ctxt->ci_pi =
+ SQ_CTXT_CI_PI_SET(ci_start, CI_IDX) |
+ SQ_CTXT_CI_PI_SET(pi_start, PI_IDX);
+
+ sq_ctxt->drop_mode_sp =
+ SQ_CTXT_MODE_SET(0, SP_FLAG) |
+ SQ_CTXT_MODE_SET(0, PKT_DROP);
+
+ sq_ctxt->wq_pfn_hi_owner =
+ SQ_CTXT_WQ_PAGE_SET(wq_page_pfn_hi, HI_PFN) |
+ SQ_CTXT_WQ_PAGE_SET(1, OWNER);
+
+ sq_ctxt->wq_pfn_lo = wq_page_pfn_lo;
+
+ /* TO DO */
+ sq_ctxt->pkt_drop_thd =
+ SQ_CTXT_PKT_DROP_THD_SET(tx_drop_thd_on, THD_ON) |
+ SQ_CTXT_PKT_DROP_THD_SET(tx_drop_thd_off, THD_OFF);
+
+ sq_ctxt->global_sq_id =
+ SQ_CTXT_GLOBAL_QUEUE_ID_SET(sq_id, GLOBAL_SQ_ID);
+
+ /* enable insert c-vlan in default */
+ sq_ctxt->vlan_ceq_attr =
+ SQ_CTXT_VLAN_CEQ_SET(0, CEQ_EN) |
+ SQ_CTXT_VLAN_CEQ_SET(1, INSERT_MODE);
+
+ sq_ctxt->rsvd0 = 0;
+
+ sq_ctxt->pref_cache =
+ SQ_CTXT_PREF_SET(WQ_PREFETCH_MIN, CACHE_MIN) |
+ SQ_CTXT_PREF_SET(WQ_PREFETCH_MAX, CACHE_MAX) |
+ SQ_CTXT_PREF_SET(WQ_PREFETCH_THRESHOLD, CACHE_THRESHOLD);
+
+ sq_ctxt->pref_ci_owner =
+ SQ_CTXT_PREF_SET(CI_HIGN_IDX(ci_start), CI_HI) |
+ SQ_CTXT_PREF_SET(1, OWNER);
+
+ sq_ctxt->pref_wq_pfn_hi_ci =
+ SQ_CTXT_PREF_SET(ci_start, CI_LOW) |
+ SQ_CTXT_PREF_SET(wq_page_pfn_hi, WQ_PFN_HI);
+
+ sq_ctxt->pref_wq_pfn_lo = wq_page_pfn_lo;
+
+ sq_ctxt->wq_block_pfn_hi =
+ SQ_CTXT_WQ_BLOCK_SET(wq_block_pfn_hi, PFN_HI);
+
+ sq_ctxt->wq_block_pfn_lo = wq_block_pfn_lo;
+
+ hinic3_cpu_to_be32(sq_ctxt, sizeof(*sq_ctxt));
+}
+
+static void hinic3_rq_prepare_ctxt_get_wq_info(struct hinic3_io_queue *rq,
+ u32 *wq_page_pfn_hi, u32 *wq_page_pfn_lo,
+ u32 *wq_block_pfn_hi, u32 *wq_block_pfn_lo)
+{
+ u64 wq_page_addr;
+ u64 wq_page_pfn, wq_block_pfn;
+
+ wq_page_addr = hinic3_wq_get_first_wqe_page_addr(&rq->wq);
+
+ wq_page_pfn = WQ_PAGE_PFN(wq_page_addr);
+ *wq_page_pfn_hi = upper_32_bits(wq_page_pfn);
+ *wq_page_pfn_lo = lower_32_bits(wq_page_pfn);
+
+ wq_block_pfn = WQ_BLOCK_PFN(rq->wq.wq_block_paddr);
+ *wq_block_pfn_hi = upper_32_bits(wq_block_pfn);
+ *wq_block_pfn_lo = lower_32_bits(wq_block_pfn);
+}
+
+static void hinic3_rq_prepare_ctxt(struct hinic3_io_queue *rq, struct hinic3_rq_ctxt *rq_ctxt)
+{
+ u32 wq_page_pfn_hi, wq_page_pfn_lo;
+ u32 wq_block_pfn_hi, wq_block_pfn_lo;
+ u16 pi_start, ci_start;
+ u16 wqe_type = rq->wqe_type;
+
+ /* RQ depth is in unit of 8Bytes */
+ ci_start = (u16)((u32)hinic3_get_rq_local_ci(rq) << wqe_type);
+ pi_start = (u16)((u32)hinic3_get_rq_local_pi(rq) << wqe_type);
+
+ hinic3_rq_prepare_ctxt_get_wq_info(rq, &wq_page_pfn_hi, &wq_page_pfn_lo,
+ &wq_block_pfn_hi, &wq_block_pfn_lo);
+
+ rq_ctxt->ci_pi =
+ RQ_CTXT_CI_PI_SET(ci_start, CI_IDX) |
+ RQ_CTXT_CI_PI_SET(pi_start, PI_IDX);
+
+ rq_ctxt->ceq_attr = RQ_CTXT_CEQ_ATTR_SET(0, EN) |
+ RQ_CTXT_CEQ_ATTR_SET(rq->msix_entry_idx, INTR);
+
+ rq_ctxt->wq_pfn_hi_type_owner =
+ RQ_CTXT_WQ_PAGE_SET(wq_page_pfn_hi, HI_PFN) |
+ RQ_CTXT_WQ_PAGE_SET(1, OWNER);
+
+ switch (wqe_type) {
+ case HINIC3_EXTEND_RQ_WQE:
+ /* use 32Byte WQE with SGE for CQE */
+ rq_ctxt->wq_pfn_hi_type_owner |=
+ RQ_CTXT_WQ_PAGE_SET(0, WQE_TYPE);
+ break;
+ case HINIC3_NORMAL_RQ_WQE:
+ /* use 16Byte WQE with 32Bytes SGE for CQE */
+ rq_ctxt->wq_pfn_hi_type_owner |=
+ RQ_CTXT_WQ_PAGE_SET(2, WQE_TYPE);
+ rq_ctxt->cqe_sge_len = RQ_CTXT_CQE_LEN_SET(1, CQE_LEN);
+ break;
+ default:
+ pr_err("Invalid rq wqe type: %u", wqe_type);
+ }
+
+ rq_ctxt->wq_pfn_lo = wq_page_pfn_lo;
+
+ rq_ctxt->pref_cache =
+ RQ_CTXT_PREF_SET(WQ_PREFETCH_MIN, CACHE_MIN) |
+ RQ_CTXT_PREF_SET(WQ_PREFETCH_MAX, CACHE_MAX) |
+ RQ_CTXT_PREF_SET(WQ_PREFETCH_THRESHOLD, CACHE_THRESHOLD);
+
+ rq_ctxt->pref_ci_owner =
+ RQ_CTXT_PREF_SET(CI_HIGN_IDX(ci_start), CI_HI) |
+ RQ_CTXT_PREF_SET(1, OWNER);
+
+ rq_ctxt->pref_wq_pfn_hi_ci =
+ RQ_CTXT_PREF_SET(wq_page_pfn_hi, WQ_PFN_HI) |
+ RQ_CTXT_PREF_SET(ci_start, CI_LOW);
+
+ rq_ctxt->pref_wq_pfn_lo = wq_page_pfn_lo;
+
+ rq_ctxt->pi_paddr_hi = upper_32_bits(rq->rx.pi_dma_addr);
+ rq_ctxt->pi_paddr_lo = lower_32_bits(rq->rx.pi_dma_addr);
+
+ rq_ctxt->wq_block_pfn_hi =
+ RQ_CTXT_WQ_BLOCK_SET(wq_block_pfn_hi, PFN_HI);
+
+ rq_ctxt->wq_block_pfn_lo = wq_block_pfn_lo;
+
+ hinic3_cpu_to_be32(rq_ctxt, sizeof(*rq_ctxt));
+}
+
+static int init_sq_ctxts(struct hinic3_nic_io *nic_io)
+{
+ struct hinic3_sq_ctxt_block *sq_ctxt_block = NULL;
+ struct hinic3_sq_ctxt *sq_ctxt = NULL;
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ struct hinic3_io_queue *sq = NULL;
+ u64 out_param = 0;
+ u16 q_id, curr_id, max_ctxts, i;
+ int err = 0;
+
+ cmd_buf = hinic3_alloc_cmd_buf(nic_io->hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd buf\n");
+ return -ENOMEM;
+ }
+
+ q_id = 0;
+ while (q_id < nic_io->num_qps) {
+ sq_ctxt_block = cmd_buf->buf;
+ sq_ctxt = sq_ctxt_block->sq_ctxt;
+
+ max_ctxts = (nic_io->num_qps - q_id) > HINIC3_Q_CTXT_MAX ?
+ HINIC3_Q_CTXT_MAX : (nic_io->num_qps - q_id);
+
+ hinic3_qp_prepare_cmdq_header(&sq_ctxt_block->cmdq_hdr,
+ HINIC3_QP_CTXT_TYPE_SQ, max_ctxts,
+ q_id);
+
+ for (i = 0; i < max_ctxts; i++) {
+ curr_id = q_id + i;
+ sq = &nic_io->sq[curr_id];
+
+ hinic3_sq_prepare_ctxt(sq, curr_id, &sq_ctxt[i]);
+ }
+
+ cmd_buf->size = SQ_CTXT_SIZE(max_ctxts);
+
+ err = hinic3_cmdq_direct_resp(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_UCODE_CMD_MODIFY_QUEUE_CTX,
+ cmd_buf, &out_param, 0,
+ HINIC3_CHANNEL_NIC);
+ if (err || out_param != 0) {
+ nic_err(nic_io->dev_hdl, "Failed to set SQ ctxts, err: %d, out_param: 0x%llx\n",
+ err, out_param);
+
+ err = -EFAULT;
+ break;
+ }
+
+ q_id += max_ctxts;
+ }
+
+ hinic3_free_cmd_buf(nic_io->hwdev, cmd_buf);
+
+ return err;
+}
+
+static int init_rq_ctxts(struct hinic3_nic_io *nic_io)
+{
+ struct hinic3_rq_ctxt_block *rq_ctxt_block = NULL;
+ struct hinic3_rq_ctxt *rq_ctxt = NULL;
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ struct hinic3_io_queue *rq = NULL;
+ u64 out_param = 0;
+ u16 q_id, curr_id, max_ctxts, i;
+ int err = 0;
+
+ cmd_buf = hinic3_alloc_cmd_buf(nic_io->hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd buf\n");
+ return -ENOMEM;
+ }
+
+ q_id = 0;
+ while (q_id < nic_io->num_qps) {
+ rq_ctxt_block = cmd_buf->buf;
+ rq_ctxt = rq_ctxt_block->rq_ctxt;
+
+ max_ctxts = (nic_io->num_qps - q_id) > HINIC3_Q_CTXT_MAX ?
+ HINIC3_Q_CTXT_MAX : (nic_io->num_qps - q_id);
+
+ hinic3_qp_prepare_cmdq_header(&rq_ctxt_block->cmdq_hdr,
+ HINIC3_QP_CTXT_TYPE_RQ, max_ctxts,
+ q_id);
+
+ for (i = 0; i < max_ctxts; i++) {
+ curr_id = q_id + i;
+ rq = &nic_io->rq[curr_id];
+
+ hinic3_rq_prepare_ctxt(rq, &rq_ctxt[i]);
+ }
+
+ cmd_buf->size = RQ_CTXT_SIZE(max_ctxts);
+
+ err = hinic3_cmdq_direct_resp(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_UCODE_CMD_MODIFY_QUEUE_CTX,
+ cmd_buf, &out_param, 0,
+ HINIC3_CHANNEL_NIC);
+ if (err || out_param != 0) {
+ nic_err(nic_io->dev_hdl, "Failed to set RQ ctxts, err: %d, out_param: 0x%llx\n",
+ err, out_param);
+
+ err = -EFAULT;
+ break;
+ }
+
+ q_id += max_ctxts;
+ }
+
+ hinic3_free_cmd_buf(nic_io->hwdev, cmd_buf);
+
+ return err;
+}
+
+static int init_qp_ctxts(struct hinic3_nic_io *nic_io)
+{
+ int err;
+
+ err = init_sq_ctxts(nic_io);
+ if (err)
+ return err;
+
+ err = init_rq_ctxts(nic_io);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int clean_queue_offload_ctxt(struct hinic3_nic_io *nic_io,
+ enum hinic3_qp_ctxt_type ctxt_type)
+{
+ struct hinic3_clean_queue_ctxt *ctxt_block = NULL;
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ u64 out_param = 0;
+ int err;
+
+ cmd_buf = hinic3_alloc_cmd_buf(nic_io->hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd buf\n");
+ return -ENOMEM;
+ }
+
+ ctxt_block = cmd_buf->buf;
+ ctxt_block->cmdq_hdr.num_queues = nic_io->max_qps;
+ ctxt_block->cmdq_hdr.queue_type = ctxt_type;
+ ctxt_block->cmdq_hdr.start_qid = 0;
+
+ hinic3_cpu_to_be32(ctxt_block, sizeof(*ctxt_block));
+
+ cmd_buf->size = sizeof(*ctxt_block);
+
+ err = hinic3_cmdq_direct_resp(nic_io->hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_UCODE_CMD_CLEAN_QUEUE_CONTEXT,
+ cmd_buf, &out_param, 0,
+ HINIC3_CHANNEL_NIC);
+ if ((err) || (out_param)) {
+ nic_err(nic_io->dev_hdl, "Failed to clean queue offload ctxts, err: %d,out_param: 0x%llx\n",
+ err, out_param);
+
+ err = -EFAULT;
+ }
+
+ hinic3_free_cmd_buf(nic_io->hwdev, cmd_buf);
+
+ return err;
+}
+
+static int clean_qp_offload_ctxt(struct hinic3_nic_io *nic_io)
+{
+ /* clean LRO/TSO context space */
+ return (clean_queue_offload_ctxt(nic_io, HINIC3_QP_CTXT_TYPE_SQ) ||
+ clean_queue_offload_ctxt(nic_io, HINIC3_QP_CTXT_TYPE_RQ));
+}
+
+/* init qps ctxt and set sq ci attr and arm all sq */
+int hinic3_init_qp_ctxts(void *hwdev)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_sq_attr sq_attr;
+ u32 rq_depth;
+ u16 q_id;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ if (!nic_io)
+ return -EFAULT;
+
+ err = init_qp_ctxts(nic_io);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to init QP ctxts\n");
+ return err;
+ }
+
+ /* clean LRO/TSO context space */
+ err = clean_qp_offload_ctxt(nic_io);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to clean qp offload ctxts\n");
+ return err;
+ }
+
+ rq_depth = nic_io->rq[0].wq.q_depth << nic_io->rq[0].wqe_type;
+
+ err = hinic3_set_root_ctxt(hwdev, rq_depth, nic_io->sq[0].wq.q_depth,
+ nic_io->rx_buff_len, HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to set root context\n");
+ return err;
+ }
+
+ for (q_id = 0; q_id < nic_io->num_qps; q_id++) {
+ sq_attr.ci_dma_base =
+ HINIC3_CI_PADDR(nic_io->ci_dma_base, q_id) >> 0x2;
+ sq_attr.pending_limit = tx_pending_limit;
+ sq_attr.coalescing_time = tx_coalescing_time;
+ sq_attr.intr_en = 1;
+ sq_attr.intr_idx = nic_io->sq[q_id].msix_entry_idx;
+ sq_attr.l2nic_sqn = q_id;
+ sq_attr.dma_attr_off = 0;
+ err = hinic3_set_ci_table(hwdev, &sq_attr);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to set ci table\n");
+ goto set_cons_idx_table_err;
+ }
+ }
+
+ return 0;
+
+set_cons_idx_table_err:
+ hinic3_clean_root_ctxt(hwdev, HINIC3_CHANNEL_NIC);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(hinic3_init_qp_ctxts);
+
+void hinic3_free_qp_ctxts(void *hwdev)
+{
+ if (!hwdev)
+ return;
+
+ hinic3_clean_root_ctxt(hwdev, HINIC3_CHANNEL_NIC);
+}
+EXPORT_SYMBOL_GPL(hinic3_free_qp_ctxts);
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.h
new file mode 100644
index 000000000000..5c5585a7fd74
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.h
@@ -0,0 +1,325 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NIC_IO_H
+#define HINIC3_NIC_IO_H
+
+#include "hinic3_crm.h"
+#include "hinic3_common.h"
+#include "hinic3_wq.h"
+
+#define HINIC3_MAX_TX_QUEUE_DEPTH 65536
+#define HINIC3_MAX_RX_QUEUE_DEPTH 16384
+
+#define HINIC3_MIN_QUEUE_DEPTH 128
+
+#define HINIC3_SQ_WQEBB_SHIFT 4
+#define HINIC3_RQ_WQEBB_SHIFT 3
+
+#define HINIC3_SQ_WQEBB_SIZE BIT(HINIC3_SQ_WQEBB_SHIFT)
+#define HINIC3_CQE_SIZE_SHIFT 4
+
+enum hinic3_rq_wqe_type {
+ HINIC3_COMPACT_RQ_WQE,
+ HINIC3_NORMAL_RQ_WQE,
+ HINIC3_EXTEND_RQ_WQE,
+};
+
+struct hinic3_io_queue {
+ struct hinic3_wq wq;
+ union {
+ u8 wqe_type; /* for rq */
+ u8 owner; /* for sq */
+ };
+ u8 rsvd1;
+ u16 rsvd2;
+
+ u16 q_id;
+ u16 msix_entry_idx;
+
+ u8 __iomem *db_addr;
+
+ union {
+ struct {
+ void *cons_idx_addr;
+ } tx;
+
+ struct {
+ u16 *pi_virt_addr;
+ dma_addr_t pi_dma_addr;
+ } rx;
+ };
+} ____cacheline_aligned;
+
+struct hinic3_nic_db {
+ u32 db_info;
+ u32 pi_hi;
+};
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+/* *
+ * @brief hinic3_get_sq_free_wqebbs - get send queue free wqebb
+ * @param sq: send queue
+ * @retval : number of free wqebb
+ */
+static inline u16 hinic3_get_sq_free_wqebbs(struct hinic3_io_queue *sq)
+{
+ return hinic3_wq_free_wqebbs(&sq->wq);
+}
+
+/* *
+ * @brief hinic3_update_sq_local_ci - update send queue local consumer index
+ * @param sq: send queue
+ * @param wqe_cnt: number of wqebb
+ */
+static inline void hinic3_update_sq_local_ci(struct hinic3_io_queue *sq,
+ u16 wqebb_cnt)
+{
+ hinic3_wq_put_wqebbs(&sq->wq, wqebb_cnt);
+}
+
+/* *
+ * @brief hinic3_get_sq_local_ci - get send queue local consumer index
+ * @param sq: send queue
+ * @retval : local consumer index
+ */
+static inline u16 hinic3_get_sq_local_ci(const struct hinic3_io_queue *sq)
+{
+ return WQ_MASK_IDX(&sq->wq, sq->wq.cons_idx);
+}
+
+/* *
+ * @brief hinic3_get_sq_local_pi - get send queue local producer index
+ * @param sq: send queue
+ * @retval : local producer index
+ */
+static inline u16 hinic3_get_sq_local_pi(const struct hinic3_io_queue *sq)
+{
+ return WQ_MASK_IDX(&sq->wq, sq->wq.prod_idx);
+}
+
+/* *
+ * @brief hinic3_get_sq_hw_ci - get send queue hardware consumer index
+ * @param sq: send queue
+ * @retval : hardware consumer index
+ */
+static inline u16 hinic3_get_sq_hw_ci(const struct hinic3_io_queue *sq)
+{
+ return WQ_MASK_IDX(&sq->wq,
+ hinic3_hw_cpu16(*(u16 *)sq->tx.cons_idx_addr));
+}
+
+/* *
+ * @brief hinic3_get_sq_one_wqebb - get send queue wqe with single wqebb
+ * @param sq: send queue
+ * @param pi: return current pi
+ * @retval : wqe base address
+ */
+static inline void *hinic3_get_sq_one_wqebb(struct hinic3_io_queue *sq, u16 *pi)
+{
+ return hinic3_wq_get_one_wqebb(&sq->wq, pi);
+}
+
+/* *
+ * @brief hinic3_get_sq_multi_wqebb - get send queue wqe with multiple wqebbs
+ * @param sq: send queue
+ * @param wqebb_cnt: wqebb counter
+ * @param pi: return current pi
+ * @param second_part_wqebbs_addr: second part wqebbs base address
+ * @param first_part_wqebbs_num: number wqebbs of first part
+ * @retval : first part wqebbs base address
+ */
+static inline void *hinic3_get_sq_multi_wqebbs(struct hinic3_io_queue *sq,
+ u16 wqebb_cnt, u16 *pi,
+ void **second_part_wqebbs_addr,
+ u16 *first_part_wqebbs_num)
+{
+ return hinic3_wq_get_multi_wqebbs(&sq->wq, wqebb_cnt, pi,
+ second_part_wqebbs_addr,
+ first_part_wqebbs_num);
+}
+
+/* *
+ * @brief hinic3_get_and_update_sq_owner - get and update send queue owner bit
+ * @param sq: send queue
+ * @param curr_pi: current pi
+ * @param wqebb_cnt: wqebb counter
+ * @retval : owner bit
+ */
+static inline u16 hinic3_get_and_update_sq_owner(struct hinic3_io_queue *sq,
+ u16 curr_pi, u16 wqebb_cnt)
+{
+ u16 owner = sq->owner;
+
+ if (unlikely(curr_pi + wqebb_cnt >= sq->wq.q_depth))
+ sq->owner = !sq->owner;
+
+ return owner;
+}
+
+/* *
+ * @brief hinic3_get_sq_wqe_with_owner - get send queue wqe with owner
+ * @param sq: send queue
+ * @param wqebb_cnt: wqebb counter
+ * @param pi: return current pi
+ * @param owner: return owner bit
+ * @param second_part_wqebbs_addr: second part wqebbs base address
+ * @param first_part_wqebbs_num: number wqebbs of first part
+ * @retval : first part wqebbs base address
+ */
+static inline void *hinic3_get_sq_wqe_with_owner(struct hinic3_io_queue *sq,
+ u16 wqebb_cnt, u16 *pi,
+ u16 *owner,
+ void **second_part_wqebbs_addr,
+ u16 *first_part_wqebbs_num)
+{
+ void *wqe = hinic3_wq_get_multi_wqebbs(&sq->wq, wqebb_cnt, pi,
+ second_part_wqebbs_addr,
+ first_part_wqebbs_num);
+
+ *owner = sq->owner;
+ if (unlikely(*pi + wqebb_cnt >= sq->wq.q_depth))
+ sq->owner = !sq->owner;
+
+ return wqe;
+}
+
+/* *
+ * @brief hinic3_rollback_sq_wqebbs - rollback send queue wqe
+ * @param sq: send queue
+ * @param wqebb_cnt: wqebb counter
+ * @param owner: owner bit
+ */
+static inline void hinic3_rollback_sq_wqebbs(struct hinic3_io_queue *sq,
+ u16 wqebb_cnt, u16 owner)
+{
+ if (owner != sq->owner)
+ sq->owner = (u8)owner;
+ sq->wq.prod_idx -= wqebb_cnt;
+}
+
+/* *
+ * @brief hinic3_rq_wqe_addr - get receive queue wqe address by queue index
+ * @param rq: receive queue
+ * @param idx: wq index
+ * @retval: wqe base address
+ */
+static inline void *hinic3_rq_wqe_addr(struct hinic3_io_queue *rq, u16 idx)
+{
+ return hinic3_wq_wqebb_addr(&rq->wq, idx);
+}
+
+/* *
+ * @brief hinic3_update_rq_hw_pi - update receive queue hardware pi
+ * @param rq: receive queue
+ * @param pi: pi
+ */
+static inline void hinic3_update_rq_hw_pi(struct hinic3_io_queue *rq, u16 pi)
+{
+ *rq->rx.pi_virt_addr = cpu_to_be16((pi & rq->wq.idx_mask) <<
+ rq->wqe_type);
+}
+
+/* *
+ * @brief hinic3_update_rq_local_ci - update receive queue local consumer index
+ * @param sq: receive queue
+ * @param wqe_cnt: number of wqebb
+ */
+static inline void hinic3_update_rq_local_ci(struct hinic3_io_queue *rq,
+ u16 wqebb_cnt)
+{
+ hinic3_wq_put_wqebbs(&rq->wq, wqebb_cnt);
+}
+
+/* *
+ * @brief hinic3_get_rq_local_ci - get receive queue local ci
+ * @param rq: receive queue
+ * @retval: receive queue local ci
+ */
+static inline u16 hinic3_get_rq_local_ci(const struct hinic3_io_queue *rq)
+{
+ return WQ_MASK_IDX(&rq->wq, rq->wq.cons_idx);
+}
+
+/* *
+ * @brief hinic3_get_rq_local_pi - get receive queue local pi
+ * @param rq: receive queue
+ * @retval: receive queue local pi
+ */
+static inline u16 hinic3_get_rq_local_pi(const struct hinic3_io_queue *rq)
+{
+ return WQ_MASK_IDX(&rq->wq, rq->wq.prod_idx);
+}
+
+/* ******************** DB INFO ******************** */
+#define DB_INFO_QID_SHIFT 0
+#define DB_INFO_NON_FILTER_SHIFT 22
+#define DB_INFO_CFLAG_SHIFT 23
+#define DB_INFO_COS_SHIFT 24
+#define DB_INFO_TYPE_SHIFT 27
+
+#define DB_INFO_QID_MASK 0x1FFFU
+#define DB_INFO_NON_FILTER_MASK 0x1U
+#define DB_INFO_CFLAG_MASK 0x1U
+#define DB_INFO_COS_MASK 0x7U
+#define DB_INFO_TYPE_MASK 0x1FU
+#define DB_INFO_SET(val, member) \
+ (((u32)(val) & DB_INFO_##member##_MASK) << \
+ DB_INFO_##member##_SHIFT)
+
+#define DB_PI_LOW_MASK 0xFFU
+#define DB_PI_HIGH_MASK 0xFFU
+#define DB_PI_LOW(pi) ((pi) & DB_PI_LOW_MASK)
+#define DB_PI_HI_SHIFT 8
+#define DB_PI_HIGH(pi) (((pi) >> DB_PI_HI_SHIFT) & DB_PI_HIGH_MASK)
+#define DB_ADDR(queue, pi) ((u64 *)((queue)->db_addr) + DB_PI_LOW(pi))
+#define SRC_TYPE 1
+
+/* CFLAG_DATA_PATH */
+#define SQ_CFLAG_DP 0
+#define RQ_CFLAG_DP 1
+/* *
+ * @brief hinic3_write_db - write doorbell
+ * @param queue: nic io queue
+ * @param cos: cos index
+ * @param cflag: 0--sq, 1--rq
+ * @param pi: product index
+ */
+static inline void hinic3_write_db(struct hinic3_io_queue *queue, int cos,
+ u8 cflag, u16 pi)
+{
+ struct hinic3_nic_db db;
+
+ db.db_info = DB_INFO_SET(SRC_TYPE, TYPE) | DB_INFO_SET(cflag, CFLAG) |
+ DB_INFO_SET(cos, COS) | DB_INFO_SET(queue->q_id, QID);
+ db.pi_hi = DB_PI_HIGH(pi);
+ /* Data should be written to HW in Big Endian Format */
+ db.db_info = hinic3_hw_be32(db.db_info);
+ db.pi_hi = hinic3_hw_be32(db.pi_hi);
+
+ wmb(); /* Write all before the doorbell */
+
+ writeq(*((u64 *)&db), DB_ADDR(queue, pi));
+}
+
+struct hinic3_dyna_qp_params {
+ u16 num_qps;
+ u32 sq_depth;
+ u32 rq_depth;
+
+ struct hinic3_io_queue *sqs;
+ struct hinic3_io_queue *rqs;
+};
+
+int hinic3_alloc_qps(void *hwdev, struct irq_info *qps_msix_arry,
+ struct hinic3_dyna_qp_params *qp_params);
+void hinic3_free_qps(void *hwdev, struct hinic3_dyna_qp_params *qp_params);
+int hinic3_init_qps(void *hwdev, struct hinic3_dyna_qp_params *qp_params);
+void hinic3_deinit_qps(void *hwdev, struct hinic3_dyna_qp_params *qp_params);
+int hinic3_init_nicio_res(void *hwdev);
+void hinic3_deinit_nicio_res(void *hwdev);
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.c b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.c
new file mode 100644
index 000000000000..78d943d2dab5
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.c
@@ -0,0 +1,47 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/netdevice.h>
+#include <linux/device.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+
+#include "ossl_knl.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_profile.h"
+#include "hinic3_nic_prof.h"
+
+static bool is_match_nic_prof_default_adapter(void *device)
+{
+ /* always match default profile adapter in standard scene */
+ return true;
+}
+
+struct hinic3_prof_adapter nic_prof_adap_objs[] = {
+ /* Add prof adapter before default profile */
+ {
+ .type = PROF_ADAP_TYPE_DEFAULT,
+ .match = is_match_nic_prof_default_adapter,
+ .init = NULL,
+ .deinit = NULL,
+ },
+};
+
+void hinic3_init_nic_prof_adapter(struct hinic3_nic_dev *nic_dev)
+{
+ u16 num_adap = ARRAY_SIZE(nic_prof_adap_objs);
+
+ nic_dev->prof_adap = hinic3_prof_init(nic_dev, nic_prof_adap_objs, num_adap,
+ (void *)&nic_dev->prof_attr);
+ if (nic_dev->prof_adap)
+ nic_info(&nic_dev->pdev->dev, "Find profile adapter type: %d\n",
+ nic_dev->prof_adap->type);
+}
+
+void hinic3_deinit_nic_prof_adapter(struct hinic3_nic_dev *nic_dev)
+{
+ hinic3_prof_deinit(nic_dev->prof_adap, nic_dev->prof_attr);
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.h
new file mode 100644
index 000000000000..3c279e715b0a
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NIC_PROF_H
+#define HINIC3_NIC_PROF_H
+#include <linux/socket.h>
+
+#include <linux/types.h>
+
+#include "hinic3_nic_cfg.h"
+
+struct hinic3_nic_prof_attr {
+ void *priv_data;
+ char netdev_name[IFNAMSIZ];
+};
+
+struct hinic3_nic_dev;
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+static inline char *hinic3_get_dft_netdev_name_fmt(struct hinic3_nic_dev *nic_dev)
+{
+ if (nic_dev->prof_attr)
+ return nic_dev->prof_attr->netdev_name;
+
+ return NULL;
+}
+
+#ifdef CONFIG_MODULE_PROF
+int hinic3_set_master_dev_state(struct hinic3_nic_dev *nic_dev, u32 flag);
+u32 hinic3_get_link(struct net_device *dev)
+int hinic3_config_port_mtu(struct hinic3_nic_dev *nic_dev, u32 mtu);
+int hinic3_config_port_mac(struct hinic3_nic_dev *nic_dev, struct sockaddr *saddr);
+#else
+static inline int hinic3_set_master_dev_state(struct hinic3_nic_dev *nic_dev, u32 flag)
+{
+ return 0;
+}
+
+static inline int hinic3_config_port_mtu(struct hinic3_nic_dev *nic_dev, u32 mtu)
+{
+ return hinic3_set_port_mtu(nic_dev->hwdev, (u16)mtu);
+}
+
+static inline int hinic3_config_port_mac(struct hinic3_nic_dev *nic_dev, struct sockaddr *saddr)
+{
+ return hinic3_update_mac(nic_dev->hwdev, nic_dev->netdev->dev_addr, saddr->sa_data, 0,
+ hinic3_global_func_id(nic_dev->hwdev));
+}
+
+#endif
+
+void hinic3_init_nic_prof_adapter(struct hinic3_nic_dev *nic_dev);
+void hinic3_deinit_nic_prof_adapter(struct hinic3_nic_dev *nic_dev);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_nic_qp.h b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_qp.h
new file mode 100644
index 000000000000..f492c5d8ad08
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_nic_qp.h
@@ -0,0 +1,384 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NIC_QP_H
+#define HINIC3_NIC_QP_H
+
+#include "hinic3_common.h"
+
+#define TX_MSS_DEFAULT 0x3E00
+#define TX_MSS_MIN 0x50
+
+#define HINIC3_MAX_SQ_SGE 18
+
+#define RQ_CQE_OFFOLAD_TYPE_PKT_TYPE_SHIFT 0
+#define RQ_CQE_OFFOLAD_TYPE_IP_TYPE_SHIFT 5
+#define RQ_CQE_OFFOLAD_TYPE_ENC_L3_TYPE_SHIFT 7
+#define RQ_CQE_OFFOLAD_TYPE_TUNNEL_PKT_FORMAT_SHIFT 8
+#define RQ_CQE_OFFOLAD_TYPE_PKT_UMBCAST_SHIFT 19
+#define RQ_CQE_OFFOLAD_TYPE_VLAN_EN_SHIFT 21
+#define RQ_CQE_OFFOLAD_TYPE_RSS_TYPE_SHIFT 24
+
+#define RQ_CQE_OFFOLAD_TYPE_PKT_TYPE_MASK 0x1FU
+#define RQ_CQE_OFFOLAD_TYPE_IP_TYPE_MASK 0x3U
+#define RQ_CQE_OFFOLAD_TYPE_ENC_L3_TYPE_MASK 0x1U
+#define RQ_CQE_OFFOLAD_TYPE_TUNNEL_PKT_FORMAT_MASK 0xFU
+#define RQ_CQE_OFFOLAD_TYPE_PKT_UMBCAST_MASK 0x3U
+#define RQ_CQE_OFFOLAD_TYPE_VLAN_EN_MASK 0x1U
+#define RQ_CQE_OFFOLAD_TYPE_RSS_TYPE_MASK 0xFFU
+
+#define RQ_CQE_OFFOLAD_TYPE_GET(val, member) \
+ (((val) >> RQ_CQE_OFFOLAD_TYPE_##member##_SHIFT) & \
+ RQ_CQE_OFFOLAD_TYPE_##member##_MASK)
+
+#define HINIC3_GET_RX_PKT_TYPE(offload_type) \
+ RQ_CQE_OFFOLAD_TYPE_GET(offload_type, PKT_TYPE)
+#define HINIC3_GET_RX_IP_TYPE(offload_type) \
+ RQ_CQE_OFFOLAD_TYPE_GET(offload_type, IP_TYPE)
+#define HINIC3_GET_RX_ENC_L3_TYPE(offload_type) \
+ RQ_CQE_OFFOLAD_TYPE_GET(offload_type, ENC_L3_TYPE)
+#define HINIC3_GET_RX_TUNNEL_PKT_FORMAT(offload_type) \
+ RQ_CQE_OFFOLAD_TYPE_GET(offload_type, TUNNEL_PKT_FORMAT)
+
+#define HINIC3_GET_RX_PKT_UMBCAST(offload_type) \
+ RQ_CQE_OFFOLAD_TYPE_GET(offload_type, PKT_UMBCAST)
+
+#define HINIC3_GET_RX_VLAN_OFFLOAD_EN(offload_type) \
+ RQ_CQE_OFFOLAD_TYPE_GET(offload_type, VLAN_EN)
+
+#define HINIC3_GET_RSS_TYPES(offload_type) \
+ RQ_CQE_OFFOLAD_TYPE_GET(offload_type, RSS_TYPE)
+
+#define RQ_CQE_SGE_VLAN_SHIFT 0
+#define RQ_CQE_SGE_LEN_SHIFT 16
+
+#define RQ_CQE_SGE_VLAN_MASK 0xFFFFU
+#define RQ_CQE_SGE_LEN_MASK 0xFFFFU
+
+#define RQ_CQE_SGE_GET(val, member) \
+ (((val) >> RQ_CQE_SGE_##member##_SHIFT) & RQ_CQE_SGE_##member##_MASK)
+
+#define HINIC3_GET_RX_VLAN_TAG(vlan_len) RQ_CQE_SGE_GET(vlan_len, VLAN)
+
+#define HINIC3_GET_RX_PKT_LEN(vlan_len) RQ_CQE_SGE_GET(vlan_len, LEN)
+
+#define RQ_CQE_STATUS_CSUM_ERR_SHIFT 0
+#define RQ_CQE_STATUS_NUM_LRO_SHIFT 16
+#define RQ_CQE_STATUS_LRO_PUSH_SHIFT 25
+#define RQ_CQE_STATUS_LRO_ENTER_SHIFT 26
+#define RQ_CQE_STATUS_LRO_INTR_SHIFT 27
+
+#define RQ_CQE_STATUS_BP_EN_SHIFT 30
+#define RQ_CQE_STATUS_RXDONE_SHIFT 31
+#define RQ_CQE_STATUS_DECRY_PKT_SHIFT 29
+#define RQ_CQE_STATUS_FLUSH_SHIFT 28
+
+#define RQ_CQE_STATUS_CSUM_ERR_MASK 0xFFFFU
+#define RQ_CQE_STATUS_NUM_LRO_MASK 0xFFU
+#define RQ_CQE_STATUS_LRO_PUSH_MASK 0X1U
+#define RQ_CQE_STATUS_LRO_ENTER_MASK 0X1U
+#define RQ_CQE_STATUS_LRO_INTR_MASK 0X1U
+#define RQ_CQE_STATUS_BP_EN_MASK 0X1U
+#define RQ_CQE_STATUS_RXDONE_MASK 0x1U
+#define RQ_CQE_STATUS_FLUSH_MASK 0x1U
+#define RQ_CQE_STATUS_DECRY_PKT_MASK 0x1U
+
+#define RQ_CQE_STATUS_GET(val, member) \
+ (((val) >> RQ_CQE_STATUS_##member##_SHIFT) & \
+ RQ_CQE_STATUS_##member##_MASK)
+
+#define HINIC3_GET_RX_CSUM_ERR(status) RQ_CQE_STATUS_GET(status, CSUM_ERR)
+
+#define HINIC3_GET_RX_DONE(status) RQ_CQE_STATUS_GET(status, RXDONE)
+
+#define HINIC3_GET_RX_FLUSH(status) RQ_CQE_STATUS_GET(status, FLUSH)
+
+#define HINIC3_GET_RX_BP_EN(status) RQ_CQE_STATUS_GET(status, BP_EN)
+
+#define HINIC3_GET_RX_NUM_LRO(status) RQ_CQE_STATUS_GET(status, NUM_LRO)
+
+#define HINIC3_RX_IS_DECRY_PKT(status) RQ_CQE_STATUS_GET(status, DECRY_PKT)
+
+#define RQ_CQE_SUPER_CQE_EN_SHIFT 0
+#define RQ_CQE_PKT_NUM_SHIFT 1
+#define RQ_CQE_PKT_LAST_LEN_SHIFT 6
+#define RQ_CQE_PKT_FIRST_LEN_SHIFT 19
+
+#define RQ_CQE_SUPER_CQE_EN_MASK 0x1
+#define RQ_CQE_PKT_NUM_MASK 0x1FU
+#define RQ_CQE_PKT_FIRST_LEN_MASK 0x1FFFU
+#define RQ_CQE_PKT_LAST_LEN_MASK 0x1FFFU
+
+#define RQ_CQE_PKT_NUM_GET(val, member) \
+ (((val) >> RQ_CQE_PKT_##member##_SHIFT) & RQ_CQE_PKT_##member##_MASK)
+#define HINIC3_GET_RQ_CQE_PKT_NUM(pkt_info) RQ_CQE_PKT_NUM_GET(pkt_info, NUM)
+
+#define RQ_CQE_SUPER_CQE_EN_GET(val, member) \
+ (((val) >> RQ_CQE_##member##_SHIFT) & RQ_CQE_##member##_MASK)
+#define HINIC3_GET_SUPER_CQE_EN(pkt_info) \
+ RQ_CQE_SUPER_CQE_EN_GET(pkt_info, SUPER_CQE_EN)
+
+#define RQ_CQE_PKT_LEN_GET(val, member) \
+ (((val) >> RQ_CQE_PKT_##member##_SHIFT) & RQ_CQE_PKT_##member##_MASK)
+
+#define RQ_CQE_DECRY_INFO_DECRY_STATUS_SHIFT 8
+#define RQ_CQE_DECRY_INFO_ESP_NEXT_HEAD_SHIFT 0
+
+#define RQ_CQE_DECRY_INFO_DECRY_STATUS_MASK 0xFFU
+#define RQ_CQE_DECRY_INFO_ESP_NEXT_HEAD_MASK 0xFFU
+
+#define RQ_CQE_DECRY_INFO_GET(val, member) \
+ (((val) >> RQ_CQE_DECRY_INFO_##member##_SHIFT) & \
+ RQ_CQE_DECRY_INFO_##member##_MASK)
+
+#define HINIC3_GET_DECRYPT_STATUS(decry_info) \
+ RQ_CQE_DECRY_INFO_GET(decry_info, DECRY_STATUS)
+
+#define HINIC3_GET_ESP_NEXT_HEAD(decry_info) \
+ RQ_CQE_DECRY_INFO_GET(decry_info, ESP_NEXT_HEAD)
+
+struct hinic3_rq_cqe {
+ u32 status;
+ u32 vlan_len;
+
+ u32 offload_type;
+ u32 hash_val;
+ u32 xid;
+ u32 decrypt_info;
+ u32 rsvd6;
+ u32 pkt_info;
+};
+
+struct hinic3_sge_sect {
+ struct hinic3_sge sge;
+ u32 rsvd;
+};
+
+struct hinic3_rq_extend_wqe {
+ struct hinic3_sge_sect buf_desc;
+ struct hinic3_sge_sect cqe_sect;
+};
+
+struct hinic3_rq_normal_wqe {
+ u32 buf_hi_addr;
+ u32 buf_lo_addr;
+ u32 cqe_hi_addr;
+ u32 cqe_lo_addr;
+};
+
+struct hinic3_rq_wqe {
+ union {
+ struct hinic3_rq_normal_wqe normal_wqe;
+ struct hinic3_rq_extend_wqe extend_wqe;
+ };
+};
+
+struct hinic3_sq_wqe_desc {
+ u32 ctrl_len;
+ u32 queue_info;
+ u32 hi_addr;
+ u32 lo_addr;
+};
+
+/* Engine only pass first 12B TS field directly to uCode through metadata
+ * vlan_offoad is used for hardware when vlan insert in tx
+ */
+struct hinic3_sq_task {
+ u32 pkt_info0;
+ u32 ip_identify;
+ u32 pkt_info2; /* ipsec used as spi */
+ u32 vlan_offload;
+};
+
+struct hinic3_sq_bufdesc {
+ u32 len; /* 31-bits Length, L2NIC only use length[17:0] */
+ u32 rsvd;
+ u32 hi_addr;
+ u32 lo_addr;
+};
+
+struct hinic3_sq_compact_wqe {
+ struct hinic3_sq_wqe_desc wqe_desc;
+};
+
+struct hinic3_sq_extend_wqe {
+ struct hinic3_sq_wqe_desc wqe_desc;
+ struct hinic3_sq_task task;
+ struct hinic3_sq_bufdesc buf_desc[0];
+};
+
+struct hinic3_sq_wqe {
+ union {
+ struct hinic3_sq_compact_wqe compact_wqe;
+ struct hinic3_sq_extend_wqe extend_wqe;
+ };
+};
+
+/* use section pointer for support non continuous wqe */
+struct hinic3_sq_wqe_combo {
+ struct hinic3_sq_wqe_desc *ctrl_bd0;
+ struct hinic3_sq_task *task;
+ struct hinic3_sq_bufdesc *bds_head;
+ struct hinic3_sq_bufdesc *bds_sec2;
+ u16 first_bds_num;
+ u32 wqe_type;
+ u32 task_type;
+};
+
+/* ************* SQ_CTRL ************** */
+enum sq_wqe_data_format {
+ SQ_NORMAL_WQE = 0,
+};
+
+enum sq_wqe_ec_type {
+ SQ_WQE_COMPACT_TYPE = 0,
+ SQ_WQE_EXTENDED_TYPE = 1,
+};
+
+enum sq_wqe_tasksect_len_type {
+ SQ_WQE_TASKSECT_46BITS = 0,
+ SQ_WQE_TASKSECT_16BYTES = 1,
+};
+
+#define SQ_CTRL_BD0_LEN_SHIFT 0
+#define SQ_CTRL_RSVD_SHIFT 18
+#define SQ_CTRL_BUFDESC_NUM_SHIFT 19
+#define SQ_CTRL_TASKSECT_LEN_SHIFT 27
+#define SQ_CTRL_DATA_FORMAT_SHIFT 28
+#define SQ_CTRL_DIRECT_SHIFT 29
+#define SQ_CTRL_EXTENDED_SHIFT 30
+#define SQ_CTRL_OWNER_SHIFT 31
+
+#define SQ_CTRL_BD0_LEN_MASK 0x3FFFFU
+#define SQ_CTRL_RSVD_MASK 0x1U
+#define SQ_CTRL_BUFDESC_NUM_MASK 0xFFU
+#define SQ_CTRL_TASKSECT_LEN_MASK 0x1U
+#define SQ_CTRL_DATA_FORMAT_MASK 0x1U
+#define SQ_CTRL_DIRECT_MASK 0x1U
+#define SQ_CTRL_EXTENDED_MASK 0x1U
+#define SQ_CTRL_OWNER_MASK 0x1U
+
+#define SQ_CTRL_SET(val, member) \
+ (((u32)(val) & SQ_CTRL_##member##_MASK) << SQ_CTRL_##member##_SHIFT)
+
+#define SQ_CTRL_GET(val, member) \
+ (((val) >> SQ_CTRL_##member##_SHIFT) & SQ_CTRL_##member##_MASK)
+
+#define SQ_CTRL_CLEAR(val, member) \
+ ((val) & (~(SQ_CTRL_##member##_MASK << SQ_CTRL_##member##_SHIFT)))
+
+#define SQ_CTRL_QUEUE_INFO_PKT_TYPE_SHIFT 0
+#define SQ_CTRL_QUEUE_INFO_PLDOFF_SHIFT 2
+#define SQ_CTRL_QUEUE_INFO_UFO_SHIFT 10
+#define SQ_CTRL_QUEUE_INFO_TSO_SHIFT 11
+#define SQ_CTRL_QUEUE_INFO_TCPUDP_CS_SHIFT 12
+#define SQ_CTRL_QUEUE_INFO_MSS_SHIFT 13
+#define SQ_CTRL_QUEUE_INFO_SCTP_SHIFT 27
+#define SQ_CTRL_QUEUE_INFO_UC_SHIFT 28
+#define SQ_CTRL_QUEUE_INFO_PRI_SHIFT 29
+
+#define SQ_CTRL_QUEUE_INFO_PKT_TYPE_MASK 0x3U
+#define SQ_CTRL_QUEUE_INFO_PLDOFF_MASK 0xFFU
+#define SQ_CTRL_QUEUE_INFO_UFO_MASK 0x1U
+#define SQ_CTRL_QUEUE_INFO_TSO_MASK 0x1U
+#define SQ_CTRL_QUEUE_INFO_TCPUDP_CS_MASK 0x1U
+#define SQ_CTRL_QUEUE_INFO_MSS_MASK 0x3FFFU
+#define SQ_CTRL_QUEUE_INFO_SCTP_MASK 0x1U
+#define SQ_CTRL_QUEUE_INFO_UC_MASK 0x1U
+#define SQ_CTRL_QUEUE_INFO_PRI_MASK 0x7U
+
+#define SQ_CTRL_QUEUE_INFO_SET(val, member) \
+ (((u32)(val) & SQ_CTRL_QUEUE_INFO_##member##_MASK) << \
+ SQ_CTRL_QUEUE_INFO_##member##_SHIFT)
+
+#define SQ_CTRL_QUEUE_INFO_GET(val, member) \
+ (((val) >> SQ_CTRL_QUEUE_INFO_##member##_SHIFT) & \
+ SQ_CTRL_QUEUE_INFO_##member##_MASK)
+
+#define SQ_CTRL_QUEUE_INFO_CLEAR(val, member) \
+ ((val) & (~(SQ_CTRL_QUEUE_INFO_##member##_MASK << \
+ SQ_CTRL_QUEUE_INFO_##member##_SHIFT)))
+
+#define SQ_TASK_INFO0_TUNNEL_FLAG_SHIFT 19
+#define SQ_TASK_INFO0_ESP_NEXT_PROTO_SHIFT 22
+#define SQ_TASK_INFO0_INNER_L4_EN_SHIFT 24
+#define SQ_TASK_INFO0_INNER_L3_EN_SHIFT 25
+#define SQ_TASK_INFO0_INNER_L4_PSEUDO_SHIFT 26
+#define SQ_TASK_INFO0_OUT_L4_EN_SHIFT 27
+#define SQ_TASK_INFO0_OUT_L3_EN_SHIFT 28
+#define SQ_TASK_INFO0_OUT_L4_PSEUDO_SHIFT 29
+#define SQ_TASK_INFO0_ESP_OFFLOAD_SHIFT 30
+#define SQ_TASK_INFO0_IPSEC_PROTO_SHIFT 31
+
+#define SQ_TASK_INFO0_TUNNEL_FLAG_MASK 0x1U
+#define SQ_TASK_INFO0_ESP_NEXT_PROTO_MASK 0x3U
+#define SQ_TASK_INFO0_INNER_L4_EN_MASK 0x1U
+#define SQ_TASK_INFO0_INNER_L3_EN_MASK 0x1U
+#define SQ_TASK_INFO0_INNER_L4_PSEUDO_MASK 0x1U
+#define SQ_TASK_INFO0_OUT_L4_EN_MASK 0x1U
+#define SQ_TASK_INFO0_OUT_L3_EN_MASK 0x1U
+#define SQ_TASK_INFO0_OUT_L4_PSEUDO_MASK 0x1U
+#define SQ_TASK_INFO0_ESP_OFFLOAD_MASK 0x1U
+#define SQ_TASK_INFO0_IPSEC_PROTO_MASK 0x1U
+
+#define SQ_TASK_INFO0_SET(val, member) \
+ (((u32)(val) & SQ_TASK_INFO0_##member##_MASK) << \
+ SQ_TASK_INFO0_##member##_SHIFT)
+#define SQ_TASK_INFO0_GET(val, member) \
+ (((val) >> SQ_TASK_INFO0_##member##_SHIFT) & \
+ SQ_TASK_INFO0_##member##_MASK)
+
+#define SQ_TASK_INFO1_SET(val, member) \
+ (((val) & SQ_TASK_INFO1_##member##_MASK) << \
+ SQ_TASK_INFO1_##member##_SHIFT)
+#define SQ_TASK_INFO1_GET(val, member) \
+ (((val) >> SQ_TASK_INFO1_##member##_SHIFT) & \
+ SQ_TASK_INFO1_##member##_MASK)
+
+#define SQ_TASK_INFO3_VLAN_TAG_SHIFT 0
+#define SQ_TASK_INFO3_VLAN_TYPE_SHIFT 16
+#define SQ_TASK_INFO3_VLAN_TAG_VALID_SHIFT 19
+
+#define SQ_TASK_INFO3_VLAN_TAG_MASK 0xFFFFU
+#define SQ_TASK_INFO3_VLAN_TYPE_MASK 0x7U
+#define SQ_TASK_INFO3_VLAN_TAG_VALID_MASK 0x1U
+
+#define SQ_TASK_INFO3_SET(val, member) \
+ (((val) & SQ_TASK_INFO3_##member##_MASK) << \
+ SQ_TASK_INFO3_##member##_SHIFT)
+#define SQ_TASK_INFO3_GET(val, member) \
+ (((val) >> SQ_TASK_INFO3_##member##_SHIFT) & \
+ SQ_TASK_INFO3_##member##_MASK)
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+static inline u32 hinic3_get_pkt_len_for_super_cqe(const struct hinic3_rq_cqe *cqe,
+ bool last)
+{
+ u32 pkt_len = hinic3_hw_cpu32(cqe->pkt_info);
+
+ if (!last)
+ return RQ_CQE_PKT_LEN_GET(pkt_len, FIRST_LEN);
+ else
+ return RQ_CQE_PKT_LEN_GET(pkt_len, LAST_LEN);
+}
+
+/* *
+ * hinic3_set_vlan_tx_offload - set vlan offload info
+ * @task: wqe task section
+ * @vlan_tag: vlan tag
+ * @vlan_type: 0--select TPID0 in IPSU, 1--select TPID0 in IPSU
+ * 2--select TPID2 in IPSU, 3--select TPID3 in IPSU, 4--select TPID4 in IPSU
+ */
+static inline void hinic3_set_vlan_tx_offload(struct hinic3_sq_task *task,
+ u16 vlan_tag, u8 vlan_type)
+{
+ task->vlan_offload = SQ_TASK_INFO3_SET(vlan_tag, VLAN_TAG) |
+ SQ_TASK_INFO3_SET(vlan_type, VLAN_TYPE) |
+ SQ_TASK_INFO3_SET(1U, VLAN_TAG_VALID);
+}
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_ntuple.c b/drivers/net/ethernet/huawei/hinic3/hinic3_ntuple.c
new file mode 100644
index 000000000000..b992defdea6d
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_ntuple.c
@@ -0,0 +1,907 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/device.h>
+#include <linux/ethtool.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_nic_dev.h"
+
+#define MAX_NUM_OF_ETHTOOL_NTUPLE_RULES BIT(9)
+struct hinic3_ethtool_rx_flow_rule {
+ struct list_head list;
+ struct ethtool_rx_flow_spec flow_spec;
+};
+
+static void tcam_translate_key_y(u8 *key_y, const u8 *src_input, const u8 *mask, u8 len)
+{
+ u8 idx;
+
+ for (idx = 0; idx < len; idx++)
+ key_y[idx] = src_input[idx] & mask[idx];
+}
+
+static void tcam_translate_key_x(u8 *key_x, const u8 *key_y, const u8 *mask, u8 len)
+{
+ u8 idx;
+
+ for (idx = 0; idx < len; idx++)
+ key_x[idx] = key_y[idx] ^ mask[idx];
+}
+
+static void tcam_key_calculate(struct tag_tcam_key *tcam_key,
+ struct nic_tcam_cfg_rule *fdir_tcam_rule)
+{
+ tcam_translate_key_y(fdir_tcam_rule->key.y,
+ (u8 *)(&tcam_key->key_info),
+ (u8 *)(&tcam_key->key_mask), TCAM_FLOW_KEY_SIZE);
+ tcam_translate_key_x(fdir_tcam_rule->key.x, fdir_tcam_rule->key.y,
+ (u8 *)(&tcam_key->key_mask), TCAM_FLOW_KEY_SIZE);
+}
+
+#define TCAM_IPV4_TYPE 0
+#define TCAM_IPV6_TYPE 1
+
+static int hinic3_base_ipv4_parse(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs,
+ struct tag_tcam_key *tcam_key)
+{
+ struct ethtool_tcpip4_spec *mask = &fs->m_u.tcp_ip4_spec;
+ struct ethtool_tcpip4_spec *val = &fs->h_u.tcp_ip4_spec;
+ u32 temp;
+
+ switch (mask->ip4src) {
+ case U32_MAX:
+ temp = ntohl(val->ip4src);
+ tcam_key->key_info.sipv4_h = high_16_bits(temp);
+ tcam_key->key_info.sipv4_l = low_16_bits(temp);
+
+ tcam_key->key_mask.sipv4_h = U16_MAX;
+ tcam_key->key_mask.sipv4_l = U16_MAX;
+ break;
+ case 0:
+ break;
+
+ default:
+ nicif_err(nic_dev, drv, nic_dev->netdev, "invalid src_ip mask\n");
+ return -EINVAL;
+ }
+
+ switch (mask->ip4dst) {
+ case U32_MAX:
+ temp = ntohl(val->ip4dst);
+ tcam_key->key_info.dipv4_h = high_16_bits(temp);
+ tcam_key->key_info.dipv4_l = low_16_bits(temp);
+
+ tcam_key->key_mask.dipv4_h = U16_MAX;
+ tcam_key->key_mask.dipv4_l = U16_MAX;
+ break;
+ case 0:
+ break;
+
+ default:
+ nicif_err(nic_dev, drv, nic_dev->netdev, "invalid src_ip mask\n");
+ return -EINVAL;
+ }
+
+ tcam_key->key_info.ip_type = TCAM_IPV4_TYPE;
+ tcam_key->key_mask.ip_type = TCAM_IP_TYPE_MASK;
+
+ tcam_key->key_info.function_id = hinic3_global_func_id(nic_dev->hwdev);
+ tcam_key->key_mask.function_id = TCAM_FUNC_ID_MASK;
+
+ return 0;
+}
+
+static int hinic3_fdir_tcam_ipv4_l4_init(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs,
+ struct tag_tcam_key *tcam_key)
+{
+ struct ethtool_tcpip4_spec *l4_mask = &fs->m_u.tcp_ip4_spec;
+ struct ethtool_tcpip4_spec *l4_val = &fs->h_u.tcp_ip4_spec;
+ int err;
+
+ err = hinic3_base_ipv4_parse(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+
+ tcam_key->key_info.dport = ntohs(l4_val->pdst);
+ tcam_key->key_mask.dport = l4_mask->pdst;
+
+ tcam_key->key_info.sport = ntohs(l4_val->psrc);
+ tcam_key->key_mask.sport = l4_mask->psrc;
+
+ if (fs->flow_type == TCP_V4_FLOW)
+ tcam_key->key_info.ip_proto = IPPROTO_TCP;
+ else
+ tcam_key->key_info.ip_proto = IPPROTO_UDP;
+ tcam_key->key_mask.ip_proto = U8_MAX;
+
+ return 0;
+}
+
+static int hinic3_fdir_tcam_ipv4_init(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs,
+ struct tag_tcam_key *tcam_key)
+{
+ struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
+ struct ethtool_usrip4_spec *l3_val = &fs->h_u.usr_ip4_spec;
+ int err;
+
+ err = hinic3_base_ipv4_parse(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+
+ tcam_key->key_info.ip_proto = l3_val->proto;
+ tcam_key->key_mask.ip_proto = l3_mask->proto;
+
+ return 0;
+}
+
+#ifndef UNSUPPORT_NTUPLE_IPV6
+enum ipv6_parse_res {
+ IPV6_MASK_INVALID,
+ IPV6_MASK_ALL_MASK,
+ IPV6_MASK_ALL_ZERO,
+};
+
+enum ipv6_index {
+ IPV6_IDX0,
+ IPV6_IDX1,
+ IPV6_IDX2,
+ IPV6_IDX3,
+};
+
+static int ipv6_mask_parse(const u32 *ipv6_mask)
+{
+ if (ipv6_mask[IPV6_IDX0] == 0 && ipv6_mask[IPV6_IDX1] == 0 &&
+ ipv6_mask[IPV6_IDX2] == 0 && ipv6_mask[IPV6_IDX3] == 0)
+ return IPV6_MASK_ALL_ZERO;
+
+ if (ipv6_mask[IPV6_IDX0] == U32_MAX &&
+ ipv6_mask[IPV6_IDX1] == U32_MAX &&
+ ipv6_mask[IPV6_IDX2] == U32_MAX && ipv6_mask[IPV6_IDX3] == U32_MAX)
+ return IPV6_MASK_ALL_MASK;
+
+ return IPV6_MASK_INVALID;
+}
+
+static int hinic3_base_ipv6_parse(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs,
+ struct tag_tcam_key *tcam_key)
+{
+ struct ethtool_tcpip6_spec *mask = &fs->m_u.tcp_ip6_spec;
+ struct ethtool_tcpip6_spec *val = &fs->h_u.tcp_ip6_spec;
+ int parse_res;
+ u32 temp;
+
+ parse_res = ipv6_mask_parse((u32 *)mask->ip6src);
+ if (parse_res == IPV6_MASK_ALL_MASK) {
+ temp = ntohl(val->ip6src[IPV6_IDX0]);
+ tcam_key->key_info_ipv6.sipv6_key0 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.sipv6_key1 = low_16_bits(temp);
+ temp = ntohl(val->ip6src[IPV6_IDX1]);
+ tcam_key->key_info_ipv6.sipv6_key2 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.sipv6_key3 = low_16_bits(temp);
+ temp = ntohl(val->ip6src[IPV6_IDX2]);
+ tcam_key->key_info_ipv6.sipv6_key4 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.sipv6_key5 = low_16_bits(temp);
+ temp = ntohl(val->ip6src[IPV6_IDX3]);
+ tcam_key->key_info_ipv6.sipv6_key6 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.sipv6_key7 = low_16_bits(temp);
+
+ tcam_key->key_mask_ipv6.sipv6_key0 = U16_MAX;
+ tcam_key->key_mask_ipv6.sipv6_key1 = U16_MAX;
+ tcam_key->key_mask_ipv6.sipv6_key2 = U16_MAX;
+ tcam_key->key_mask_ipv6.sipv6_key3 = U16_MAX;
+ tcam_key->key_mask_ipv6.sipv6_key4 = U16_MAX;
+ tcam_key->key_mask_ipv6.sipv6_key5 = U16_MAX;
+ tcam_key->key_mask_ipv6.sipv6_key6 = U16_MAX;
+ tcam_key->key_mask_ipv6.sipv6_key7 = U16_MAX;
+ } else if (parse_res == IPV6_MASK_INVALID) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "invalid src_ipv6 mask\n");
+ return -EINVAL;
+ }
+
+ parse_res = ipv6_mask_parse((u32 *)mask->ip6dst);
+ if (parse_res == IPV6_MASK_ALL_MASK) {
+ temp = ntohl(val->ip6dst[IPV6_IDX0]);
+ tcam_key->key_info_ipv6.dipv6_key0 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.dipv6_key1 = low_16_bits(temp);
+ temp = ntohl(val->ip6dst[IPV6_IDX1]);
+ tcam_key->key_info_ipv6.dipv6_key2 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.dipv6_key3 = low_16_bits(temp);
+ temp = ntohl(val->ip6dst[IPV6_IDX2]);
+ tcam_key->key_info_ipv6.dipv6_key4 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.dipv6_key5 = low_16_bits(temp);
+ temp = ntohl(val->ip6dst[IPV6_IDX3]);
+ tcam_key->key_info_ipv6.dipv6_key6 = high_16_bits(temp);
+ tcam_key->key_info_ipv6.dipv6_key7 = low_16_bits(temp);
+
+ tcam_key->key_mask_ipv6.dipv6_key0 = U16_MAX;
+ tcam_key->key_mask_ipv6.dipv6_key1 = U16_MAX;
+ tcam_key->key_mask_ipv6.dipv6_key2 = U16_MAX;
+ tcam_key->key_mask_ipv6.dipv6_key3 = U16_MAX;
+ tcam_key->key_mask_ipv6.dipv6_key4 = U16_MAX;
+ tcam_key->key_mask_ipv6.dipv6_key5 = U16_MAX;
+ tcam_key->key_mask_ipv6.dipv6_key6 = U16_MAX;
+ tcam_key->key_mask_ipv6.dipv6_key7 = U16_MAX;
+ } else if (parse_res == IPV6_MASK_INVALID) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "invalid dst_ipv6 mask\n");
+ return -EINVAL;
+ }
+
+ tcam_key->key_info_ipv6.ip_type = TCAM_IPV6_TYPE;
+ tcam_key->key_mask_ipv6.ip_type = TCAM_IP_TYPE_MASK;
+
+ tcam_key->key_info_ipv6.function_id =
+ hinic3_global_func_id(nic_dev->hwdev);
+ tcam_key->key_mask_ipv6.function_id = TCAM_FUNC_ID_MASK;
+
+ return 0;
+}
+
+static int hinic3_fdir_tcam_ipv6_l4_init(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs,
+ struct tag_tcam_key *tcam_key)
+{
+ struct ethtool_tcpip6_spec *l4_mask = &fs->m_u.tcp_ip6_spec;
+ struct ethtool_tcpip6_spec *l4_val = &fs->h_u.tcp_ip6_spec;
+ int err;
+
+ err = hinic3_base_ipv6_parse(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+
+ tcam_key->key_info_ipv6.dport = ntohs(l4_val->pdst);
+ tcam_key->key_mask_ipv6.dport = l4_mask->pdst;
+
+ tcam_key->key_info_ipv6.sport = ntohs(l4_val->psrc);
+ tcam_key->key_mask_ipv6.sport = l4_mask->psrc;
+
+ if (fs->flow_type == TCP_V6_FLOW)
+ tcam_key->key_info_ipv6.ip_proto = NEXTHDR_TCP;
+ else
+ tcam_key->key_info_ipv6.ip_proto = NEXTHDR_UDP;
+ tcam_key->key_mask_ipv6.ip_proto = U8_MAX;
+
+ return 0;
+}
+
+static int hinic3_fdir_tcam_ipv6_init(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs,
+ struct tag_tcam_key *tcam_key)
+{
+ struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
+ struct ethtool_usrip6_spec *l3_val = &fs->h_u.usr_ip6_spec;
+ int err;
+
+ err = hinic3_base_ipv6_parse(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+
+ tcam_key->key_info_ipv6.ip_proto = l3_val->l4_proto;
+ tcam_key->key_mask_ipv6.ip_proto = l3_mask->l4_proto;
+
+ return 0;
+}
+#endif
+
+static int hinic3_fdir_tcam_info_init(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs,
+ struct tag_tcam_key *tcam_key,
+ struct nic_tcam_cfg_rule *fdir_tcam_rule)
+{
+ int err;
+
+ switch (fs->flow_type) {
+ case TCP_V4_FLOW:
+ case UDP_V4_FLOW:
+ err = hinic3_fdir_tcam_ipv4_l4_init(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+ break;
+ case IP_USER_FLOW:
+ err = hinic3_fdir_tcam_ipv4_init(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+ break;
+#ifndef UNSUPPORT_NTUPLE_IPV6
+ case TCP_V6_FLOW:
+ case UDP_V6_FLOW:
+ err = hinic3_fdir_tcam_ipv6_l4_init(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+ break;
+ case IPV6_USER_FLOW:
+ err = hinic3_fdir_tcam_ipv6_init(nic_dev, fs, tcam_key);
+ if (err)
+ return err;
+ break;
+#endif
+ default:
+ return -ENOTSUPP;
+ }
+
+ tcam_key->key_info.tunnel_type = 0;
+ tcam_key->key_mask.tunnel_type = TCAM_TUNNEL_TYPE_MASK;
+
+ fdir_tcam_rule->data.qid = (u32)fs->ring_cookie;
+ tcam_key_calculate(tcam_key, fdir_tcam_rule);
+
+ return 0;
+}
+
+void hinic3_flush_rx_flow_rule(struct hinic3_nic_dev *nic_dev)
+{
+ struct hinic3_tcam_info *tcam_info = &nic_dev->tcam;
+ struct hinic3_ethtool_rx_flow_rule *eth_rule = NULL;
+ struct hinic3_ethtool_rx_flow_rule *eth_rule_tmp = NULL;
+ struct hinic3_tcam_filter *tcam_iter = NULL;
+ struct hinic3_tcam_filter *tcam_iter_tmp = NULL;
+ struct hinic3_tcam_dynamic_block *block = NULL;
+ struct hinic3_tcam_dynamic_block *block_tmp = NULL;
+ struct list_head *dynamic_list =
+ &tcam_info->tcam_dynamic_info.tcam_dynamic_list;
+
+ if (!list_empty(&tcam_info->tcam_list)) {
+ list_for_each_entry_safe(tcam_iter, tcam_iter_tmp,
+ &tcam_info->tcam_list,
+ tcam_filter_list) {
+ list_del(&tcam_iter->tcam_filter_list);
+ kfree(tcam_iter);
+ }
+ }
+ if (!list_empty(dynamic_list)) {
+ list_for_each_entry_safe(block, block_tmp, dynamic_list,
+ block_list) {
+ list_del(&block->block_list);
+ kfree(block);
+ }
+ }
+
+ if (!list_empty(&nic_dev->rx_flow_rule.rules)) {
+ list_for_each_entry_safe(eth_rule, eth_rule_tmp,
+ &nic_dev->rx_flow_rule.rules, list) {
+ list_del(ð_rule->list);
+ kfree(eth_rule);
+ }
+ }
+
+ if (HINIC3_SUPPORT_FDIR(nic_dev->hwdev)) {
+ hinic3_flush_tcam_rule(nic_dev->hwdev);
+ hinic3_set_fdir_tcam_rule_filter(nic_dev->hwdev, false);
+ }
+}
+
+static struct hinic3_tcam_dynamic_block *
+hinic3_alloc_dynamic_block_resource(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_tcam_info *tcam_info,
+ u16 dynamic_block_id)
+{
+ struct hinic3_tcam_dynamic_block *dynamic_block_ptr = NULL;
+
+ dynamic_block_ptr = kzalloc(sizeof(*dynamic_block_ptr), GFP_KERNEL);
+ if (!dynamic_block_ptr) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "fdir filter dynamic alloc block index %d memory failed\n",
+ dynamic_block_id);
+ return NULL;
+ }
+
+ dynamic_block_ptr->dynamic_block_id = dynamic_block_id;
+ list_add_tail(&dynamic_block_ptr->block_list,
+ &tcam_info->tcam_dynamic_info.tcam_dynamic_list);
+
+ tcam_info->tcam_dynamic_info.dynamic_block_cnt++;
+
+ return dynamic_block_ptr;
+}
+
+static void hinic3_free_dynamic_block_resource(struct hinic3_tcam_info *tcam_info,
+ struct hinic3_tcam_dynamic_block *block_ptr)
+{
+ if (!block_ptr)
+ return;
+
+ list_del(&block_ptr->block_list);
+ kfree(block_ptr);
+
+ tcam_info->tcam_dynamic_info.dynamic_block_cnt--;
+}
+
+static struct hinic3_tcam_dynamic_block *
+hinic3_dynamic_lookup_tcam_filter(struct hinic3_nic_dev *nic_dev,
+ struct nic_tcam_cfg_rule *fdir_tcam_rule,
+ const struct hinic3_tcam_info *tcam_info,
+ struct hinic3_tcam_filter *tcam_filter,
+ u16 *tcam_index)
+{
+ struct hinic3_tcam_dynamic_block *tmp = NULL;
+ u16 index;
+
+ list_for_each_entry(tmp,
+ &tcam_info->tcam_dynamic_info.tcam_dynamic_list,
+ block_list)
+ if (tmp->dynamic_index_cnt < HINIC3_TCAM_DYNAMIC_BLOCK_SIZE)
+ break;
+
+ if (!tmp || tmp->dynamic_index_cnt >= HINIC3_TCAM_DYNAMIC_BLOCK_SIZE) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Fdir filter dynamic lookup for index failed\n");
+ return NULL;
+ }
+
+ for (index = 0; index < HINIC3_TCAM_DYNAMIC_BLOCK_SIZE; index++)
+ if (tmp->dynamic_index_used[index] == 0)
+ break;
+
+ if (index == HINIC3_TCAM_DYNAMIC_BLOCK_SIZE) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "tcam block 0x%x supports filter rules is full\n",
+ tmp->dynamic_block_id);
+ return NULL;
+ }
+
+ tcam_filter->dynamic_block_id = tmp->dynamic_block_id;
+ tcam_filter->index = index;
+ *tcam_index = index;
+
+ fdir_tcam_rule->index = index +
+ HINIC3_PKT_TCAM_DYNAMIC_INDEX_START(tmp->dynamic_block_id);
+
+ return tmp;
+}
+
+static int hinic3_add_tcam_filter(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_tcam_filter *tcam_filter,
+ struct nic_tcam_cfg_rule *fdir_tcam_rule)
+{
+ struct hinic3_tcam_info *tcam_info = &nic_dev->tcam;
+ struct hinic3_tcam_dynamic_block *dynamic_block_ptr = NULL;
+ struct hinic3_tcam_dynamic_block *tmp = NULL;
+ u16 block_cnt = tcam_info->tcam_dynamic_info.dynamic_block_cnt;
+ u16 tcam_block_index = 0;
+ int block_alloc_flag = 0;
+ u16 index = 0;
+ int err;
+
+ if (tcam_info->tcam_rule_nums >=
+ block_cnt * HINIC3_TCAM_DYNAMIC_BLOCK_SIZE) {
+ if (block_cnt >= (HINIC3_MAX_TCAM_FILTERS /
+ HINIC3_TCAM_DYNAMIC_BLOCK_SIZE)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Dynamic tcam block is full, alloc failed\n");
+ goto failed;
+ }
+
+ err = hinic3_alloc_tcam_block(nic_dev->hwdev,
+ &tcam_block_index);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Fdir filter dynamic tcam alloc block failed\n");
+ goto failed;
+ }
+
+ block_alloc_flag = 1;
+
+ dynamic_block_ptr =
+ hinic3_alloc_dynamic_block_resource(nic_dev, tcam_info,
+ tcam_block_index);
+ if (!dynamic_block_ptr) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Fdir filter dynamic alloc block memory failed\n");
+ goto block_alloc_failed;
+ }
+ }
+
+ tmp = hinic3_dynamic_lookup_tcam_filter(nic_dev,
+ fdir_tcam_rule, tcam_info,
+ tcam_filter, &index);
+ if (!tmp) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Dynamic lookup tcam filter failed\n");
+ goto lookup_tcam_index_failed;
+ }
+
+ err = hinic3_add_tcam_rule(nic_dev->hwdev, fdir_tcam_rule);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Fdir_tcam_rule add failed\n");
+ goto add_tcam_rules_failed;
+ }
+
+ nicif_info(nic_dev, drv, nic_dev->netdev,
+ "Add fdir tcam rule, function_id: 0x%x, tcam_block_id: %d, local_index: %d, global_index: %d, queue: %d, tcam_rule_nums: %d succeed\n",
+ hinic3_global_func_id(nic_dev->hwdev),
+ tcam_filter->dynamic_block_id, index, fdir_tcam_rule->index,
+ fdir_tcam_rule->data.qid, tcam_info->tcam_rule_nums + 1);
+
+ if (tcam_info->tcam_rule_nums == 0) {
+ err = hinic3_set_fdir_tcam_rule_filter(nic_dev->hwdev, true);
+ if (err)
+ goto enable_failed;
+ }
+
+ list_add_tail(&tcam_filter->tcam_filter_list, &tcam_info->tcam_list);
+
+ tmp->dynamic_index_used[index] = 1;
+ tmp->dynamic_index_cnt++;
+
+ tcam_info->tcam_rule_nums++;
+
+ return 0;
+
+enable_failed:
+ hinic3_del_tcam_rule(nic_dev->hwdev, fdir_tcam_rule->index);
+
+add_tcam_rules_failed:
+lookup_tcam_index_failed:
+ if (block_alloc_flag == 1)
+ hinic3_free_dynamic_block_resource(tcam_info,
+ dynamic_block_ptr);
+
+block_alloc_failed:
+ if (block_alloc_flag == 1)
+ hinic3_free_tcam_block(nic_dev->hwdev, &tcam_block_index);
+
+failed:
+ return -EFAULT;
+}
+
+static int hinic3_del_tcam_filter(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_tcam_filter *tcam_filter)
+{
+ struct hinic3_tcam_info *tcam_info = &nic_dev->tcam;
+ u16 dynamic_block_id = tcam_filter->dynamic_block_id;
+ struct hinic3_tcam_dynamic_block *tmp = NULL;
+ u32 index = 0;
+ int err;
+
+ list_for_each_entry(tmp,
+ &tcam_info->tcam_dynamic_info.tcam_dynamic_list,
+ block_list) {
+ if (tmp->dynamic_block_id == dynamic_block_id)
+ break;
+ }
+ if (!tmp || tmp->dynamic_block_id != dynamic_block_id) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Fdir filter del dynamic lookup for block failed\n");
+ return -EFAULT;
+ }
+
+ index = HINIC3_PKT_TCAM_DYNAMIC_INDEX_START(tmp->dynamic_block_id) +
+ tcam_filter->index;
+
+ err = hinic3_del_tcam_rule(nic_dev->hwdev, index);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "fdir_tcam_rule del failed\n");
+ return -EFAULT;
+ }
+
+ nicif_info(nic_dev, drv, nic_dev->netdev,
+ "Del fdir_tcam_dynamic_rule function_id: 0x%x, tcam_block_id: %d, local_index: %d, global_index: %d, local_rules_nums: %d, global_rule_nums: %d succeed\n",
+ hinic3_global_func_id(nic_dev->hwdev), dynamic_block_id,
+ tcam_filter->index, index, tmp->dynamic_index_cnt - 1,
+ tcam_info->tcam_rule_nums - 1);
+
+ tmp->dynamic_index_used[tcam_filter->index] = 0;
+ tmp->dynamic_index_cnt--;
+ tcam_info->tcam_rule_nums--;
+ if (tmp->dynamic_index_cnt == 0) {
+ hinic3_free_tcam_block(nic_dev->hwdev, &dynamic_block_id);
+ hinic3_free_dynamic_block_resource(tcam_info, tmp);
+ }
+
+ if (tcam_info->tcam_rule_nums == 0)
+ hinic3_set_fdir_tcam_rule_filter(nic_dev->hwdev, false);
+
+ list_del(&tcam_filter->tcam_filter_list);
+ kfree(tcam_filter);
+
+ return 0;
+}
+
+static inline struct hinic3_tcam_filter *
+hinic3_tcam_filter_lookup(const struct list_head *filter_list,
+ struct tag_tcam_key *key)
+{
+ struct hinic3_tcam_filter *iter;
+
+ list_for_each_entry(iter, filter_list, tcam_filter_list) {
+ if (memcmp(key, &iter->tcam_key,
+ sizeof(struct tag_tcam_key)) == 0) {
+ return iter;
+ }
+ }
+
+ return NULL;
+}
+
+static void del_ethtool_rule(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_ethtool_rx_flow_rule *eth_rule)
+{
+ list_del(ð_rule->list);
+ nic_dev->rx_flow_rule.tot_num_rules--;
+
+ kfree(eth_rule);
+}
+
+static int hinic3_remove_one_rule(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_ethtool_rx_flow_rule *eth_rule)
+{
+ struct hinic3_tcam_info *tcam_info = &nic_dev->tcam;
+ struct hinic3_tcam_filter *tcam_filter;
+ struct nic_tcam_cfg_rule fdir_tcam_rule;
+ struct tag_tcam_key tcam_key;
+ int err;
+
+ memset(&fdir_tcam_rule, 0, sizeof(fdir_tcam_rule));
+ memset(&tcam_key, 0, sizeof(tcam_key));
+
+ err = hinic3_fdir_tcam_info_init(nic_dev, ð_rule->flow_spec,
+ &tcam_key, &fdir_tcam_rule);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Init fdir info failed\n");
+ return err;
+ }
+
+ tcam_filter = hinic3_tcam_filter_lookup(&tcam_info->tcam_list,
+ &tcam_key);
+ if (!tcam_filter) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Filter does not exists\n");
+ return -EEXIST;
+ }
+
+ err = hinic3_del_tcam_filter(nic_dev, tcam_filter);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Delete tcam filter failed\n");
+ return err;
+ }
+
+ del_ethtool_rule(nic_dev, eth_rule);
+
+ return 0;
+}
+
+static void add_rule_to_list(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_ethtool_rx_flow_rule *rule)
+{
+ struct hinic3_ethtool_rx_flow_rule *iter = NULL;
+ struct list_head *head = &nic_dev->rx_flow_rule.rules;
+
+ list_for_each_entry(iter, &nic_dev->rx_flow_rule.rules, list) {
+ if (iter->flow_spec.location > rule->flow_spec.location)
+ break;
+ head = &iter->list;
+ }
+ nic_dev->rx_flow_rule.tot_num_rules++;
+ list_add(&rule->list, head);
+}
+
+static int hinic3_add_one_rule(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs)
+{
+ struct nic_tcam_cfg_rule fdir_tcam_rule;
+ struct tag_tcam_key tcam_key;
+ struct hinic3_ethtool_rx_flow_rule *eth_rule = NULL;
+ struct hinic3_tcam_filter *tcam_filter = NULL;
+ struct hinic3_tcam_info *tcam_info = &nic_dev->tcam;
+ int err;
+
+ memset(&fdir_tcam_rule, 0, sizeof(fdir_tcam_rule));
+ memset(&tcam_key, 0, sizeof(tcam_key));
+ err = hinic3_fdir_tcam_info_init(nic_dev, fs, &tcam_key,
+ &fdir_tcam_rule);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Init fdir info failed\n");
+ return err;
+ }
+
+ tcam_filter = hinic3_tcam_filter_lookup(&tcam_info->tcam_list,
+ &tcam_key);
+ if (tcam_filter) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Filter exists\n");
+ return -EEXIST;
+ }
+
+ tcam_filter = kzalloc(sizeof(*tcam_filter), GFP_KERNEL);
+ if (!tcam_filter)
+ return -ENOMEM;
+ memcpy(&tcam_filter->tcam_key,
+ &tcam_key, sizeof(struct tag_tcam_key));
+ tcam_filter->queue = (u16)fdir_tcam_rule.data.qid;
+
+ err = hinic3_add_tcam_filter(nic_dev, tcam_filter, &fdir_tcam_rule);
+ if (err)
+ goto add_tcam_filter_fail;
+
+ /* driver save new rule filter */
+ eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
+ if (!eth_rule) {
+ err = -ENOMEM;
+ goto alloc_eth_rule_fail;
+ }
+
+ eth_rule->flow_spec = *fs;
+ add_rule_to_list(nic_dev, eth_rule);
+
+ return 0;
+
+alloc_eth_rule_fail:
+ hinic3_del_tcam_filter(nic_dev, tcam_filter);
+add_tcam_filter_fail:
+ kfree(tcam_filter);
+ return err;
+}
+
+static struct hinic3_ethtool_rx_flow_rule *
+find_ethtool_rule(const struct hinic3_nic_dev *nic_dev, u32 location)
+{
+ struct hinic3_ethtool_rx_flow_rule *iter = NULL;
+
+ list_for_each_entry(iter, &nic_dev->rx_flow_rule.rules, list) {
+ if (iter->flow_spec.location == location)
+ return iter;
+ }
+ return NULL;
+}
+
+static int validate_flow(struct hinic3_nic_dev *nic_dev,
+ const struct ethtool_rx_flow_spec *fs)
+{
+ if (fs->location >= MAX_NUM_OF_ETHTOOL_NTUPLE_RULES) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "loc exceed limit[0,%lu]\n",
+ MAX_NUM_OF_ETHTOOL_NTUPLE_RULES);
+ return -EINVAL;
+ }
+
+ if (fs->ring_cookie >= nic_dev->q_params.num_qps) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "action is larger than queue number %u\n",
+ nic_dev->q_params.num_qps);
+ return -EINVAL;
+ }
+
+ switch (fs->flow_type) {
+ case TCP_V4_FLOW:
+ case UDP_V4_FLOW:
+ case IP_USER_FLOW:
+#ifndef UNSUPPORT_NTUPLE_IPV6
+ case TCP_V6_FLOW:
+ case UDP_V6_FLOW:
+ case IPV6_USER_FLOW:
+#endif
+ break;
+ default:
+ nicif_err(nic_dev, drv, nic_dev->netdev, "flow type is not supported\n");
+ return -ENOTSUPP;
+ }
+
+ return 0;
+}
+
+int hinic3_ethtool_flow_replace(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs)
+{
+ struct hinic3_ethtool_rx_flow_rule *eth_rule = NULL;
+ struct ethtool_rx_flow_spec flow_spec_temp;
+ int loc_exit_flag = 0;
+ int err;
+
+ if (!HINIC3_SUPPORT_FDIR(nic_dev->hwdev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Unsupported ntuple function\n");
+ return -EOPNOTSUPP;
+ }
+
+ err = validate_flow(nic_dev, fs);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "flow is not valid %d\n", err);
+ return err;
+ }
+
+ eth_rule = find_ethtool_rule(nic_dev, fs->location);
+ /* when location is same, delete old location rule. */
+ if (eth_rule) {
+ memcpy(&flow_spec_temp, ð_rule->flow_spec,
+ sizeof(struct ethtool_rx_flow_spec));
+ err = hinic3_remove_one_rule(nic_dev, eth_rule);
+ if (err)
+ return err;
+
+ loc_exit_flag = 1;
+ }
+
+ /* add new rule filter */
+ err = hinic3_add_one_rule(nic_dev, fs);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Add new rule filter failed\n");
+ if (loc_exit_flag)
+ hinic3_add_one_rule(nic_dev, &flow_spec_temp);
+
+ return -ENOENT;
+ }
+
+ return 0;
+}
+
+int hinic3_ethtool_flow_remove(struct hinic3_nic_dev *nic_dev, u32 location)
+{
+ struct hinic3_ethtool_rx_flow_rule *eth_rule = NULL;
+ int err;
+
+ if (!HINIC3_SUPPORT_FDIR(nic_dev->hwdev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Unsupported ntuple function\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (location >= MAX_NUM_OF_ETHTOOL_NTUPLE_RULES)
+ return -ENOSPC;
+
+ eth_rule = find_ethtool_rule(nic_dev, location);
+ if (!eth_rule)
+ return -ENOENT;
+
+ err = hinic3_remove_one_rule(nic_dev, eth_rule);
+
+ return err;
+}
+
+int hinic3_ethtool_get_flow(const struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rxnfc *info, u32 location)
+{
+ struct hinic3_ethtool_rx_flow_rule *eth_rule = NULL;
+
+ if (!HINIC3_SUPPORT_FDIR(nic_dev->hwdev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Unsupported ntuple function\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (location >= MAX_NUM_OF_ETHTOOL_NTUPLE_RULES)
+ return -EINVAL;
+
+ list_for_each_entry(eth_rule, &nic_dev->rx_flow_rule.rules, list) {
+ if (eth_rule->flow_spec.location == location) {
+ info->fs = eth_rule->flow_spec;
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
+
+int hinic3_ethtool_get_all_flows(const struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rxnfc *info, u32 *rule_locs)
+{
+ int idx = 0;
+ struct hinic3_ethtool_rx_flow_rule *eth_rule = NULL;
+
+ if (!HINIC3_SUPPORT_FDIR(nic_dev->hwdev)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Unsupported ntuple function\n");
+ return -EOPNOTSUPP;
+ }
+
+ info->data = MAX_NUM_OF_ETHTOOL_NTUPLE_RULES;
+ list_for_each_entry(eth_rule, &nic_dev->rx_flow_rule.rules, list)
+ rule_locs[idx++] = eth_rule->flow_spec.location;
+
+ return info->rule_cnt == idx ? 0 : -ENOENT;
+}
+
+bool hinic3_validate_channel_setting_in_ntuple(const struct hinic3_nic_dev *nic_dev, u32 q_num)
+{
+ struct hinic3_ethtool_rx_flow_rule *iter = NULL;
+
+ list_for_each_entry(iter, &nic_dev->rx_flow_rule.rules, list) {
+ if (iter->flow_spec.ring_cookie >= q_num) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "User defined filter %u assigns flow to queue %llu. Queue number %u is invalid\n",
+ iter->flow_spec.location, iter->flow_spec.ring_cookie, q_num);
+ return false;
+ }
+ }
+
+ return true;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_profile.h b/drivers/net/ethernet/huawei/hinic3/hinic3_profile.h
new file mode 100644
index 000000000000..a93f3b60e709
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_profile.h
@@ -0,0 +1,146 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_PROFILE_H
+#define HINIC3_PROFILE_H
+
+typedef bool (*hinic3_is_match_prof)(void *device);
+typedef void *(*hinic3_init_prof_attr)(void *device);
+typedef void (*hinic3_deinit_prof_attr)(void *porf_attr);
+
+enum prof_adapter_type {
+ PROF_ADAP_TYPE_INVALID,
+ PROF_ADAP_TYPE_PANGEA = 1,
+
+ /* Add prof adapter type before default */
+ PROF_ADAP_TYPE_DEFAULT,
+};
+
+/**
+ * struct hinic3_prof_adapter - custom scene's profile adapter
+ * @type: adapter type
+ * @match: Check whether the current function is used in the custom scene.
+ * Implemented in the current source file
+ * @init: When @match return true, the initialization function called in probe.
+ * Implemented in the source file of the custom scene
+ * @deinit: When @match return true, the deinitialization function called when
+ * remove. Implemented in the source file of the custom scene
+ */
+struct hinic3_prof_adapter {
+ enum prof_adapter_type type;
+ hinic3_is_match_prof match;
+ hinic3_init_prof_attr init;
+ hinic3_deinit_prof_attr deinit;
+};
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+/*lint -save -e661 */
+static inline struct hinic3_prof_adapter *
+hinic3_prof_init(void *device, struct hinic3_prof_adapter *adap_objs, int num_adap,
+ void **prof_attr)
+{
+ struct hinic3_prof_adapter *prof_obj = NULL;
+ u16 i;
+
+ for (i = 0; i < num_adap; i++) {
+ prof_obj = &adap_objs[i];
+ if (!(prof_obj->match && prof_obj->match(device)))
+ continue;
+
+ *prof_attr = prof_obj->init ? prof_obj->init(device) : NULL;
+
+ return prof_obj;
+ }
+
+ return NULL;
+}
+
+static inline void hinic3_prof_deinit(struct hinic3_prof_adapter *prof_obj, void *prof_attr)
+{
+ if (!prof_obj)
+ return;
+
+ if (prof_obj->deinit)
+ prof_obj->deinit(prof_attr);
+}
+
+/*lint -restore*/
+
+/* module-level interface */
+#ifdef CONFIG_MODULE_PROF
+struct hinic3_module_ops {
+ int (*module_prof_init)(void);
+ void (*module_prof_exit)(void);
+ void (*probe_fault_process)(void *pdev, u16 level);
+ int (*probe_pre_process)(void *pdev);
+ void (*probe_pre_unprocess)(void *pdev);
+};
+
+struct hinic3_module_ops *hinic3_get_module_prof_ops(void);
+
+static inline void hinic3_probe_fault_process(void *pdev, u16 level)
+{
+ struct hinic3_module_ops *ops = hinic3_get_module_prof_ops();
+
+ if (ops && ops->probe_fault_process)
+ ops->probe_fault_process(pdev, level);
+}
+
+static inline int hinic3_module_pre_init(void)
+{
+ struct hinic3_module_ops *ops = hinic3_get_module_prof_ops();
+
+ if (!ops || !ops->module_prof_init)
+ return -EINVAL;
+
+ return ops->module_prof_init();
+}
+
+static inline void hinic3_module_post_exit(void)
+{
+ struct hinic3_module_ops *ops = hinic3_get_module_prof_ops();
+
+ if (ops && ops->module_prof_exit)
+ ops->module_prof_exit();
+}
+
+static inline int hinic3_probe_pre_process(void *pdev)
+{
+ struct hinic3_module_ops *ops = hinic3_get_module_prof_ops();
+
+ if (!ops || !ops->probe_pre_process)
+ return -EINVAL;
+
+ return ops->probe_pre_process(pdev);
+}
+
+static inline void hinic3_probe_pre_unprocess(void *pdev)
+{
+ struct hinic3_module_ops *ops = hinic3_get_module_prof_ops();
+
+ if (ops && ops->probe_pre_unprocess)
+ ops->probe_pre_unprocess(pdev);
+}
+#else
+static inline void hinic3_probe_fault_process(void *pdev, u16 level) { };
+
+static inline int hinic3_module_pre_init(void)
+{
+ return 0;
+}
+
+static inline void hinic3_module_post_exit(void) { };
+
+static inline int hinic3_probe_pre_process(void *pdev)
+{
+ return 0;
+}
+
+static inline void hinic3_probe_pre_unprocess(void *pdev) { };
+#endif
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_rss.c b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.c
new file mode 100644
index 000000000000..9b31d89e26ce
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.c
@@ -0,0 +1,978 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/device.h>
+#include <linux/ethtool.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/dcbnl.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_hw.h"
+#include "hinic3_rss.h"
+
+/*lint -e806*/
+static u16 num_qps;
+module_param(num_qps, ushort, 0444);
+MODULE_PARM_DESC(num_qps, "Number of Queue Pairs (default=0)");
+
+#define MOD_PARA_VALIDATE_NUM_QPS(nic_dev, num_qps, out_qps) do { \
+ if ((num_qps) > (nic_dev)->max_qps) \
+ nic_warn(&(nic_dev)->pdev->dev, \
+ "Module Parameter %s value %u is out of range, " \
+ "Maximum value for the device: %u, using %u\n", \
+ #num_qps, num_qps, (nic_dev)->max_qps, \
+ (nic_dev)->max_qps); \
+ if ((num_qps) > (nic_dev)->max_qps) \
+ (out_qps) = (nic_dev)->max_qps; \
+ else if ((num_qps) > 0) \
+ (out_qps) = (num_qps); \
+} while (0)
+
+/* In rx, iq means cos */
+static u8 hinic3_get_iqmap_by_tc(const u8 *prio_tc, u8 num_iq, u8 tc)
+{
+ u8 i, map = 0;
+
+ for (i = 0; i < num_iq; i++) {
+ if (prio_tc[i] == tc)
+ map |= (u8)(1U << ((num_iq - 1) - i));
+ }
+
+ return map;
+}
+
+static u8 hinic3_get_tcid_by_rq(const u32 *indir_tbl, u8 num_tcs, u16 rq_id)
+{
+ u16 tc_group_size;
+ int i;
+ u8 temp_num_tcs = num_tcs;
+
+ if (!num_tcs)
+ temp_num_tcs = 1;
+
+ tc_group_size = NIC_RSS_INDIR_SIZE / temp_num_tcs;
+ for (i = 0; i < NIC_RSS_INDIR_SIZE; i++) {
+ if (indir_tbl[i] == rq_id)
+ return (u8)(i / tc_group_size);
+ }
+
+ return 0xFF; /* Invalid TC */
+}
+
+static int hinic3_get_rq2iq_map(struct hinic3_nic_dev *nic_dev,
+ u16 num_rq, u8 num_tcs, u8 *prio_tc, u8 cos_num,
+ u32 *indir_tbl, u8 *map, u32 map_size)
+{
+ u16 qid;
+ u8 tc_id;
+ u8 temp_num_tcs = num_tcs;
+
+ if (!num_tcs)
+ temp_num_tcs = 1;
+
+ if (num_rq > map_size) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Rq number(%u) exceed max map qid(%u)\n",
+ num_rq, map_size);
+ return -EINVAL;
+ }
+
+ if (cos_num < HINIC_NUM_IQ_PER_FUNC) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Cos number(%u) less then map qid(%d)\n",
+ cos_num, HINIC_NUM_IQ_PER_FUNC);
+ return -EINVAL;
+ }
+
+ for (qid = 0; qid < num_rq; qid++) {
+ tc_id = hinic3_get_tcid_by_rq(indir_tbl, temp_num_tcs, qid);
+ map[qid] = hinic3_get_iqmap_by_tc(prio_tc,
+ HINIC_NUM_IQ_PER_FUNC, tc_id);
+ }
+
+ return 0;
+}
+
+static void hinic3_fillout_indir_tbl(struct hinic3_nic_dev *nic_dev, u8 num_cos, u32 *indir)
+{
+ u16 k, group_size, start_qid = 0, qp_num = 0;
+ int i = 0;
+ u8 j, cur_cos = 0, default_cos;
+ u8 valid_cos_map = hinic3_get_dev_valid_cos_map(nic_dev);
+
+ if (num_cos == 0) {
+ for (i = 0; i < NIC_RSS_INDIR_SIZE; i++)
+ indir[i] = i % nic_dev->q_params.num_qps;
+ } else {
+ group_size = NIC_RSS_INDIR_SIZE / num_cos;
+
+ for (j = 0; j < num_cos; j++) {
+ while (cur_cos < NIC_DCB_COS_MAX &&
+ nic_dev->hw_dcb_cfg.cos_qp_num[cur_cos] == 0)
+ cur_cos++;
+
+ if (cur_cos >= NIC_DCB_COS_MAX) {
+ if (BIT(nic_dev->hw_dcb_cfg.default_cos) & valid_cos_map)
+ default_cos = nic_dev->hw_dcb_cfg.default_cos;
+ else
+ default_cos = (u8)fls(valid_cos_map) - 1;
+
+ start_qid = nic_dev->hw_dcb_cfg.cos_qp_offset[default_cos];
+ qp_num = nic_dev->hw_dcb_cfg.cos_qp_num[default_cos];
+ } else {
+ start_qid = nic_dev->hw_dcb_cfg.cos_qp_offset[cur_cos];
+ qp_num = nic_dev->hw_dcb_cfg.cos_qp_num[cur_cos];
+ }
+
+ for (k = 0; k < group_size; k++)
+ indir[i++] = start_qid + k % qp_num;
+
+ cur_cos++;
+ }
+ }
+}
+
+/*lint -e528*/
+int hinic3_rss_init(struct hinic3_nic_dev *nic_dev, u8 *rq2iq_map, u32 map_size, u8 dcb_en)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u8 i, cos_num;
+ u8 cos_map[NIC_DCB_UP_MAX] = {0};
+ u8 cfg_map[NIC_DCB_UP_MAX] = {0};
+ int err;
+
+ if (dcb_en) {
+ cos_num = hinic3_get_dev_user_cos_num(nic_dev);
+
+ if (nic_dev->hw_dcb_cfg.trust == 0) {
+ memcpy(cfg_map, nic_dev->hw_dcb_cfg.pcp2cos, sizeof(cfg_map));
+ } else if (nic_dev->hw_dcb_cfg.trust == 1) {
+ for (i = 0; i < NIC_DCB_UP_MAX; i++)
+ cfg_map[i] = nic_dev->hw_dcb_cfg.dscp2cos[i * NIC_DCB_DSCP_NUM];
+ }
+#define COS_CHANGE_OFFSET 4
+ for (i = 0; i < COS_CHANGE_OFFSET; i++)
+ cos_map[COS_CHANGE_OFFSET + i] = cfg_map[i];
+
+ for (i = 0; i < COS_CHANGE_OFFSET; i++)
+ cos_map[i] = cfg_map[NIC_DCB_UP_MAX - (i + 1)];
+
+ while (cos_num & (cos_num - 1))
+ cos_num++;
+ } else {
+ cos_num = 0;
+ }
+
+ err = hinic3_set_hw_rss_parameters(netdev, 1, cos_num, cos_map, dcb_en);
+ if (err)
+ return err;
+
+ err = hinic3_get_rq2iq_map(nic_dev, nic_dev->q_params.num_qps, cos_num, cos_map,
+ NIC_DCB_UP_MAX, nic_dev->rss_indir, rq2iq_map, map_size);
+ if (err)
+ nicif_err(nic_dev, drv, netdev, "Failed to get rq map\n");
+ return err;
+}
+
+/*lint -e528*/
+void hinic3_rss_deinit(struct hinic3_nic_dev *nic_dev)
+{
+ u8 cos_map[NIC_DCB_UP_MAX] = {0};
+
+ hinic3_rss_cfg(nic_dev->hwdev, 0, 0, cos_map, 1);
+}
+
+void hinic3_init_rss_parameters(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ nic_dev->rss_hash_engine = HINIC3_RSS_HASH_ENGINE_TYPE_XOR;
+ nic_dev->rss_type.tcp_ipv6_ext = 1;
+ nic_dev->rss_type.ipv6_ext = 1;
+ nic_dev->rss_type.tcp_ipv6 = 1;
+ nic_dev->rss_type.ipv6 = 1;
+ nic_dev->rss_type.tcp_ipv4 = 1;
+ nic_dev->rss_type.ipv4 = 1;
+ nic_dev->rss_type.udp_ipv6 = 1;
+ nic_dev->rss_type.udp_ipv4 = 1;
+}
+
+void hinic3_clear_rss_config(struct hinic3_nic_dev *nic_dev)
+{
+ kfree(nic_dev->rss_hkey);
+ nic_dev->rss_hkey = NULL;
+
+ kfree(nic_dev->rss_indir);
+ nic_dev->rss_indir = NULL;
+}
+
+void hinic3_set_default_rss_indir(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ set_bit(HINIC3_RSS_DEFAULT_INDIR, &nic_dev->flags);
+}
+
+static void hinic3_maybe_reconfig_rss_indir(struct net_device *netdev, u8 dcb_en)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int i;
+
+ /* if dcb is enabled, user can not config rss indir table */
+ if (dcb_en) {
+ nicif_info(nic_dev, drv, netdev, "DCB is enabled, set default rss indir\n");
+ goto discard_user_rss_indir;
+ }
+
+ for (i = 0; i < NIC_RSS_INDIR_SIZE; i++) {
+ if (nic_dev->rss_indir[i] >= nic_dev->q_params.num_qps)
+ goto discard_user_rss_indir;
+ }
+
+ return;
+
+discard_user_rss_indir:
+ hinic3_set_default_rss_indir(netdev);
+}
+
+static void decide_num_qps(struct hinic3_nic_dev *nic_dev)
+{
+ u16 tmp_num_qps = nic_dev->max_qps;
+ u16 num_cpus = 0;
+ int i, node;
+
+ if (nic_dev->nic_cap.default_num_queues != 0 &&
+ nic_dev->nic_cap.default_num_queues < nic_dev->max_qps)
+ tmp_num_qps = nic_dev->nic_cap.default_num_queues;
+
+ MOD_PARA_VALIDATE_NUM_QPS(nic_dev, num_qps, tmp_num_qps);
+
+ for (i = 0; i < (int)num_online_cpus(); i++) {
+ node = (int)cpu_to_node(i);
+ if (node == dev_to_node(&nic_dev->pdev->dev))
+ num_cpus++;
+ }
+
+ if (!num_cpus)
+ num_cpus = (u16)num_online_cpus();
+
+ nic_dev->q_params.num_qps = (u16)min_t(u16, tmp_num_qps, num_cpus);
+}
+
+static void copy_value_to_rss_hkey(struct hinic3_nic_dev *nic_dev,
+ const u8 *hkey)
+{
+ u32 i;
+ u32 *rss_hkey = (u32 *)nic_dev->rss_hkey;
+
+ memcpy(nic_dev->rss_hkey, hkey, NIC_RSS_KEY_SIZE);
+
+ /* make a copy of the key, and convert it to Big Endian */
+ for (i = 0; i < NIC_RSS_KEY_SIZE / sizeof(u32); i++)
+ nic_dev->rss_hkey_be[i] = cpu_to_be32(rss_hkey[i]);
+}
+
+static int alloc_rss_resource(struct hinic3_nic_dev *nic_dev)
+{
+ u8 default_rss_key[NIC_RSS_KEY_SIZE] = {
+ 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2,
+ 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0,
+ 0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4,
+ 0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c,
+ 0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa};
+
+ /* We request double spaces for the hash key,
+ * the second one holds the key of Big Edian
+ * format.
+ */
+ nic_dev->rss_hkey =
+ kzalloc(NIC_RSS_KEY_SIZE *
+ HINIC3_RSS_KEY_RSV_NUM, GFP_KERNEL);
+ if (!nic_dev->rss_hkey) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc memory for rss_hkey\n");
+ return -ENOMEM;
+ }
+
+ /* The second space is for big edian hash key */
+ nic_dev->rss_hkey_be = (u32 *)(nic_dev->rss_hkey +
+ NIC_RSS_KEY_SIZE);
+ copy_value_to_rss_hkey(nic_dev, (u8 *)default_rss_key);
+
+ nic_dev->rss_indir = kzalloc(sizeof(u32) * NIC_RSS_INDIR_SIZE, GFP_KERNEL);
+ if (!nic_dev->rss_indir) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc memory for rss_indir\n");
+ kfree(nic_dev->rss_hkey);
+ nic_dev->rss_hkey = NULL;
+ return -ENOMEM;
+ }
+
+ set_bit(HINIC3_RSS_DEFAULT_INDIR, &nic_dev->flags);
+
+ return 0;
+}
+
+/*lint -e528*/
+void hinic3_try_to_enable_rss(struct hinic3_nic_dev *nic_dev)
+{
+ u8 cos_map[NIC_DCB_UP_MAX] = {0};
+ int err = 0;
+
+ if (!nic_dev)
+ return;
+
+ nic_dev->max_qps = hinic3_func_max_nic_qnum(nic_dev->hwdev);
+ if (nic_dev->max_qps <= 1 || !HINIC3_SUPPORT_RSS(nic_dev->hwdev))
+ goto set_q_params;
+
+ err = alloc_rss_resource(nic_dev);
+ if (err) {
+ nic_dev->max_qps = 1;
+ goto set_q_params;
+ }
+
+ set_bit(HINIC3_RSS_ENABLE, &nic_dev->flags);
+ nic_dev->max_qps = hinic3_func_max_nic_qnum(nic_dev->hwdev);
+
+ decide_num_qps(nic_dev);
+
+ hinic3_init_rss_parameters(nic_dev->netdev);
+ err = hinic3_set_hw_rss_parameters(nic_dev->netdev, 0, 0, cos_map,
+ test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags) ? 1 : 0);
+ if (err) {
+ nic_err(&nic_dev->pdev->dev, "Failed to set hardware rss parameters\n");
+
+ hinic3_clear_rss_config(nic_dev);
+ nic_dev->max_qps = 1;
+ goto set_q_params;
+ }
+ return;
+
+set_q_params:
+ clear_bit(HINIC3_RSS_ENABLE, &nic_dev->flags);
+ nic_dev->q_params.num_qps = nic_dev->max_qps;
+}
+
+static int hinic3_config_rss_hw_resource(struct hinic3_nic_dev *nic_dev,
+ u32 *indir_tbl)
+{
+ int err;
+
+ err = hinic3_rss_set_indir_tbl(nic_dev->hwdev, indir_tbl);
+ if (err)
+ return err;
+
+ err = hinic3_set_rss_type(nic_dev->hwdev, nic_dev->rss_type);
+ if (err)
+ return err;
+
+ return hinic3_rss_set_hash_engine(nic_dev->hwdev,
+ nic_dev->rss_hash_engine);
+}
+
+int hinic3_set_hw_rss_parameters(struct net_device *netdev, u8 rss_en,
+ u8 cos_num, u8 *cos_map, u8 dcb_en)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err;
+
+ /* RSS key */
+ err = hinic3_rss_set_hash_key(nic_dev->hwdev, nic_dev->rss_hkey);
+ if (err)
+ return err;
+
+ hinic3_maybe_reconfig_rss_indir(netdev, dcb_en);
+
+ if (test_bit(HINIC3_RSS_DEFAULT_INDIR, &nic_dev->flags))
+ hinic3_fillout_indir_tbl(nic_dev, cos_num, nic_dev->rss_indir);
+
+ err = hinic3_config_rss_hw_resource(nic_dev, nic_dev->rss_indir);
+ if (err)
+ return err;
+
+ err = hinic3_rss_cfg(nic_dev->hwdev, rss_en, cos_num, cos_map,
+ nic_dev->q_params.num_qps);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+/* for ethtool */
+static int set_l4_rss_hash_ops(const struct ethtool_rxnfc *cmd,
+ struct nic_rss_type *rss_type)
+{
+ u8 rss_l4_en = 0;
+
+ switch (cmd->data & (RXH_L4_B_0_1 | RXH_L4_B_2_3)) {
+ case 0:
+ rss_l4_en = 0;
+ break;
+ case (RXH_L4_B_0_1 | RXH_L4_B_2_3):
+ rss_l4_en = 1;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ switch (cmd->flow_type) {
+ case TCP_V4_FLOW:
+ rss_type->tcp_ipv4 = rss_l4_en;
+ break;
+ case TCP_V6_FLOW:
+ rss_type->tcp_ipv6 = rss_l4_en;
+ break;
+ case UDP_V4_FLOW:
+ rss_type->udp_ipv4 = rss_l4_en;
+ break;
+ case UDP_V6_FLOW:
+ rss_type->udp_ipv6 = rss_l4_en;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int update_rss_hash_opts(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rxnfc *cmd,
+ struct nic_rss_type *rss_type)
+{
+ int err;
+
+ switch (cmd->flow_type) {
+ case TCP_V4_FLOW:
+ case TCP_V6_FLOW:
+ case UDP_V4_FLOW:
+ case UDP_V6_FLOW:
+ err = set_l4_rss_hash_ops(cmd, rss_type);
+ if (err)
+ return err;
+
+ break;
+ case IPV4_FLOW:
+ rss_type->ipv4 = 1;
+ break;
+ case IPV6_FLOW:
+ rss_type->ipv6 = 1;
+ break;
+ default:
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Unsupported flow type\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_set_rss_hash_opts(struct hinic3_nic_dev *nic_dev, struct ethtool_rxnfc *cmd)
+{
+ struct nic_rss_type *rss_type = &nic_dev->rss_type;
+ int err;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ cmd->data = 0;
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "RSS is disable, not support to set flow-hash\n");
+ return -EOPNOTSUPP;
+ }
+
+ /* RSS does not support anything other than hashing
+ * to queues on src and dst IPs and ports
+ */
+ if (cmd->data & ~(RXH_IP_SRC | RXH_IP_DST | RXH_L4_B_0_1 |
+ RXH_L4_B_2_3))
+ return -EINVAL;
+
+ /* We need at least the IP SRC and DEST fields for hashing */
+ if (!(cmd->data & RXH_IP_SRC) || !(cmd->data & RXH_IP_DST))
+ return -EINVAL;
+
+ err = hinic3_get_rss_type(nic_dev->hwdev, rss_type);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to get rss type\n");
+ return -EFAULT;
+ }
+
+ err = update_rss_hash_opts(nic_dev, cmd, rss_type);
+ if (err)
+ return err;
+
+ err = hinic3_set_rss_type(nic_dev->hwdev, *rss_type);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to set rss type\n");
+ return -EFAULT;
+ }
+
+ nicif_info(nic_dev, drv, nic_dev->netdev, "Set rss hash options success\n");
+
+ return 0;
+}
+
+static void convert_rss_type(u8 rss_opt, struct ethtool_rxnfc *cmd)
+{
+ if (rss_opt)
+ cmd->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+}
+
+static int hinic3_convert_rss_type(struct hinic3_nic_dev *nic_dev,
+ struct nic_rss_type *rss_type,
+ struct ethtool_rxnfc *cmd)
+{
+ cmd->data = RXH_IP_SRC | RXH_IP_DST;
+ switch (cmd->flow_type) {
+ case TCP_V4_FLOW:
+ convert_rss_type(rss_type->tcp_ipv4, cmd);
+ break;
+ case TCP_V6_FLOW:
+ convert_rss_type(rss_type->tcp_ipv6, cmd);
+ break;
+ case UDP_V4_FLOW:
+ convert_rss_type(rss_type->udp_ipv4, cmd);
+ break;
+ case UDP_V6_FLOW:
+ convert_rss_type(rss_type->udp_ipv6, cmd);
+ break;
+ case IPV4_FLOW:
+ case IPV6_FLOW:
+ break;
+ default:
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Unsupported flow type\n");
+ cmd->data = 0;
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_get_rss_hash_opts(struct hinic3_nic_dev *nic_dev, struct ethtool_rxnfc *cmd)
+{
+ struct nic_rss_type rss_type = {0};
+ int err;
+
+ cmd->data = 0;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags))
+ return 0;
+
+ err = hinic3_get_rss_type(nic_dev->hwdev, &rss_type);
+ if (err) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to get rss type\n");
+ return err;
+ }
+
+ return hinic3_convert_rss_type(nic_dev, &rss_type, cmd);
+}
+
+#if (KERNEL_VERSION(3, 4, 24) > LINUX_VERSION_CODE)
+int hinic3_get_rxnfc(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd, void *rule_locs)
+#else
+int hinic3_get_rxnfc(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd, u32 *rule_locs)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err = 0;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_GRXRINGS:
+ cmd->data = nic_dev->q_params.num_qps;
+ break;
+ case ETHTOOL_GRXCLSRLCNT:
+ cmd->rule_cnt = (u32)nic_dev->rx_flow_rule.tot_num_rules;
+ break;
+ case ETHTOOL_GRXCLSRULE:
+ err = hinic3_ethtool_get_flow(nic_dev, cmd, cmd->fs.location);
+ break;
+ case ETHTOOL_GRXCLSRLALL:
+ err = hinic3_ethtool_get_all_flows(nic_dev, cmd, rule_locs);
+ break;
+ case ETHTOOL_GRXFH:
+ err = hinic3_get_rss_hash_opts(nic_dev, cmd);
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ return err;
+}
+
+int hinic3_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err = 0;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_SRXFH:
+ err = hinic3_set_rss_hash_opts(nic_dev, cmd);
+ break;
+ case ETHTOOL_SRXCLSRLINS:
+ err = hinic3_ethtool_flow_replace(nic_dev, &cmd->fs);
+ break;
+ case ETHTOOL_SRXCLSRLDEL:
+ err = hinic3_ethtool_flow_remove(nic_dev, cmd->fs.location);
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ return err;
+}
+
+static u16 hinic3_max_channels(struct hinic3_nic_dev *nic_dev)
+{
+ u8 tcs = (u8)netdev_get_num_tc(nic_dev->netdev);
+
+ return tcs ? nic_dev->max_qps / tcs : nic_dev->max_qps;
+}
+
+static u16 hinic3_curr_channels(struct hinic3_nic_dev *nic_dev)
+{
+ if (netif_running(nic_dev->netdev))
+ return nic_dev->q_params.num_qps ?
+ nic_dev->q_params.num_qps : 1;
+ else
+ return (u16)min_t(u16, hinic3_max_channels(nic_dev),
+ nic_dev->q_params.num_qps);
+}
+
+void hinic3_get_channels(struct net_device *netdev,
+ struct ethtool_channels *channels)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ channels->max_rx = 0;
+ channels->max_tx = 0;
+ channels->max_other = 0;
+ /* report maximum channels */
+ channels->max_combined = hinic3_max_channels(nic_dev);
+ channels->rx_count = 0;
+ channels->tx_count = 0;
+ channels->other_count = 0;
+ /* report flow director queues as maximum channels */
+ channels->combined_count = hinic3_curr_channels(nic_dev);
+}
+
+static int hinic3_validate_channel_parameter(struct net_device *netdev,
+ const struct ethtool_channels *channels)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u16 max_channel = hinic3_max_channels(nic_dev);
+ unsigned int count = channels->combined_count;
+
+ if (!count) {
+ nicif_err(nic_dev, drv, netdev,
+ "Unsupported combined_count=0\n");
+ return -EINVAL;
+ }
+
+ if (channels->tx_count || channels->rx_count || channels->other_count) {
+ nicif_err(nic_dev, drv, netdev,
+ "Setting rx/tx/other count not supported\n");
+ return -EINVAL;
+ }
+
+ if (count > max_channel) {
+ nicif_err(nic_dev, drv, netdev,
+ "Combined count %u exceed limit %u\n", count,
+ max_channel);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static void change_num_channel_reopen_handler(struct hinic3_nic_dev *nic_dev,
+ const void *priv_data)
+{
+ hinic3_set_default_rss_indir(nic_dev->netdev);
+}
+
+int hinic3_set_channels(struct net_device *netdev,
+ struct ethtool_channels *channels)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_dyna_txrxq_params q_params = {0};
+ unsigned int count = channels->combined_count;
+ int err;
+ u8 user_cos_num = hinic3_get_dev_user_cos_num(nic_dev);
+
+ if (hinic3_validate_channel_parameter(netdev, channels))
+ return -EINVAL;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, netdev,
+ "This function don't support RSS, only support 1 queue pair\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags)) {
+ if (count < user_cos_num) {
+ nicif_err(nic_dev, drv, netdev,
+ "DCB is on, channels num should more than valid cos num:%u\n",
+ user_cos_num);
+
+ return -EOPNOTSUPP;
+ }
+ }
+
+ if (HINIC3_SUPPORT_FDIR(nic_dev->hwdev) &&
+ !hinic3_validate_channel_setting_in_ntuple(nic_dev, count))
+ return -EOPNOTSUPP;
+
+ nicif_info(nic_dev, drv, netdev, "Set max combined queue number from %u to %u\n",
+ nic_dev->q_params.num_qps, count);
+
+ if (netif_running(netdev)) {
+ q_params = nic_dev->q_params;
+ q_params.num_qps = (u16)count;
+ q_params.txqs_res = NULL;
+ q_params.rxqs_res = NULL;
+ q_params.irq_cfg = NULL;
+
+ nicif_info(nic_dev, drv, netdev, "Restarting channel\n");
+ err = hinic3_change_channel_settings(nic_dev, &q_params,
+ change_num_channel_reopen_handler, NULL);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to change channel settings\n");
+ return -EFAULT;
+ }
+ } else {
+ /* Discard user configured rss */
+ hinic3_set_default_rss_indir(netdev);
+ nic_dev->q_params.num_qps = (u16)count;
+ }
+
+ return 0;
+}
+
+#ifndef NOT_HAVE_GET_RXFH_INDIR_SIZE
+u32 hinic3_get_rxfh_indir_size(struct net_device *netdev)
+{
+ return NIC_RSS_INDIR_SIZE;
+}
+#endif
+
+static int set_rss_rxfh(struct net_device *netdev, const u32 *indir,
+ const u8 *key)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err;
+
+ if (indir) {
+ err = hinic3_rss_set_indir_tbl(nic_dev->hwdev, indir);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to set rss indir table\n");
+ return -EFAULT;
+ }
+ clear_bit(HINIC3_RSS_DEFAULT_INDIR, &nic_dev->flags);
+
+ memcpy(nic_dev->rss_indir, indir,
+ sizeof(u32) * NIC_RSS_INDIR_SIZE);
+ nicif_info(nic_dev, drv, netdev, "Change rss indir success\n");
+ }
+
+ if (key) {
+ err = hinic3_rss_set_hash_key(nic_dev->hwdev, key);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to set rss key\n");
+ return -EFAULT;
+ }
+
+ copy_value_to_rss_hkey(nic_dev, key);
+ nicif_info(nic_dev, drv, netdev, "Change rss key success\n");
+ }
+
+ return 0;
+}
+
+#if defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)
+u32 hinic3_get_rxfh_key_size(struct net_device *netdev)
+{
+ return NIC_RSS_KEY_SIZE;
+}
+
+#ifdef HAVE_RXFH_HASHFUNC
+int hinic3_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key, u8 *hfunc)
+#else
+int hinic3_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err = 0;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Rss is disable\n");
+ return -EOPNOTSUPP;
+ }
+
+#ifdef HAVE_RXFH_HASHFUNC
+ if (hfunc)
+ *hfunc = nic_dev->rss_hash_engine ?
+ ETH_RSS_HASH_TOP : ETH_RSS_HASH_XOR;
+#endif
+
+ if (indir) {
+ err = hinic3_rss_get_indir_tbl(nic_dev->hwdev, indir);
+ if (err)
+ return -EFAULT;
+ }
+
+ if (key)
+ memcpy(key, nic_dev->rss_hkey, NIC_RSS_KEY_SIZE);
+
+ return err;
+}
+
+#ifdef HAVE_RXFH_HASHFUNC
+int hinic3_set_rxfh(struct net_device *netdev, const u32 *indir, const u8 *key,
+ const u8 hfunc)
+#else
+#ifdef HAVE_RXFH_NONCONST
+int hinic3_set_rxfh(struct net_device *netdev, u32 *indir, u8 *key)
+#else
+int hinic3_set_rxfh(struct net_device *netdev, const u32 *indir, const u8 *key)
+#endif
+#endif /* HAVE_RXFH_HASHFUNC */
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err = 0;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Not support to set rss parameters when rss is disable\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags) && indir) {
+ nicif_err(nic_dev, drv, netdev,
+ "Not support to set indir when DCB is enabled\n");
+ return -EOPNOTSUPP;
+ }
+
+#ifdef HAVE_RXFH_HASHFUNC
+ if (hfunc != ETH_RSS_HASH_NO_CHANGE) {
+ if (hfunc != ETH_RSS_HASH_TOP && hfunc != ETH_RSS_HASH_XOR) {
+ nicif_err(nic_dev, drv, netdev,
+ "Not support to set hfunc type except TOP and XOR\n");
+ return -EOPNOTSUPP;
+ }
+
+ nic_dev->rss_hash_engine = (hfunc == ETH_RSS_HASH_XOR) ?
+ HINIC3_RSS_HASH_ENGINE_TYPE_XOR :
+ HINIC3_RSS_HASH_ENGINE_TYPE_TOEP;
+ err = hinic3_rss_set_hash_engine(nic_dev->hwdev,
+ nic_dev->rss_hash_engine);
+ if (err)
+ return -EFAULT;
+
+ nicif_info(nic_dev, drv, netdev,
+ "Change hfunc to RSS_HASH_%s success\n",
+ (hfunc == ETH_RSS_HASH_XOR) ? "XOR" : "TOP");
+ }
+#endif
+ err = set_rss_rxfh(netdev, indir, key);
+
+ return err;
+}
+
+#else /* !(defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)) */
+
+#ifdef NOT_HAVE_GET_RXFH_INDIR_SIZE
+int hinic3_get_rxfh_indir(struct net_device *netdev,
+ struct ethtool_rxfh_indir *indir1)
+#else
+int hinic3_get_rxfh_indir(struct net_device *netdev, u32 *indir)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err = 0;
+#ifdef NOT_HAVE_GET_RXFH_INDIR_SIZE
+ u32 *indir = NULL;
+
+ /* In a low version kernel(eg:suse 11.2), call the interface twice.
+ * First call to get the size value,
+ * and second call to get the rxfh indir according to the size value.
+ */
+ if (indir1->size == 0) {
+ indir1->size = NIC_RSS_INDIR_SIZE;
+ return 0;
+ }
+
+ if (indir1->size < NIC_RSS_INDIR_SIZE) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to get rss indir, rss size(%d) is more than system rss size(%u).\n",
+ NIC_RSS_INDIR_SIZE, indir1->size);
+ return -EINVAL;
+ }
+
+ indir = indir1->ring_index;
+#endif
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Rss is disable\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (indir)
+ err = hinic3_rss_get_indir_tbl(nic_dev->hwdev, indir);
+
+ return err;
+}
+
+#ifdef NOT_HAVE_GET_RXFH_INDIR_SIZE
+int hinic3_set_rxfh_indir(struct net_device *netdev,
+ const struct ethtool_rxfh_indir *indir1)
+#else
+int hinic3_set_rxfh_indir(struct net_device *netdev, const u32 *indir)
+#endif
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+#ifdef NOT_HAVE_GET_RXFH_INDIR_SIZE
+ const u32 *indir = NULL;
+
+ if (indir1->size != NIC_RSS_INDIR_SIZE) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to set rss indir, rss size(%d) is more than system rss size(%u).\n",
+ NIC_RSS_INDIR_SIZE, indir1->size);
+ return -EINVAL;
+ }
+
+ indir = indir1->ring_index;
+#endif
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Not support to set rss indir when rss is disable\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (test_bit(HINIC3_DCB_ENABLE, &nic_dev->flags) && indir) {
+ nicif_err(nic_dev, drv, netdev,
+ "Not support to set indir when DCB is enabled\n");
+ return -EOPNOTSUPP;
+ }
+
+ return set_rss_rxfh(netdev, indir, NULL);
+}
+
+#endif /* defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH) */
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_rss.h b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.h
new file mode 100644
index 000000000000..8961cdd095a1
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.h
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_RSS_H
+#define HINIC3_RSS_H
+
+#include "hinic3_nic_dev.h"
+
+#define HINIC_NUM_IQ_PER_FUNC 8
+
+int hinic3_rss_init(struct hinic3_nic_dev *nic_dev, u8 *rq2iq_map,
+ u32 map_size, u8 dcb_en);
+
+void hinic3_rss_deinit(struct hinic3_nic_dev *nic_dev);
+
+int hinic3_set_hw_rss_parameters(struct net_device *netdev, u8 rss_en,
+ u8 cos_num, u8 *cos_map, u8 dcb_en);
+
+void hinic3_init_rss_parameters(struct net_device *netdev);
+
+void hinic3_set_default_rss_indir(struct net_device *netdev);
+
+void hinic3_try_to_enable_rss(struct hinic3_nic_dev *nic_dev);
+
+void hinic3_clear_rss_config(struct hinic3_nic_dev *nic_dev);
+
+void hinic3_flush_rx_flow_rule(struct hinic3_nic_dev *nic_dev);
+int hinic3_ethtool_get_flow(const struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rxnfc *info, u32 location);
+
+int hinic3_ethtool_get_all_flows(const struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rxnfc *info, u32 *rule_locs);
+
+int hinic3_ethtool_flow_remove(struct hinic3_nic_dev *nic_dev, u32 location);
+
+int hinic3_ethtool_flow_replace(struct hinic3_nic_dev *nic_dev,
+ struct ethtool_rx_flow_spec *fs);
+
+bool hinic3_validate_channel_setting_in_ntuple(const struct hinic3_nic_dev *nic_dev, u32 q_num);
+
+/* for ethtool */
+#if (KERNEL_VERSION(3, 4, 24) > LINUX_VERSION_CODE)
+int hinic3_get_rxnfc(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd, void *rule_locs);
+#else
+int hinic3_get_rxnfc(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd, u32 *rule_locs);
+#endif
+
+int hinic3_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd);
+
+void hinic3_get_channels(struct net_device *netdev,
+ struct ethtool_channels *channels);
+
+int hinic3_set_channels(struct net_device *netdev,
+ struct ethtool_channels *channels);
+
+#ifndef NOT_HAVE_GET_RXFH_INDIR_SIZE
+u32 hinic3_get_rxfh_indir_size(struct net_device *netdev);
+#endif /* NOT_HAVE_GET_RXFH_INDIR_SIZE */
+
+#if defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)
+u32 hinic3_get_rxfh_key_size(struct net_device *netdev);
+
+#ifdef HAVE_RXFH_HASHFUNC
+int hinic3_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key, u8 *hfunc);
+#else /* HAVE_RXFH_HASHFUNC */
+int hinic3_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key);
+#endif /* HAVE_RXFH_HASHFUNC */
+
+#ifdef HAVE_RXFH_HASHFUNC
+int hinic3_set_rxfh(struct net_device *netdev, const u32 *indir, const u8 *key,
+ const u8 hfunc);
+#else
+#ifdef HAVE_RXFH_NONCONST
+int hinic3_set_rxfh(struct net_device *netdev, u32 *indir, u8 *key);
+#else
+int hinic3_set_rxfh(struct net_device *netdev, const u32 *indir, const u8 *key);
+#endif /* HAVE_RXFH_NONCONST */
+#endif /* HAVE_RXFH_HASHFUNC */
+
+#else /* !(defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)) */
+
+#ifdef NOT_HAVE_GET_RXFH_INDIR_SIZE
+int hinic3_get_rxfh_indir(struct net_device *netdev,
+ struct ethtool_rxfh_indir *indir1);
+#else
+int hinic3_get_rxfh_indir(struct net_device *netdev, u32 *indir);
+#endif
+
+#ifdef NOT_HAVE_GET_RXFH_INDIR_SIZE
+int hinic3_set_rxfh_indir(struct net_device *netdev,
+ const struct ethtool_rxfh_indir *indir1);
+#else
+int hinic3_set_rxfh_indir(struct net_device *netdev, const u32 *indir);
+#endif /* NOT_HAVE_GET_RXFH_INDIR_SIZE */
+
+#endif /* (defined(ETHTOOL_GRSSH) && defined(ETHTOOL_SRSSH)) */
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_rss_cfg.c b/drivers/net/ethernet/huawei/hinic3/hinic3_rss_cfg.c
new file mode 100644
index 000000000000..175c4d68b795
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rss_cfg.c
@@ -0,0 +1,384 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/kernel.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/dcbnl.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_nic_cmd.h"
+#include "hinic3_hw.h"
+#include "hinic3_nic.h"
+#include "hinic3_common.h"
+
+static int hinic3_rss_cfg_hash_key(struct hinic3_nic_io *nic_io, u8 opcode,
+ u8 *key)
+{
+ struct hinic3_cmd_rss_hash_key hash_key;
+ u16 out_size = sizeof(hash_key);
+ int err;
+
+ memset(&hash_key, 0, sizeof(struct hinic3_cmd_rss_hash_key));
+ hash_key.func_id = hinic3_global_func_id(nic_io->hwdev);
+ hash_key.opcode = opcode;
+
+ if (opcode == HINIC3_CMD_OP_SET)
+ memcpy(hash_key.key, key, NIC_RSS_KEY_SIZE);
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_CFG_RSS_HASH_KEY,
+ &hash_key, sizeof(hash_key),
+ &hash_key, &out_size);
+ if (err || !out_size || hash_key.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to %s hash key, err: %d, status: 0x%x, out size: 0x%x\n",
+ opcode == HINIC3_CMD_OP_SET ? "set" : "get",
+ err, hash_key.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ if (opcode == HINIC3_CMD_OP_GET)
+ memcpy(key, hash_key.key, NIC_RSS_KEY_SIZE);
+
+ return 0;
+}
+
+int hinic3_rss_set_hash_key(void *hwdev, const u8 *key)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ u8 hash_key[NIC_RSS_KEY_SIZE];
+
+ if (!hwdev || !key)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memcpy(hash_key, key, NIC_RSS_KEY_SIZE);
+ return hinic3_rss_cfg_hash_key(nic_io, HINIC3_CMD_OP_SET, hash_key);
+}
+
+int hinic3_rss_get_hash_key(void *hwdev, u8 *key)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !key)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ return hinic3_rss_cfg_hash_key(nic_io, HINIC3_CMD_OP_GET, key);
+}
+
+int hinic3_rss_get_indir_tbl(void *hwdev, u32 *indir_table)
+{
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ u16 *indir_tbl = NULL;
+ int err, i;
+
+ if (!hwdev || !indir_table)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ cmd_buf = hinic3_alloc_cmd_buf(hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd_buf.\n");
+ return -ENOMEM;
+ }
+
+ cmd_buf->size = sizeof(struct nic_rss_indirect_tbl);
+ err = hinic3_cmdq_detail_resp(hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_UCODE_CMD_GET_RSS_INDIR_TABLE,
+ cmd_buf, cmd_buf, NULL, 0,
+ HINIC3_CHANNEL_NIC);
+ if (err) {
+ nic_err(nic_io->dev_hdl, "Failed to get rss indir table\n");
+ goto get_indir_tbl_failed;
+ }
+
+ indir_tbl = (u16 *)cmd_buf->buf;
+ for (i = 0; i < NIC_RSS_INDIR_SIZE; i++)
+ indir_table[i] = *(indir_tbl + i);
+
+get_indir_tbl_failed:
+ hinic3_free_cmd_buf(hwdev, cmd_buf);
+
+ return err;
+}
+
+int hinic3_rss_set_indir_tbl(void *hwdev, const u32 *indir_table)
+{
+ struct nic_rss_indirect_tbl *indir_tbl = NULL;
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ u32 *temp = NULL;
+ u32 i, size;
+ u64 out_param = 0;
+ int err;
+
+ if (!hwdev || !indir_table)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ cmd_buf = hinic3_alloc_cmd_buf(hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd buf\n");
+ return -ENOMEM;
+ }
+
+ cmd_buf->size = sizeof(struct nic_rss_indirect_tbl);
+ indir_tbl = (struct nic_rss_indirect_tbl *)cmd_buf->buf;
+ memset(indir_tbl, 0, sizeof(*indir_tbl));
+
+ for (i = 0; i < NIC_RSS_INDIR_SIZE; i++)
+ indir_tbl->entry[i] = (u16)(*(indir_table + i));
+
+ size = sizeof(indir_tbl->entry) / sizeof(u32);
+ temp = (u32 *)indir_tbl->entry;
+ for (i = 0; i < size; i++)
+ temp[i] = cpu_to_be32(temp[i]);
+
+ err = hinic3_cmdq_direct_resp(hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_UCODE_CMD_SET_RSS_INDIR_TABLE,
+ cmd_buf, &out_param, 0,
+ HINIC3_CHANNEL_NIC);
+ if (err || out_param != 0) {
+ nic_err(nic_io->dev_hdl, "Failed to set rss indir table\n");
+ err = -EFAULT;
+ }
+
+ hinic3_free_cmd_buf(hwdev, cmd_buf);
+ return err;
+}
+
+static int hinic3_cmdq_set_rss_type(void *hwdev, struct nic_rss_type rss_type)
+{
+ struct nic_rss_context_tbl *ctx_tbl = NULL;
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ struct hinic3_nic_io *nic_io = NULL;
+ u32 ctx = 0;
+ u64 out_param = 0;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ cmd_buf = hinic3_alloc_cmd_buf(hwdev);
+ if (!cmd_buf) {
+ nic_err(nic_io->dev_hdl, "Failed to allocate cmd buf\n");
+ return -ENOMEM;
+ }
+
+ ctx |= HINIC3_RSS_TYPE_SET(1, VALID) |
+ HINIC3_RSS_TYPE_SET(rss_type.ipv4, IPV4) |
+ HINIC3_RSS_TYPE_SET(rss_type.ipv6, IPV6) |
+ HINIC3_RSS_TYPE_SET(rss_type.ipv6_ext, IPV6_EXT) |
+ HINIC3_RSS_TYPE_SET(rss_type.tcp_ipv4, TCP_IPV4) |
+ HINIC3_RSS_TYPE_SET(rss_type.tcp_ipv6, TCP_IPV6) |
+ HINIC3_RSS_TYPE_SET(rss_type.tcp_ipv6_ext, TCP_IPV6_EXT) |
+ HINIC3_RSS_TYPE_SET(rss_type.udp_ipv4, UDP_IPV4) |
+ HINIC3_RSS_TYPE_SET(rss_type.udp_ipv6, UDP_IPV6);
+
+ cmd_buf->size = sizeof(struct nic_rss_context_tbl);
+ ctx_tbl = (struct nic_rss_context_tbl *)cmd_buf->buf;
+ memset(ctx_tbl, 0, sizeof(*ctx_tbl));
+ ctx_tbl->ctx = cpu_to_be32(ctx);
+
+ /* cfg the rss context table by command queue */
+ err = hinic3_cmdq_direct_resp(hwdev, HINIC3_MOD_L2NIC,
+ HINIC3_UCODE_CMD_SET_RSS_CONTEXT_TABLE,
+ cmd_buf, &out_param, 0,
+ HINIC3_CHANNEL_NIC);
+
+ hinic3_free_cmd_buf(hwdev, cmd_buf);
+
+ if (err || out_param != 0) {
+ nic_err(nic_io->dev_hdl, "cmdq set set rss context table failed, err: %d\n",
+ err);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int hinic3_mgmt_set_rss_type(void *hwdev, struct nic_rss_type rss_type)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+ struct hinic3_rss_context_table ctx_tbl;
+ u32 ctx = 0;
+ u16 out_size = sizeof(ctx_tbl);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&ctx_tbl, 0, sizeof(ctx_tbl));
+ ctx_tbl.func_id = hinic3_global_func_id(hwdev);
+ ctx |= HINIC3_RSS_TYPE_SET(1, VALID) |
+ HINIC3_RSS_TYPE_SET(rss_type.ipv4, IPV4) |
+ HINIC3_RSS_TYPE_SET(rss_type.ipv6, IPV6) |
+ HINIC3_RSS_TYPE_SET(rss_type.ipv6_ext, IPV6_EXT) |
+ HINIC3_RSS_TYPE_SET(rss_type.tcp_ipv4, TCP_IPV4) |
+ HINIC3_RSS_TYPE_SET(rss_type.tcp_ipv6, TCP_IPV6) |
+ HINIC3_RSS_TYPE_SET(rss_type.tcp_ipv6_ext, TCP_IPV6_EXT) |
+ HINIC3_RSS_TYPE_SET(rss_type.udp_ipv4, UDP_IPV4) |
+ HINIC3_RSS_TYPE_SET(rss_type.udp_ipv6, UDP_IPV6);
+ ctx_tbl.context = ctx;
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_SET_RSS_CTX_TBL_INTO_FUNC,
+ &ctx_tbl, sizeof(ctx_tbl),
+ &ctx_tbl, &out_size);
+
+ if (ctx_tbl.msg_head.status == HINIC3_MGMT_CMD_UNSUPPORTED) {
+ return HINIC3_MGMT_CMD_UNSUPPORTED;
+ } else if (err || !out_size || ctx_tbl.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "mgmt Failed to set rss context offload, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, ctx_tbl.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_set_rss_type(void *hwdev, struct nic_rss_type rss_type)
+{
+ int err;
+
+ err = hinic3_mgmt_set_rss_type(hwdev, rss_type);
+ if (err == HINIC3_MGMT_CMD_UNSUPPORTED)
+ err = hinic3_cmdq_set_rss_type(hwdev, rss_type);
+
+ return err;
+}
+
+int hinic3_get_rss_type(void *hwdev, struct nic_rss_type *rss_type)
+{
+ struct hinic3_rss_context_table ctx_tbl;
+ u16 out_size = sizeof(ctx_tbl);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ if (!hwdev || !rss_type)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+
+ memset(&ctx_tbl, 0, sizeof(struct hinic3_rss_context_table));
+ ctx_tbl.func_id = hinic3_global_func_id(hwdev);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_GET_RSS_CTX_TBL,
+ &ctx_tbl, sizeof(ctx_tbl),
+ &ctx_tbl, &out_size);
+ if (err || !out_size || ctx_tbl.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to get hash type, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, ctx_tbl.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ rss_type->ipv4 = HINIC3_RSS_TYPE_GET(ctx_tbl.context, IPV4);
+ rss_type->ipv6 = HINIC3_RSS_TYPE_GET(ctx_tbl.context, IPV6);
+ rss_type->ipv6_ext = HINIC3_RSS_TYPE_GET(ctx_tbl.context, IPV6_EXT);
+ rss_type->tcp_ipv4 = HINIC3_RSS_TYPE_GET(ctx_tbl.context, TCP_IPV4);
+ rss_type->tcp_ipv6 = HINIC3_RSS_TYPE_GET(ctx_tbl.context, TCP_IPV6);
+ rss_type->tcp_ipv6_ext = HINIC3_RSS_TYPE_GET(ctx_tbl.context,
+ TCP_IPV6_EXT);
+ rss_type->udp_ipv4 = HINIC3_RSS_TYPE_GET(ctx_tbl.context, UDP_IPV4);
+ rss_type->udp_ipv6 = HINIC3_RSS_TYPE_GET(ctx_tbl.context, UDP_IPV6);
+
+ return 0;
+}
+
+static int hinic3_rss_cfg_hash_engine(struct hinic3_nic_io *nic_io, u8 opcode,
+ u8 *type)
+{
+ struct hinic3_cmd_rss_engine_type hash_type;
+ u16 out_size = sizeof(hash_type);
+ int err;
+
+ memset(&hash_type, 0, sizeof(struct hinic3_cmd_rss_engine_type));
+
+ hash_type.func_id = hinic3_global_func_id(nic_io->hwdev);
+ hash_type.opcode = opcode;
+
+ if (opcode == HINIC3_CMD_OP_SET)
+ hash_type.hash_engine = *type;
+
+ err = l2nic_msg_to_mgmt_sync(nic_io->hwdev,
+ HINIC3_NIC_CMD_CFG_RSS_HASH_ENGINE,
+ &hash_type, sizeof(hash_type),
+ &hash_type, &out_size);
+ if (err || !out_size || hash_type.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to %s hash engine, err: %d, status: 0x%x, out size: 0x%x\n",
+ opcode == HINIC3_CMD_OP_SET ? "set" : "get",
+ err, hash_type.msg_head.status, out_size);
+ return -EIO;
+ }
+
+ if (opcode == HINIC3_CMD_OP_GET)
+ *type = hash_type.hash_engine;
+
+ return 0;
+}
+
+int hinic3_rss_set_hash_engine(void *hwdev, u8 type)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ return hinic3_rss_cfg_hash_engine(nic_io, HINIC3_CMD_OP_SET, &type);
+}
+
+int hinic3_rss_get_hash_engine(void *hwdev, u8 *type)
+{
+ struct hinic3_nic_io *nic_io = NULL;
+
+ if (!hwdev || !type)
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ return hinic3_rss_cfg_hash_engine(nic_io, HINIC3_CMD_OP_GET, type);
+}
+
+int hinic3_rss_cfg(void *hwdev, u8 rss_en, u8 cos_num, u8 *prio_tc, u16 num_qps)
+{
+ struct hinic3_cmd_rss_config rss_cfg;
+ u16 out_size = sizeof(rss_cfg);
+ struct hinic3_nic_io *nic_io = NULL;
+ int err;
+
+ /* micro code required: number of TC should be power of 2 */
+ if (!hwdev || !prio_tc || (cos_num & (cos_num - 1)))
+ return -EINVAL;
+
+ nic_io = hinic3_get_service_adapter(hwdev, SERVICE_T_NIC);
+ memset(&rss_cfg, 0, sizeof(struct hinic3_cmd_rss_config));
+ rss_cfg.func_id = hinic3_global_func_id(hwdev);
+ rss_cfg.rss_en = rss_en;
+ rss_cfg.rq_priority_number = cos_num ? (u8)ilog2(cos_num) : 0;
+ rss_cfg.num_qps = num_qps;
+
+ memcpy(rss_cfg.prio_tc, prio_tc, NIC_DCB_UP_MAX);
+
+ err = l2nic_msg_to_mgmt_sync(hwdev, HINIC3_NIC_CMD_RSS_CFG,
+ &rss_cfg, sizeof(rss_cfg),
+ &rss_cfg, &out_size);
+ if (err || !out_size || rss_cfg.msg_head.status) {
+ nic_err(nic_io->dev_hdl, "Failed to set rss cfg, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, rss_cfg.msg_head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
new file mode 100644
index 000000000000..a4085334806d
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
@@ -0,0 +1,1344 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/skbuff.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/netdevice.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/u64_stats_sync.h>
+#include <linux/ip.h>
+#include <linux/tcp.h>
+#include <linux/sctp.h>
+#include <linux/pkt_sched.h>
+#include <linux/ipv6.h>
+#include <linux/module.h>
+#include <linux/compiler.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_common.h"
+#include "hinic3_nic_qp.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_rss.h"
+#include "hinic3_rx.h"
+
+static u32 rq_pi_rd_en;
+module_param(rq_pi_rd_en, uint, 0644);
+MODULE_PARM_DESC(rq_pi_rd_en, "Enable rq read pi from host, defaut update pi by doorbell (default=0)");
+
+/* performance: ci addr RTE_CACHE_SIZE(64B) alignment */
+#define HINIC3_RX_HDR_SIZE 256
+#define HINIC3_RX_BUFFER_WRITE 16
+
+#define HINIC3_RX_TCP_PKT 0x3
+#define HINIC3_RX_UDP_PKT 0x4
+#define HINIC3_RX_SCTP_PKT 0x7
+
+#define HINIC3_RX_IPV4_PKT 0
+#define HINIC3_RX_IPV6_PKT 1
+#define HINIC3_RX_INVALID_IP_TYPE 2
+
+#define HINIC3_RX_PKT_FORMAT_NON_TUNNEL 0
+#define HINIC3_RX_PKT_FORMAT_VXLAN 1
+
+#define RXQ_STATS_INC(rxq, field) \
+do { \
+ u64_stats_update_begin(&(rxq)->rxq_stats.syncp); \
+ (rxq)->rxq_stats.field++; \
+ u64_stats_update_end(&(rxq)->rxq_stats.syncp); \
+} while (0)
+
+static bool rx_alloc_mapped_page(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_rx_info *rx_info)
+{
+ struct pci_dev *pdev = nic_dev->pdev;
+ struct page *page = rx_info->page;
+ dma_addr_t dma = rx_info->buf_dma_addr;
+
+ if (likely(dma))
+ return true;
+
+ /* alloc new page for storage */
+ page = alloc_pages_node(NUMA_NO_NODE, GFP_ATOMIC | __GFP_COLD |
+ __GFP_COMP, nic_dev->page_order);
+ if (unlikely(!page))
+ return false;
+
+ /* map page for use */
+ dma = dma_map_page(&pdev->dev, page, 0, nic_dev->dma_rx_buff_size,
+ DMA_FROM_DEVICE);
+ /* if mapping failed free memory back to system since
+ * there isn't much point in holding memory we can't use
+ */
+ if (unlikely(dma_mapping_error(&pdev->dev, dma))) {
+ __free_pages(page, nic_dev->page_order);
+ return false;
+ }
+
+ rx_info->page = page;
+ rx_info->buf_dma_addr = dma;
+ rx_info->page_offset = 0;
+
+ return true;
+}
+
+static u32 hinic3_rx_fill_wqe(struct hinic3_rxq *rxq)
+{
+ struct net_device *netdev = rxq->netdev;
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int rq_wqe_len = rxq->rq->wq.wqebb_size;
+ struct hinic3_rq_wqe *rq_wqe = NULL;
+ struct hinic3_rx_info *rx_info = NULL;
+ u32 i;
+
+ for (i = 0; i < rxq->q_depth; i++) {
+ rx_info = &rxq->rx_info[i];
+ rq_wqe = hinic3_rq_wqe_addr(rxq->rq, (u16)i);
+
+ if (rxq->rq->wqe_type == HINIC3_EXTEND_RQ_WQE) {
+ /* unit of cqe length is 16B */
+ hinic3_set_sge(&rq_wqe->extend_wqe.cqe_sect.sge,
+ rx_info->cqe_dma,
+ (sizeof(struct hinic3_rq_cqe) >>
+ HINIC3_CQE_SIZE_SHIFT));
+ /* use fixed len */
+ rq_wqe->extend_wqe.buf_desc.sge.len =
+ nic_dev->rx_buff_len;
+ } else {
+ rq_wqe->normal_wqe.cqe_hi_addr =
+ upper_32_bits(rx_info->cqe_dma);
+ rq_wqe->normal_wqe.cqe_lo_addr =
+ lower_32_bits(rx_info->cqe_dma);
+ }
+
+ hinic3_hw_be32_len(rq_wqe, rq_wqe_len);
+ rx_info->rq_wqe = rq_wqe;
+ }
+
+ return i;
+}
+
+static u32 hinic3_rx_fill_buffers(struct hinic3_rxq *rxq)
+{
+ struct net_device *netdev = rxq->netdev;
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_rq_wqe *rq_wqe = NULL;
+ struct hinic3_rx_info *rx_info = NULL;
+ dma_addr_t dma_addr;
+ u32 i, free_wqebbs = rxq->delta - 1;
+
+ for (i = 0; i < free_wqebbs; i++) {
+ rx_info = &rxq->rx_info[rxq->next_to_update];
+
+ if (unlikely(!rx_alloc_mapped_page(nic_dev, rx_info))) {
+ RXQ_STATS_INC(rxq, alloc_rx_buf_err);
+ break;
+ }
+
+ dma_addr = rx_info->buf_dma_addr + rx_info->page_offset;
+
+ rq_wqe = rx_info->rq_wqe;
+
+ if (rxq->rq->wqe_type == HINIC3_EXTEND_RQ_WQE) {
+ rq_wqe->extend_wqe.buf_desc.sge.hi_addr =
+ hinic3_hw_be32(upper_32_bits(dma_addr));
+ rq_wqe->extend_wqe.buf_desc.sge.lo_addr =
+ hinic3_hw_be32(lower_32_bits(dma_addr));
+ } else {
+ rq_wqe->normal_wqe.buf_hi_addr =
+ hinic3_hw_be32(upper_32_bits(dma_addr));
+ rq_wqe->normal_wqe.buf_lo_addr =
+ hinic3_hw_be32(lower_32_bits(dma_addr));
+ }
+ rxq->next_to_update = (u16)((rxq->next_to_update + 1) & rxq->q_mask);
+ }
+
+ if (likely(i)) {
+ if (!rq_pi_rd_en) {
+ hinic3_write_db(rxq->rq,
+ rxq->q_id & (NIC_DCB_COS_MAX - 1),
+ RQ_CFLAG_DP,
+ (u16)((u32)rxq->next_to_update <<
+ rxq->rq->wqe_type));
+ } else {
+ /* Write all the wqes before pi update */
+ wmb();
+
+ hinic3_update_rq_hw_pi(rxq->rq, rxq->next_to_update);
+ }
+ rxq->delta -= i;
+ rxq->next_to_alloc = rxq->next_to_update;
+ } else if (free_wqebbs == rxq->q_depth - 1) {
+ RXQ_STATS_INC(rxq, rx_buf_empty);
+ }
+
+ return i;
+}
+
+static u32 hinic3_rx_alloc_buffers(struct hinic3_nic_dev *nic_dev, u32 rq_depth,
+ struct hinic3_rx_info *rx_info_arr)
+{
+ u32 free_wqebbs = rq_depth - 1;
+ u32 idx;
+
+ for (idx = 0; idx < free_wqebbs; idx++) {
+ if (!rx_alloc_mapped_page(nic_dev, &rx_info_arr[idx]))
+ break;
+ }
+
+ return idx;
+}
+
+static void hinic3_rx_free_buffers(struct hinic3_nic_dev *nic_dev, u32 q_depth,
+ struct hinic3_rx_info *rx_info_arr)
+{
+ struct hinic3_rx_info *rx_info = NULL;
+ u32 i;
+
+ /* Free all the Rx ring sk_buffs */
+ for (i = 0; i < q_depth; i++) {
+ rx_info = &rx_info_arr[i];
+
+ if (rx_info->buf_dma_addr) {
+ dma_unmap_page(&nic_dev->pdev->dev,
+ rx_info->buf_dma_addr,
+ nic_dev->dma_rx_buff_size,
+ DMA_FROM_DEVICE);
+ rx_info->buf_dma_addr = 0;
+ }
+
+ if (rx_info->page) {
+ __free_pages(rx_info->page, nic_dev->page_order);
+ rx_info->page = NULL;
+ }
+ }
+}
+
+static void hinic3_reuse_rx_page(struct hinic3_rxq *rxq,
+ struct hinic3_rx_info *old_rx_info)
+{
+ struct hinic3_rx_info *new_rx_info;
+ u16 nta = rxq->next_to_alloc;
+
+ new_rx_info = &rxq->rx_info[nta];
+
+ /* update, and store next to alloc */
+ nta++;
+ rxq->next_to_alloc = (nta < rxq->q_depth) ? nta : 0;
+
+ new_rx_info->page = old_rx_info->page;
+ new_rx_info->page_offset = old_rx_info->page_offset;
+ new_rx_info->buf_dma_addr = old_rx_info->buf_dma_addr;
+
+ /* sync the buffer for use by the device */
+ dma_sync_single_range_for_device(rxq->dev, new_rx_info->buf_dma_addr,
+ new_rx_info->page_offset,
+ rxq->buf_len,
+ DMA_FROM_DEVICE);
+}
+
+static bool hinic3_add_rx_frag(struct hinic3_rxq *rxq,
+ struct hinic3_rx_info *rx_info,
+ struct sk_buff *skb, u32 size)
+{
+ struct page *page;
+ u8 *va;
+
+ page = rx_info->page;
+ va = (u8 *)page_address(page) + rx_info->page_offset;
+ prefetch(va);
+#if L1_CACHE_BYTES < 128
+ prefetch(va + L1_CACHE_BYTES);
+#endif
+
+ dma_sync_single_range_for_cpu(rxq->dev,
+ rx_info->buf_dma_addr,
+ rx_info->page_offset,
+ rxq->buf_len,
+ DMA_FROM_DEVICE);
+
+ if (size <= HINIC3_RX_HDR_SIZE && !skb_is_nonlinear(skb)) {
+ memcpy(__skb_put(skb, size), va,
+ ALIGN(size, sizeof(long))); /*lint !e666*/
+
+ /* page is not reserved, we can reuse buffer as-is */
+ if (likely(page_to_nid(page) == numa_node_id()))
+ return true;
+
+ /* this page cannot be reused so discard it */
+ put_page(page);
+ return false;
+ }
+
+ skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
+ (int)rx_info->page_offset, (int)size, rxq->buf_len);
+
+ /* avoid re-using remote pages */
+ if (unlikely(page_to_nid(page) != numa_node_id()))
+ return false;
+
+ /* if we are only owner of page we can reuse it */
+ if (unlikely(page_count(page) != 1))
+ return false;
+
+ /* flip page offset to other buffer */
+ rx_info->page_offset ^= rxq->buf_len;
+ get_page(page);
+
+ return true;
+}
+
+static void packaging_skb(struct hinic3_rxq *rxq, struct sk_buff *head_skb,
+ u8 sge_num, u32 pkt_len)
+{
+ struct hinic3_rx_info *rx_info = NULL;
+ struct sk_buff *skb = NULL;
+ u8 frag_num = 0;
+ u32 size;
+ u32 sw_ci;
+ u32 temp_pkt_len = pkt_len;
+ u8 temp_sge_num = sge_num;
+
+ sw_ci = rxq->cons_idx & rxq->q_mask;
+ skb = head_skb;
+ while (temp_sge_num) {
+ rx_info = &rxq->rx_info[sw_ci];
+ sw_ci = (sw_ci + 1) & rxq->q_mask;
+ if (unlikely(temp_pkt_len > rxq->buf_len)) {
+ size = rxq->buf_len;
+ temp_pkt_len -= rxq->buf_len;
+ } else {
+ size = temp_pkt_len;
+ }
+
+ if (unlikely(frag_num == MAX_SKB_FRAGS)) {
+ frag_num = 0;
+ if (skb == head_skb)
+ skb = skb_shinfo(skb)->frag_list;
+ else
+ skb = skb->next;
+ }
+
+ if (unlikely(skb != head_skb)) {
+ head_skb->len += size;
+ head_skb->data_len += size;
+ head_skb->truesize += rxq->buf_len;
+ }
+
+ if (likely(hinic3_add_rx_frag(rxq, rx_info, skb, size))) {
+ hinic3_reuse_rx_page(rxq, rx_info);
+ } else {
+ /* we are not reusing the buffer so unmap it */
+ dma_unmap_page(rxq->dev, rx_info->buf_dma_addr,
+ rxq->dma_rx_buff_size, DMA_FROM_DEVICE);
+ }
+ /* clear contents of buffer_info */
+ rx_info->buf_dma_addr = 0;
+ rx_info->page = NULL;
+ temp_sge_num--;
+ frag_num++;
+ }
+}
+
+#define HINIC3_GET_SGE_NUM(pkt_len, rxq) \
+ ((u8)(((pkt_len) >> (rxq)->rx_buff_shift) + \
+ (((pkt_len) & ((rxq)->buf_len - 1)) ? 1 : 0)))
+
+static struct sk_buff *hinic3_fetch_rx_buffer(struct hinic3_rxq *rxq,
+ u32 pkt_len)
+{
+ struct sk_buff *head_skb = NULL;
+ struct sk_buff *cur_skb = NULL;
+ struct sk_buff *skb = NULL;
+ struct net_device *netdev = rxq->netdev;
+ u8 sge_num, skb_num;
+ u16 wqebb_cnt = 0;
+
+ head_skb = netdev_alloc_skb_ip_align(netdev, HINIC3_RX_HDR_SIZE);
+ if (unlikely(!head_skb))
+ return NULL;
+
+ sge_num = HINIC3_GET_SGE_NUM(pkt_len, rxq);
+ if (likely(sge_num <= MAX_SKB_FRAGS))
+ skb_num = 1;
+ else
+ skb_num = (sge_num / MAX_SKB_FRAGS) +
+ ((sge_num % MAX_SKB_FRAGS) ? 1 : 0);
+
+ while (unlikely(skb_num > 1)) {
+ cur_skb = netdev_alloc_skb_ip_align(netdev, HINIC3_RX_HDR_SIZE);
+ if (unlikely(!cur_skb))
+ goto alloc_skb_fail;
+
+ if (!skb) {
+ skb_shinfo(head_skb)->frag_list = cur_skb;
+ skb = cur_skb;
+ } else {
+ skb->next = cur_skb;
+ skb = cur_skb;
+ }
+
+ skb_num--;
+ }
+
+ prefetchw(head_skb->data);
+ wqebb_cnt = sge_num;
+
+ packaging_skb(rxq, head_skb, sge_num, pkt_len);
+
+ rxq->cons_idx += wqebb_cnt;
+ rxq->delta += wqebb_cnt;
+
+ return head_skb;
+
+alloc_skb_fail:
+ dev_kfree_skb_any(head_skb);
+ return NULL;
+}
+
+void hinic3_rxq_get_stats(struct hinic3_rxq *rxq,
+ struct hinic3_rxq_stats *stats)
+{
+ struct hinic3_rxq_stats *rxq_stats = &rxq->rxq_stats;
+ unsigned int start;
+
+ u64_stats_update_begin(&stats->syncp);
+ do {
+ start = u64_stats_fetch_begin(&rxq_stats->syncp);
+ stats->bytes = rxq_stats->bytes;
+ stats->packets = rxq_stats->packets;
+ stats->errors = rxq_stats->csum_errors +
+ rxq_stats->other_errors;
+ stats->csum_errors = rxq_stats->csum_errors;
+ stats->other_errors = rxq_stats->other_errors;
+ stats->dropped = rxq_stats->dropped;
+ stats->xdp_dropped = rxq_stats->xdp_dropped;
+ stats->rx_buf_empty = rxq_stats->rx_buf_empty;
+ } while (u64_stats_fetch_retry(&rxq_stats->syncp, start));
+ u64_stats_update_end(&stats->syncp);
+}
+
+void hinic3_rxq_clean_stats(struct hinic3_rxq_stats *rxq_stats)
+{
+ u64_stats_update_begin(&rxq_stats->syncp);
+ rxq_stats->bytes = 0;
+ rxq_stats->packets = 0;
+ rxq_stats->errors = 0;
+ rxq_stats->csum_errors = 0;
+ rxq_stats->other_errors = 0;
+ rxq_stats->dropped = 0;
+ rxq_stats->xdp_dropped = 0;
+ rxq_stats->rx_buf_empty = 0;
+
+ rxq_stats->alloc_skb_err = 0;
+ rxq_stats->alloc_rx_buf_err = 0;
+ rxq_stats->xdp_large_pkt = 0;
+ rxq_stats->restore_drop_sge = 0;
+ rxq_stats->rsvd2 = 0;
+ u64_stats_update_end(&rxq_stats->syncp);
+}
+
+static void rxq_stats_init(struct hinic3_rxq *rxq)
+{
+ struct hinic3_rxq_stats *rxq_stats = &rxq->rxq_stats;
+
+ u64_stats_init(&rxq_stats->syncp);
+ hinic3_rxq_clean_stats(rxq_stats);
+}
+
+#ifndef HAVE_ETH_GET_HEADLEN_FUNC
+static unsigned int hinic3_eth_get_headlen(unsigned char *data, unsigned int max_len)
+{
+#define IP_FRAG_OFFSET 0x1FFF
+#define FCOE_HLEN 38
+#define ETH_P_8021_AD 0x88A8
+#define ETH_P_8021_Q 0x8100
+#define TCP_HEAD_OFFSET 12
+ union {
+ unsigned char *data;
+ struct ethhdr *eth;
+ struct vlan_ethhdr *vlan;
+ struct iphdr *ipv4;
+ struct ipv6hdr *ipv6;
+ } hdr;
+ u16 protocol;
+ u8 nexthdr = 0;
+ u8 hlen;
+
+ if (unlikely(max_len < ETH_HLEN))
+ return max_len;
+
+ hdr.data = data;
+ protocol = hdr.eth->h_proto;
+
+ /* L2 header */
+ /*lint -save -e778*/
+ if (protocol == __constant_htons(ETH_P_8021_AD) ||
+ protocol == __constant_htons(ETH_P_8021_Q)) { /*lint -restore*/
+ if (unlikely(max_len < ETH_HLEN + VLAN_HLEN))
+ return max_len;
+
+ /* L3 protocol */
+ protocol = hdr.vlan->h_vlan_encapsulated_proto;
+ hdr.data += sizeof(struct vlan_ethhdr);
+ } else {
+ hdr.data += ETH_HLEN;
+ }
+
+ /* L3 header */
+ /*lint -save -e778*/
+ switch (protocol) {
+ case __constant_htons(ETH_P_IP): /*lint -restore*/
+ if ((int)(hdr.data - data) >
+ (int)(max_len - sizeof(struct iphdr)))
+ return max_len;
+
+ /* L3 header length = (1st byte & 0x0F) << 2 */
+ hlen = (hdr.data[0] & 0x0F) << 2;
+
+ if (hlen < sizeof(struct iphdr))
+ return (unsigned int)(hdr.data - data);
+
+ if (!(hdr.ipv4->frag_off & htons(IP_FRAG_OFFSET)))
+ nexthdr = hdr.ipv4->protocol;
+
+ hdr.data += hlen;
+ break;
+
+ case __constant_htons(ETH_P_IPV6):
+ if ((int)(hdr.data - data) >
+ (int)(max_len - sizeof(struct ipv6hdr)))
+ return max_len;
+ /* L4 protocol */
+ nexthdr = hdr.ipv6->nexthdr;
+ hdr.data += sizeof(struct ipv6hdr);
+ break;
+
+ case __constant_htons(ETH_P_FCOE):
+ hdr.data += FCOE_HLEN;
+ break;
+
+ default:
+ return (unsigned int)(hdr.data - data);
+ }
+
+ /* L4 header */
+ switch (nexthdr) {
+ case IPPROTO_TCP:
+ if ((int)(hdr.data - data) >
+ (int)(max_len - sizeof(struct tcphdr)))
+ return max_len;
+
+ /* L4 header length = (13st byte & 0xF0) >> 2 */
+ if (((hdr.data[TCP_HEAD_OFFSET] & 0xF0) >>
+ HINIC3_HEADER_DATA_UNIT) > sizeof(struct tcphdr))
+ hdr.data += ((hdr.data[TCP_HEAD_OFFSET] & 0xF0) >>
+ HINIC3_HEADER_DATA_UNIT);
+ else
+ hdr.data += sizeof(struct tcphdr);
+ break;
+ case IPPROTO_UDP:
+ case IPPROTO_UDPLITE:
+ hdr.data += sizeof(struct udphdr);
+ break;
+
+ case IPPROTO_SCTP:
+ hdr.data += sizeof(struct sctphdr);
+ break;
+ default:
+ break;
+ }
+
+ if ((hdr.data - data) > max_len)
+ return max_len;
+ else
+ return (unsigned int)(hdr.data - data);
+}
+#endif
+
+static void hinic3_pull_tail(struct sk_buff *skb)
+{
+ skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
+ unsigned char *va = NULL;
+ unsigned int pull_len;
+
+ /* it is valid to use page_address instead of kmap since we are
+ * working with pages allocated out of the lomem pool per
+ * alloc_page(GFP_ATOMIC)
+ */
+ va = skb_frag_address(frag);
+
+#ifdef HAVE_ETH_GET_HEADLEN_FUNC
+ /* we need the header to contain the greater of either ETH_HLEN or
+ * 60 bytes if the skb->len is less than 60 for skb_pad.
+ */
+#ifdef ETH_GET_HEADLEN_NEED_DEV
+ pull_len = eth_get_headlen(skb->dev, va, HINIC3_RX_HDR_SIZE);
+#else
+ pull_len = eth_get_headlen(va, HINIC3_RX_HDR_SIZE);
+#endif
+
+#else
+ pull_len = hinic3_eth_get_headlen(va, HINIC3_RX_HDR_SIZE);
+#endif
+
+ /* align pull length to size of long to optimize memcpy performance */
+ skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long)));
+
+ /* update all of the pointers */
+ skb_frag_size_sub(frag, (int)pull_len);
+ skb_frag_off_add(frag, (int)pull_len);
+
+ skb->data_len -= pull_len;
+ skb->tail += pull_len;
+}
+
+static void hinic3_rx_csum(struct hinic3_rxq *rxq, u32 offload_type,
+ u32 status, struct sk_buff *skb)
+{
+ struct net_device *netdev = rxq->netdev;
+ u32 pkt_type = HINIC3_GET_RX_PKT_TYPE(offload_type);
+ u32 ip_type = HINIC3_GET_RX_IP_TYPE(offload_type);
+ u32 pkt_fmt = HINIC3_GET_RX_TUNNEL_PKT_FORMAT(offload_type);
+
+ u32 csum_err;
+
+ csum_err = HINIC3_GET_RX_CSUM_ERR(status);
+ if (unlikely(csum_err == HINIC3_RX_CSUM_IPSU_OTHER_ERR))
+ rxq->rxq_stats.other_errors++;
+
+ if (!(netdev->features & NETIF_F_RXCSUM))
+ return;
+
+ if (unlikely(csum_err)) {
+ /* pkt type is recognized by HW, and csum is wrong */
+ if (!(csum_err & (HINIC3_RX_CSUM_HW_CHECK_NONE |
+ HINIC3_RX_CSUM_IPSU_OTHER_ERR)))
+ rxq->rxq_stats.csum_errors++;
+ skb->ip_summed = CHECKSUM_NONE;
+ return;
+ }
+
+ if (ip_type == HINIC3_RX_INVALID_IP_TYPE ||
+ !(pkt_fmt == HINIC3_RX_PKT_FORMAT_NON_TUNNEL ||
+ pkt_fmt == HINIC3_RX_PKT_FORMAT_VXLAN)) {
+ skb->ip_summed = CHECKSUM_NONE;
+ return;
+ }
+
+ switch (pkt_type) {
+ case HINIC3_RX_TCP_PKT:
+ case HINIC3_RX_UDP_PKT:
+ case HINIC3_RX_SCTP_PKT:
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ break;
+ default:
+ skb->ip_summed = CHECKSUM_NONE;
+ break;
+ }
+}
+
+#ifdef HAVE_SKBUFF_CSUM_LEVEL
+static void hinic3_rx_gro(struct hinic3_rxq *rxq, u32 offload_type,
+ struct sk_buff *skb)
+{
+ struct net_device *netdev = rxq->netdev;
+ bool l2_tunnel = false;
+
+ if (!(netdev->features & NETIF_F_GRO))
+ return;
+
+ l2_tunnel =
+ HINIC3_GET_RX_TUNNEL_PKT_FORMAT(offload_type) ==
+ HINIC3_RX_PKT_FORMAT_VXLAN ? 1 : 0;
+ if (l2_tunnel && skb->ip_summed == CHECKSUM_UNNECESSARY)
+ /* If we checked the outer header let the stack know */
+ skb->csum_level = 1;
+}
+#endif /* HAVE_SKBUFF_CSUM_LEVEL */
+
+static void hinic3_copy_lp_data(struct hinic3_nic_dev *nic_dev,
+ struct sk_buff *skb)
+{
+ struct net_device *netdev = nic_dev->netdev;
+ u8 *lb_buf = nic_dev->lb_test_rx_buf;
+ void *frag_data = NULL;
+ int lb_len = nic_dev->lb_pkt_len;
+ int pkt_offset, frag_len, i;
+
+ if (nic_dev->lb_test_rx_idx == LP_PKT_CNT) {
+ nic_dev->lb_test_rx_idx = 0;
+ nicif_warn(nic_dev, rx_err, netdev, "Loopback test warning, receive too many test pkts\n");
+ }
+
+ if (skb->len != nic_dev->lb_pkt_len) {
+ nicif_warn(nic_dev, rx_err, netdev, "Wrong packet length\n");
+ nic_dev->lb_test_rx_idx++;
+ return;
+ }
+
+ pkt_offset = nic_dev->lb_test_rx_idx * lb_len;
+ frag_len = (int)skb_headlen(skb);
+ memcpy(lb_buf + pkt_offset, skb->data, (size_t)(u32)frag_len);
+
+ pkt_offset += frag_len;
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+ frag_data = skb_frag_address(&skb_shinfo(skb)->frags[i]);
+ frag_len = (int)skb_frag_size(&skb_shinfo(skb)->frags[i]);
+ memcpy(lb_buf + pkt_offset, frag_data, (size_t)(u32)frag_len);
+
+ pkt_offset += frag_len;
+ }
+ nic_dev->lb_test_rx_idx++;
+}
+
+static inline void hinic3_lro_set_gso_params(struct sk_buff *skb, u16 num_lro)
+{
+ struct ethhdr *eth = (struct ethhdr *)(skb->data);
+ __be16 proto;
+
+ proto = __vlan_get_protocol(skb, eth->h_proto, NULL);
+
+ skb_shinfo(skb)->gso_size = (u16)DIV_ROUND_UP((skb->len - skb_headlen(skb)), num_lro);
+ skb_shinfo(skb)->gso_type = (proto == htons(ETH_P_IP)) ? SKB_GSO_TCPV4 : SKB_GSO_TCPV6;
+ skb_shinfo(skb)->gso_segs = num_lro;
+}
+
+#ifdef HAVE_XDP_SUPPORT
+enum hinic3_xdp_pkt {
+ HINIC3_XDP_PKT_PASS,
+ HINIC3_XDP_PKT_DROP,
+};
+
+static void update_drop_rx_info(struct hinic3_rxq *rxq, u16 weqbb_num)
+{
+ struct hinic3_rx_info *rx_info = NULL;
+
+ while (weqbb_num) {
+ rx_info = &rxq->rx_info[rxq->cons_idx & rxq->q_mask];
+ if (likely(page_to_nid(rx_info->page) == numa_node_id()))
+ hinic3_reuse_rx_page(rxq, rx_info);
+
+ rx_info->buf_dma_addr = 0;
+ rx_info->page = NULL;
+ rxq->cons_idx++;
+ rxq->delta++;
+
+ weqbb_num--;
+ }
+}
+
+int hinic3_run_xdp(struct hinic3_rxq *rxq, u32 pkt_len)
+{
+ struct bpf_prog *xdp_prog = NULL;
+ struct hinic3_rx_info *rx_info = NULL;
+ struct xdp_buff xdp;
+ int result = HINIC3_XDP_PKT_PASS;
+ u16 weqbb_num = 1; /* xdp can only use one rx_buff */
+ u8 *va = NULL;
+ u32 act;
+
+ rcu_read_lock();
+ xdp_prog = READ_ONCE(rxq->xdp_prog);
+ if (!xdp_prog)
+ goto unlock_rcu;
+
+ if (unlikely(pkt_len > rxq->buf_len)) {
+ RXQ_STATS_INC(rxq, xdp_large_pkt);
+ weqbb_num = (u16)(pkt_len >> rxq->rx_buff_shift) +
+ ((pkt_len & (rxq->buf_len - 1)) ? 1 : 0);
+ result = HINIC3_XDP_PKT_DROP;
+ goto xdp_out;
+ }
+
+ rx_info = &rxq->rx_info[rxq->cons_idx & rxq->q_mask];
+ va = (u8 *)page_address(rx_info->page) + rx_info->page_offset;
+ prefetch(va);
+ dma_sync_single_range_for_cpu(rxq->dev, rx_info->buf_dma_addr,
+ rx_info->page_offset,
+ rxq->buf_len, DMA_FROM_DEVICE);
+ xdp.data = va;
+ xdp.data_hard_start = xdp.data;
+ xdp.data_end = xdp.data + pkt_len;
+#ifdef HAVE_XDP_FRAME_SZ
+ xdp.frame_sz = rxq->buf_len;
+#endif
+#ifdef HAVE_XDP_DATA_META
+ xdp_set_data_meta_invalid(&xdp);
+#endif
+ prefetchw(xdp.data_hard_start);
+ act = bpf_prog_run_xdp(xdp_prog, &xdp);
+ switch (act) {
+ case XDP_PASS:
+ break;
+ case XDP_DROP:
+ result = HINIC3_XDP_PKT_DROP;
+ break;
+ default:
+ result = HINIC3_XDP_PKT_DROP;
+ bpf_warn_invalid_xdp_action(act);
+ }
+
+xdp_out:
+ if (result == HINIC3_XDP_PKT_DROP) {
+ RXQ_STATS_INC(rxq, xdp_dropped);
+ update_drop_rx_info(rxq, weqbb_num);
+ }
+
+unlock_rcu:
+ rcu_read_unlock();
+
+ return result;
+}
+#endif
+
+static int recv_one_pkt(struct hinic3_rxq *rxq, struct hinic3_rq_cqe *rx_cqe,
+ u32 pkt_len, u32 vlan_len, u32 status)
+{
+ struct sk_buff *skb;
+ struct net_device *netdev = rxq->netdev;
+ u32 offload_type;
+ u16 num_lro;
+ struct hinic3_nic_dev *nic_dev = netdev_priv(rxq->netdev);
+
+#ifdef HAVE_XDP_SUPPORT
+ u32 xdp_status;
+
+ xdp_status = hinic3_run_xdp(rxq, pkt_len);
+ if (xdp_status == HINIC3_XDP_PKT_DROP)
+ return 0;
+#endif
+
+ skb = hinic3_fetch_rx_buffer(rxq, pkt_len);
+ if (unlikely(!skb)) {
+ RXQ_STATS_INC(rxq, alloc_skb_err);
+ return -ENOMEM;
+ }
+
+ /* place header in linear portion of buffer */
+ if (skb_is_nonlinear(skb))
+ hinic3_pull_tail(skb);
+
+ offload_type = hinic3_hw_cpu32(rx_cqe->offload_type);
+ hinic3_rx_csum(rxq, offload_type, status, skb);
+
+#ifdef HAVE_SKBUFF_CSUM_LEVEL
+ hinic3_rx_gro(rxq, offload_type, skb);
+#endif
+
+#if defined(NETIF_F_HW_VLAN_CTAG_RX)
+ if ((netdev->features & NETIF_F_HW_VLAN_CTAG_RX) &&
+ HINIC3_GET_RX_VLAN_OFFLOAD_EN(offload_type)) {
+#else
+ if ((netdev->features & NETIF_F_HW_VLAN_RX) &&
+ HINIC3_GET_RX_VLAN_OFFLOAD_EN(offload_type)) {
+#endif
+ u16 vid = HINIC3_GET_RX_VLAN_TAG(vlan_len);
+
+ /* if the packet is a vlan pkt, the vid may be 0 */
+ __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vid);
+ }
+
+ if (unlikely(test_bit(HINIC3_LP_TEST, &nic_dev->flags)))
+ hinic3_copy_lp_data(nic_dev, skb);
+
+ num_lro = HINIC3_GET_RX_NUM_LRO(status);
+ if (num_lro)
+ hinic3_lro_set_gso_params(skb, num_lro);
+
+ skb_record_rx_queue(skb, rxq->q_id);
+ skb->protocol = eth_type_trans(skb, netdev);
+
+ if (skb_has_frag_list(skb)) {
+#ifdef HAVE_NAPI_GRO_FLUSH_OLD
+ napi_gro_flush(&rxq->irq_cfg->napi, false);
+#else
+ napi_gro_flush(&rxq->irq_cfg->napi);
+#endif
+ netif_receive_skb(skb);
+ } else {
+ napi_gro_receive(&rxq->irq_cfg->napi, skb);
+ }
+
+ return 0;
+}
+
+#define LRO_PKT_HDR_LEN_IPV4 66
+#define LRO_PKT_HDR_LEN_IPV6 86
+#define LRO_PKT_HDR_LEN(cqe) \
+ (HINIC3_GET_RX_IP_TYPE(hinic3_hw_cpu32((cqe)->offload_type)) == \
+ HINIC3_RX_IPV6_PKT ? LRO_PKT_HDR_LEN_IPV6 : LRO_PKT_HDR_LEN_IPV4)
+
+int hinic3_rx_poll(struct hinic3_rxq *rxq, int budget)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(rxq->netdev);
+ u32 sw_ci, status, pkt_len, vlan_len, dropped = 0;
+ struct hinic3_rq_cqe *rx_cqe = NULL;
+ u64 rx_bytes = 0;
+ u16 num_lro;
+ int pkts = 0, nr_pkts = 0;
+ u16 num_wqe = 0;
+
+ while (likely(pkts < budget)) {
+ sw_ci = rxq->cons_idx & rxq->q_mask;
+ rx_cqe = rxq->rx_info[sw_ci].cqe;
+ status = hinic3_hw_cpu32(rx_cqe->status);
+ if (!HINIC3_GET_RX_DONE(status))
+ break;
+
+ /* make sure we read rx_done before packet length */
+ rmb();
+
+ vlan_len = hinic3_hw_cpu32(rx_cqe->vlan_len);
+ pkt_len = HINIC3_GET_RX_PKT_LEN(vlan_len);
+ if (recv_one_pkt(rxq, rx_cqe, pkt_len, vlan_len, status))
+ break;
+
+ rx_bytes += pkt_len;
+ pkts++;
+ nr_pkts++;
+
+ num_lro = HINIC3_GET_RX_NUM_LRO(status);
+ if (num_lro) {
+ rx_bytes += ((num_lro - 1) * LRO_PKT_HDR_LEN(rx_cqe));
+
+ num_wqe += HINIC3_GET_SGE_NUM(pkt_len, rxq);
+ }
+
+ rx_cqe->status = 0;
+
+ if (num_wqe >= nic_dev->lro_replenish_thld)
+ break;
+ }
+
+ if (rxq->delta >= HINIC3_RX_BUFFER_WRITE)
+ hinic3_rx_fill_buffers(rxq);
+
+ u64_stats_update_begin(&rxq->rxq_stats.syncp);
+ rxq->rxq_stats.packets += (u64)(u32)nr_pkts;
+ rxq->rxq_stats.bytes += rx_bytes;
+ rxq->rxq_stats.dropped += (u64)dropped;
+ u64_stats_update_end(&rxq->rxq_stats.syncp);
+ return pkts;
+}
+
+int hinic3_alloc_rxqs_res(struct hinic3_nic_dev *nic_dev, u16 num_rq,
+ u32 rq_depth, struct hinic3_dyna_rxq_res *rxqs_res)
+{
+ struct hinic3_dyna_rxq_res *rqres = NULL;
+ u64 cqe_mem_size = sizeof(struct hinic3_rq_cqe) * rq_depth;
+ int idx, i;
+ u32 pkts;
+ u64 size;
+
+ for (idx = 0; idx < num_rq; idx++) {
+ rqres = &rxqs_res[idx];
+ size = sizeof(*rqres->rx_info) * rq_depth;
+ rqres->rx_info = kzalloc(size, GFP_KERNEL);
+ if (!rqres->rx_info) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc rxq%d rx info\n", idx);
+ goto err_out;
+ }
+
+ rqres->cqe_start_vaddr =
+ dma_zalloc_coherent(&nic_dev->pdev->dev, cqe_mem_size,
+ &rqres->cqe_start_paddr,
+ GFP_KERNEL);
+ if (!rqres->cqe_start_vaddr) {
+ kfree(rqres->rx_info);
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc rxq%d cqe\n", idx);
+ goto err_out;
+ }
+
+ pkts = hinic3_rx_alloc_buffers(nic_dev, rq_depth,
+ rqres->rx_info);
+ if (!pkts) {
+ dma_free_coherent(&nic_dev->pdev->dev, cqe_mem_size,
+ rqres->cqe_start_vaddr,
+ rqres->cqe_start_paddr);
+ kfree(rqres->rx_info);
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc rxq%d rx buffers\n", idx);
+ goto err_out;
+ }
+ rqres->next_to_alloc = (u16)pkts;
+ }
+ return 0;
+
+err_out:
+ for (i = 0; i < idx; i++) {
+ rqres = &rxqs_res[i];
+
+ hinic3_rx_free_buffers(nic_dev, rq_depth, rqres->rx_info);
+ dma_free_coherent(&nic_dev->pdev->dev, cqe_mem_size,
+ rqres->cqe_start_vaddr,
+ rqres->cqe_start_paddr);
+ kfree(rqres->rx_info);
+ }
+
+ return -ENOMEM;
+}
+
+void hinic3_free_rxqs_res(struct hinic3_nic_dev *nic_dev, u16 num_rq,
+ u32 rq_depth, struct hinic3_dyna_rxq_res *rxqs_res)
+{
+ struct hinic3_dyna_rxq_res *rqres = NULL;
+ u64 cqe_mem_size = sizeof(struct hinic3_rq_cqe) * rq_depth;
+ int idx;
+
+ for (idx = 0; idx < num_rq; idx++) {
+ rqres = &rxqs_res[idx];
+
+ hinic3_rx_free_buffers(nic_dev, rq_depth, rqres->rx_info);
+ dma_free_coherent(&nic_dev->pdev->dev, cqe_mem_size,
+ rqres->cqe_start_vaddr,
+ rqres->cqe_start_paddr);
+ kfree(rqres->rx_info);
+ }
+}
+
+int hinic3_configure_rxqs(struct hinic3_nic_dev *nic_dev, u16 num_rq,
+ u32 rq_depth, struct hinic3_dyna_rxq_res *rxqs_res)
+{
+ struct hinic3_dyna_rxq_res *rqres = NULL;
+ struct irq_info *msix_entry = NULL;
+ struct hinic3_rxq *rxq = NULL;
+ struct hinic3_rq_cqe *cqe_va = NULL;
+ dma_addr_t cqe_pa;
+ u16 q_id;
+ u32 idx;
+ u32 pkts;
+
+ nic_dev->rxq_get_err_times = 0;
+ for (q_id = 0; q_id < num_rq; q_id++) {
+ rxq = &nic_dev->rxqs[q_id];
+ rqres = &rxqs_res[q_id];
+ msix_entry = &nic_dev->qps_irq_info[q_id];
+
+ rxq->irq_id = msix_entry->irq_id;
+ rxq->msix_entry_idx = msix_entry->msix_entry_idx;
+ rxq->next_to_update = 0;
+ rxq->next_to_alloc = rqres->next_to_alloc;
+ rxq->q_depth = rq_depth;
+ rxq->delta = rxq->q_depth;
+ rxq->q_mask = rxq->q_depth - 1;
+ rxq->cons_idx = 0;
+
+ rxq->last_sw_pi = rxq->q_depth - 1;
+ rxq->last_sw_ci = 0;
+ rxq->last_hw_ci = 0;
+ rxq->rx_check_err_cnt = 0;
+ rxq->rxq_print_times = 0;
+ rxq->last_packets = 0;
+ rxq->restore_buf_num = 0;
+
+ rxq->rx_info = rqres->rx_info;
+
+ /* fill cqe */
+ cqe_va = (struct hinic3_rq_cqe *)rqres->cqe_start_vaddr;
+ cqe_pa = rqres->cqe_start_paddr;
+ for (idx = 0; idx < rq_depth; idx++) {
+ rxq->rx_info[idx].cqe = cqe_va;
+ rxq->rx_info[idx].cqe_dma = cqe_pa;
+ cqe_va++;
+ cqe_pa += sizeof(*rxq->rx_info->cqe);
+ }
+
+ rxq->rq = hinic3_get_nic_queue(nic_dev->hwdev, rxq->q_id,
+ HINIC3_RQ);
+ if (!rxq->rq) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to get rq\n");
+ return -EINVAL;
+ }
+
+ pkts = hinic3_rx_fill_wqe(rxq);
+ if (pkts != rxq->q_depth) {
+ nicif_err(nic_dev, drv, nic_dev->netdev, "Failed to fill rx wqe\n");
+ return -EFAULT;
+ }
+
+ pkts = hinic3_rx_fill_buffers(rxq);
+ if (!pkts) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to fill Rx buffer\n");
+ return -ENOMEM;
+ }
+ }
+
+ return 0;
+}
+
+void hinic3_free_rxqs(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ kfree(nic_dev->rxqs);
+}
+
+int hinic3_alloc_rxqs(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct pci_dev *pdev = nic_dev->pdev;
+ struct hinic3_rxq *rxq = NULL;
+ u16 num_rxqs = nic_dev->max_qps;
+ u16 q_id;
+ u64 rxq_size;
+
+ rxq_size = num_rxqs * sizeof(*nic_dev->rxqs);
+ if (!rxq_size) {
+ nic_err(&pdev->dev, "Cannot allocate zero size rxqs\n");
+ return -EINVAL;
+ }
+
+ nic_dev->rxqs = kzalloc(rxq_size, GFP_KERNEL);
+ if (!nic_dev->rxqs) {
+ nic_err(&pdev->dev, "Failed to allocate rxqs\n");
+ return -ENOMEM;
+ }
+
+ for (q_id = 0; q_id < num_rxqs; q_id++) {
+ rxq = &nic_dev->rxqs[q_id];
+ rxq->netdev = netdev;
+ rxq->dev = &pdev->dev;
+ rxq->q_id = q_id;
+ rxq->buf_len = nic_dev->rx_buff_len;
+ rxq->rx_buff_shift = (u32)ilog2(nic_dev->rx_buff_len);
+ rxq->dma_rx_buff_size = nic_dev->dma_rx_buff_size;
+ rxq->q_depth = nic_dev->q_params.rq_depth;
+ rxq->q_mask = nic_dev->q_params.rq_depth - 1;
+
+ rxq_stats_init(rxq);
+ }
+
+ return 0;
+}
+
+int hinic3_rx_configure(struct net_device *netdev, u8 dcb_en)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u8 rq2iq_map[HINIC3_MAX_NUM_RQ];
+ int err;
+
+ /* Set all rq mapping to all iq in default */
+
+ memset(rq2iq_map, 0xFF, sizeof(rq2iq_map));
+
+ if (test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ err = hinic3_rss_init(nic_dev, rq2iq_map, sizeof(rq2iq_map), dcb_en);
+ if (err) {
+ nicif_err(nic_dev, drv, netdev, "Failed to init rss\n");
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+void hinic3_rx_remove_configure(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ if (test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags))
+ hinic3_rss_deinit(nic_dev);
+}
+
+int rxq_restore(struct hinic3_nic_dev *nic_dev, u16 q_id, u16 hw_ci)
+{
+ struct hinic3_rxq *rxq = &nic_dev->rxqs[q_id];
+ struct hinic3_rq_wqe *rq_wqe = NULL;
+ struct hinic3_rx_info *rx_info = NULL;
+ dma_addr_t dma_addr;
+ u32 free_wqebbs = rxq->delta - rxq->restore_buf_num;
+ u32 buff_pi;
+ u32 i;
+ int err;
+
+ if (rxq->delta < rxq->restore_buf_num)
+ return -EINVAL;
+
+ if (rxq->restore_buf_num == 0) /* start restore process */
+ rxq->restore_pi = rxq->next_to_update;
+
+ buff_pi = rxq->restore_pi;
+
+ if ((((rxq->cons_idx & rxq->q_mask) + rxq->q_depth -
+ rxq->next_to_update) % rxq->q_depth) != rxq->delta)
+ return -EINVAL;
+
+ for (i = 0; i < free_wqebbs; i++) {
+ rx_info = &rxq->rx_info[buff_pi];
+
+ if (unlikely(!rx_alloc_mapped_page(nic_dev, rx_info))) {
+ RXQ_STATS_INC(rxq, alloc_rx_buf_err);
+ rxq->restore_pi = (u16)((rxq->restore_pi + i) & rxq->q_mask);
+ return -ENOMEM;
+ }
+
+ dma_addr = rx_info->buf_dma_addr + rx_info->page_offset;
+
+ rq_wqe = rx_info->rq_wqe;
+
+ if (rxq->rq->wqe_type == HINIC3_EXTEND_RQ_WQE) {
+ rq_wqe->extend_wqe.buf_desc.sge.hi_addr =
+ hinic3_hw_be32(upper_32_bits(dma_addr));
+ rq_wqe->extend_wqe.buf_desc.sge.lo_addr =
+ hinic3_hw_be32(lower_32_bits(dma_addr));
+ } else {
+ rq_wqe->normal_wqe.buf_hi_addr =
+ hinic3_hw_be32(upper_32_bits(dma_addr));
+ rq_wqe->normal_wqe.buf_lo_addr =
+ hinic3_hw_be32(lower_32_bits(dma_addr));
+ }
+ buff_pi = (u16)((buff_pi + 1) & rxq->q_mask);
+ rxq->restore_buf_num++;
+ }
+
+ nic_info(&nic_dev->pdev->dev, "rxq %u restore_buf_num:%u\n", q_id, rxq->restore_buf_num);
+
+ rx_info = &rxq->rx_info[(hw_ci + rxq->q_depth - 1) & rxq->q_mask];
+ if (rx_info->buf_dma_addr) {
+ dma_unmap_page(&nic_dev->pdev->dev, rx_info->buf_dma_addr,
+ nic_dev->dma_rx_buff_size, DMA_FROM_DEVICE);
+ rx_info->buf_dma_addr = 0;
+ }
+
+ if (rx_info->page) {
+ __free_pages(rx_info->page, nic_dev->page_order);
+ rx_info->page = NULL;
+ }
+
+ rxq->delta = 1;
+ rxq->next_to_update = (u16)((hw_ci + rxq->q_depth - 1) & rxq->q_mask);
+ rxq->cons_idx = (u16)((rxq->next_to_update + 1) & rxq->q_mask);
+ rxq->restore_buf_num = 0;
+ rxq->next_to_alloc = rxq->next_to_update;
+
+ for (i = 0; i < rxq->q_depth; i++) {
+ if (!HINIC3_GET_RX_DONE(hinic3_hw_cpu32(rxq->rx_info[i].cqe->status)))
+ continue;
+
+ RXQ_STATS_INC(rxq, restore_drop_sge);
+ rxq->rx_info[i].cqe->status = 0;
+ }
+
+ err = hinic3_cache_out_qps_res(nic_dev->hwdev);
+ if (err) {
+ clear_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags);
+ return err;
+ }
+
+ if (!rq_pi_rd_en) {
+ hinic3_write_db(rxq->rq, rxq->q_id & (NIC_DCB_COS_MAX - 1),
+ RQ_CFLAG_DP, (u16)((u32)rxq->next_to_update << rxq->rq->wqe_type));
+ } else {
+ /* Write all the wqes before pi update */
+ wmb();
+
+ hinic3_update_rq_hw_pi(rxq->rq, rxq->next_to_update);
+ }
+
+ return 0;
+}
+
+bool rxq_is_normal(struct hinic3_rxq *rxq, struct rxq_check_info rxq_info)
+{
+ u32 status;
+
+ if (rxq->rxq_stats.packets != rxq->last_packets || rxq_info.hw_pi != rxq_info.hw_ci ||
+ rxq_info.hw_ci != rxq->last_hw_ci || rxq->next_to_update != rxq->last_sw_pi)
+ return true;
+
+ /* hw rx no wqe and driver rx no packet recv */
+ status = rxq->rx_info[rxq->cons_idx & rxq->q_mask].cqe->status;
+ if (HINIC3_GET_RX_DONE(hinic3_hw_cpu32(status)))
+ return true;
+
+ if ((rxq->cons_idx & rxq->q_mask) != rxq->last_sw_ci ||
+ rxq->rxq_stats.packets != rxq->last_packets ||
+ rxq->next_to_update != rxq_info.hw_pi)
+ return true;
+
+ return false;
+}
+
+#define RXQ_CHECK_ERR_TIMES 2
+#define RXQ_PRINT_MAX_TIMES 3
+#define RXQ_GET_ERR_MAX_TIMES 3
+void hinic3_rxq_check_work_handler(struct work_struct *work)
+{
+ struct delayed_work *delay = to_delayed_work(work);
+ struct hinic3_nic_dev *nic_dev = container_of(delay, struct hinic3_nic_dev,
+ rxq_check_work);
+ struct rxq_check_info *rxq_info = NULL;
+ struct hinic3_rxq *rxq = NULL;
+ u64 size;
+ u16 qid;
+ int err;
+
+ if (!test_bit(HINIC3_INTF_UP, &nic_dev->flags))
+ return;
+
+ if (test_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags))
+ queue_delayed_work(nic_dev->workq, &nic_dev->rxq_check_work, HZ);
+
+ size = sizeof(*rxq_info) * nic_dev->q_params.num_qps;
+ if (!size)
+ return;
+
+ rxq_info = kzalloc(size, GFP_KERNEL);
+ if (!rxq_info)
+ return;
+
+ err = hinic3_get_rxq_hw_info(nic_dev->hwdev, rxq_info, nic_dev->q_params.num_qps,
+ nic_dev->rxqs[0].rq->wqe_type);
+ if (err) {
+ nic_dev->rxq_get_err_times++;
+ if (nic_dev->rxq_get_err_times >= RXQ_GET_ERR_MAX_TIMES)
+ clear_bit(HINIC3_RXQ_RECOVERY, &nic_dev->flags);
+ goto free_rxq_info;
+ }
+
+ for (qid = 0; qid < nic_dev->q_params.num_qps; qid++) {
+ rxq = &nic_dev->rxqs[qid];
+ if (!rxq_is_normal(rxq, rxq_info[qid])) {
+ rxq->rx_check_err_cnt++;
+ if (rxq->rx_check_err_cnt < RXQ_CHECK_ERR_TIMES)
+ continue;
+
+ if (rxq->rxq_print_times <= RXQ_PRINT_MAX_TIMES) {
+ nic_warn(&nic_dev->pdev->dev, "rxq %u wqe abnormal, hw_pi:%u, hw_ci:%u, sw_pi:%u, sw_ci:%u delta:%u\n",
+ qid, rxq_info[qid].hw_pi, rxq_info[qid].hw_ci,
+ rxq->next_to_update,
+ rxq->cons_idx & rxq->q_mask, rxq->delta);
+ rxq->rxq_print_times++;
+ }
+
+ err = rxq_restore(nic_dev, qid, rxq_info[qid].hw_ci);
+ if (err)
+ continue;
+ }
+
+ rxq->rxq_print_times = 0;
+ rxq->rx_check_err_cnt = 0;
+ rxq->last_sw_pi = rxq->next_to_update;
+ rxq->last_sw_ci = rxq->cons_idx & rxq->q_mask;
+ rxq->last_hw_ci = rxq_info[qid].hw_ci;
+ rxq->last_packets = rxq->rxq_stats.packets;
+ }
+
+ nic_dev->rxq_get_err_times = 0;
+
+free_rxq_info:
+ kfree(rxq_info);
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
new file mode 100644
index 000000000000..f4d6f4fdb13e
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
@@ -0,0 +1,155 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_RX_H
+#define HINIC3_RX_H
+
+#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/mm_types.h>
+#include <linux/netdevice.h>
+#include <linux/skbuff.h>
+#include <linux/u64_stats_sync.h>
+
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_qp.h"
+#include "hinic3_nic_dev.h"
+
+/* rx cqe checksum err */
+#define HINIC3_RX_CSUM_IP_CSUM_ERR BIT(0)
+#define HINIC3_RX_CSUM_TCP_CSUM_ERR BIT(1)
+#define HINIC3_RX_CSUM_UDP_CSUM_ERR BIT(2)
+#define HINIC3_RX_CSUM_IGMP_CSUM_ERR BIT(3)
+#define HINIC3_RX_CSUM_ICMPV4_CSUM_ERR BIT(4)
+#define HINIC3_RX_CSUM_ICMPV6_CSUM_ERR BIT(5)
+#define HINIC3_RX_CSUM_SCTP_CRC_ERR BIT(6)
+#define HINIC3_RX_CSUM_HW_CHECK_NONE BIT(7)
+#define HINIC3_RX_CSUM_IPSU_OTHER_ERR BIT(8)
+
+#define HINIC3_HEADER_DATA_UNIT 2
+
+struct hinic3_rxq_stats {
+ u64 packets;
+ u64 bytes;
+ u64 errors;
+ u64 csum_errors;
+ u64 other_errors;
+ u64 dropped;
+ u64 xdp_dropped;
+ u64 rx_buf_empty;
+
+ u64 alloc_skb_err;
+ u64 alloc_rx_buf_err;
+ u64 xdp_large_pkt;
+ u64 restore_drop_sge;
+ u64 rsvd2;
+#ifdef HAVE_NDO_GET_STATS64
+ struct u64_stats_sync syncp;
+#else
+ struct u64_stats_sync_empty syncp;
+#endif
+};
+
+struct hinic3_rx_info {
+ dma_addr_t buf_dma_addr;
+
+ struct hinic3_rq_cqe *cqe;
+ dma_addr_t cqe_dma;
+ struct page *page;
+ u32 page_offset;
+ u32 rsvd1;
+ struct hinic3_rq_wqe *rq_wqe;
+ struct sk_buff *saved_skb;
+ u32 skb_len;
+ u32 rsvd2;
+};
+
+struct hinic3_rxq {
+ struct net_device *netdev;
+
+ u16 q_id;
+ u16 rsvd1;
+ u32 q_depth;
+ u32 q_mask;
+
+ u16 buf_len;
+ u16 rsvd2;
+ u32 rx_buff_shift;
+ u32 dma_rx_buff_size;
+
+ struct hinic3_rxq_stats rxq_stats;
+ u32 cons_idx;
+ u32 delta;
+
+ u32 irq_id;
+ u16 msix_entry_idx;
+ u16 rsvd3;
+
+ struct hinic3_rx_info *rx_info;
+ struct hinic3_io_queue *rq;
+#ifdef HAVE_XDP_SUPPORT
+ struct bpf_prog *xdp_prog;
+#endif
+
+ struct hinic3_irq *irq_cfg;
+ u16 next_to_alloc;
+ u16 next_to_update;
+ struct device *dev; /* device for DMA mapping */
+
+ unsigned long status;
+ dma_addr_t cqe_start_paddr;
+ void *cqe_start_vaddr;
+
+ u64 last_moder_packets;
+ u64 last_moder_bytes;
+ u8 last_coalesc_timer_cfg;
+ u8 last_pending_limt;
+ u16 restore_buf_num;
+ u32 rsvd5;
+ u64 rsvd6;
+
+ u32 last_sw_pi;
+ u32 last_sw_ci;
+
+ u32 last_hw_ci;
+ u8 rx_check_err_cnt;
+ u8 rxq_print_times;
+ u16 restore_pi;
+
+ u64 last_packets;
+} ____cacheline_aligned;
+
+struct hinic3_dyna_rxq_res {
+ u16 next_to_alloc;
+ struct hinic3_rx_info *rx_info;
+ dma_addr_t cqe_start_paddr;
+ void *cqe_start_vaddr;
+};
+
+int hinic3_alloc_rxqs(struct net_device *netdev);
+
+void hinic3_free_rxqs(struct net_device *netdev);
+
+int hinic3_alloc_rxqs_res(struct hinic3_nic_dev *nic_dev, u16 num_rq,
+ u32 rq_depth, struct hinic3_dyna_rxq_res *rxqs_res);
+
+void hinic3_free_rxqs_res(struct hinic3_nic_dev *nic_dev, u16 num_rq,
+ u32 rq_depth, struct hinic3_dyna_rxq_res *rxqs_res);
+
+int hinic3_configure_rxqs(struct hinic3_nic_dev *nic_dev, u16 num_rq,
+ u32 rq_depth, struct hinic3_dyna_rxq_res *rxqs_res);
+
+int hinic3_rx_configure(struct net_device *netdev, u8 dcb_en);
+
+void hinic3_rx_remove_configure(struct net_device *netdev);
+
+int hinic3_rx_poll(struct hinic3_rxq *rxq, int budget);
+
+void hinic3_rxq_get_stats(struct hinic3_rxq *rxq,
+ struct hinic3_rxq_stats *stats);
+
+void hinic3_rxq_clean_stats(struct hinic3_rxq_stats *rxq_stats);
+
+void hinic3_rxq_check_work_handler(struct work_struct *work);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_srv_nic.h b/drivers/net/ethernet/huawei/hinic3/hinic3_srv_nic.h
new file mode 100644
index 000000000000..fee4cfca1e4c
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_srv_nic.h
@@ -0,0 +1,213 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2018-2022. All rights reserved.
+ * @file hinic3_srv_nic.h
+ * @details nic service interface
+ * History :
+ * 1.Date : 2018/3/8
+ * Modification: Created file
+ */
+
+#ifndef HINIC3_SRV_NIC_H
+#define HINIC3_SRV_NIC_H
+
+#include "hinic3_mgmt_interface.h"
+#include "mag_cmd.h"
+#include "hinic3_lld.h"
+
+enum hinic3_queue_type {
+ HINIC3_SQ,
+ HINIC3_RQ,
+ HINIC3_MAX_QUEUE_TYPE
+};
+
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_netdev(struct net_device *netdev);
+struct net_device *hinic3_get_netdev_by_lld(struct hinic3_lld_dev *lld_dev);
+
+struct hinic3_event_link_info {
+ u8 valid;
+ u8 port_type;
+ u8 autoneg_cap;
+ u8 autoneg_state;
+ u8 duplex;
+ u8 speed;
+};
+
+enum link_err_type {
+ LINK_ERR_MODULE_UNRECOGENIZED,
+ LINK_ERR_NUM,
+};
+
+enum port_module_event_type {
+ HINIC3_PORT_MODULE_CABLE_PLUGGED,
+ HINIC3_PORT_MODULE_CABLE_UNPLUGGED,
+ HINIC3_PORT_MODULE_LINK_ERR,
+ HINIC3_PORT_MODULE_MAX_EVENT,
+};
+
+struct hinic3_port_module_event {
+ enum port_module_event_type type;
+ enum link_err_type err_type;
+};
+
+struct hinic3_dcb_info {
+ u8 dcb_on;
+ u8 default_cos;
+ u8 up_cos[NIC_DCB_COS_MAX];
+};
+
+enum hinic3_nic_event_type {
+ EVENT_NIC_LINK_DOWN,
+ EVENT_NIC_LINK_UP,
+ EVENT_NIC_PORT_MODULE_EVENT,
+ EVENT_NIC_DCB_STATE_CHANGE,
+};
+
+/* *
+ * @brief hinic3_set_mac - set mac address
+ * @param hwdev: device pointer to hwdev
+ * @param mac_addr: mac address from hardware
+ * @param vlan_id: vlan id
+ * @param func_id: function index
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_mac(void *hwdev, const u8 *mac_addr, u16 vlan_id, u16 func_id, u16 channel);
+
+/* *
+ * @brief hinic3_del_mac - delete mac address
+ * @param hwdev: device pointer to hwdev
+ * @param mac_addr: mac address from hardware
+ * @param vlan_id: vlan id
+ * @param func_id: function index
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_del_mac(void *hwdev, const u8 *mac_addr, u16 vlan_id, u16 func_id, u16 channel);
+
+/* *
+ * @brief hinic3_set_vport_enable - set function valid status
+ * @param hwdev: device pointer to hwdev
+ * @param func_id: global function index
+ * @param enable: 0-disable, 1-enable
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_vport_enable(void *hwdev, u16 func_id, bool enable, u16 channel);
+
+/* *
+ * @brief hinic3_set_port_enable - set port status
+ * @param hwdev: device pointer to hwdev
+ * @param enable: 0-disable, 1-enable
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_set_port_enable(void *hwdev, bool enable, u16 channel);
+
+/* *
+ * @brief hinic3_flush_qps_res - flush queue pairs resource in hardware
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_flush_qps_res(void *hwdev);
+
+/* *
+ * @brief hinic3_cache_out_qps_res - cache out queue pairs wqe resource in hardware
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_cache_out_qps_res(void *hwdev);
+
+/* *
+ * @brief hinic3_init_nic_hwdev - init nic hwdev
+ * @param hwdev: device pointer to hwdev
+ * @param pcidev_hdl: pointer to pcidev or handler
+ * @param dev_hdl: pointer to pcidev->dev or handler, for sdk_err() or
+ * dma_alloc()
+ * @param rx_buff_len: rx_buff_len is receive buffer length
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_init_nic_hwdev(void *hwdev, void *pcidev_hdl, void *dev_hdl, u16 rx_buff_len);
+
+/* *
+ * @brief hinic3_free_nic_hwdev - free nic hwdev
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+void hinic3_free_nic_hwdev(void *hwdev);
+
+/* *
+ * @brief hinic3_get_speed - set link speed
+ * @param hwdev: device pointer to hwdev
+ * @param port_info: link speed
+ * @param channel: channel id
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_get_speed(void *hwdev, enum mag_cmd_port_speed *speed, u16 channel);
+
+int hinic3_get_dcb_state(void *hwdev, struct hinic3_dcb_state *dcb_state);
+
+int hinic3_get_pf_dcb_state(void *hwdev, struct hinic3_dcb_state *dcb_state);
+
+int hinic3_get_cos_by_pri(void *hwdev, u8 pri, u8 *cos);
+
+/* *
+ * @brief hinic3_create_qps - create queue pairs
+ * @param hwdev: device pointer to hwdev
+ * @param num_qp: number of queue pairs
+ * @param sq_depth: sq depth
+ * @param rq_depth: rq depth
+ * @param qps_msix_arry: msix info
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_create_qps(void *hwdev, u16 num_qp, u32 sq_depth, u32 rq_depth,
+ struct irq_info *qps_msix_arry);
+
+/* *
+ * @brief hinic3_destroy_qps - destroy queue pairs
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_destroy_qps(void *hwdev);
+
+/* *
+ * @brief hinic3_get_nic_queue - get nic queue
+ * @param hwdev: device pointer to hwdev
+ * @param q_id: queue index
+ * @param q_type: queue type
+ * @retval queue address
+ */
+void *hinic3_get_nic_queue(void *hwdev, u16 q_id, enum hinic3_queue_type q_type);
+
+/* *
+ * @brief hinic3_init_qp_ctxts - init queue pair context
+ * @param hwdev: device pointer to hwdev
+ * @retval zero: success
+ * @retval non-zero: failure
+ */
+int hinic3_init_qp_ctxts(void *hwdev);
+
+/* *
+ * @brief hinic3_free_qp_ctxts - free queue pairs
+ * @param hwdev: device pointer to hwdev
+ */
+void hinic3_free_qp_ctxts(void *hwdev);
+
+/* *
+ * @brief hinic3_pf_set_vf_link_state pf set vf link state
+ * @param hwdev: device pointer to hwdev
+ * @param vf_link_forced: set link forced
+ * @param link_state: Set link state, This parameter is valid only when vf_link_forced is true
+ */
+int hinic3_pf_set_vf_link_state(void *hwdev, bool vf_link_forced, bool link_state);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_tx.c b/drivers/net/ethernet/huawei/hinic3/hinic3_tx.c
new file mode 100644
index 000000000000..3029cff7f00b
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_tx.c
@@ -0,0 +1,1016 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <net/xfrm.h>
+#include <linux/netdevice.h>
+#include <linux/kernel.h>
+#include <linux/skbuff.h>
+#include <linux/interrupt.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/tcp.h>
+#include <linux/sctp.h>
+#include <linux/dma-mapping.h>
+#include <linux/types.h>
+#include <linux/u64_stats_sync.h>
+#include <linux/module.h>
+#include <linux/vmalloc.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_nic_qp.h"
+#include "hinic3_nic_io.h"
+#include "hinic3_nic_cfg.h"
+#include "hinic3_srv_nic.h"
+#include "hinic3_nic_dev.h"
+#include "hinic3_tx.h"
+
+#define MIN_SKB_LEN 32
+
+#define MAX_PAYLOAD_OFFSET 221
+
+#define NIC_QID(q_id, nic_dev) ((q_id) & ((nic_dev)->num_qps - 1))
+
+#define HINIC3_TX_TASK_WRAPPED 1
+#define HINIC3_TX_BD_DESC_WRAPPED 2
+
+#define TXQ_STATS_INC(txq, field) \
+do { \
+ u64_stats_update_begin(&(txq)->txq_stats.syncp); \
+ (txq)->txq_stats.field++; \
+ u64_stats_update_end(&(txq)->txq_stats.syncp); \
+} while (0)
+
+void hinic3_txq_get_stats(struct hinic3_txq *txq,
+ struct hinic3_txq_stats *stats)
+{
+ struct hinic3_txq_stats *txq_stats = &txq->txq_stats;
+ unsigned int start;
+
+ u64_stats_update_begin(&stats->syncp);
+ do {
+ start = u64_stats_fetch_begin(&txq_stats->syncp);
+ stats->bytes = txq_stats->bytes;
+ stats->packets = txq_stats->packets;
+ stats->busy = txq_stats->busy;
+ stats->wake = txq_stats->wake;
+ stats->dropped = txq_stats->dropped;
+ } while (u64_stats_fetch_retry(&txq_stats->syncp, start));
+ u64_stats_update_end(&stats->syncp);
+}
+
+void hinic3_txq_clean_stats(struct hinic3_txq_stats *txq_stats)
+{
+ u64_stats_update_begin(&txq_stats->syncp);
+ txq_stats->bytes = 0;
+ txq_stats->packets = 0;
+ txq_stats->busy = 0;
+ txq_stats->wake = 0;
+ txq_stats->dropped = 0;
+
+ txq_stats->skb_pad_err = 0;
+ txq_stats->frag_len_overflow = 0;
+ txq_stats->offload_cow_skb_err = 0;
+ txq_stats->map_frag_err = 0;
+ txq_stats->unknown_tunnel_pkt = 0;
+ txq_stats->frag_size_err = 0;
+ txq_stats->rsvd1 = 0;
+ txq_stats->rsvd2 = 0;
+ u64_stats_update_end(&txq_stats->syncp);
+}
+
+static void txq_stats_init(struct hinic3_txq *txq)
+{
+ struct hinic3_txq_stats *txq_stats = &txq->txq_stats;
+
+ u64_stats_init(&txq_stats->syncp);
+ hinic3_txq_clean_stats(txq_stats);
+}
+
+static inline void hinic3_set_buf_desc(struct hinic3_sq_bufdesc *buf_descs,
+ dma_addr_t addr, u32 len)
+{
+ buf_descs->hi_addr = hinic3_hw_be32(upper_32_bits(addr));
+ buf_descs->lo_addr = hinic3_hw_be32(lower_32_bits(addr));
+ buf_descs->len = hinic3_hw_be32(len);
+}
+
+static int tx_map_skb(struct hinic3_nic_dev *nic_dev, struct sk_buff *skb,
+ u16 valid_nr_frags, struct hinic3_txq *txq,
+ struct hinic3_tx_info *tx_info,
+ struct hinic3_sq_wqe_combo *wqe_combo)
+{
+ struct hinic3_sq_wqe_desc *wqe_desc = wqe_combo->ctrl_bd0;
+ struct hinic3_sq_bufdesc *buf_desc = wqe_combo->bds_head;
+ struct hinic3_dma_info *dma_info = tx_info->dma_info;
+ struct pci_dev *pdev = nic_dev->pdev;
+ skb_frag_t *frag = NULL;
+ u32 j, i;
+ int err;
+
+ dma_info[0].dma = dma_map_single(&pdev->dev, skb->data, skb_headlen(skb), DMA_TO_DEVICE);
+ if (dma_mapping_error(&pdev->dev, dma_info[0].dma)) {
+ TXQ_STATS_INC(txq, map_frag_err);
+ return -EFAULT;
+ }
+
+ dma_info[0].len = skb_headlen(skb);
+
+ wqe_desc->hi_addr = hinic3_hw_be32(upper_32_bits(dma_info[0].dma));
+ wqe_desc->lo_addr = hinic3_hw_be32(lower_32_bits(dma_info[0].dma));
+
+ wqe_desc->ctrl_len = dma_info[0].len;
+
+ for (i = 0; i < valid_nr_frags;) {
+ frag = &(skb_shinfo(skb)->frags[i]);
+ if (unlikely(i == wqe_combo->first_bds_num))
+ buf_desc = wqe_combo->bds_sec2;
+
+ i++;
+ dma_info[i].dma = skb_frag_dma_map(&pdev->dev, frag, 0,
+ skb_frag_size(frag),
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(&pdev->dev, dma_info[i].dma)) {
+ TXQ_STATS_INC(txq, map_frag_err);
+ i--;
+ err = -EFAULT;
+ goto frag_map_err;
+ }
+ dma_info[i].len = skb_frag_size(frag);
+
+ hinic3_set_buf_desc(buf_desc, dma_info[i].dma,
+ dma_info[i].len);
+ buf_desc++;
+ }
+
+ return 0;
+
+frag_map_err:
+ for (j = 0; j < i;) {
+ j++;
+ dma_unmap_page(&pdev->dev, dma_info[j].dma,
+ dma_info[j].len, DMA_TO_DEVICE);
+ }
+ dma_unmap_single(&pdev->dev, dma_info[0].dma, dma_info[0].len,
+ DMA_TO_DEVICE);
+ return err;
+}
+
+static inline void tx_unmap_skb(struct hinic3_nic_dev *nic_dev,
+ struct sk_buff *skb, u16 valid_nr_frags,
+ struct hinic3_dma_info *dma_info)
+{
+ struct pci_dev *pdev = nic_dev->pdev;
+ int i;
+
+ for (i = 0; i < valid_nr_frags;) {
+ i++;
+ dma_unmap_page(&pdev->dev,
+ dma_info[i].dma,
+ dma_info[i].len, DMA_TO_DEVICE);
+ }
+
+ dma_unmap_single(&pdev->dev, dma_info[0].dma,
+ dma_info[0].len, DMA_TO_DEVICE);
+}
+
+union hinic3_l4 {
+ struct tcphdr *tcp;
+ struct udphdr *udp;
+ unsigned char *hdr;
+};
+
+enum sq_l3_type {
+ UNKNOWN_L3TYPE = 0,
+ IPV6_PKT = 1,
+ IPV4_PKT_NO_CHKSUM_OFFLOAD = 2,
+ IPV4_PKT_WITH_CHKSUM_OFFLOAD = 3,
+};
+
+enum sq_l4offload_type {
+ OFFLOAD_DISABLE = 0,
+ TCP_OFFLOAD_ENABLE = 1,
+ SCTP_OFFLOAD_ENABLE = 2,
+ UDP_OFFLOAD_ENABLE = 3,
+};
+
+/* initialize l4_len and offset */
+static void get_inner_l4_info(struct sk_buff *skb, union hinic3_l4 *l4,
+ u8 l4_proto, u32 *offset,
+ enum sq_l4offload_type *l4_offload)
+{
+ switch (l4_proto) {
+ case IPPROTO_TCP:
+ *l4_offload = TCP_OFFLOAD_ENABLE;
+ /* To keep same with TSO, payload offset begins from paylaod */
+ *offset = (l4->tcp->doff << TCP_HDR_DATA_OFF_UNIT_SHIFT) +
+ TRANSPORT_OFFSET(l4->hdr, skb);
+ break;
+
+ case IPPROTO_UDP:
+ *l4_offload = UDP_OFFLOAD_ENABLE;
+ *offset = TRANSPORT_OFFSET(l4->hdr, skb);
+ break;
+ default:
+ break;
+ }
+}
+
+static int hinic3_tx_csum(struct hinic3_txq *txq, struct hinic3_sq_task *task,
+ struct sk_buff *skb)
+{
+ if (skb->ip_summed != CHECKSUM_PARTIAL)
+ return 0;
+
+#if (KERNEL_VERSION(3, 8, 0) <= LINUX_VERSION_CODE)
+ if (skb->encapsulation) {
+ union hinic3_ip ip;
+ u8 l4_proto;
+
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, TUNNEL_FLAG);
+
+ ip.hdr = skb_network_header(skb);
+ if (ip.v4->version == IPV4_VERSION) {
+ l4_proto = ip.v4->protocol;
+ } else if (ip.v4->version == IPV6_VERSION) {
+ union hinic3_l4 l4;
+ unsigned char *exthdr;
+ __be16 frag_off;
+
+#ifdef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, OUT_L4_EN);
+#endif
+ exthdr = ip.hdr + sizeof(*ip.v6);
+ l4_proto = ip.v6->nexthdr;
+ l4.hdr = skb_transport_header(skb);
+ if (l4.hdr != exthdr)
+ ipv6_skip_exthdr(skb, exthdr - skb->data,
+ &l4_proto, &frag_off);
+ } else {
+ l4_proto = IPPROTO_RAW;
+ }
+
+ if (l4_proto != IPPROTO_UDP ||
+ ((struct udphdr *)skb_transport_header(skb))->dest != VXLAN_OFFLOAD_PORT_LE) {
+ TXQ_STATS_INC(txq, unknown_tunnel_pkt);
+ /* Unsupport tunnel packet, disable csum offload */
+ skb_checksum_help(skb);
+ return 0;
+ }
+ }
+
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L4_EN);
+#else
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L4_EN);
+#endif
+ return 1;
+}
+
+static void get_inner_l3_l4_type(struct sk_buff *skb, union hinic3_ip *ip,
+ union hinic3_l4 *l4,
+ enum sq_l3_type *l3_type, u8 *l4_proto)
+{
+ unsigned char *exthdr = NULL;
+
+ if (ip->v4->version == IP4_VERSION) {
+ *l3_type = IPV4_PKT_WITH_CHKSUM_OFFLOAD;
+ *l4_proto = ip->v4->protocol;
+
+#ifdef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD
+ /* inner_transport_header is wrong in centos7.0 and suse12.1 */
+ l4->hdr = ip->hdr + ((u8)ip->v4->ihl << IP_HDR_IHL_UNIT_SHIFT);
+#endif
+ } else if (ip->v4->version == IP6_VERSION) {
+ *l3_type = IPV6_PKT;
+ exthdr = ip->hdr + sizeof(*ip->v6);
+ *l4_proto = ip->v6->nexthdr;
+ if (exthdr != l4->hdr) {
+ __be16 frag_off = 0;
+#ifndef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD
+ ipv6_skip_exthdr(skb, (int)(exthdr - skb->data),
+ l4_proto, &frag_off);
+#else
+ int pld_off = 0;
+
+ pld_off = ipv6_skip_exthdr(skb,
+ (int)(exthdr - skb->data),
+ l4_proto, &frag_off);
+ l4->hdr = skb->data + pld_off;
+#endif
+ }
+ } else {
+ *l3_type = UNKNOWN_L3TYPE;
+ *l4_proto = 0;
+ }
+}
+
+static void hinic3_set_tso_info(struct hinic3_sq_task *task, u32 *queue_info,
+ enum sq_l4offload_type l4_offload,
+ u32 offset, u32 mss)
+{
+ if (l4_offload == TCP_OFFLOAD_ENABLE) {
+ *queue_info |= SQ_CTRL_QUEUE_INFO_SET(1U, TSO);
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L4_EN);
+ } else if (l4_offload == UDP_OFFLOAD_ENABLE) {
+ *queue_info |= SQ_CTRL_QUEUE_INFO_SET(1U, UFO);
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L4_EN);
+ }
+
+ /* Default enable L3 calculation */
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, INNER_L3_EN);
+
+ *queue_info |= SQ_CTRL_QUEUE_INFO_SET(offset >> 1, PLDOFF);
+
+ /* set MSS value */
+ *queue_info = SQ_CTRL_QUEUE_INFO_CLEAR(*queue_info, MSS);
+ *queue_info |= SQ_CTRL_QUEUE_INFO_SET(mss, MSS);
+}
+
+static int hinic3_tso(struct hinic3_sq_task *task, u32 *queue_info,
+ struct sk_buff *skb)
+{
+ enum sq_l4offload_type l4_offload = OFFLOAD_DISABLE;
+ enum sq_l3_type l3_type;
+ union hinic3_ip ip;
+ union hinic3_l4 l4;
+ u32 offset = 0;
+ u8 l4_proto;
+ int err;
+
+ if (!skb_is_gso(skb))
+ return 0;
+
+ err = skb_cow_head(skb, 0);
+ if (err < 0)
+ return err;
+
+#if (KERNEL_VERSION(3, 8, 0) <= LINUX_VERSION_CODE)
+ if (skb->encapsulation) {
+ u32 gso_type = skb_shinfo(skb)->gso_type;
+ /* L3 checksum always enable */
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, OUT_L3_EN);
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, TUNNEL_FLAG);
+
+ l4.hdr = skb_transport_header(skb);
+ ip.hdr = skb_network_header(skb);
+
+ if (gso_type & SKB_GSO_UDP_TUNNEL_CSUM) {
+ l4.udp->check = ~csum_magic(&ip, IPPROTO_UDP);
+ task->pkt_info0 |= SQ_TASK_INFO0_SET(1U, OUT_L4_EN);
+ } else if (gso_type & SKB_GSO_UDP_TUNNEL) {
+#ifdef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD
+ if (ip.v4->version == 6) {
+ l4.udp->check = ~csum_magic(&ip, IPPROTO_UDP);
+ task->pkt_info0 |=
+ SQ_TASK_INFO0_SET(1U, OUT_L4_EN);
+ }
+#endif
+ }
+
+ ip.hdr = skb_inner_network_header(skb);
+ l4.hdr = skb_inner_transport_header(skb);
+ } else {
+ ip.hdr = skb_network_header(skb);
+ l4.hdr = skb_transport_header(skb);
+ }
+#else
+ ip.hdr = skb_network_header(skb);
+ l4.hdr = skb_transport_header(skb);
+#endif
+
+ get_inner_l3_l4_type(skb, &ip, &l4, &l3_type, &l4_proto);
+
+ if (l4_proto == IPPROTO_TCP)
+ l4.tcp->check = ~csum_magic(&ip, IPPROTO_TCP);
+#ifdef HAVE_IP6_FRAG_ID_ENABLE_UFO
+ else if (l4_proto == IPPROTO_UDP && ip.v4->version == 6)
+ task->ip_identify =
+ be32_to_cpu(skb_shinfo(skb)->ip6_frag_id);
+#endif
+
+ get_inner_l4_info(skb, &l4, l4_proto, &offset, &l4_offload);
+
+#ifdef HAVE_OUTER_IPV6_TUNNEL_OFFLOAD
+ u32 network_hdr_len;
+
+ if (unlikely(l3_type == UNKNOWN_L3TYPE))
+ network_hdr_len = 0;
+ else
+ network_hdr_len = l4.hdr - ip.hdr;
+
+ if (unlikely(!offset)) {
+ if (l3_type == UNKNOWN_L3TYPE)
+ offset = ip.hdr - skb->data;
+ else if (l4_offload == OFFLOAD_DISABLE)
+ offset = ip.hdr - skb->data + network_hdr_len;
+ }
+#endif
+
+ hinic3_set_tso_info(task, queue_info, l4_offload, offset,
+ skb_shinfo(skb)->gso_size);
+
+ return 1;
+}
+
+static u32 hinic3_tx_offload(struct sk_buff *skb, struct hinic3_sq_task *task,
+ u32 *queue_info, struct hinic3_txq *txq)
+{
+ u32 offload = 0;
+ int tso_cs_en;
+
+ task->pkt_info0 = 0;
+ task->ip_identify = 0;
+ task->pkt_info2 = 0;
+ task->vlan_offload = 0;
+
+ tso_cs_en = hinic3_tso(task, queue_info, skb);
+ if (tso_cs_en < 0) {
+ offload = TX_OFFLOAD_INVALID;
+ return offload;
+ } else if (tso_cs_en) {
+ offload |= TX_OFFLOAD_TSO;
+ } else {
+ tso_cs_en = hinic3_tx_csum(txq, task, skb);
+ if (tso_cs_en)
+ offload |= TX_OFFLOAD_CSUM;
+ }
+
+#define VLAN_INSERT_MODE_MAX 5
+ if (unlikely(skb_vlan_tag_present(skb))) {
+ /* select vlan insert mode by qid, default 802.1Q Tag type */
+ hinic3_set_vlan_tx_offload(task, skb_vlan_tag_get(skb),
+ txq->q_id % VLAN_INSERT_MODE_MAX);
+ offload |= TX_OFFLOAD_VLAN;
+ }
+
+ if (unlikely(SQ_CTRL_QUEUE_INFO_GET(*queue_info, PLDOFF) >
+ MAX_PAYLOAD_OFFSET)) {
+ offload = TX_OFFLOAD_INVALID;
+ return offload;
+ }
+
+ return offload;
+}
+
+static void get_pkt_stats(struct hinic3_tx_info *tx_info, struct sk_buff *skb)
+{
+ u32 ihs, hdr_len;
+
+ if (skb_is_gso(skb)) {
+#if (KERNEL_VERSION(3, 8, 0) <= LINUX_VERSION_CODE)
+#if (defined(HAVE_SKB_INNER_TRANSPORT_HEADER) && \
+ defined(HAVE_SK_BUFF_ENCAPSULATION))
+ if (skb->encapsulation) {
+#ifdef HAVE_SKB_INNER_TRANSPORT_OFFSET
+ ihs = skb_inner_transport_offset(skb) +
+ inner_tcp_hdrlen(skb);
+#else
+ ihs = (skb_inner_transport_header(skb) - skb->data) +
+ inner_tcp_hdrlen(skb);
+#endif
+ } else {
+#endif
+#endif
+ ihs = skb_transport_offset(skb) + tcp_hdrlen(skb);
+#if (KERNEL_VERSION(3, 8, 0) <= LINUX_VERSION_CODE)
+#if (defined(HAVE_SKB_INNER_TRANSPORT_HEADER) && \
+ defined(HAVE_SK_BUFF_ENCAPSULATION))
+ }
+#endif
+#endif
+ hdr_len = (skb_shinfo(skb)->gso_segs - 1) * ihs;
+ tx_info->num_bytes = skb->len + (u64)hdr_len;
+ } else {
+ tx_info->num_bytes = skb->len > ETH_ZLEN ? skb->len : ETH_ZLEN;
+ }
+
+ tx_info->num_pkts = 1;
+}
+
+static inline int hinic3_maybe_stop_tx(struct hinic3_txq *txq, u16 wqebb_cnt)
+{
+ if (likely(hinic3_get_sq_free_wqebbs(txq->sq) >= wqebb_cnt))
+ return 0;
+
+ /* We need to check again in a case another CPU has just
+ * made room available.
+ */
+ netif_stop_subqueue(txq->netdev, txq->q_id);
+
+ if (likely(hinic3_get_sq_free_wqebbs(txq->sq) < wqebb_cnt))
+ return -EBUSY;
+
+ /* there have enough wqebbs after queue is wake up */
+ netif_start_subqueue(txq->netdev, txq->q_id);
+
+ return 0;
+}
+
+static u16 hinic3_set_wqe_combo(struct hinic3_txq *txq,
+ struct hinic3_sq_wqe_combo *wqe_combo,
+ u32 offload, u16 num_sge, u16 *curr_pi)
+{
+ void *second_part_wqebbs_addr = NULL;
+ void *wqe = NULL;
+ u16 first_part_wqebbs_num, tmp_pi;
+
+ wqe_combo->ctrl_bd0 = hinic3_get_sq_one_wqebb(txq->sq, curr_pi);
+ if (!offload && num_sge == 1) {
+ wqe_combo->wqe_type = SQ_WQE_COMPACT_TYPE;
+ return hinic3_get_and_update_sq_owner(txq->sq, *curr_pi, 1);
+ }
+
+ wqe_combo->wqe_type = SQ_WQE_EXTENDED_TYPE;
+
+ if (offload) {
+ wqe_combo->task = hinic3_get_sq_one_wqebb(txq->sq, &tmp_pi);
+ wqe_combo->task_type = SQ_WQE_TASKSECT_16BYTES;
+ } else {
+ wqe_combo->task_type = SQ_WQE_TASKSECT_46BITS;
+ }
+
+ if (num_sge > 1) {
+ /* first wqebb contain bd0, and bd size is equal to sq wqebb
+ * size, so we use (num_sge - 1) as wanted weqbb_cnt
+ */
+ wqe = hinic3_get_sq_multi_wqebbs(txq->sq, num_sge - 1, &tmp_pi,
+ &second_part_wqebbs_addr,
+ &first_part_wqebbs_num);
+ wqe_combo->bds_head = wqe;
+ wqe_combo->bds_sec2 = second_part_wqebbs_addr;
+ wqe_combo->first_bds_num = first_part_wqebbs_num;
+ }
+
+ return hinic3_get_and_update_sq_owner(txq->sq, *curr_pi,
+ num_sge + (u16)!!offload);
+}
+
+/* *
+ * hinic3_prepare_sq_ctrl - init sq wqe cs
+ * @nr_descs: total sge_num, include bd0 in cs
+ */
+static void hinic3_prepare_sq_ctrl(struct hinic3_sq_wqe_combo *wqe_combo,
+ u32 queue_info, int nr_descs, u16 owner)
+{
+ struct hinic3_sq_wqe_desc *wqe_desc = wqe_combo->ctrl_bd0;
+
+ if (wqe_combo->wqe_type == SQ_WQE_COMPACT_TYPE) {
+ wqe_desc->ctrl_len |=
+ SQ_CTRL_SET(SQ_NORMAL_WQE, DATA_FORMAT) |
+ SQ_CTRL_SET(wqe_combo->wqe_type, EXTENDED) |
+ SQ_CTRL_SET(owner, OWNER);
+
+ wqe_desc->ctrl_len = hinic3_hw_be32(wqe_desc->ctrl_len);
+ /* compact wqe queue_info will transfer to ucode */
+ wqe_desc->queue_info = 0;
+ return;
+ }
+
+ wqe_desc->ctrl_len |= SQ_CTRL_SET(nr_descs, BUFDESC_NUM) |
+ SQ_CTRL_SET(wqe_combo->task_type, TASKSECT_LEN) |
+ SQ_CTRL_SET(SQ_NORMAL_WQE, DATA_FORMAT) |
+ SQ_CTRL_SET(wqe_combo->wqe_type, EXTENDED) |
+ SQ_CTRL_SET(owner, OWNER);
+
+ wqe_desc->ctrl_len = hinic3_hw_be32(wqe_desc->ctrl_len);
+
+ wqe_desc->queue_info = queue_info;
+ wqe_desc->queue_info |= SQ_CTRL_QUEUE_INFO_SET(1U, UC);
+
+ if (!SQ_CTRL_QUEUE_INFO_GET(wqe_desc->queue_info, MSS)) {
+ wqe_desc->queue_info |=
+ SQ_CTRL_QUEUE_INFO_SET(TX_MSS_DEFAULT, MSS);
+ } else if (SQ_CTRL_QUEUE_INFO_GET(wqe_desc->queue_info, MSS) <
+ TX_MSS_MIN) {
+ /* mss should not less than 80 */
+ wqe_desc->queue_info =
+ SQ_CTRL_QUEUE_INFO_CLEAR(wqe_desc->queue_info, MSS);
+ wqe_desc->queue_info |= SQ_CTRL_QUEUE_INFO_SET(TX_MSS_MIN, MSS);
+ }
+
+ wqe_desc->queue_info = hinic3_hw_be32(wqe_desc->queue_info);
+}
+
+static netdev_tx_t hinic3_send_one_skb(struct sk_buff *skb,
+ struct net_device *netdev,
+ struct hinic3_txq *txq)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_sq_wqe_combo wqe_combo = {0};
+ struct hinic3_tx_info *tx_info = NULL;
+ struct hinic3_sq_task task;
+ u32 offload, queue_info = 0;
+ u16 owner = 0, pi = 0;
+ u16 wqebb_cnt, num_sge, valid_nr_frags;
+ bool find_zero_sge_len = false;
+ int err, i;
+
+ if (unlikely(skb->len < MIN_SKB_LEN)) {
+ if (skb_pad(skb, (int)(MIN_SKB_LEN - skb->len))) {
+ TXQ_STATS_INC(txq, skb_pad_err);
+ goto tx_skb_pad_err;
+ }
+
+ skb->len = MIN_SKB_LEN;
+ }
+
+ valid_nr_frags = 0;
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+ if (!skb_frag_size(&skb_shinfo(skb)->frags[i])) {
+ find_zero_sge_len = true;
+ continue;
+ } else if (find_zero_sge_len) {
+ TXQ_STATS_INC(txq, frag_size_err);
+ goto tx_drop_pkts;
+ }
+
+ valid_nr_frags++;
+ }
+
+ num_sge = valid_nr_frags + 1;
+
+ /* assume need normal TS format wqe, task info need 1 wqebb */
+ wqebb_cnt = num_sge + 1;
+ if (unlikely(hinic3_maybe_stop_tx(txq, wqebb_cnt))) {
+ TXQ_STATS_INC(txq, busy);
+ return NETDEV_TX_BUSY;
+ }
+
+ offload = hinic3_tx_offload(skb, &task, &queue_info, txq);
+ if (unlikely(offload == TX_OFFLOAD_INVALID)) {
+ TXQ_STATS_INC(txq, offload_cow_skb_err);
+ goto tx_drop_pkts;
+ } else if (!offload) {
+ /* no TS in current wqe */
+ wqebb_cnt -= 1;
+ if (unlikely(num_sge == 1 && skb->len > COMPACET_WQ_SKB_MAX_LEN))
+ goto tx_drop_pkts;
+ }
+
+ owner = hinic3_set_wqe_combo(txq, &wqe_combo, offload, num_sge, &pi);
+ if (offload) {
+ /* ip6_frag_id is big endiant, not need to transfer */
+ wqe_combo.task->ip_identify = hinic3_hw_be32(task.ip_identify);
+ wqe_combo.task->pkt_info0 = hinic3_hw_be32(task.pkt_info0);
+ wqe_combo.task->pkt_info2 = hinic3_hw_be32(task.pkt_info2);
+ wqe_combo.task->vlan_offload =
+ hinic3_hw_be32(task.vlan_offload);
+ }
+
+ tx_info = &txq->tx_info[pi];
+ tx_info->skb = skb;
+ tx_info->wqebb_cnt = wqebb_cnt;
+ tx_info->valid_nr_frags = valid_nr_frags;
+
+ err = tx_map_skb(nic_dev, skb, valid_nr_frags, txq, tx_info,
+ &wqe_combo);
+ if (err) {
+ hinic3_rollback_sq_wqebbs(txq->sq, wqebb_cnt, owner);
+ goto tx_drop_pkts;
+ }
+
+ get_pkt_stats(tx_info, skb);
+
+ hinic3_prepare_sq_ctrl(&wqe_combo, queue_info, num_sge, owner);
+
+ hinic3_write_db(txq->sq, txq->cos, SQ_CFLAG_DP,
+ hinic3_get_sq_local_pi(txq->sq));
+
+ return NETDEV_TX_OK;
+
+tx_drop_pkts:
+ dev_kfree_skb_any(skb);
+
+tx_skb_pad_err:
+ TXQ_STATS_INC(txq, dropped);
+
+ return NETDEV_TX_OK;
+}
+
+netdev_tx_t hinic3_lb_xmit_frame(struct sk_buff *skb,
+ struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u16 q_id = skb_get_queue_mapping(skb);
+ struct hinic3_txq *txq = &nic_dev->txqs[q_id];
+
+ return hinic3_send_one_skb(skb, netdev, txq);
+}
+
+netdev_tx_t hinic3_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_txq *txq = NULL;
+ u16 q_id = skb_get_queue_mapping(skb);
+
+ if (unlikely(!netif_carrier_ok(netdev))) {
+ dev_kfree_skb_any(skb);
+ HINIC3_NIC_STATS_INC(nic_dev, tx_carrier_off_drop);
+ return NETDEV_TX_OK;
+ }
+
+ if (unlikely(q_id >= nic_dev->q_params.num_qps)) {
+ txq = &nic_dev->txqs[0];
+ HINIC3_NIC_STATS_INC(nic_dev, tx_invalid_qid);
+ goto tx_drop_pkts;
+ }
+ txq = &nic_dev->txqs[q_id];
+
+ return hinic3_send_one_skb(skb, netdev, txq);
+
+tx_drop_pkts:
+ dev_kfree_skb_any(skb);
+ u64_stats_update_begin(&txq->txq_stats.syncp);
+ txq->txq_stats.dropped++;
+ u64_stats_update_end(&txq->txq_stats.syncp);
+
+ return NETDEV_TX_OK;
+}
+
+static inline void tx_free_skb(struct hinic3_nic_dev *nic_dev,
+ struct hinic3_tx_info *tx_info)
+{
+ tx_unmap_skb(nic_dev, tx_info->skb, tx_info->valid_nr_frags,
+ tx_info->dma_info);
+ dev_kfree_skb_any(tx_info->skb);
+ tx_info->skb = NULL;
+}
+
+static void free_all_tx_skbs(struct hinic3_nic_dev *nic_dev, u32 sq_depth,
+ struct hinic3_tx_info *tx_info_arr)
+{
+ struct hinic3_tx_info *tx_info = NULL;
+ u32 idx;
+
+ for (idx = 0; idx < sq_depth; idx++) {
+ tx_info = &tx_info_arr[idx];
+ if (tx_info->skb)
+ tx_free_skb(nic_dev, tx_info);
+ }
+}
+
+int hinic3_tx_poll(struct hinic3_txq *txq, int budget)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(txq->netdev);
+ struct hinic3_tx_info *tx_info = NULL;
+ u64 tx_bytes = 0, wake = 0, nr_pkts = 0;
+ int pkts = 0;
+ u16 wqebb_cnt = 0;
+ u16 hw_ci, sw_ci = 0, q_id = txq->sq->q_id;
+
+ hw_ci = hinic3_get_sq_hw_ci(txq->sq);
+ dma_rmb();
+ sw_ci = hinic3_get_sq_local_ci(txq->sq);
+
+ do {
+ tx_info = &txq->tx_info[sw_ci];
+
+ /* Whether all of the wqebb of this wqe is completed */
+ if (hw_ci == sw_ci ||
+ ((hw_ci - sw_ci) & txq->q_mask) < tx_info->wqebb_cnt)
+ break;
+
+ sw_ci = (sw_ci + tx_info->wqebb_cnt) & (u16)txq->q_mask;
+ prefetch(&txq->tx_info[sw_ci]);
+
+ wqebb_cnt += tx_info->wqebb_cnt;
+
+ tx_bytes += tx_info->num_bytes;
+ nr_pkts += tx_info->num_pkts;
+ pkts++;
+
+ tx_free_skb(nic_dev, tx_info);
+ } while (likely(pkts < budget));
+
+ hinic3_update_sq_local_ci(txq->sq, wqebb_cnt);
+
+ if (unlikely(__netif_subqueue_stopped(nic_dev->netdev, q_id) &&
+ hinic3_get_sq_free_wqebbs(txq->sq) >= 1 &&
+ test_bit(HINIC3_INTF_UP, &nic_dev->flags))) {
+ struct netdev_queue *netdev_txq =
+ netdev_get_tx_queue(txq->netdev, q_id);
+
+ __netif_tx_lock(netdev_txq, smp_processor_id());
+ /* To avoid re-waking subqueue with xmit_frame */
+ if (__netif_subqueue_stopped(nic_dev->netdev, q_id)) {
+ netif_wake_subqueue(nic_dev->netdev, q_id);
+ wake++;
+ }
+ __netif_tx_unlock(netdev_txq);
+ }
+
+ u64_stats_update_begin(&txq->txq_stats.syncp);
+ txq->txq_stats.bytes += tx_bytes;
+ txq->txq_stats.packets += nr_pkts;
+ txq->txq_stats.wake += wake;
+ u64_stats_update_end(&txq->txq_stats.syncp);
+
+ return pkts;
+}
+
+void hinic3_set_txq_cos(struct hinic3_nic_dev *nic_dev, u16 start_qid,
+ u16 q_num, u8 cos)
+{
+ u16 idx;
+
+ for (idx = 0; idx < q_num; idx++)
+ nic_dev->txqs[idx + start_qid].cos = cos;
+}
+
+#define HINIC3_BDS_PER_SQ_WQEBB \
+ (HINIC3_SQ_WQEBB_SIZE / sizeof(struct hinic3_sq_bufdesc))
+
+int hinic3_alloc_txqs_res(struct hinic3_nic_dev *nic_dev, u16 num_sq,
+ u32 sq_depth, struct hinic3_dyna_txq_res *txqs_res)
+{
+ struct hinic3_dyna_txq_res *tqres = NULL;
+ int idx, i;
+ u64 size;
+
+ for (idx = 0; idx < num_sq; idx++) {
+ tqres = &txqs_res[idx];
+
+ size = sizeof(*tqres->tx_info) * sq_depth;
+ tqres->tx_info = kzalloc(size, GFP_KERNEL);
+ if (!tqres->tx_info) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc txq%d tx info\n", idx);
+ goto err_out;
+ }
+
+ size = sizeof(*tqres->bds) *
+ (sq_depth * HINIC3_BDS_PER_SQ_WQEBB +
+ HINIC3_MAX_SQ_SGE);
+ tqres->bds = kzalloc(size, GFP_KERNEL);
+ if (!tqres->bds) {
+ kfree(tqres->tx_info);
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to alloc txq%d bds info\n", idx);
+ goto err_out;
+ }
+ }
+
+ return 0;
+
+err_out:
+ for (i = 0; i < idx; i++) {
+ tqres = &txqs_res[i];
+
+ kfree(tqres->bds);
+ kfree(tqres->tx_info);
+ }
+
+ return -ENOMEM;
+}
+
+void hinic3_free_txqs_res(struct hinic3_nic_dev *nic_dev, u16 num_sq,
+ u32 sq_depth, struct hinic3_dyna_txq_res *txqs_res)
+{
+ struct hinic3_dyna_txq_res *tqres = NULL;
+ int idx;
+
+ for (idx = 0; idx < num_sq; idx++) {
+ tqres = &txqs_res[idx];
+
+ free_all_tx_skbs(nic_dev, sq_depth, tqres->tx_info);
+ kfree(tqres->bds);
+ kfree(tqres->tx_info);
+ }
+}
+
+int hinic3_configure_txqs(struct hinic3_nic_dev *nic_dev, u16 num_sq,
+ u32 sq_depth, struct hinic3_dyna_txq_res *txqs_res)
+{
+ struct hinic3_dyna_txq_res *tqres = NULL;
+ struct hinic3_txq *txq = NULL;
+ u16 q_id;
+ u32 idx;
+
+ for (q_id = 0; q_id < num_sq; q_id++) {
+ txq = &nic_dev->txqs[q_id];
+ tqres = &txqs_res[q_id];
+
+ txq->q_depth = sq_depth;
+ txq->q_mask = sq_depth - 1;
+
+ txq->tx_info = tqres->tx_info;
+ for (idx = 0; idx < sq_depth; idx++)
+ txq->tx_info[idx].dma_info =
+ &tqres->bds[idx * HINIC3_BDS_PER_SQ_WQEBB];
+
+ txq->sq = hinic3_get_nic_queue(nic_dev->hwdev, q_id, HINIC3_SQ);
+ if (!txq->sq) {
+ nicif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to get %u sq\n", q_id);
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+int hinic3_alloc_txqs(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct pci_dev *pdev = nic_dev->pdev;
+ struct hinic3_txq *txq = NULL;
+ u16 q_id, num_txqs = nic_dev->max_qps;
+ u64 txq_size;
+
+ txq_size = num_txqs * sizeof(*nic_dev->txqs);
+ if (!txq_size) {
+ nic_err(&pdev->dev, "Cannot allocate zero size txqs\n");
+ return -EINVAL;
+ }
+
+ nic_dev->txqs = kzalloc(txq_size, GFP_KERNEL);
+ if (!nic_dev->txqs) {
+ nic_err(&pdev->dev, "Failed to allocate txqs\n");
+ return -ENOMEM;
+ }
+
+ for (q_id = 0; q_id < num_txqs; q_id++) {
+ txq = &nic_dev->txqs[q_id];
+ txq->netdev = netdev;
+ txq->q_id = q_id;
+ txq->q_depth = nic_dev->q_params.sq_depth;
+ txq->q_mask = nic_dev->q_params.sq_depth - 1;
+ txq->dev = &pdev->dev;
+
+ txq_stats_init(txq);
+ }
+
+ return 0;
+}
+
+void hinic3_free_txqs(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ kfree(nic_dev->txqs);
+}
+
+static bool is_hw_complete_sq_process(struct hinic3_io_queue *sq)
+{
+ u16 sw_pi, hw_ci;
+
+ sw_pi = hinic3_get_sq_local_pi(sq);
+ hw_ci = hinic3_get_sq_hw_ci(sq);
+
+ return sw_pi == hw_ci;
+}
+
+#define HINIC3_FLUSH_QUEUE_TIMEOUT 1000
+static int hinic3_stop_sq(struct hinic3_txq *txq)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(txq->netdev);
+ unsigned long timeout;
+ int err;
+
+ timeout = msecs_to_jiffies(HINIC3_FLUSH_QUEUE_TIMEOUT) + jiffies;
+ do {
+ if (is_hw_complete_sq_process(txq->sq))
+ return 0;
+
+ usleep_range(900, 1000); /* sleep 900 us ~ 1000 us */
+ } while (time_before(jiffies, timeout));
+
+ /* force hardware to drop packets */
+ timeout = msecs_to_jiffies(HINIC3_FLUSH_QUEUE_TIMEOUT) + jiffies;
+ do {
+ if (is_hw_complete_sq_process(txq->sq))
+ return 0;
+
+ err = hinic3_force_drop_tx_pkt(nic_dev->hwdev);
+ if (err)
+ break;
+
+ usleep_range(9900, 10000); /* sleep 9900 us ~ 10000 us */
+ } while (time_before(jiffies, timeout));
+
+ /* Avoid msleep takes too long and get a fake result */
+ if (is_hw_complete_sq_process(txq->sq))
+ return 0;
+
+ return -EFAULT;
+}
+
+/* should stop transmit any packets before calling this function */
+int hinic3_flush_txqs(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u16 qid;
+ int err;
+
+ for (qid = 0; qid < nic_dev->q_params.num_qps; qid++) {
+ err = hinic3_stop_sq(&nic_dev->txqs[qid]);
+ if (err)
+ nicif_err(nic_dev, drv, netdev,
+ "Failed to stop sq%u\n", qid);
+ }
+
+ return 0;
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_tx.h b/drivers/net/ethernet/huawei/hinic3/hinic3_tx.h
new file mode 100644
index 000000000000..290ef297c45c
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_tx.h
@@ -0,0 +1,157 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_TX_H
+#define HINIC3_TX_H
+
+#include <net/ipv6.h>
+#include <net/checksum.h>
+#include <net/ip6_checksum.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+
+#include "hinic3_nic_qp.h"
+#include "hinic3_nic_io.h"
+
+#define VXLAN_OFFLOAD_PORT_LE 46354 /* big end is 4789 */
+
+#define COMPACET_WQ_SKB_MAX_LEN 16383
+
+#define IP4_VERSION 4
+#define IP6_VERSION 6
+#define IP_HDR_IHL_UNIT_SHIFT 2
+#define TCP_HDR_DATA_OFF_UNIT_SHIFT 2
+
+enum tx_offload_type {
+ TX_OFFLOAD_TSO = BIT(0),
+ TX_OFFLOAD_CSUM = BIT(1),
+ TX_OFFLOAD_VLAN = BIT(2),
+ TX_OFFLOAD_INVALID = BIT(3),
+ TX_OFFLOAD_ESP = BIT(4),
+};
+
+struct hinic3_txq_stats {
+ u64 packets;
+ u64 bytes;
+ u64 busy;
+ u64 wake;
+ u64 dropped;
+
+ /* Subdivision statistics show in private tool */
+ u64 skb_pad_err;
+ u64 frag_len_overflow;
+ u64 offload_cow_skb_err;
+ u64 map_frag_err;
+ u64 unknown_tunnel_pkt;
+ u64 frag_size_err;
+ u64 rsvd1;
+ u64 rsvd2;
+
+#ifdef HAVE_NDO_GET_STATS64
+ struct u64_stats_sync syncp;
+#else
+ struct u64_stats_sync_empty syncp;
+#endif
+};
+
+struct hinic3_dma_info {
+ dma_addr_t dma;
+ u32 len;
+};
+
+#define IPV4_VERSION 4
+#define IPV6_VERSION 6
+#define TCP_HDR_DOFF_UNIT 2
+#define TRANSPORT_OFFSET(l4_hdr, skb) ((u32)((l4_hdr) - (skb)->data))
+
+union hinic3_ip {
+ struct iphdr *v4;
+ struct ipv6hdr *v6;
+ unsigned char *hdr;
+};
+
+struct hinic3_tx_info {
+ struct sk_buff *skb;
+
+ u16 wqebb_cnt;
+ u16 valid_nr_frags;
+
+ int num_sge;
+ u16 num_pkts;
+ u16 rsvd1;
+ u32 rsvd2;
+ u64 num_bytes;
+ struct hinic3_dma_info *dma_info;
+ u64 rsvd3;
+};
+
+struct hinic3_txq {
+ struct net_device *netdev;
+ struct device *dev;
+
+ struct hinic3_txq_stats txq_stats;
+
+ u8 cos;
+ u8 rsvd1;
+ u16 q_id;
+ u32 q_mask;
+ u32 q_depth;
+ u32 rsvd2;
+
+ struct hinic3_tx_info *tx_info;
+ struct hinic3_io_queue *sq;
+
+ u64 last_moder_packets;
+ u64 last_moder_bytes;
+ u64 rsvd3;
+} ____cacheline_aligned;
+
+netdev_tx_t hinic3_lb_xmit_frame(struct sk_buff *skb,
+ struct net_device *netdev);
+
+struct hinic3_dyna_txq_res {
+ struct hinic3_tx_info *tx_info;
+ struct hinic3_dma_info *bds;
+};
+
+netdev_tx_t hinic3_xmit_frame(struct sk_buff *skb, struct net_device *netdev);
+
+void hinic3_txq_get_stats(struct hinic3_txq *txq,
+ struct hinic3_txq_stats *stats);
+
+void hinic3_txq_clean_stats(struct hinic3_txq_stats *txq_stats);
+
+struct hinic3_nic_dev;
+int hinic3_alloc_txqs_res(struct hinic3_nic_dev *nic_dev, u16 num_sq,
+ u32 sq_depth, struct hinic3_dyna_txq_res *txqs_res);
+
+void hinic3_free_txqs_res(struct hinic3_nic_dev *nic_dev, u16 num_sq,
+ u32 sq_depth, struct hinic3_dyna_txq_res *txqs_res);
+
+int hinic3_configure_txqs(struct hinic3_nic_dev *nic_dev, u16 num_sq,
+ u32 sq_depth, struct hinic3_dyna_txq_res *txqs_res);
+
+int hinic3_alloc_txqs(struct net_device *netdev);
+
+void hinic3_free_txqs(struct net_device *netdev);
+
+int hinic3_tx_poll(struct hinic3_txq *txq, int budget);
+
+int hinic3_flush_txqs(struct net_device *netdev);
+
+void hinic3_set_txq_cos(struct hinic3_nic_dev *nic_dev, u16 start_qid,
+ u16 q_num, u8 cos);
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+static inline __sum16 csum_magic(union hinic3_ip *ip, unsigned short proto)
+{
+ return (ip->v4->version == IPV4_VERSION) ?
+ csum_tcpudp_magic(ip->v4->saddr, ip->v4->daddr, 0, proto, 0) :
+ csum_ipv6_magic(&ip->v6->saddr, &ip->v6->daddr, 0, proto, 0);
+}
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_wq.h b/drivers/net/ethernet/huawei/hinic3/hinic3_wq.h
new file mode 100644
index 000000000000..1b9e509109b8
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_wq.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_WQ_H
+#define HINIC3_WQ_H
+
+struct hinic3_wq {
+ u16 cons_idx;
+ u16 prod_idx;
+
+ u32 q_depth;
+ u16 idx_mask;
+ u16 wqebb_size_shift;
+ u16 rsvd1;
+ u16 num_wq_pages;
+ u32 wqebbs_per_page;
+ u16 wqebbs_per_page_shift;
+ u16 wqebbs_per_page_mask;
+
+ struct hinic3_dma_addr_align *wq_pages;
+
+ dma_addr_t wq_block_paddr;
+ u64 *wq_block_vaddr;
+
+ void *dev_hdl;
+ u32 wq_page_size;
+ u16 wqebb_size;
+} ____cacheline_aligned;
+
+#define WQ_MASK_IDX(wq, idx) ((idx) & (wq)->idx_mask)
+#define WQ_MASK_PAGE(wq, pg_idx) \
+ ((pg_idx) < (wq)->num_wq_pages ? (pg_idx) : 0)
+#define WQ_PAGE_IDX(wq, idx) ((idx) >> (wq)->wqebbs_per_page_shift)
+#define WQ_OFFSET_IN_PAGE(wq, idx) ((idx) & (wq)->wqebbs_per_page_mask)
+#define WQ_GET_WQEBB_ADDR(wq, pg_idx, idx_in_pg) \
+ ((u8 *)(wq)->wq_pages[pg_idx].align_vaddr + \
+ ((idx_in_pg) << (wq)->wqebb_size_shift))
+#define WQ_IS_0_LEVEL_CLA(wq) ((wq)->num_wq_pages == 1)
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+static inline u16 hinic3_wq_free_wqebbs(struct hinic3_wq *wq)
+{
+ return wq->q_depth - ((wq->q_depth + wq->prod_idx - wq->cons_idx) &
+ wq->idx_mask) - 1;
+}
+
+static inline bool hinic3_wq_is_empty(struct hinic3_wq *wq)
+{
+ return WQ_MASK_IDX(wq, wq->prod_idx) == WQ_MASK_IDX(wq, wq->cons_idx);
+}
+
+static inline void *hinic3_wq_get_one_wqebb(struct hinic3_wq *wq, u16 *pi)
+{
+ *pi = WQ_MASK_IDX(wq, wq->prod_idx);
+ wq->prod_idx++;
+
+ return WQ_GET_WQEBB_ADDR(wq, WQ_PAGE_IDX(wq, *pi),
+ WQ_OFFSET_IN_PAGE(wq, *pi));
+}
+
+static inline void *hinic3_wq_get_multi_wqebbs(struct hinic3_wq *wq,
+ u16 num_wqebbs, u16 *prod_idx,
+ void **second_part_wqebbs_addr,
+ u16 *first_part_wqebbs_num)
+{
+ u32 pg_idx, off_in_page;
+
+ *prod_idx = WQ_MASK_IDX(wq, wq->prod_idx);
+ wq->prod_idx += num_wqebbs;
+
+ pg_idx = WQ_PAGE_IDX(wq, *prod_idx);
+ off_in_page = WQ_OFFSET_IN_PAGE(wq, *prod_idx);
+
+ if (off_in_page + num_wqebbs > wq->wqebbs_per_page) {
+ /* wqe across wq page boundary */
+ *second_part_wqebbs_addr =
+ WQ_GET_WQEBB_ADDR(wq, WQ_MASK_PAGE(wq, pg_idx + 1), 0);
+ *first_part_wqebbs_num = wq->wqebbs_per_page - off_in_page;
+ } else {
+ *second_part_wqebbs_addr = NULL;
+ *first_part_wqebbs_num = num_wqebbs;
+ }
+
+ return WQ_GET_WQEBB_ADDR(wq, pg_idx, off_in_page);
+}
+
+static inline void hinic3_wq_put_wqebbs(struct hinic3_wq *wq, u16 num_wqebbs)
+{
+ wq->cons_idx += num_wqebbs;
+}
+
+static inline void *hinic3_wq_wqebb_addr(struct hinic3_wq *wq, u16 idx)
+{
+ return WQ_GET_WQEBB_ADDR(wq, WQ_PAGE_IDX(wq, idx),
+ WQ_OFFSET_IN_PAGE(wq, idx));
+}
+
+static inline void *hinic3_wq_read_one_wqebb(struct hinic3_wq *wq,
+ u16 *cons_idx)
+{
+ *cons_idx = WQ_MASK_IDX(wq, wq->cons_idx);
+
+ return hinic3_wq_wqebb_addr(wq, *cons_idx);
+}
+
+static inline u64 hinic3_wq_get_first_wqe_page_addr(struct hinic3_wq *wq)
+{
+ return wq->wq_pages[0].align_paddr;
+}
+
+static inline void hinic3_wq_reset(struct hinic3_wq *wq)
+{
+ u16 pg_idx;
+
+ wq->cons_idx = 0;
+ wq->prod_idx = 0;
+
+ for (pg_idx = 0; pg_idx < wq->num_wq_pages; pg_idx++)
+ memset(wq->wq_pages[pg_idx].align_vaddr, 0, wq->wq_page_size);
+}
+
+int hinic3_wq_create(void *hwdev, struct hinic3_wq *wq, u32 q_depth,
+ u16 wqebb_size);
+void hinic3_wq_destroy(struct hinic3_wq *wq);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.c
new file mode 100644
index 000000000000..b742f8a8d9fe
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.c
@@ -0,0 +1,1211 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/completion.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/dma-mapping.h>
+#include <linux/semaphore.h>
+#include <linux/jiffies.h>
+#include <linux/delay.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_csr.h"
+#include "hinic3_hwif.h"
+#include "hinic3_api_cmd.h"
+
+#define API_CMD_CHAIN_CELL_SIZE_SHIFT 6U
+
+#define API_CMD_CELL_DESC_SIZE 8
+#define API_CMD_CELL_DATA_ADDR_SIZE 8
+
+#define API_CHAIN_NUM_CELLS 32
+#define API_CHAIN_CELL_SIZE 128
+#define API_CHAIN_RSP_DATA_SIZE 128
+
+#define API_CMD_CELL_WB_ADDR_SIZE 8
+
+#define API_CHAIN_CELL_ALIGNMENT 8
+
+#define API_CMD_TIMEOUT 10000
+#define API_CMD_STATUS_TIMEOUT 10000
+
+#define API_CMD_BUF_SIZE 2048ULL
+
+#define API_CMD_NODE_ALIGN_SIZE 512ULL
+#define API_PAYLOAD_ALIGN_SIZE 64ULL
+
+#define API_CHAIN_RESP_ALIGNMENT 128ULL
+
+#define COMPLETION_TIMEOUT_DEFAULT 1000UL
+#define POLLING_COMPLETION_TIMEOUT_DEFAULT 1000U
+
+#define API_CMD_RESPONSE_DATA_PADDR(val) be64_to_cpu(*((u64 *)(val)))
+
+#define READ_API_CMD_PRIV_DATA(id, token) ((((u32)(id)) << 16) + (token))
+#define WRITE_API_CMD_PRIV_DATA(id) (((u8)(id)) << 16)
+
+#define MASKED_IDX(chain, idx) ((idx) & ((chain)->num_cells - 1))
+
+#define SIZE_4BYTES(size) (ALIGN((u32)(size), 4U) >> 2)
+#define SIZE_8BYTES(size) (ALIGN((u32)(size), 8U) >> 3)
+
+enum api_cmd_data_format {
+ SGL_DATA = 1,
+};
+
+enum api_cmd_type {
+ API_CMD_WRITE_TYPE = 0,
+ API_CMD_READ_TYPE = 1,
+};
+
+enum api_cmd_bypass {
+ NOT_BYPASS = 0,
+ BYPASS = 1,
+};
+
+enum api_cmd_resp_aeq {
+ NOT_TRIGGER = 0,
+ TRIGGER = 1,
+};
+
+enum api_cmd_chn_code {
+ APICHN_0 = 0,
+};
+
+enum api_cmd_chn_rsvd {
+ APICHN_VALID = 0,
+ APICHN_INVALID = 1,
+};
+
+#define API_DESC_LEN (7)
+
+static u8 xor_chksum_set(void *data)
+{
+ int idx;
+ u8 checksum = 0;
+ u8 *val = data;
+
+ for (idx = 0; idx < API_DESC_LEN; idx++)
+ checksum ^= val[idx];
+
+ return checksum;
+}
+
+static void set_prod_idx(struct hinic3_api_cmd_chain *chain)
+{
+ enum hinic3_api_cmd_chain_type chain_type = chain->chain_type;
+ struct hinic3_hwif *hwif = chain->hwdev->hwif;
+ u32 hw_prod_idx_addr = HINIC3_CSR_API_CMD_CHAIN_PI_ADDR(chain_type);
+ u32 prod_idx = chain->prod_idx;
+
+ hinic3_hwif_write_reg(hwif, hw_prod_idx_addr, prod_idx);
+}
+
+static u32 get_hw_cons_idx(struct hinic3_api_cmd_chain *chain)
+{
+ u32 addr, val;
+
+ addr = HINIC3_CSR_API_CMD_STATUS_0_ADDR(chain->chain_type);
+ val = hinic3_hwif_read_reg(chain->hwdev->hwif, addr);
+
+ return HINIC3_API_CMD_STATUS_GET(val, CONS_IDX);
+}
+
+static void dump_api_chain_reg(struct hinic3_api_cmd_chain *chain)
+{
+ void *dev = chain->hwdev->dev_hdl;
+ u32 addr, val;
+ u16 pci_cmd = 0;
+
+ addr = HINIC3_CSR_API_CMD_STATUS_0_ADDR(chain->chain_type);
+ val = hinic3_hwif_read_reg(chain->hwdev->hwif, addr);
+
+ sdk_err(dev, "Chain type: 0x%x, cpld error: 0x%x, check error: 0x%x, current fsm: 0x%x\n",
+ chain->chain_type, HINIC3_API_CMD_STATUS_GET(val, CPLD_ERR),
+ HINIC3_API_CMD_STATUS_GET(val, CHKSUM_ERR),
+ HINIC3_API_CMD_STATUS_GET(val, FSM));
+
+ sdk_err(dev, "Chain hw current ci: 0x%x\n",
+ HINIC3_API_CMD_STATUS_GET(val, CONS_IDX));
+
+ addr = HINIC3_CSR_API_CMD_CHAIN_PI_ADDR(chain->chain_type);
+ val = hinic3_hwif_read_reg(chain->hwdev->hwif, addr);
+ sdk_err(dev, "Chain hw current pi: 0x%x\n", val);
+ pci_read_config_word(chain->hwdev->pcidev_hdl, PCI_COMMAND, &pci_cmd);
+ sdk_err(dev, "PCI command reg: 0x%x\n", pci_cmd);
+}
+
+/**
+ * chain_busy - check if the chain is still processing last requests
+ * @chain: chain to check
+ **/
+static int chain_busy(struct hinic3_api_cmd_chain *chain)
+{
+ void *dev = chain->hwdev->dev_hdl;
+ struct hinic3_api_cmd_cell_ctxt *ctxt;
+ u64 resp_header;
+
+ ctxt = &chain->cell_ctxt[chain->prod_idx];
+
+ switch (chain->chain_type) {
+ case HINIC3_API_CMD_MULTI_READ:
+ case HINIC3_API_CMD_POLL_READ:
+ resp_header = be64_to_cpu(ctxt->resp->header);
+ if (ctxt->status &&
+ !HINIC3_API_CMD_RESP_HEADER_VALID(resp_header)) {
+ sdk_err(dev, "Context(0x%x) busy!, pi: %u, resp_header: 0x%08x%08x\n",
+ ctxt->status, chain->prod_idx,
+ upper_32_bits(resp_header),
+ lower_32_bits(resp_header));
+ dump_api_chain_reg(chain);
+ return -EBUSY;
+ }
+ break;
+ case HINIC3_API_CMD_POLL_WRITE:
+ case HINIC3_API_CMD_WRITE_TO_MGMT_CPU:
+ case HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU:
+ chain->cons_idx = get_hw_cons_idx(chain);
+
+ if (chain->cons_idx == MASKED_IDX(chain, chain->prod_idx + 1)) {
+ sdk_err(dev, "API CMD chain %d is busy, cons_idx = %u, prod_idx = %u\n",
+ chain->chain_type, chain->cons_idx,
+ chain->prod_idx);
+ dump_api_chain_reg(chain);
+ return -EBUSY;
+ }
+ break;
+ default:
+ sdk_err(dev, "Unknown Chain type %d\n", chain->chain_type);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+/**
+ * get_cell_data_size - get the data size of specific cell type
+ * @type: chain type
+ **/
+static u16 get_cell_data_size(enum hinic3_api_cmd_chain_type type)
+{
+ u16 cell_data_size = 0;
+
+ switch (type) {
+ case HINIC3_API_CMD_POLL_READ:
+ cell_data_size = ALIGN(API_CMD_CELL_DESC_SIZE +
+ API_CMD_CELL_WB_ADDR_SIZE +
+ API_CMD_CELL_DATA_ADDR_SIZE,
+ API_CHAIN_CELL_ALIGNMENT);
+ break;
+
+ case HINIC3_API_CMD_WRITE_TO_MGMT_CPU:
+ case HINIC3_API_CMD_POLL_WRITE:
+ case HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU:
+ cell_data_size = ALIGN(API_CMD_CELL_DESC_SIZE +
+ API_CMD_CELL_DATA_ADDR_SIZE,
+ API_CHAIN_CELL_ALIGNMENT);
+ break;
+ default:
+ break;
+ }
+
+ return cell_data_size;
+}
+
+/**
+ * prepare_cell_ctrl - prepare the ctrl of the cell for the command
+ * @cell_ctrl: the control of the cell to set the control into it
+ * @cell_len: the size of the cell
+ **/
+static void prepare_cell_ctrl(u64 *cell_ctrl, u16 cell_len)
+{
+ u64 ctrl;
+ u8 chksum;
+
+ ctrl = HINIC3_API_CMD_CELL_CTRL_SET(SIZE_8BYTES(cell_len), CELL_LEN) |
+ HINIC3_API_CMD_CELL_CTRL_SET(0ULL, RD_DMA_ATTR_OFF) |
+ HINIC3_API_CMD_CELL_CTRL_SET(0ULL, WR_DMA_ATTR_OFF);
+
+ chksum = xor_chksum_set(&ctrl);
+
+ ctrl |= HINIC3_API_CMD_CELL_CTRL_SET(chksum, XOR_CHKSUM);
+
+ /* The data in the HW should be in Big Endian Format */
+ *cell_ctrl = cpu_to_be64(ctrl);
+}
+
+/**
+ * prepare_api_cmd - prepare API CMD command
+ * @chain: chain for the command
+ * @cell: the cell of the command
+ * @node_id: destination node on the card that will receive the command
+ * @cmd: command data
+ * @cmd_size: the command size
+ **/
+static void prepare_api_cmd(struct hinic3_api_cmd_chain *chain,
+ struct hinic3_api_cmd_cell *cell, u8 node_id,
+ const void *cmd, u16 cmd_size)
+{
+ struct hinic3_api_cmd_cell_ctxt *cell_ctxt;
+ u32 priv;
+
+ cell_ctxt = &chain->cell_ctxt[chain->prod_idx];
+
+ switch (chain->chain_type) {
+ case HINIC3_API_CMD_POLL_READ:
+ priv = READ_API_CMD_PRIV_DATA(chain->chain_type,
+ cell_ctxt->saved_prod_idx);
+ cell->desc = HINIC3_API_CMD_DESC_SET(SGL_DATA, API_TYPE) |
+ HINIC3_API_CMD_DESC_SET(API_CMD_READ_TYPE, RD_WR) |
+ HINIC3_API_CMD_DESC_SET(BYPASS, MGMT_BYPASS) |
+ HINIC3_API_CMD_DESC_SET(NOT_TRIGGER,
+ RESP_AEQE_EN) |
+ HINIC3_API_CMD_DESC_SET(priv, PRIV_DATA);
+ break;
+ case HINIC3_API_CMD_POLL_WRITE:
+ priv = WRITE_API_CMD_PRIV_DATA(chain->chain_type);
+ cell->desc = HINIC3_API_CMD_DESC_SET(SGL_DATA, API_TYPE) |
+ HINIC3_API_CMD_DESC_SET(API_CMD_WRITE_TYPE,
+ RD_WR) |
+ HINIC3_API_CMD_DESC_SET(BYPASS, MGMT_BYPASS) |
+ HINIC3_API_CMD_DESC_SET(NOT_TRIGGER,
+ RESP_AEQE_EN) |
+ HINIC3_API_CMD_DESC_SET(priv, PRIV_DATA);
+ break;
+ case HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU:
+ case HINIC3_API_CMD_WRITE_TO_MGMT_CPU:
+ priv = WRITE_API_CMD_PRIV_DATA(chain->chain_type);
+ cell->desc = HINIC3_API_CMD_DESC_SET(SGL_DATA, API_TYPE) |
+ HINIC3_API_CMD_DESC_SET(API_CMD_WRITE_TYPE,
+ RD_WR) |
+ HINIC3_API_CMD_DESC_SET(NOT_BYPASS, MGMT_BYPASS) |
+ HINIC3_API_CMD_DESC_SET(TRIGGER, RESP_AEQE_EN) |
+ HINIC3_API_CMD_DESC_SET(priv, PRIV_DATA);
+ break;
+ default:
+ sdk_err(chain->hwdev->dev_hdl, "Unknown Chain type: %d\n",
+ chain->chain_type);
+ return;
+ }
+
+ cell->desc |= HINIC3_API_CMD_DESC_SET(APICHN_0, APICHN_CODE) |
+ HINIC3_API_CMD_DESC_SET(APICHN_VALID, APICHN_RSVD);
+
+ cell->desc |= HINIC3_API_CMD_DESC_SET(node_id, DEST) |
+ HINIC3_API_CMD_DESC_SET(SIZE_4BYTES(cmd_size), SIZE);
+
+ cell->desc |= HINIC3_API_CMD_DESC_SET(xor_chksum_set(&cell->desc),
+ XOR_CHKSUM);
+
+ /* The data in the HW should be in Big Endian Format */
+ cell->desc = cpu_to_be64(cell->desc);
+
+ memcpy(cell_ctxt->api_cmd_vaddr, cmd, cmd_size);
+}
+
+/**
+ * prepare_cell - prepare cell ctrl and cmd in the current producer cell
+ * @chain: chain for the command
+ * @node_id: destination node on the card that will receive the command
+ * @cmd: command data
+ * @cmd_size: the command size
+ * Return: 0 - success, negative - failure
+ **/
+static void prepare_cell(struct hinic3_api_cmd_chain *chain, u8 node_id,
+ const void *cmd, u16 cmd_size)
+{
+ struct hinic3_api_cmd_cell *curr_node;
+ u16 cell_size;
+
+ curr_node = chain->curr_node;
+
+ cell_size = get_cell_data_size(chain->chain_type);
+
+ prepare_cell_ctrl(&curr_node->ctrl, cell_size);
+ prepare_api_cmd(chain, curr_node, node_id, cmd, cmd_size);
+}
+
+static inline void cmd_chain_prod_idx_inc(struct hinic3_api_cmd_chain *chain)
+{
+ chain->prod_idx = MASKED_IDX(chain, chain->prod_idx + 1);
+}
+
+static void issue_api_cmd(struct hinic3_api_cmd_chain *chain)
+{
+ set_prod_idx(chain);
+}
+
+/**
+ * api_cmd_status_update - update the status of the chain
+ * @chain: chain to update
+ **/
+static void api_cmd_status_update(struct hinic3_api_cmd_chain *chain)
+{
+ struct hinic3_api_cmd_status *wb_status;
+ enum hinic3_api_cmd_chain_type chain_type;
+ u64 status_header;
+ u32 buf_desc;
+
+ wb_status = chain->wb_status;
+
+ buf_desc = be32_to_cpu(wb_status->buf_desc);
+ if (HINIC3_API_CMD_STATUS_GET(buf_desc, CHKSUM_ERR))
+ return;
+
+ status_header = be64_to_cpu(wb_status->header);
+ chain_type = HINIC3_API_CMD_STATUS_HEADER_GET(status_header, CHAIN_ID);
+ if (chain_type >= HINIC3_API_CMD_MAX)
+ return;
+
+ if (chain_type != chain->chain_type)
+ return;
+
+ chain->cons_idx = HINIC3_API_CMD_STATUS_GET(buf_desc, CONS_IDX);
+}
+
+static enum hinic3_wait_return wait_for_status_poll_handler(void *priv_data)
+{
+ struct hinic3_api_cmd_chain *chain = priv_data;
+
+ if (!chain->hwdev->chip_present_flag)
+ return WAIT_PROCESS_ERR;
+
+ api_cmd_status_update(chain);
+ /* SYNC API CMD cmd should start after prev cmd finished */
+ if (chain->cons_idx == chain->prod_idx)
+ return WAIT_PROCESS_CPL;
+
+ return WAIT_PROCESS_WAITING;
+}
+
+/**
+ * wait_for_status_poll - wait for write to mgmt command to complete
+ * @chain: the chain of the command
+ * Return: 0 - success, negative - failure
+ **/
+static int wait_for_status_poll(struct hinic3_api_cmd_chain *chain)
+{
+ return hinic3_wait_for_timeout(chain,
+ wait_for_status_poll_handler,
+ API_CMD_STATUS_TIMEOUT, 100); /* wait 100 us once */
+}
+
+static void copy_resp_data(struct hinic3_api_cmd_cell_ctxt *ctxt, void *ack,
+ u16 ack_size)
+{
+ struct hinic3_api_cmd_resp_fmt *resp = ctxt->resp;
+
+ memcpy(ack, &resp->resp_data, ack_size);
+ ctxt->status = 0;
+}
+
+static enum hinic3_wait_return check_cmd_resp_handler(void *priv_data)
+{
+ struct hinic3_api_cmd_cell_ctxt *ctxt = priv_data;
+ u64 resp_header;
+ u8 resp_status;
+
+ if (!ctxt->hwdev->chip_present_flag)
+ return WAIT_PROCESS_ERR;
+
+ resp_header = be64_to_cpu(ctxt->resp->header);
+ rmb(); /* read the latest header */
+
+ if (HINIC3_API_CMD_RESP_HEADER_VALID(resp_header)) {
+ resp_status = HINIC3_API_CMD_RESP_HEAD_GET(resp_header, STATUS);
+ if (resp_status) {
+ pr_err("Api chain response data err, status: %u\n",
+ resp_status);
+ return WAIT_PROCESS_ERR;
+ }
+
+ return WAIT_PROCESS_CPL;
+ }
+
+ return WAIT_PROCESS_WAITING;
+}
+
+/**
+ * prepare_cell - polling for respense data of the read api-command
+ * @chain: pointer to api cmd chain
+ *
+ * Return: 0 - success, negative - failure
+ **/
+static int wait_for_resp_polling(struct hinic3_api_cmd_cell_ctxt *ctxt)
+{
+ return hinic3_wait_for_timeout(ctxt, check_cmd_resp_handler,
+ POLLING_COMPLETION_TIMEOUT_DEFAULT,
+ USEC_PER_MSEC);
+}
+
+/**
+ * wait_for_api_cmd_completion - wait for command to complete
+ * @chain: chain for the command
+ * Return: 0 - success, negative - failure
+ **/
+static int wait_for_api_cmd_completion(struct hinic3_api_cmd_chain *chain,
+ struct hinic3_api_cmd_cell_ctxt *ctxt,
+ void *ack, u16 ack_size)
+{
+ void *dev = chain->hwdev->dev_hdl;
+ int err = 0;
+
+ switch (chain->chain_type) {
+ case HINIC3_API_CMD_POLL_READ:
+ err = wait_for_resp_polling(ctxt);
+ if (err == 0)
+ copy_resp_data(ctxt, ack, ack_size);
+ else
+ sdk_err(dev, "API CMD poll response timeout\n");
+ break;
+ case HINIC3_API_CMD_POLL_WRITE:
+ case HINIC3_API_CMD_WRITE_TO_MGMT_CPU:
+ err = wait_for_status_poll(chain);
+ if (err != 0) {
+ sdk_err(dev, "API CMD Poll status timeout, chain type: %d\n",
+ chain->chain_type);
+ break;
+ }
+ break;
+ case HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU:
+ /* No need to wait */
+ break;
+ default:
+ sdk_err(dev, "Unknown API CMD Chain type: %d\n",
+ chain->chain_type);
+ err = -EINVAL;
+ break;
+ }
+
+ if (err != 0)
+ dump_api_chain_reg(chain);
+
+ return err;
+}
+
+static inline void update_api_cmd_ctxt(struct hinic3_api_cmd_chain *chain,
+ struct hinic3_api_cmd_cell_ctxt *ctxt)
+{
+ ctxt->status = 1;
+ ctxt->saved_prod_idx = chain->prod_idx;
+ if (ctxt->resp) {
+ ctxt->resp->header = 0;
+
+ /* make sure "header" was cleared */
+ wmb();
+ }
+}
+
+/**
+ * api_cmd - API CMD command
+ * @chain: chain for the command
+ * @node_id: destination node on the card that will receive the command
+ * @cmd: command data
+ * @size: the command size
+ * Return: 0 - success, negative - failure
+ **/
+static int api_cmd(struct hinic3_api_cmd_chain *chain, u8 node_id,
+ const void *cmd, u16 cmd_size, void *ack, u16 ack_size)
+{
+ struct hinic3_api_cmd_cell_ctxt *ctxt = NULL;
+
+ if (chain->chain_type == HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU)
+ spin_lock(&chain->async_lock);
+ else
+ down(&chain->sem);
+ ctxt = &chain->cell_ctxt[chain->prod_idx];
+ if (chain_busy(chain)) {
+ if (chain->chain_type == HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU)
+ spin_unlock(&chain->async_lock);
+ else
+ up(&chain->sem);
+ return -EBUSY;
+ }
+ update_api_cmd_ctxt(chain, ctxt);
+
+ prepare_cell(chain, node_id, cmd, cmd_size);
+
+ cmd_chain_prod_idx_inc(chain);
+
+ wmb(); /* issue the command */
+
+ issue_api_cmd(chain);
+
+ /* incremented prod idx, update ctxt */
+
+ chain->curr_node = chain->cell_ctxt[chain->prod_idx].cell_vaddr;
+ if (chain->chain_type == HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU)
+ spin_unlock(&chain->async_lock);
+ else
+ up(&chain->sem);
+
+ return wait_for_api_cmd_completion(chain, ctxt, ack, ack_size);
+}
+
+/**
+ * hinic3_api_cmd_write - Write API CMD command
+ * @chain: chain for write command
+ * @node_id: destination node on the card that will receive the command
+ * @cmd: command data
+ * @size: the command size
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_api_cmd_write(struct hinic3_api_cmd_chain *chain, u8 node_id,
+ const void *cmd, u16 size)
+{
+ /* Verify the chain type */
+ return api_cmd(chain, node_id, cmd, size, NULL, 0);
+}
+
+/**
+ * hinic3_api_cmd_read - Read API CMD command
+ * @chain: chain for read command
+ * @node_id: destination node on the card that will receive the command
+ * @cmd: command data
+ * @size: the command size
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_api_cmd_read(struct hinic3_api_cmd_chain *chain, u8 node_id,
+ const void *cmd, u16 size, void *ack, u16 ack_size)
+{
+ return api_cmd(chain, node_id, cmd, size, ack, ack_size);
+}
+
+static enum hinic3_wait_return check_chain_restart_handler(void *priv_data)
+{
+ struct hinic3_api_cmd_chain *cmd_chain = priv_data;
+ u32 reg_addr, val;
+
+ if (!cmd_chain->hwdev->chip_present_flag)
+ return WAIT_PROCESS_ERR;
+
+ reg_addr = HINIC3_CSR_API_CMD_CHAIN_REQ_ADDR(cmd_chain->chain_type);
+ val = hinic3_hwif_read_reg(cmd_chain->hwdev->hwif, reg_addr);
+ if (!HINIC3_API_CMD_CHAIN_REQ_GET(val, RESTART))
+ return WAIT_PROCESS_CPL;
+
+ return WAIT_PROCESS_WAITING;
+}
+
+/**
+ * api_cmd_hw_restart - restart the chain in the HW
+ * @chain: the API CMD specific chain to restart
+ **/
+static int api_cmd_hw_restart(struct hinic3_api_cmd_chain *cmd_chain)
+{
+ struct hinic3_hwif *hwif = cmd_chain->hwdev->hwif;
+ u32 reg_addr, val;
+
+ /* Read Modify Write */
+ reg_addr = HINIC3_CSR_API_CMD_CHAIN_REQ_ADDR(cmd_chain->chain_type);
+ val = hinic3_hwif_read_reg(hwif, reg_addr);
+
+ val = HINIC3_API_CMD_CHAIN_REQ_CLEAR(val, RESTART);
+ val |= HINIC3_API_CMD_CHAIN_REQ_SET(1, RESTART);
+
+ hinic3_hwif_write_reg(hwif, reg_addr, val);
+
+ return hinic3_wait_for_timeout(cmd_chain, check_chain_restart_handler,
+ API_CMD_TIMEOUT, USEC_PER_MSEC);
+}
+
+/**
+ * api_cmd_ctrl_init - set the control register of a chain
+ * @chain: the API CMD specific chain to set control register for
+ **/
+static void api_cmd_ctrl_init(struct hinic3_api_cmd_chain *chain)
+{
+ struct hinic3_hwif *hwif = chain->hwdev->hwif;
+ u32 reg_addr, ctrl;
+ u32 size;
+
+ /* Read Modify Write */
+ reg_addr = HINIC3_CSR_API_CMD_CHAIN_CTRL_ADDR(chain->chain_type);
+
+ size = (u32)ilog2(chain->cell_size >> API_CMD_CHAIN_CELL_SIZE_SHIFT);
+
+ ctrl = hinic3_hwif_read_reg(hwif, reg_addr);
+
+ ctrl = HINIC3_API_CMD_CHAIN_CTRL_CLEAR(ctrl, AEQE_EN) &
+ HINIC3_API_CMD_CHAIN_CTRL_CLEAR(ctrl, CELL_SIZE);
+
+ ctrl |= HINIC3_API_CMD_CHAIN_CTRL_SET(0, AEQE_EN) |
+ HINIC3_API_CMD_CHAIN_CTRL_SET(size, CELL_SIZE);
+
+ hinic3_hwif_write_reg(hwif, reg_addr, ctrl);
+}
+
+/**
+ * api_cmd_set_status_addr - set the status address of a chain in the HW
+ * @chain: the API CMD specific chain to set status address for
+ **/
+static void api_cmd_set_status_addr(struct hinic3_api_cmd_chain *chain)
+{
+ struct hinic3_hwif *hwif = chain->hwdev->hwif;
+ u32 addr, val;
+
+ addr = HINIC3_CSR_API_CMD_STATUS_HI_ADDR(chain->chain_type);
+ val = upper_32_bits(chain->wb_status_paddr);
+ hinic3_hwif_write_reg(hwif, addr, val);
+
+ addr = HINIC3_CSR_API_CMD_STATUS_LO_ADDR(chain->chain_type);
+ val = lower_32_bits(chain->wb_status_paddr);
+ hinic3_hwif_write_reg(hwif, addr, val);
+}
+
+/**
+ * api_cmd_set_num_cells - set the number cells of a chain in the HW
+ * @chain: the API CMD specific chain to set the number of cells for
+ **/
+static void api_cmd_set_num_cells(struct hinic3_api_cmd_chain *chain)
+{
+ struct hinic3_hwif *hwif = chain->hwdev->hwif;
+ u32 addr, val;
+
+ addr = HINIC3_CSR_API_CMD_CHAIN_NUM_CELLS_ADDR(chain->chain_type);
+ val = chain->num_cells;
+ hinic3_hwif_write_reg(hwif, addr, val);
+}
+
+/**
+ * api_cmd_head_init - set the head cell of a chain in the HW
+ * @chain: the API CMD specific chain to set the head for
+ **/
+static void api_cmd_head_init(struct hinic3_api_cmd_chain *chain)
+{
+ struct hinic3_hwif *hwif = chain->hwdev->hwif;
+ u32 addr, val;
+
+ addr = HINIC3_CSR_API_CMD_CHAIN_HEAD_HI_ADDR(chain->chain_type);
+ val = upper_32_bits(chain->head_cell_paddr);
+ hinic3_hwif_write_reg(hwif, addr, val);
+
+ addr = HINIC3_CSR_API_CMD_CHAIN_HEAD_LO_ADDR(chain->chain_type);
+ val = lower_32_bits(chain->head_cell_paddr);
+ hinic3_hwif_write_reg(hwif, addr, val);
+}
+
+static enum hinic3_wait_return check_chain_ready_handler(void *priv_data)
+{
+ struct hinic3_api_cmd_chain *chain = priv_data;
+ u32 addr, val;
+ u32 hw_cons_idx;
+
+ if (!chain->hwdev->chip_present_flag)
+ return WAIT_PROCESS_ERR;
+
+ addr = HINIC3_CSR_API_CMD_STATUS_0_ADDR(chain->chain_type);
+ val = hinic3_hwif_read_reg(chain->hwdev->hwif, addr);
+ hw_cons_idx = HINIC3_API_CMD_STATUS_GET(val, CONS_IDX);
+ /* wait for HW cons idx to be updated */
+ if (hw_cons_idx == chain->cons_idx)
+ return WAIT_PROCESS_CPL;
+ return WAIT_PROCESS_WAITING;
+}
+
+/**
+ * wait_for_ready_chain - wait for the chain to be ready
+ * @chain: the API CMD specific chain to wait for
+ * Return: 0 - success, negative - failure
+ **/
+static int wait_for_ready_chain(struct hinic3_api_cmd_chain *chain)
+{
+ return hinic3_wait_for_timeout(chain, check_chain_ready_handler,
+ API_CMD_TIMEOUT, USEC_PER_MSEC);
+}
+
+/**
+ * api_cmd_chain_hw_clean - clean the HW
+ * @chain: the API CMD specific chain
+ **/
+static void api_cmd_chain_hw_clean(struct hinic3_api_cmd_chain *chain)
+{
+ struct hinic3_hwif *hwif = chain->hwdev->hwif;
+ u32 addr, ctrl;
+
+ addr = HINIC3_CSR_API_CMD_CHAIN_CTRL_ADDR(chain->chain_type);
+
+ ctrl = hinic3_hwif_read_reg(hwif, addr);
+ ctrl = HINIC3_API_CMD_CHAIN_CTRL_CLEAR(ctrl, RESTART_EN) &
+ HINIC3_API_CMD_CHAIN_CTRL_CLEAR(ctrl, XOR_ERR) &
+ HINIC3_API_CMD_CHAIN_CTRL_CLEAR(ctrl, AEQE_EN) &
+ HINIC3_API_CMD_CHAIN_CTRL_CLEAR(ctrl, XOR_CHK_EN) &
+ HINIC3_API_CMD_CHAIN_CTRL_CLEAR(ctrl, CELL_SIZE);
+
+ hinic3_hwif_write_reg(hwif, addr, ctrl);
+}
+
+/**
+ * api_cmd_chain_hw_init - initialize the chain in the HW
+ * @chain: the API CMD specific chain to initialize in HW
+ * Return: 0 - success, negative - failure
+ **/
+static int api_cmd_chain_hw_init(struct hinic3_api_cmd_chain *chain)
+{
+ api_cmd_chain_hw_clean(chain);
+
+ api_cmd_set_status_addr(chain);
+
+ if (api_cmd_hw_restart(chain)) {
+ sdk_err(chain->hwdev->dev_hdl, "Failed to restart api_cmd_hw\n");
+ return -EBUSY;
+ }
+
+ api_cmd_ctrl_init(chain);
+ api_cmd_set_num_cells(chain);
+ api_cmd_head_init(chain);
+
+ return wait_for_ready_chain(chain);
+}
+
+/**
+ * alloc_cmd_buf - allocate a dma buffer for API CMD command
+ * @chain: the API CMD specific chain for the cmd
+ * @cell: the cell in the HW for the cmd
+ * @cell_idx: the index of the cell
+ * Return: 0 - success, negative - failure
+ **/
+static int alloc_cmd_buf(struct hinic3_api_cmd_chain *chain,
+ struct hinic3_api_cmd_cell *cell, u32 cell_idx)
+{
+ struct hinic3_api_cmd_cell_ctxt *cell_ctxt;
+ void *dev = chain->hwdev->dev_hdl;
+ void *buf_vaddr;
+ u64 buf_paddr;
+ int err = 0;
+
+ buf_vaddr = (u8 *)((u64)chain->buf_vaddr_base +
+ chain->buf_size_align * cell_idx);
+ buf_paddr = chain->buf_paddr_base +
+ chain->buf_size_align * cell_idx;
+
+ cell_ctxt = &chain->cell_ctxt[cell_idx];
+
+ cell_ctxt->api_cmd_vaddr = buf_vaddr;
+
+ /* set the cmd DMA address in the cell */
+ switch (chain->chain_type) {
+ case HINIC3_API_CMD_POLL_READ:
+ cell->read.hw_cmd_paddr = cpu_to_be64(buf_paddr);
+ break;
+ case HINIC3_API_CMD_WRITE_TO_MGMT_CPU:
+ case HINIC3_API_CMD_POLL_WRITE:
+ case HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU:
+ /* The data in the HW should be in Big Endian Format */
+ cell->write.hw_cmd_paddr = cpu_to_be64(buf_paddr);
+ break;
+ default:
+ sdk_err(dev, "Unknown API CMD Chain type: %d\n",
+ chain->chain_type);
+ err = -EINVAL;
+ break;
+ }
+
+ return err;
+}
+
+/**
+ * alloc_cmd_buf - allocate a resp buffer for API CMD command
+ * @chain: the API CMD specific chain for the cmd
+ * @cell: the cell in the HW for the cmd
+ * @cell_idx: the index of the cell
+ **/
+static void alloc_resp_buf(struct hinic3_api_cmd_chain *chain,
+ struct hinic3_api_cmd_cell *cell, u32 cell_idx)
+{
+ struct hinic3_api_cmd_cell_ctxt *cell_ctxt;
+ void *resp_vaddr;
+ u64 resp_paddr;
+
+ resp_vaddr = (u8 *)((u64)chain->rsp_vaddr_base +
+ chain->rsp_size_align * cell_idx);
+ resp_paddr = chain->rsp_paddr_base +
+ chain->rsp_size_align * cell_idx;
+
+ cell_ctxt = &chain->cell_ctxt[cell_idx];
+
+ cell_ctxt->resp = resp_vaddr;
+ cell->read.hw_wb_resp_paddr = cpu_to_be64(resp_paddr);
+}
+
+static int hinic3_alloc_api_cmd_cell_buf(struct hinic3_api_cmd_chain *chain,
+ u32 cell_idx,
+ struct hinic3_api_cmd_cell *node)
+{
+ void *dev = chain->hwdev->dev_hdl;
+ int err;
+
+ /* For read chain, we should allocate buffer for the response data */
+ if (chain->chain_type == HINIC3_API_CMD_MULTI_READ ||
+ chain->chain_type == HINIC3_API_CMD_POLL_READ)
+ alloc_resp_buf(chain, node, cell_idx);
+
+ switch (chain->chain_type) {
+ case HINIC3_API_CMD_WRITE_TO_MGMT_CPU:
+ case HINIC3_API_CMD_POLL_WRITE:
+ case HINIC3_API_CMD_POLL_READ:
+ case HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU:
+ err = alloc_cmd_buf(chain, node, cell_idx);
+ if (err) {
+ sdk_err(dev, "Failed to allocate cmd buffer\n");
+ goto alloc_cmd_buf_err;
+ }
+ break;
+ /* For api command write and api command read, the data section
+ * is directly inserted in the cell, so no need to allocate.
+ */
+ case HINIC3_API_CMD_MULTI_READ:
+ chain->cell_ctxt[cell_idx].api_cmd_vaddr =
+ &node->read.hw_cmd_paddr;
+ break;
+ default:
+ sdk_err(dev, "Unsupported API CMD chain type\n");
+ err = -EINVAL;
+ goto alloc_cmd_buf_err;
+ }
+
+ return 0;
+
+alloc_cmd_buf_err:
+
+ return err;
+}
+
+/**
+ * api_cmd_create_cell - create API CMD cell of specific chain
+ * @chain: the API CMD specific chain to create its cell
+ * @cell_idx: the cell index to create
+ * @pre_node: previous cell
+ * @node_vaddr: the virt addr of the cell
+ * Return: 0 - success, negative - failure
+ **/
+static int api_cmd_create_cell(struct hinic3_api_cmd_chain *chain, u32 cell_idx,
+ struct hinic3_api_cmd_cell *pre_node,
+ struct hinic3_api_cmd_cell **node_vaddr)
+{
+ struct hinic3_api_cmd_cell_ctxt *cell_ctxt;
+ struct hinic3_api_cmd_cell *node;
+ void *cell_vaddr;
+ u64 cell_paddr;
+ int err;
+
+ cell_vaddr = (void *)((u64)chain->cell_vaddr_base +
+ chain->cell_size_align * cell_idx);
+ cell_paddr = chain->cell_paddr_base +
+ chain->cell_size_align * cell_idx;
+
+ cell_ctxt = &chain->cell_ctxt[cell_idx];
+ cell_ctxt->cell_vaddr = cell_vaddr;
+ cell_ctxt->hwdev = chain->hwdev;
+ node = cell_ctxt->cell_vaddr;
+
+ if (!pre_node) {
+ chain->head_node = cell_vaddr;
+ chain->head_cell_paddr = (dma_addr_t)cell_paddr;
+ } else {
+ /* The data in the HW should be in Big Endian Format */
+ pre_node->next_cell_paddr = cpu_to_be64(cell_paddr);
+ }
+
+ /* Driver software should make sure that there is an empty API
+ * command cell at the end the chain
+ */
+ node->next_cell_paddr = 0;
+
+ err = hinic3_alloc_api_cmd_cell_buf(chain, cell_idx, node);
+ if (err)
+ return err;
+
+ *node_vaddr = node;
+
+ return 0;
+}
+
+/**
+ * api_cmd_create_cells - create API CMD cells for specific chain
+ * @chain: the API CMD specific chain
+ * Return: 0 - success, negative - failure
+ **/
+static int api_cmd_create_cells(struct hinic3_api_cmd_chain *chain)
+{
+ struct hinic3_api_cmd_cell *node = NULL, *pre_node = NULL;
+ void *dev = chain->hwdev->dev_hdl;
+ u32 cell_idx;
+ int err;
+
+ for (cell_idx = 0; cell_idx < chain->num_cells; cell_idx++) {
+ err = api_cmd_create_cell(chain, cell_idx, pre_node, &node);
+ if (err) {
+ sdk_err(dev, "Failed to create API CMD cell\n");
+ return err;
+ }
+
+ pre_node = node;
+ }
+
+ if (!node)
+ return -EFAULT;
+
+ /* set the Final node to point on the start */
+ node->next_cell_paddr = cpu_to_be64(chain->head_cell_paddr);
+
+ /* set the current node to be the head */
+ chain->curr_node = chain->head_node;
+ return 0;
+}
+
+/**
+ * api_chain_init - initialize API CMD specific chain
+ * @chain: the API CMD specific chain to initialize
+ * @attr: attributes to set in the chain
+ * Return: 0 - success, negative - failure
+ **/
+static int api_chain_init(struct hinic3_api_cmd_chain *chain,
+ struct hinic3_api_cmd_chain_attr *attr)
+{
+ void *dev = chain->hwdev->dev_hdl;
+ size_t cell_ctxt_size;
+ size_t cells_buf_size;
+ int err;
+
+ chain->chain_type = attr->chain_type;
+ chain->num_cells = attr->num_cells;
+ chain->cell_size = attr->cell_size;
+ chain->rsp_size = attr->rsp_size;
+
+ chain->prod_idx = 0;
+ chain->cons_idx = 0;
+
+ if (chain->chain_type == HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU)
+ spin_lock_init(&chain->async_lock);
+ else
+ sema_init(&chain->sem, 1);
+
+ cell_ctxt_size = chain->num_cells * sizeof(*chain->cell_ctxt);
+ if (!cell_ctxt_size) {
+ sdk_err(dev, "Api chain cell size cannot be zero\n");
+ err = -EINVAL;
+ goto alloc_cell_ctxt_err;
+ }
+
+ chain->cell_ctxt = kzalloc(cell_ctxt_size, GFP_KERNEL);
+ if (!chain->cell_ctxt) {
+ sdk_err(dev, "Failed to allocate cell contexts for a chain\n");
+ err = -ENOMEM;
+ goto alloc_cell_ctxt_err;
+ }
+
+ chain->wb_status = dma_zalloc_coherent(dev,
+ sizeof(*chain->wb_status),
+ &chain->wb_status_paddr,
+ GFP_KERNEL);
+ if (!chain->wb_status) {
+ sdk_err(dev, "Failed to allocate DMA wb status\n");
+ err = -ENOMEM;
+ goto alloc_wb_status_err;
+ }
+
+ chain->cell_size_align = ALIGN((u64)chain->cell_size,
+ API_CMD_NODE_ALIGN_SIZE);
+ chain->rsp_size_align = ALIGN((u64)chain->rsp_size,
+ API_CHAIN_RESP_ALIGNMENT);
+ chain->buf_size_align = ALIGN(API_CMD_BUF_SIZE, API_PAYLOAD_ALIGN_SIZE);
+
+ cells_buf_size = (chain->cell_size_align + chain->rsp_size_align +
+ chain->buf_size_align) * chain->num_cells;
+
+ err = hinic3_dma_zalloc_coherent_align(dev, cells_buf_size,
+ API_CMD_NODE_ALIGN_SIZE,
+ GFP_KERNEL,
+ &chain->cells_addr);
+ if (err) {
+ sdk_err(dev, "Failed to allocate API CMD cells buffer\n");
+ goto alloc_cells_buf_err;
+ }
+
+ chain->cell_vaddr_base = chain->cells_addr.align_vaddr;
+ chain->cell_paddr_base = chain->cells_addr.align_paddr;
+
+ chain->rsp_vaddr_base = (u8 *)((u64)chain->cell_vaddr_base +
+ chain->cell_size_align * chain->num_cells);
+ chain->rsp_paddr_base = chain->cell_paddr_base +
+ chain->cell_size_align * chain->num_cells;
+
+ chain->buf_vaddr_base = (u8 *)((u64)chain->rsp_vaddr_base +
+ chain->rsp_size_align * chain->num_cells);
+ chain->buf_paddr_base = chain->rsp_paddr_base +
+ chain->rsp_size_align * chain->num_cells;
+
+ return 0;
+
+alloc_cells_buf_err:
+ dma_free_coherent(dev, sizeof(*chain->wb_status),
+ chain->wb_status, chain->wb_status_paddr);
+
+alloc_wb_status_err:
+ kfree(chain->cell_ctxt);
+
+/*lint -save -e548*/
+alloc_cell_ctxt_err:
+ if (chain->chain_type == HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU)
+ spin_lock_deinit(&chain->async_lock);
+ else
+ sema_deinit(&chain->sem);
+/*lint -restore*/
+ return err;
+}
+
+/**
+ * api_chain_free - free API CMD specific chain
+ * @chain: the API CMD specific chain to free
+ **/
+static void api_chain_free(struct hinic3_api_cmd_chain *chain)
+{
+ void *dev = chain->hwdev->dev_hdl;
+
+ hinic3_dma_free_coherent_align(dev, &chain->cells_addr);
+
+ dma_free_coherent(dev, sizeof(*chain->wb_status),
+ chain->wb_status, chain->wb_status_paddr);
+ kfree(chain->cell_ctxt);
+
+ if (chain->chain_type == HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU)
+ spin_lock_deinit(&chain->async_lock);
+ else
+ sema_deinit(&chain->sem);
+}
+
+/**
+ * api_cmd_create_chain - create API CMD specific chain
+ * @chain: the API CMD specific chain to create
+ * @attr: attributes to set in the chain
+ * Return: 0 - success, negative - failure
+ **/
+static int api_cmd_create_chain(struct hinic3_api_cmd_chain **cmd_chain,
+ struct hinic3_api_cmd_chain_attr *attr)
+{
+ struct hinic3_hwdev *hwdev = attr->hwdev;
+ struct hinic3_api_cmd_chain *chain = NULL;
+ int err;
+
+ if (attr->num_cells & (attr->num_cells - 1)) {
+ sdk_err(hwdev->dev_hdl, "Invalid number of cells, must be power of 2\n");
+ return -EINVAL;
+ }
+
+ chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+ if (!chain)
+ return -ENOMEM;
+
+ chain->hwdev = hwdev;
+
+ err = api_chain_init(chain, attr);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to initialize chain\n");
+ goto chain_init_err;
+ }
+
+ err = api_cmd_create_cells(chain);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to create cells for API CMD chain\n");
+ goto create_cells_err;
+ }
+
+ err = api_cmd_chain_hw_init(chain);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to initialize chain HW\n");
+ goto chain_hw_init_err;
+ }
+
+ *cmd_chain = chain;
+ return 0;
+
+chain_hw_init_err:
+create_cells_err:
+ api_chain_free(chain);
+
+chain_init_err:
+ kfree(chain);
+ return err;
+}
+
+/**
+ * api_cmd_destroy_chain - destroy API CMD specific chain
+ * @chain: the API CMD specific chain to destroy
+ **/
+static void api_cmd_destroy_chain(struct hinic3_api_cmd_chain *chain)
+{
+ api_chain_free(chain);
+ kfree(chain);
+}
+
+/**
+ * hinic3_api_cmd_init - Initialize all the API CMD chains
+ * @hwif: the hardware interface of a pci function device
+ * @chain: the API CMD chains that will be initialized
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_api_cmd_init(struct hinic3_hwdev *hwdev,
+ struct hinic3_api_cmd_chain **chain)
+{
+ void *dev = hwdev->dev_hdl;
+ struct hinic3_api_cmd_chain_attr attr;
+ u8 chain_type, i;
+ int err;
+
+ if (COMM_SUPPORT_API_CHAIN(hwdev) == 0)
+ return 0;
+
+ attr.hwdev = hwdev;
+ attr.num_cells = API_CHAIN_NUM_CELLS;
+ attr.cell_size = API_CHAIN_CELL_SIZE;
+ attr.rsp_size = API_CHAIN_RSP_DATA_SIZE;
+
+ chain_type = HINIC3_API_CMD_WRITE_TO_MGMT_CPU;
+ for (; chain_type < HINIC3_API_CMD_MAX; chain_type++) {
+ attr.chain_type = chain_type;
+
+ err = api_cmd_create_chain(&chain[chain_type], &attr);
+ if (err) {
+ sdk_err(dev, "Failed to create chain %d\n", chain_type);
+ goto create_chain_err;
+ }
+ }
+
+ return 0;
+
+create_chain_err:
+ i = HINIC3_API_CMD_WRITE_TO_MGMT_CPU;
+ for (; i < chain_type; i++)
+ api_cmd_destroy_chain(chain[i]);
+
+ return err;
+}
+
+/**
+ * hinic3_api_cmd_free - free the API CMD chains
+ * @chain: the API CMD chains that will be freed
+ **/
+void hinic3_api_cmd_free(const struct hinic3_hwdev *hwdev, struct hinic3_api_cmd_chain **chain)
+{
+ u8 chain_type;
+
+ if (COMM_SUPPORT_API_CHAIN(hwdev) == 0)
+ return;
+
+ chain_type = HINIC3_API_CMD_WRITE_TO_MGMT_CPU;
+
+ for (; chain_type < HINIC3_API_CMD_MAX; chain_type++)
+ api_cmd_destroy_chain(chain[chain_type]);
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.h
new file mode 100644
index 000000000000..727e668bf237
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.h
@@ -0,0 +1,286 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_API_CMD_H
+#define HINIC3_API_CMD_H
+
+#include <linux/semaphore.h>
+
+#include "hinic3_eqs.h"
+#include "hinic3_hwif.h"
+
+/* api_cmd_cell.ctrl structure */
+#define HINIC3_API_CMD_CELL_CTRL_CELL_LEN_SHIFT 0
+#define HINIC3_API_CMD_CELL_CTRL_RD_DMA_ATTR_OFF_SHIFT 16
+#define HINIC3_API_CMD_CELL_CTRL_WR_DMA_ATTR_OFF_SHIFT 24
+#define HINIC3_API_CMD_CELL_CTRL_XOR_CHKSUM_SHIFT 56
+
+#define HINIC3_API_CMD_CELL_CTRL_CELL_LEN_MASK 0x3FU
+#define HINIC3_API_CMD_CELL_CTRL_RD_DMA_ATTR_OFF_MASK 0x3FU
+#define HINIC3_API_CMD_CELL_CTRL_WR_DMA_ATTR_OFF_MASK 0x3FU
+#define HINIC3_API_CMD_CELL_CTRL_XOR_CHKSUM_MASK 0xFFU
+
+#define HINIC3_API_CMD_CELL_CTRL_SET(val, member) \
+ ((((u64)(val)) & HINIC3_API_CMD_CELL_CTRL_##member##_MASK) << \
+ HINIC3_API_CMD_CELL_CTRL_##member##_SHIFT)
+
+/* api_cmd_cell.desc structure */
+#define HINIC3_API_CMD_DESC_API_TYPE_SHIFT 0
+#define HINIC3_API_CMD_DESC_RD_WR_SHIFT 1
+#define HINIC3_API_CMD_DESC_MGMT_BYPASS_SHIFT 2
+#define HINIC3_API_CMD_DESC_RESP_AEQE_EN_SHIFT 3
+#define HINIC3_API_CMD_DESC_APICHN_RSVD_SHIFT 4
+#define HINIC3_API_CMD_DESC_APICHN_CODE_SHIFT 6
+#define HINIC3_API_CMD_DESC_PRIV_DATA_SHIFT 8
+#define HINIC3_API_CMD_DESC_DEST_SHIFT 32
+#define HINIC3_API_CMD_DESC_SIZE_SHIFT 40
+#define HINIC3_API_CMD_DESC_XOR_CHKSUM_SHIFT 56
+
+#define HINIC3_API_CMD_DESC_API_TYPE_MASK 0x1U
+#define HINIC3_API_CMD_DESC_RD_WR_MASK 0x1U
+#define HINIC3_API_CMD_DESC_MGMT_BYPASS_MASK 0x1U
+#define HINIC3_API_CMD_DESC_RESP_AEQE_EN_MASK 0x1U
+#define HINIC3_API_CMD_DESC_APICHN_RSVD_MASK 0x3U
+#define HINIC3_API_CMD_DESC_APICHN_CODE_MASK 0x3U
+#define HINIC3_API_CMD_DESC_PRIV_DATA_MASK 0xFFFFFFU
+#define HINIC3_API_CMD_DESC_DEST_MASK 0x1FU
+#define HINIC3_API_CMD_DESC_SIZE_MASK 0x7FFU
+#define HINIC3_API_CMD_DESC_XOR_CHKSUM_MASK 0xFFU
+
+#define HINIC3_API_CMD_DESC_SET(val, member) \
+ ((((u64)(val)) & HINIC3_API_CMD_DESC_##member##_MASK) << \
+ HINIC3_API_CMD_DESC_##member##_SHIFT)
+
+/* api_cmd_status header */
+#define HINIC3_API_CMD_STATUS_HEADER_VALID_SHIFT 0
+#define HINIC3_API_CMD_STATUS_HEADER_CHAIN_ID_SHIFT 16
+
+#define HINIC3_API_CMD_STATUS_HEADER_VALID_MASK 0xFFU
+#define HINIC3_API_CMD_STATUS_HEADER_CHAIN_ID_MASK 0xFFU
+
+#define HINIC3_API_CMD_STATUS_HEADER_GET(val, member) \
+ (((val) >> HINIC3_API_CMD_STATUS_HEADER_##member##_SHIFT) & \
+ HINIC3_API_CMD_STATUS_HEADER_##member##_MASK)
+
+/* API_CHAIN_REQ CSR: 0x0020+api_idx*0x080 */
+#define HINIC3_API_CMD_CHAIN_REQ_RESTART_SHIFT 1
+#define HINIC3_API_CMD_CHAIN_REQ_WB_TRIGGER_SHIFT 2
+
+#define HINIC3_API_CMD_CHAIN_REQ_RESTART_MASK 0x1U
+#define HINIC3_API_CMD_CHAIN_REQ_WB_TRIGGER_MASK 0x1U
+
+#define HINIC3_API_CMD_CHAIN_REQ_SET(val, member) \
+ (((val) & HINIC3_API_CMD_CHAIN_REQ_##member##_MASK) << \
+ HINIC3_API_CMD_CHAIN_REQ_##member##_SHIFT)
+
+#define HINIC3_API_CMD_CHAIN_REQ_GET(val, member) \
+ (((val) >> HINIC3_API_CMD_CHAIN_REQ_##member##_SHIFT) & \
+ HINIC3_API_CMD_CHAIN_REQ_##member##_MASK)
+
+#define HINIC3_API_CMD_CHAIN_REQ_CLEAR(val, member) \
+ ((val) & (~(HINIC3_API_CMD_CHAIN_REQ_##member##_MASK \
+ << HINIC3_API_CMD_CHAIN_REQ_##member##_SHIFT)))
+
+/* API_CHAIN_CTL CSR: 0x0014+api_idx*0x080 */
+#define HINIC3_API_CMD_CHAIN_CTRL_RESTART_EN_SHIFT 1
+#define HINIC3_API_CMD_CHAIN_CTRL_XOR_ERR_SHIFT 2
+#define HINIC3_API_CMD_CHAIN_CTRL_AEQE_EN_SHIFT 4
+#define HINIC3_API_CMD_CHAIN_CTRL_AEQ_ID_SHIFT 8
+#define HINIC3_API_CMD_CHAIN_CTRL_XOR_CHK_EN_SHIFT 28
+#define HINIC3_API_CMD_CHAIN_CTRL_CELL_SIZE_SHIFT 30
+
+#define HINIC3_API_CMD_CHAIN_CTRL_RESTART_EN_MASK 0x1U
+#define HINIC3_API_CMD_CHAIN_CTRL_XOR_ERR_MASK 0x1U
+#define HINIC3_API_CMD_CHAIN_CTRL_AEQE_EN_MASK 0x1U
+#define HINIC3_API_CMD_CHAIN_CTRL_AEQ_ID_MASK 0x3U
+#define HINIC3_API_CMD_CHAIN_CTRL_XOR_CHK_EN_MASK 0x3U
+#define HINIC3_API_CMD_CHAIN_CTRL_CELL_SIZE_MASK 0x3U
+
+#define HINIC3_API_CMD_CHAIN_CTRL_SET(val, member) \
+ (((val) & HINIC3_API_CMD_CHAIN_CTRL_##member##_MASK) << \
+ HINIC3_API_CMD_CHAIN_CTRL_##member##_SHIFT)
+
+#define HINIC3_API_CMD_CHAIN_CTRL_CLEAR(val, member) \
+ ((val) & (~(HINIC3_API_CMD_CHAIN_CTRL_##member##_MASK \
+ << HINIC3_API_CMD_CHAIN_CTRL_##member##_SHIFT)))
+
+/* api_cmd rsp header */
+#define HINIC3_API_CMD_RESP_HEAD_VALID_SHIFT 0
+#define HINIC3_API_CMD_RESP_HEAD_STATUS_SHIFT 8
+#define HINIC3_API_CMD_RESP_HEAD_CHAIN_ID_SHIFT 16
+#define HINIC3_API_CMD_RESP_HEAD_RESP_LEN_SHIFT 24
+#define HINIC3_API_CMD_RESP_HEAD_DRIVER_PRIV_SHIFT 40
+
+#define HINIC3_API_CMD_RESP_HEAD_VALID_MASK 0xFF
+#define HINIC3_API_CMD_RESP_HEAD_STATUS_MASK 0xFFU
+#define HINIC3_API_CMD_RESP_HEAD_CHAIN_ID_MASK 0xFFU
+#define HINIC3_API_CMD_RESP_HEAD_RESP_LEN_MASK 0x1FFU
+#define HINIC3_API_CMD_RESP_HEAD_DRIVER_PRIV_MASK 0xFFFFFFU
+
+#define HINIC3_API_CMD_RESP_HEAD_VALID_CODE 0xFF
+
+#define HINIC3_API_CMD_RESP_HEADER_VALID(val) \
+ (((val) & HINIC3_API_CMD_RESP_HEAD_VALID_MASK) == \
+ HINIC3_API_CMD_RESP_HEAD_VALID_CODE)
+
+#define HINIC3_API_CMD_RESP_HEAD_GET(val, member) \
+ (((val) >> HINIC3_API_CMD_RESP_HEAD_##member##_SHIFT) & \
+ HINIC3_API_CMD_RESP_HEAD_##member##_MASK)
+
+#define HINIC3_API_CMD_RESP_HEAD_CHAIN_ID(val) \
+ (((val) >> HINIC3_API_CMD_RESP_HEAD_CHAIN_ID_SHIFT) & \
+ HINIC3_API_CMD_RESP_HEAD_CHAIN_ID_MASK)
+
+#define HINIC3_API_CMD_RESP_HEAD_DRIVER_PRIV(val) \
+ ((u16)(((val) >> HINIC3_API_CMD_RESP_HEAD_DRIVER_PRIV_SHIFT) & \
+ HINIC3_API_CMD_RESP_HEAD_DRIVER_PRIV_MASK))
+/* API_STATUS_0 CSR: 0x0030+api_idx*0x080 */
+#define HINIC3_API_CMD_STATUS_CONS_IDX_MASK 0xFFFFFFU
+#define HINIC3_API_CMD_STATUS_CONS_IDX_SHIFT 0
+
+#define HINIC3_API_CMD_STATUS_FSM_MASK 0xFU
+#define HINIC3_API_CMD_STATUS_FSM_SHIFT 24
+
+#define HINIC3_API_CMD_STATUS_CHKSUM_ERR_MASK 0x3U
+#define HINIC3_API_CMD_STATUS_CHKSUM_ERR_SHIFT 28
+
+#define HINIC3_API_CMD_STATUS_CPLD_ERR_MASK 0x1U
+#define HINIC3_API_CMD_STATUS_CPLD_ERR_SHIFT 30
+
+#define HINIC3_API_CMD_STATUS_CONS_IDX(val) \
+ ((val) & HINIC3_API_CMD_STATUS_CONS_IDX_MASK)
+
+#define HINIC3_API_CMD_STATUS_CHKSUM_ERR(val) \
+ (((val) >> HINIC3_API_CMD_STATUS_CHKSUM_ERR_SHIFT) & \
+ HINIC3_API_CMD_STATUS_CHKSUM_ERR_MASK)
+
+#define HINIC3_API_CMD_STATUS_GET(val, member) \
+ (((val) >> HINIC3_API_CMD_STATUS_##member##_SHIFT) & \
+ HINIC3_API_CMD_STATUS_##member##_MASK)
+
+enum hinic3_api_cmd_chain_type {
+ /* write to mgmt cpu command with completion */
+ HINIC3_API_CMD_WRITE_TO_MGMT_CPU = 2,
+ /* multi read command with completion notification - not used */
+ HINIC3_API_CMD_MULTI_READ = 3,
+ /* write command without completion notification */
+ HINIC3_API_CMD_POLL_WRITE = 4,
+ /* read command without completion notification */
+ HINIC3_API_CMD_POLL_READ = 5,
+ /* read from mgmt cpu command with completion */
+ HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU = 6,
+ HINIC3_API_CMD_MAX,
+};
+
+struct hinic3_api_cmd_status {
+ u64 header;
+ u32 buf_desc;
+ u32 cell_addr_hi;
+ u32 cell_addr_lo;
+ u32 rsvd0;
+ u64 rsvd1;
+};
+
+/* HW struct */
+struct hinic3_api_cmd_cell {
+ u64 ctrl;
+
+ /* address is 64 bit in HW struct */
+ u64 next_cell_paddr;
+
+ u64 desc;
+
+ /* HW struct */
+ union {
+ struct {
+ u64 hw_cmd_paddr;
+ } write;
+
+ struct {
+ u64 hw_wb_resp_paddr;
+ u64 hw_cmd_paddr;
+ } read;
+ };
+};
+
+struct hinic3_api_cmd_resp_fmt {
+ u64 header;
+ u64 resp_data;
+};
+
+struct hinic3_api_cmd_cell_ctxt {
+ struct hinic3_api_cmd_cell *cell_vaddr;
+
+ void *api_cmd_vaddr;
+
+ struct hinic3_api_cmd_resp_fmt *resp;
+
+ struct completion done;
+ int status;
+
+ u32 saved_prod_idx;
+ struct hinic3_hwdev *hwdev;
+};
+
+struct hinic3_api_cmd_chain_attr {
+ struct hinic3_hwdev *hwdev;
+ enum hinic3_api_cmd_chain_type chain_type;
+
+ u32 num_cells;
+ u16 rsp_size;
+ u16 cell_size;
+};
+
+struct hinic3_api_cmd_chain {
+ struct hinic3_hwdev *hwdev;
+ enum hinic3_api_cmd_chain_type chain_type;
+
+ u32 num_cells;
+ u16 cell_size;
+ u16 rsp_size;
+ u32 rsvd1;
+
+ /* HW members is 24 bit format */
+ u32 prod_idx;
+ u32 cons_idx;
+
+ struct semaphore sem;
+ /* Async cmd can not be scheduling */
+ spinlock_t async_lock;
+
+ dma_addr_t wb_status_paddr;
+ struct hinic3_api_cmd_status *wb_status;
+
+ dma_addr_t head_cell_paddr;
+ struct hinic3_api_cmd_cell *head_node;
+
+ struct hinic3_api_cmd_cell_ctxt *cell_ctxt;
+ struct hinic3_api_cmd_cell *curr_node;
+
+ struct hinic3_dma_addr_align cells_addr;
+
+ u8 *cell_vaddr_base;
+ u64 cell_paddr_base;
+ u8 *rsp_vaddr_base;
+ u64 rsp_paddr_base;
+ u8 *buf_vaddr_base;
+ u64 buf_paddr_base;
+ u64 cell_size_align;
+ u64 rsp_size_align;
+ u64 buf_size_align;
+
+ u64 rsvd2;
+};
+
+int hinic3_api_cmd_write(struct hinic3_api_cmd_chain *chain, u8 node_id,
+ const void *cmd, u16 size);
+
+int hinic3_api_cmd_read(struct hinic3_api_cmd_chain *chain, u8 node_id,
+ const void *cmd, u16 size, void *ack, u16 ack_size);
+
+int hinic3_api_cmd_init(struct hinic3_hwdev *hwdev,
+ struct hinic3_api_cmd_chain **chain);
+
+void hinic3_api_cmd_free(const struct hinic3_hwdev *hwdev, struct hinic3_api_cmd_chain **chain);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.c
new file mode 100644
index 000000000000..230859adf0b2
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.c
@@ -0,0 +1,1543 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/errno.h>
+#include <linux/completion.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/spinlock.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_eqs.h"
+#include "hinic3_common.h"
+#include "hinic3_wq.h"
+#include "hinic3_hw_comm.h"
+#include "hinic3_hwif.h"
+#include "hinic3_cmdq.h"
+
+#define HINIC3_CMDQ_BUF_SIZE 2048U
+
+#define CMDQ_CMD_TIMEOUT 5000 /* millisecond */
+
+#define UPPER_8_BITS(data) (((data) >> 8) & 0xFF)
+#define LOWER_8_BITS(data) ((data) & 0xFF)
+
+#define CMDQ_DB_INFO_HI_PROD_IDX_SHIFT 0
+#define CMDQ_DB_INFO_HI_PROD_IDX_MASK 0xFFU
+#define CMDQ_DB_INFO_SET(val, member) \
+ ((((u32)(val)) & CMDQ_DB_INFO_##member##_MASK) << \
+ CMDQ_DB_INFO_##member##_SHIFT)
+
+#define CMDQ_DB_HEAD_QUEUE_TYPE_SHIFT 23
+#define CMDQ_DB_HEAD_CMDQ_TYPE_SHIFT 24
+#define CMDQ_DB_HEAD_SRC_TYPE_SHIFT 27
+#define CMDQ_DB_HEAD_QUEUE_TYPE_MASK 0x1U
+#define CMDQ_DB_HEAD_CMDQ_TYPE_MASK 0x7U
+#define CMDQ_DB_HEAD_SRC_TYPE_MASK 0x1FU
+#define CMDQ_DB_HEAD_SET(val, member) \
+ ((((u32)(val)) & CMDQ_DB_HEAD_##member##_MASK) << \
+ CMDQ_DB_HEAD_##member##_SHIFT)
+
+#define CMDQ_CTRL_PI_SHIFT 0
+#define CMDQ_CTRL_CMD_SHIFT 16
+#define CMDQ_CTRL_MOD_SHIFT 24
+#define CMDQ_CTRL_ACK_TYPE_SHIFT 29
+#define CMDQ_CTRL_HW_BUSY_BIT_SHIFT 31
+
+#define CMDQ_CTRL_PI_MASK 0xFFFFU
+#define CMDQ_CTRL_CMD_MASK 0xFFU
+#define CMDQ_CTRL_MOD_MASK 0x1FU
+#define CMDQ_CTRL_ACK_TYPE_MASK 0x3U
+#define CMDQ_CTRL_HW_BUSY_BIT_MASK 0x1U
+
+#define CMDQ_CTRL_SET(val, member) \
+ ((((u32)(val)) & CMDQ_CTRL_##member##_MASK) << \
+ CMDQ_CTRL_##member##_SHIFT)
+
+#define CMDQ_CTRL_GET(val, member) \
+ (((val) >> CMDQ_CTRL_##member##_SHIFT) & \
+ CMDQ_CTRL_##member##_MASK)
+
+#define CMDQ_WQE_HEADER_BUFDESC_LEN_SHIFT 0
+#define CMDQ_WQE_HEADER_COMPLETE_FMT_SHIFT 15
+#define CMDQ_WQE_HEADER_DATA_FMT_SHIFT 22
+#define CMDQ_WQE_HEADER_COMPLETE_REQ_SHIFT 23
+#define CMDQ_WQE_HEADER_COMPLETE_SECT_LEN_SHIFT 27
+#define CMDQ_WQE_HEADER_CTRL_LEN_SHIFT 29
+#define CMDQ_WQE_HEADER_HW_BUSY_BIT_SHIFT 31
+
+#define CMDQ_WQE_HEADER_BUFDESC_LEN_MASK 0xFFU
+#define CMDQ_WQE_HEADER_COMPLETE_FMT_MASK 0x1U
+#define CMDQ_WQE_HEADER_DATA_FMT_MASK 0x1U
+#define CMDQ_WQE_HEADER_COMPLETE_REQ_MASK 0x1U
+#define CMDQ_WQE_HEADER_COMPLETE_SECT_LEN_MASK 0x3U
+#define CMDQ_WQE_HEADER_CTRL_LEN_MASK 0x3U
+#define CMDQ_WQE_HEADER_HW_BUSY_BIT_MASK 0x1U
+
+#define CMDQ_WQE_HEADER_SET(val, member) \
+ ((((u32)(val)) & CMDQ_WQE_HEADER_##member##_MASK) << \
+ CMDQ_WQE_HEADER_##member##_SHIFT)
+
+#define CMDQ_WQE_HEADER_GET(val, member) \
+ (((val) >> CMDQ_WQE_HEADER_##member##_SHIFT) & \
+ CMDQ_WQE_HEADER_##member##_MASK)
+
+#define CMDQ_CTXT_CURR_WQE_PAGE_PFN_SHIFT 0
+#define CMDQ_CTXT_EQ_ID_SHIFT 53
+#define CMDQ_CTXT_CEQ_ARM_SHIFT 61
+#define CMDQ_CTXT_CEQ_EN_SHIFT 62
+#define CMDQ_CTXT_HW_BUSY_BIT_SHIFT 63
+
+#define CMDQ_CTXT_CURR_WQE_PAGE_PFN_MASK 0xFFFFFFFFFFFFF
+#define CMDQ_CTXT_EQ_ID_MASK 0xFF
+#define CMDQ_CTXT_CEQ_ARM_MASK 0x1
+#define CMDQ_CTXT_CEQ_EN_MASK 0x1
+#define CMDQ_CTXT_HW_BUSY_BIT_MASK 0x1
+
+#define CMDQ_CTXT_PAGE_INFO_SET(val, member) \
+ (((u64)(val) & CMDQ_CTXT_##member##_MASK) << \
+ CMDQ_CTXT_##member##_SHIFT)
+
+#define CMDQ_CTXT_PAGE_INFO_GET(val, member) \
+ (((u64)(val) >> CMDQ_CTXT_##member##_SHIFT) & \
+ CMDQ_CTXT_##member##_MASK)
+
+#define CMDQ_CTXT_WQ_BLOCK_PFN_SHIFT 0
+#define CMDQ_CTXT_CI_SHIFT 52
+
+#define CMDQ_CTXT_WQ_BLOCK_PFN_MASK 0xFFFFFFFFFFFFF
+#define CMDQ_CTXT_CI_MASK 0xFFF
+
+#define CMDQ_CTXT_BLOCK_INFO_SET(val, member) \
+ (((u64)(val) & CMDQ_CTXT_##member##_MASK) << \
+ CMDQ_CTXT_##member##_SHIFT)
+
+#define CMDQ_CTXT_BLOCK_INFO_GET(val, member) \
+ (((u64)(val) >> CMDQ_CTXT_##member##_SHIFT) & \
+ CMDQ_CTXT_##member##_MASK)
+
+#define SAVED_DATA_ARM_SHIFT 31
+
+#define SAVED_DATA_ARM_MASK 0x1U
+
+#define SAVED_DATA_SET(val, member) \
+ (((val) & SAVED_DATA_##member##_MASK) << \
+ SAVED_DATA_##member##_SHIFT)
+
+#define SAVED_DATA_CLEAR(val, member) \
+ ((val) & (~(SAVED_DATA_##member##_MASK << \
+ SAVED_DATA_##member##_SHIFT)))
+
+#define WQE_ERRCODE_VAL_SHIFT 0
+
+#define WQE_ERRCODE_VAL_MASK 0x7FFFFFFF
+
+#define WQE_ERRCODE_GET(val, member) \
+ (((val) >> WQE_ERRCODE_##member##_SHIFT) & \
+ WQE_ERRCODE_##member##_MASK)
+
+#define CEQE_CMDQ_TYPE_SHIFT 0
+
+#define CEQE_CMDQ_TYPE_MASK 0x7
+
+#define CEQE_CMDQ_GET(val, member) \
+ (((val) >> CEQE_CMDQ_##member##_SHIFT) & \
+ CEQE_CMDQ_##member##_MASK)
+
+#define WQE_COMPLETED(ctrl_info) CMDQ_CTRL_GET(ctrl_info, HW_BUSY_BIT)
+
+#define WQE_HEADER(wqe) ((struct hinic3_cmdq_header *)(wqe))
+
+#define CMDQ_DB_PI_OFF(pi) (((u16)LOWER_8_BITS(pi)) << 3)
+
+#define CMDQ_DB_ADDR(db_base, pi) \
+ (((u8 *)(db_base)) + CMDQ_DB_PI_OFF(pi))
+
+#define CMDQ_PFN_SHIFT 12
+#define CMDQ_PFN(addr) ((addr) >> CMDQ_PFN_SHIFT)
+
+#define FIRST_DATA_TO_WRITE_LAST sizeof(u64)
+
+#define WQE_LCMD_SIZE 64
+#define WQE_SCMD_SIZE 64
+
+#define COMPLETE_LEN 3
+
+#define CMDQ_WQEBB_SIZE 64
+#define CMDQ_WQE_SIZE 64
+
+#define cmdq_to_cmdqs(cmdq) container_of((cmdq) - (cmdq)->cmdq_type, \
+ struct hinic3_cmdqs, cmdq[0])
+
+#define CMDQ_SEND_CMPT_CODE 10
+#define CMDQ_COMPLETE_CMPT_CODE 11
+#define CMDQ_FORCE_STOP_CMPT_CODE 12
+
+enum cmdq_scmd_type {
+ CMDQ_SET_ARM_CMD = 2,
+};
+
+enum cmdq_wqe_type {
+ WQE_LCMD_TYPE,
+ WQE_SCMD_TYPE,
+};
+
+enum ctrl_sect_len {
+ CTRL_SECT_LEN = 1,
+ CTRL_DIRECT_SECT_LEN = 2,
+};
+
+enum bufdesc_len {
+ BUFDESC_LCMD_LEN = 2,
+ BUFDESC_SCMD_LEN = 3,
+};
+
+enum data_format {
+ DATA_SGE,
+ DATA_DIRECT,
+};
+
+enum completion_format {
+ COMPLETE_DIRECT,
+ COMPLETE_SGE,
+};
+
+enum completion_request {
+ CEQ_SET = 1,
+};
+
+enum cmdq_cmd_type {
+ SYNC_CMD_DIRECT_RESP,
+ SYNC_CMD_SGE_RESP,
+ ASYNC_CMD,
+};
+
+#define NUM_WQEBBS_FOR_CMDQ_WQE 1
+
+bool hinic3_cmdq_idle(struct hinic3_cmdq *cmdq)
+{
+ return hinic3_wq_is_empty(&cmdq->wq);
+}
+
+static void *cmdq_read_wqe(struct hinic3_wq *wq, u16 *ci)
+{
+ if (hinic3_wq_is_empty(wq))
+ return NULL;
+
+ return hinic3_wq_read_one_wqebb(wq, ci);
+}
+
+static void *cmdq_get_wqe(struct hinic3_wq *wq, u16 *pi)
+{
+ if (!hinic3_wq_free_wqebbs(wq))
+ return NULL;
+
+ return hinic3_wq_get_one_wqebb(wq, pi);
+}
+
+struct hinic3_cmd_buf *hinic3_alloc_cmd_buf(void *hwdev)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+ void *dev = NULL;
+
+ if (!hwdev) {
+ pr_err("Failed to alloc cmd buf, Invalid hwdev\n");
+ return NULL;
+ }
+
+ cmdqs = ((struct hinic3_hwdev *)hwdev)->cmdqs;
+ dev = ((struct hinic3_hwdev *)hwdev)->dev_hdl;
+
+ cmd_buf = kzalloc(sizeof(*cmd_buf), GFP_ATOMIC);
+ if (!cmd_buf) {
+ sdk_err(dev, "Failed to allocate cmd buf\n");
+ return NULL;
+ }
+
+ cmd_buf->buf = pci_pool_alloc(cmdqs->cmd_buf_pool, GFP_ATOMIC,
+ &cmd_buf->dma_addr);
+ if (!cmd_buf->buf) {
+ sdk_err(dev, "Failed to allocate cmdq cmd buf from the pool\n");
+ goto alloc_pci_buf_err;
+ }
+
+ cmd_buf->size = HINIC3_CMDQ_BUF_SIZE;
+ atomic_set(&cmd_buf->ref_cnt, 1);
+
+ return cmd_buf;
+
+alloc_pci_buf_err:
+ kfree(cmd_buf);
+ return NULL;
+}
+EXPORT_SYMBOL(hinic3_alloc_cmd_buf);
+
+void hinic3_free_cmd_buf(void *hwdev, struct hinic3_cmd_buf *cmd_buf)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+
+ if (!hwdev || !cmd_buf) {
+ pr_err("Failed to free cmd buf, hwdev or cmd_buf is NULL\n");
+ return;
+ }
+
+ if (!atomic_dec_and_test(&cmd_buf->ref_cnt))
+ return;
+
+ cmdqs = ((struct hinic3_hwdev *)hwdev)->cmdqs;
+
+ pci_pool_free(cmdqs->cmd_buf_pool, cmd_buf->buf, cmd_buf->dma_addr);
+ kfree(cmd_buf);
+}
+EXPORT_SYMBOL(hinic3_free_cmd_buf);
+
+static void cmdq_set_completion(struct hinic3_cmdq_completion *complete,
+ struct hinic3_cmd_buf *buf_out)
+{
+ struct hinic3_sge_resp *sge_resp = &complete->sge_resp;
+
+ hinic3_set_sge(&sge_resp->sge, buf_out->dma_addr,
+ HINIC3_CMDQ_BUF_SIZE);
+}
+
+static void cmdq_set_lcmd_bufdesc(struct hinic3_cmdq_wqe_lcmd *wqe,
+ struct hinic3_cmd_buf *buf_in)
+{
+ hinic3_set_sge(&wqe->buf_desc.sge, buf_in->dma_addr, buf_in->size);
+}
+
+static void cmdq_fill_db(struct hinic3_cmdq_db *db,
+ enum hinic3_cmdq_type cmdq_type, u16 prod_idx)
+{
+ db->db_info = CMDQ_DB_INFO_SET(UPPER_8_BITS(prod_idx), HI_PROD_IDX);
+
+ db->db_head = CMDQ_DB_HEAD_SET(HINIC3_DB_CMDQ_TYPE, QUEUE_TYPE) |
+ CMDQ_DB_HEAD_SET(cmdq_type, CMDQ_TYPE) |
+ CMDQ_DB_HEAD_SET(HINIC3_DB_SRC_CMDQ_TYPE, SRC_TYPE);
+}
+
+static void cmdq_set_db(struct hinic3_cmdq *cmdq,
+ enum hinic3_cmdq_type cmdq_type, u16 prod_idx)
+{
+ struct hinic3_cmdq_db db = {0};
+ u8 *db_base = cmdq->hwdev->cmdqs->cmdqs_db_base;
+
+ cmdq_fill_db(&db, cmdq_type, prod_idx);
+
+ /* The data that is written to HW should be in Big Endian Format */
+ db.db_info = hinic3_hw_be32(db.db_info);
+ db.db_head = hinic3_hw_be32(db.db_head);
+
+ wmb(); /* write all before the doorbell */
+ writeq(*((u64 *)&db), CMDQ_DB_ADDR(db_base, prod_idx));
+}
+
+static void cmdq_wqe_fill(void *dst, const void *src)
+{
+ memcpy((u8 *)dst + FIRST_DATA_TO_WRITE_LAST,
+ (u8 *)src + FIRST_DATA_TO_WRITE_LAST,
+ CMDQ_WQE_SIZE - FIRST_DATA_TO_WRITE_LAST);
+
+ wmb(); /* The first 8 bytes should be written last */
+
+ *(u64 *)dst = *(u64 *)src;
+}
+
+static void cmdq_prepare_wqe_ctrl(struct hinic3_cmdq_wqe *wqe, int wrapped,
+ u8 mod, u8 cmd, u16 prod_idx,
+ enum completion_format complete_format,
+ enum data_format data_format,
+ enum bufdesc_len buf_len)
+{
+ struct hinic3_ctrl *ctrl = NULL;
+ enum ctrl_sect_len ctrl_len;
+ struct hinic3_cmdq_wqe_lcmd *wqe_lcmd = NULL;
+ struct hinic3_cmdq_wqe_scmd *wqe_scmd = NULL;
+ u32 saved_data = WQE_HEADER(wqe)->saved_data;
+
+ if (data_format == DATA_SGE) {
+ wqe_lcmd = &wqe->wqe_lcmd;
+
+ wqe_lcmd->status.status_info = 0;
+ ctrl = &wqe_lcmd->ctrl;
+ ctrl_len = CTRL_SECT_LEN;
+ } else {
+ wqe_scmd = &wqe->inline_wqe.wqe_scmd;
+
+ wqe_scmd->status.status_info = 0;
+ ctrl = &wqe_scmd->ctrl;
+ ctrl_len = CTRL_DIRECT_SECT_LEN;
+ }
+
+ ctrl->ctrl_info = CMDQ_CTRL_SET(prod_idx, PI) |
+ CMDQ_CTRL_SET(cmd, CMD) |
+ CMDQ_CTRL_SET(mod, MOD) |
+ CMDQ_CTRL_SET(HINIC3_ACK_TYPE_CMDQ, ACK_TYPE);
+
+ WQE_HEADER(wqe)->header_info =
+ CMDQ_WQE_HEADER_SET(buf_len, BUFDESC_LEN) |
+ CMDQ_WQE_HEADER_SET(complete_format, COMPLETE_FMT) |
+ CMDQ_WQE_HEADER_SET(data_format, DATA_FMT) |
+ CMDQ_WQE_HEADER_SET(CEQ_SET, COMPLETE_REQ) |
+ CMDQ_WQE_HEADER_SET(COMPLETE_LEN, COMPLETE_SECT_LEN) |
+ CMDQ_WQE_HEADER_SET(ctrl_len, CTRL_LEN) |
+ CMDQ_WQE_HEADER_SET((u32)wrapped, HW_BUSY_BIT);
+
+ if (cmd == CMDQ_SET_ARM_CMD && mod == HINIC3_MOD_COMM) {
+ saved_data &= SAVED_DATA_CLEAR(saved_data, ARM);
+ WQE_HEADER(wqe)->saved_data = saved_data |
+ SAVED_DATA_SET(1, ARM);
+ } else {
+ saved_data &= SAVED_DATA_CLEAR(saved_data, ARM);
+ WQE_HEADER(wqe)->saved_data = saved_data;
+ }
+}
+
+static void cmdq_set_lcmd_wqe(struct hinic3_cmdq_wqe *wqe,
+ enum cmdq_cmd_type cmd_type,
+ struct hinic3_cmd_buf *buf_in,
+ struct hinic3_cmd_buf *buf_out, int wrapped,
+ u8 mod, u8 cmd, u16 prod_idx)
+{
+ struct hinic3_cmdq_wqe_lcmd *wqe_lcmd = &wqe->wqe_lcmd;
+ enum completion_format complete_format = COMPLETE_DIRECT;
+
+ switch (cmd_type) {
+ case SYNC_CMD_DIRECT_RESP:
+ wqe_lcmd->completion.direct_resp = 0;
+ break;
+ case SYNC_CMD_SGE_RESP:
+ if (buf_out) {
+ complete_format = COMPLETE_SGE;
+ cmdq_set_completion(&wqe_lcmd->completion,
+ buf_out);
+ }
+ break;
+ case ASYNC_CMD:
+ wqe_lcmd->completion.direct_resp = 0;
+ wqe_lcmd->buf_desc.saved_async_buf = (u64)(buf_in);
+ break;
+ }
+
+ cmdq_prepare_wqe_ctrl(wqe, wrapped, mod, cmd, prod_idx, complete_format,
+ DATA_SGE, BUFDESC_LCMD_LEN);
+
+ cmdq_set_lcmd_bufdesc(wqe_lcmd, buf_in);
+}
+
+static void cmdq_update_cmd_status(struct hinic3_cmdq *cmdq, u16 prod_idx,
+ struct hinic3_cmdq_wqe *wqe)
+{
+ struct hinic3_cmdq_cmd_info *cmd_info;
+ struct hinic3_cmdq_wqe_lcmd *wqe_lcmd;
+ u32 status_info;
+
+ wqe_lcmd = &wqe->wqe_lcmd;
+ cmd_info = &cmdq->cmd_infos[prod_idx];
+
+ if (cmd_info->errcode) {
+ status_info = hinic3_hw_cpu32(wqe_lcmd->status.status_info);
+ *cmd_info->errcode = WQE_ERRCODE_GET(status_info, VAL);
+ }
+
+ if (cmd_info->direct_resp)
+ *cmd_info->direct_resp =
+ hinic3_hw_cpu32(wqe_lcmd->completion.direct_resp);
+}
+
+static int hinic3_cmdq_sync_timeout_check(struct hinic3_cmdq *cmdq,
+ struct hinic3_cmdq_wqe *wqe, u16 pi)
+{
+ struct hinic3_cmdq_wqe_lcmd *wqe_lcmd;
+ struct hinic3_ctrl *ctrl;
+ u32 ctrl_info;
+
+ wqe_lcmd = &wqe->wqe_lcmd;
+ ctrl = &wqe_lcmd->ctrl;
+ ctrl_info = hinic3_hw_cpu32((ctrl)->ctrl_info);
+ if (!WQE_COMPLETED(ctrl_info)) {
+ sdk_info(cmdq->hwdev->dev_hdl, "Cmdq sync command check busy bit not set\n");
+ return -EFAULT;
+ }
+
+ cmdq_update_cmd_status(cmdq, pi, wqe);
+
+ sdk_info(cmdq->hwdev->dev_hdl, "Cmdq sync command check succeed\n");
+ return 0;
+}
+
+static void clear_cmd_info(struct hinic3_cmdq_cmd_info *cmd_info,
+ const struct hinic3_cmdq_cmd_info *saved_cmd_info)
+{
+ if (cmd_info->errcode == saved_cmd_info->errcode)
+ cmd_info->errcode = NULL;
+
+ if (cmd_info->done == saved_cmd_info->done)
+ cmd_info->done = NULL;
+
+ if (cmd_info->direct_resp == saved_cmd_info->direct_resp)
+ cmd_info->direct_resp = NULL;
+}
+
+static int cmdq_ceq_handler_status(struct hinic3_cmdq *cmdq,
+ struct hinic3_cmdq_cmd_info *cmd_info,
+ struct hinic3_cmdq_cmd_info *saved_cmd_info,
+ u64 curr_msg_id, u16 curr_prod_idx,
+ struct hinic3_cmdq_wqe *curr_wqe,
+ u32 timeout)
+{
+ ulong timeo;
+ int err;
+ ulong end = jiffies + msecs_to_jiffies(timeout);
+
+ if (cmdq->hwdev->poll) {
+ while (time_before(jiffies, end)) {
+ hinic3_cmdq_ceq_handler(cmdq->hwdev, 0);
+ if (saved_cmd_info->done->done != 0)
+ return 0;
+ usleep_range(9, 10); /* sleep 9 us ~ 10 us */
+ }
+ } else {
+ timeo = msecs_to_jiffies(timeout);
+ if (wait_for_completion_timeout(saved_cmd_info->done, timeo))
+ return 0;
+ }
+
+ spin_lock_bh(&cmdq->cmdq_lock);
+
+ if (cmd_info->cmpt_code == saved_cmd_info->cmpt_code)
+ cmd_info->cmpt_code = NULL;
+
+ if (*saved_cmd_info->cmpt_code == CMDQ_COMPLETE_CMPT_CODE) {
+ sdk_info(cmdq->hwdev->dev_hdl, "Cmdq direct sync command has been completed\n");
+ spin_unlock_bh(&cmdq->cmdq_lock);
+ return 0;
+ }
+
+ if (curr_msg_id == cmd_info->cmdq_msg_id) {
+ err = hinic3_cmdq_sync_timeout_check(cmdq, curr_wqe,
+ curr_prod_idx);
+ if (err)
+ cmd_info->cmd_type = HINIC3_CMD_TYPE_TIMEOUT;
+ else
+ cmd_info->cmd_type = HINIC3_CMD_TYPE_FAKE_TIMEOUT;
+ } else {
+ err = -ETIMEDOUT;
+ sdk_err(cmdq->hwdev->dev_hdl, "Cmdq sync command current msg id dismatch with cmd_info msg id\n");
+ }
+
+ clear_cmd_info(cmd_info, saved_cmd_info);
+
+ spin_unlock_bh(&cmdq->cmdq_lock);
+
+ if (err == 0)
+ return 0;
+
+ hinic3_dump_ceq_info(cmdq->hwdev);
+
+ return -ETIMEDOUT;
+}
+
+static int wait_cmdq_sync_cmd_completion(struct hinic3_cmdq *cmdq,
+ struct hinic3_cmdq_cmd_info *cmd_info,
+ struct hinic3_cmdq_cmd_info *saved_cmd_info,
+ u64 curr_msg_id, u16 curr_prod_idx,
+ struct hinic3_cmdq_wqe *curr_wqe, u32 timeout)
+{
+ return cmdq_ceq_handler_status(cmdq, cmd_info, saved_cmd_info,
+ curr_msg_id, curr_prod_idx,
+ curr_wqe, timeout);
+}
+
+static int cmdq_msg_lock(struct hinic3_cmdq *cmdq, u16 channel)
+{
+ struct hinic3_cmdqs *cmdqs = cmdq_to_cmdqs(cmdq);
+
+ /* Keep wrapped and doorbell index correct. bh - for tasklet(ceq) */
+ spin_lock_bh(&cmdq->cmdq_lock);
+
+ if (cmdqs->lock_channel_en && test_bit(channel, &cmdqs->channel_stop)) {
+ spin_unlock_bh(&cmdq->cmdq_lock);
+ return -EAGAIN;
+ }
+
+ return 0;
+}
+
+static void cmdq_msg_unlock(struct hinic3_cmdq *cmdq)
+{
+ spin_unlock_bh(&cmdq->cmdq_lock);
+}
+
+static void cmdq_clear_cmd_buf(struct hinic3_cmdq_cmd_info *cmd_info,
+ struct hinic3_hwdev *hwdev)
+{
+ if (cmd_info->buf_in)
+ hinic3_free_cmd_buf(hwdev, cmd_info->buf_in);
+
+ if (cmd_info->buf_out)
+ hinic3_free_cmd_buf(hwdev, cmd_info->buf_out);
+
+ cmd_info->buf_in = NULL;
+ cmd_info->buf_out = NULL;
+}
+
+static void cmdq_set_cmd_buf(struct hinic3_cmdq_cmd_info *cmd_info,
+ struct hinic3_hwdev *hwdev,
+ struct hinic3_cmd_buf *buf_in,
+ struct hinic3_cmd_buf *buf_out)
+{
+ cmd_info->buf_in = buf_in;
+ cmd_info->buf_out = buf_out;
+
+ if (buf_in)
+ atomic_inc(&buf_in->ref_cnt);
+
+ if (buf_out)
+ atomic_inc(&buf_out->ref_cnt);
+}
+
+static int cmdq_sync_cmd_direct_resp(struct hinic3_cmdq *cmdq, u8 mod,
+ u8 cmd, struct hinic3_cmd_buf *buf_in,
+ u64 *out_param, u32 timeout, u16 channel)
+{
+ struct hinic3_wq *wq = &cmdq->wq;
+ struct hinic3_cmdq_wqe *curr_wqe = NULL, wqe;
+ struct hinic3_cmdq_cmd_info *cmd_info = NULL, saved_cmd_info;
+ struct completion done;
+ u16 curr_prod_idx, next_prod_idx;
+ int wrapped, errcode = 0, wqe_size = WQE_LCMD_SIZE;
+ int cmpt_code = CMDQ_SEND_CMPT_CODE;
+ u64 curr_msg_id;
+ int err;
+ u32 real_timeout;
+
+ err = cmdq_msg_lock(cmdq, channel);
+ if (err)
+ return err;
+
+ curr_wqe = cmdq_get_wqe(wq, &curr_prod_idx);
+ if (!curr_wqe) {
+ cmdq_msg_unlock(cmdq);
+ return -EBUSY;
+ }
+
+ memset(&wqe, 0, sizeof(wqe));
+
+ wrapped = cmdq->wrapped;
+
+ next_prod_idx = curr_prod_idx + NUM_WQEBBS_FOR_CMDQ_WQE;
+ if (next_prod_idx >= wq->q_depth) {
+ cmdq->wrapped = (cmdq->wrapped == 0) ? 1 : 0;
+ next_prod_idx -= (u16)wq->q_depth;
+ }
+
+ cmd_info = &cmdq->cmd_infos[curr_prod_idx];
+
+ init_completion(&done);
+
+ cmd_info->cmd_type = HINIC3_CMD_TYPE_DIRECT_RESP;
+ cmd_info->done = &done;
+ cmd_info->errcode = &errcode;
+ cmd_info->direct_resp = out_param;
+ cmd_info->cmpt_code = &cmpt_code;
+ cmd_info->channel = channel;
+ cmdq_set_cmd_buf(cmd_info, cmdq->hwdev, buf_in, NULL);
+
+ memcpy(&saved_cmd_info, cmd_info, sizeof(*cmd_info));
+
+ cmdq_set_lcmd_wqe(&wqe, SYNC_CMD_DIRECT_RESP, buf_in, NULL,
+ wrapped, mod, cmd, curr_prod_idx);
+
+ /* The data that is written to HW should be in Big Endian Format */
+ hinic3_hw_be32_len(&wqe, wqe_size);
+
+ /* CMDQ WQE is not shadow, therefore wqe will be written to wq */
+ cmdq_wqe_fill(curr_wqe, &wqe);
+
+ (cmd_info->cmdq_msg_id)++;
+ curr_msg_id = cmd_info->cmdq_msg_id;
+
+ cmdq_set_db(cmdq, HINIC3_CMDQ_SYNC, next_prod_idx);
+
+ cmdq_msg_unlock(cmdq);
+
+ real_timeout = timeout ? timeout : CMDQ_CMD_TIMEOUT;
+ err = wait_cmdq_sync_cmd_completion(cmdq, cmd_info, &saved_cmd_info,
+ curr_msg_id, curr_prod_idx,
+ curr_wqe, real_timeout);
+ if (err) {
+ sdk_err(cmdq->hwdev->dev_hdl, "Cmdq sync command(mod: %u, cmd: %u) timeout, prod idx: 0x%x\n",
+ mod, cmd, curr_prod_idx);
+ err = -ETIMEDOUT;
+ }
+
+ if (cmpt_code == CMDQ_FORCE_STOP_CMPT_CODE) {
+ sdk_info(cmdq->hwdev->dev_hdl, "Force stop cmdq cmd, mod: %u, cmd: %u\n",
+ mod, cmd);
+ err = -EAGAIN;
+ }
+
+ destroy_completion(&done);
+ smp_rmb(); /* read error code after completion */
+
+ return (err != 0) ? err : errcode;
+}
+
+static int cmdq_sync_cmd_detail_resp(struct hinic3_cmdq *cmdq, u8 mod, u8 cmd,
+ struct hinic3_cmd_buf *buf_in,
+ struct hinic3_cmd_buf *buf_out,
+ u64 *out_param, u32 timeout, u16 channel)
+{
+ struct hinic3_wq *wq = &cmdq->wq;
+ struct hinic3_cmdq_wqe *curr_wqe = NULL, wqe;
+ struct hinic3_cmdq_cmd_info *cmd_info = NULL, saved_cmd_info;
+ struct completion done;
+ u16 curr_prod_idx, next_prod_idx;
+ int wrapped, errcode = 0, wqe_size = WQE_LCMD_SIZE;
+ int cmpt_code = CMDQ_SEND_CMPT_CODE;
+ u64 curr_msg_id;
+ int err;
+ u32 real_timeout;
+
+ err = cmdq_msg_lock(cmdq, channel);
+ if (err)
+ return err;
+
+ curr_wqe = cmdq_get_wqe(wq, &curr_prod_idx);
+ if (!curr_wqe) {
+ cmdq_msg_unlock(cmdq);
+ return -EBUSY;
+ }
+
+ memset(&wqe, 0, sizeof(wqe));
+
+ wrapped = cmdq->wrapped;
+
+ next_prod_idx = curr_prod_idx + NUM_WQEBBS_FOR_CMDQ_WQE;
+ if (next_prod_idx >= wq->q_depth) {
+ cmdq->wrapped = (cmdq->wrapped == 0) ? 1 : 0;
+ next_prod_idx -= (u16)wq->q_depth;
+ }
+
+ cmd_info = &cmdq->cmd_infos[curr_prod_idx];
+
+ init_completion(&done);
+
+ cmd_info->cmd_type = HINIC3_CMD_TYPE_SGE_RESP;
+ cmd_info->done = &done;
+ cmd_info->errcode = &errcode;
+ cmd_info->direct_resp = out_param;
+ cmd_info->cmpt_code = &cmpt_code;
+ cmd_info->channel = channel;
+ cmdq_set_cmd_buf(cmd_info, cmdq->hwdev, buf_in, buf_out);
+
+ memcpy(&saved_cmd_info, cmd_info, sizeof(*cmd_info));
+
+ cmdq_set_lcmd_wqe(&wqe, SYNC_CMD_SGE_RESP, buf_in, buf_out,
+ wrapped, mod, cmd, curr_prod_idx);
+
+ hinic3_hw_be32_len(&wqe, wqe_size);
+
+ cmdq_wqe_fill(curr_wqe, &wqe);
+
+ (cmd_info->cmdq_msg_id)++;
+ curr_msg_id = cmd_info->cmdq_msg_id;
+
+ cmdq_set_db(cmdq, cmdq->cmdq_type, next_prod_idx);
+
+ cmdq_msg_unlock(cmdq);
+
+ real_timeout = timeout ? timeout : CMDQ_CMD_TIMEOUT;
+ err = wait_cmdq_sync_cmd_completion(cmdq, cmd_info, &saved_cmd_info,
+ curr_msg_id, curr_prod_idx,
+ curr_wqe, real_timeout);
+ if (err) {
+ sdk_err(cmdq->hwdev->dev_hdl, "Cmdq sync command(mod: %u, cmd: %u) timeout, prod idx: 0x%x\n",
+ mod, cmd, curr_prod_idx);
+ err = -ETIMEDOUT;
+ }
+
+ if (cmpt_code == CMDQ_FORCE_STOP_CMPT_CODE) {
+ sdk_info(cmdq->hwdev->dev_hdl, "Force stop cmdq cmd, mod: %u, cmd: %u\n",
+ mod, cmd);
+ err = -EAGAIN;
+ }
+
+ destroy_completion(&done);
+ smp_rmb(); /* read error code after completion */
+
+ return (err != 0) ? err : errcode;
+}
+
+static int cmdq_async_cmd(struct hinic3_cmdq *cmdq, u8 mod, u8 cmd,
+ struct hinic3_cmd_buf *buf_in, u16 channel)
+{
+ struct hinic3_cmdq_cmd_info *cmd_info = NULL;
+ struct hinic3_wq *wq = &cmdq->wq;
+ int wqe_size = WQE_LCMD_SIZE;
+ u16 curr_prod_idx, next_prod_idx;
+ struct hinic3_cmdq_wqe *curr_wqe = NULL, wqe;
+ int wrapped, err;
+
+ err = cmdq_msg_lock(cmdq, channel);
+ if (err)
+ return err;
+
+ curr_wqe = cmdq_get_wqe(wq, &curr_prod_idx);
+ if (!curr_wqe) {
+ cmdq_msg_unlock(cmdq);
+ return -EBUSY;
+ }
+
+ memset(&wqe, 0, sizeof(wqe));
+
+ wrapped = cmdq->wrapped;
+ next_prod_idx = curr_prod_idx + NUM_WQEBBS_FOR_CMDQ_WQE;
+ if (next_prod_idx >= wq->q_depth) {
+ cmdq->wrapped = (cmdq->wrapped == 0) ? 1 : 0;
+ next_prod_idx -= (u16)wq->q_depth;
+ }
+
+ cmdq_set_lcmd_wqe(&wqe, ASYNC_CMD, buf_in, NULL, wrapped,
+ mod, cmd, curr_prod_idx);
+
+ /* The data that is written to HW should be in Big Endian Format */
+ hinic3_hw_be32_len(&wqe, wqe_size);
+ cmdq_wqe_fill(curr_wqe, &wqe);
+
+ cmd_info = &cmdq->cmd_infos[curr_prod_idx];
+ cmd_info->cmd_type = HINIC3_CMD_TYPE_ASYNC;
+ cmd_info->channel = channel;
+ /* The caller will not free the cmd_buf of the asynchronous command,
+ * so there is no need to increase the reference count here
+ */
+ cmd_info->buf_in = buf_in;
+
+ /* LB mode 1 compatible, cmdq 0 also for async, which is sync_no_wait */
+ cmdq_set_db(cmdq, HINIC3_CMDQ_SYNC, next_prod_idx);
+
+ cmdq_msg_unlock(cmdq);
+
+ return 0;
+}
+
+static int cmdq_params_valid(const void *hwdev, const struct hinic3_cmd_buf *buf_in)
+{
+ if (!buf_in || !hwdev) {
+ pr_err("Invalid CMDQ buffer addr or hwdev\n");
+ return -EINVAL;
+ }
+
+ if (!buf_in->size || buf_in->size > HINIC3_CMDQ_BUF_SIZE) {
+ pr_err("Invalid CMDQ buffer size: 0x%x\n", buf_in->size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+#define WAIT_CMDQ_ENABLE_TIMEOUT 300
+static int wait_cmdqs_enable(struct hinic3_cmdqs *cmdqs)
+{
+ unsigned long end;
+
+ end = jiffies + msecs_to_jiffies(WAIT_CMDQ_ENABLE_TIMEOUT);
+ do {
+ if (cmdqs->status & HINIC3_CMDQ_ENABLE)
+ return 0;
+ } while (time_before(jiffies, end) && cmdqs->hwdev->chip_present_flag &&
+ !cmdqs->disable_flag);
+
+ cmdqs->disable_flag = 1;
+
+ return -EBUSY;
+}
+
+int hinic3_cmdq_direct_resp(void *hwdev, u8 mod, u8 cmd,
+ struct hinic3_cmd_buf *buf_in,
+ u64 *out_param, u32 timeout, u16 channel)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+ int err;
+
+ err = cmdq_params_valid(hwdev, buf_in);
+ if (err) {
+ pr_err("Invalid CMDQ parameters\n");
+ return err;
+ }
+
+ if (!get_card_present_state((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ cmdqs = ((struct hinic3_hwdev *)hwdev)->cmdqs;
+ err = wait_cmdqs_enable(cmdqs);
+ if (err) {
+ sdk_err(cmdqs->hwdev->dev_hdl, "Cmdq is disable\n");
+ return err;
+ }
+
+ err = cmdq_sync_cmd_direct_resp(&cmdqs->cmdq[HINIC3_CMDQ_SYNC],
+ mod, cmd, buf_in, out_param,
+ timeout, channel);
+
+ if (!(((struct hinic3_hwdev *)hwdev)->chip_present_flag))
+ return -ETIMEDOUT;
+ else
+ return err;
+}
+EXPORT_SYMBOL(hinic3_cmdq_direct_resp);
+
+int hinic3_cmdq_detail_resp(void *hwdev, u8 mod, u8 cmd,
+ struct hinic3_cmd_buf *buf_in,
+ struct hinic3_cmd_buf *buf_out,
+ u64 *out_param, u32 timeout, u16 channel)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+ int err;
+
+ err = cmdq_params_valid(hwdev, buf_in);
+ if (err)
+ return err;
+
+ cmdqs = ((struct hinic3_hwdev *)hwdev)->cmdqs;
+
+ if (!get_card_present_state((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ err = wait_cmdqs_enable(cmdqs);
+ if (err) {
+ sdk_err(cmdqs->hwdev->dev_hdl, "Cmdq is disable\n");
+ return err;
+ }
+
+ err = cmdq_sync_cmd_detail_resp(&cmdqs->cmdq[HINIC3_CMDQ_SYNC],
+ mod, cmd, buf_in, buf_out, out_param,
+ timeout, channel);
+ if (!(((struct hinic3_hwdev *)hwdev)->chip_present_flag))
+ return -ETIMEDOUT;
+ else
+ return err;
+}
+EXPORT_SYMBOL(hinic3_cmdq_detail_resp);
+
+int hinic3_cos_id_detail_resp(void *hwdev, u8 mod, u8 cmd, u8 cos_id,
+ struct hinic3_cmd_buf *buf_in,
+ struct hinic3_cmd_buf *buf_out, u64 *out_param,
+ u32 timeout, u16 channel)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+ int err;
+
+ err = cmdq_params_valid(hwdev, buf_in);
+ if (err)
+ return err;
+
+ cmdqs = ((struct hinic3_hwdev *)hwdev)->cmdqs;
+
+ if (!get_card_present_state((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ err = wait_cmdqs_enable(cmdqs);
+ if (err) {
+ sdk_err(cmdqs->hwdev->dev_hdl, "Cmdq is disable\n");
+ return err;
+ }
+
+ if (cos_id >= cmdqs->cmdq_num) {
+ sdk_err(cmdqs->hwdev->dev_hdl, "Cmdq id is invalid\n");
+ return -EINVAL;
+ }
+
+ err = cmdq_sync_cmd_detail_resp(&cmdqs->cmdq[cos_id], mod, cmd,
+ buf_in, buf_out, out_param,
+ timeout, channel);
+ if (!(((struct hinic3_hwdev *)hwdev)->chip_present_flag))
+ return -ETIMEDOUT;
+ else
+ return err;
+}
+EXPORT_SYMBOL(hinic3_cos_id_detail_resp);
+
+int hinic3_cmdq_async(void *hwdev, u8 mod, u8 cmd, struct hinic3_cmd_buf *buf_in, u16 channel)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+ int err;
+
+ err = cmdq_params_valid(hwdev, buf_in);
+ if (err)
+ return err;
+
+ cmdqs = ((struct hinic3_hwdev *)hwdev)->cmdqs;
+
+ if (!get_card_present_state((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ err = wait_cmdqs_enable(cmdqs);
+ if (err) {
+ sdk_err(cmdqs->hwdev->dev_hdl, "Cmdq is disable\n");
+ return err;
+ }
+ /* LB mode 1 compatible, cmdq 0 also for async, which is sync_no_wait */
+ return cmdq_async_cmd(&cmdqs->cmdq[HINIC3_CMDQ_SYNC], mod,
+ cmd, buf_in, channel);
+}
+
+static void clear_wqe_complete_bit(struct hinic3_cmdq *cmdq,
+ struct hinic3_cmdq_wqe *wqe, u16 ci)
+{
+ struct hinic3_ctrl *ctrl = NULL;
+ u32 header_info = hinic3_hw_cpu32(WQE_HEADER(wqe)->header_info);
+ enum data_format df = CMDQ_WQE_HEADER_GET(header_info, DATA_FMT);
+
+ if (df == DATA_SGE)
+ ctrl = &wqe->wqe_lcmd.ctrl;
+ else
+ ctrl = &wqe->inline_wqe.wqe_scmd.ctrl;
+
+ /* clear HW busy bit */
+ ctrl->ctrl_info = 0;
+ cmdq->cmd_infos[ci].cmd_type = HINIC3_CMD_TYPE_NONE;
+
+ wmb(); /* verify wqe is clear */
+
+ hinic3_wq_put_wqebbs(&cmdq->wq, NUM_WQEBBS_FOR_CMDQ_WQE);
+}
+
+static void cmdq_sync_cmd_handler(struct hinic3_cmdq *cmdq,
+ struct hinic3_cmdq_wqe *wqe, u16 ci)
+{
+ spin_lock(&cmdq->cmdq_lock);
+
+ cmdq_update_cmd_status(cmdq, ci, wqe);
+
+ if (cmdq->cmd_infos[ci].cmpt_code) {
+ *cmdq->cmd_infos[ci].cmpt_code = CMDQ_COMPLETE_CMPT_CODE;
+ cmdq->cmd_infos[ci].cmpt_code = NULL;
+ }
+
+ /* make sure cmpt_code operation before done operation */
+ smp_rmb();
+
+ if (cmdq->cmd_infos[ci].done) {
+ complete(cmdq->cmd_infos[ci].done);
+ cmdq->cmd_infos[ci].done = NULL;
+ }
+
+ spin_unlock(&cmdq->cmdq_lock);
+
+ cmdq_clear_cmd_buf(&cmdq->cmd_infos[ci], cmdq->hwdev);
+ clear_wqe_complete_bit(cmdq, wqe, ci);
+}
+
+static void cmdq_async_cmd_handler(struct hinic3_hwdev *hwdev,
+ struct hinic3_cmdq *cmdq,
+ struct hinic3_cmdq_wqe *wqe, u16 ci)
+{
+ cmdq_clear_cmd_buf(&cmdq->cmd_infos[ci], hwdev);
+ clear_wqe_complete_bit(cmdq, wqe, ci);
+}
+
+static int cmdq_arm_ceq_handler(struct hinic3_cmdq *cmdq,
+ struct hinic3_cmdq_wqe *wqe, u16 ci)
+{
+ struct hinic3_ctrl *ctrl = &wqe->inline_wqe.wqe_scmd.ctrl;
+ u32 ctrl_info = hinic3_hw_cpu32((ctrl)->ctrl_info);
+
+ if (!WQE_COMPLETED(ctrl_info))
+ return -EBUSY;
+
+ clear_wqe_complete_bit(cmdq, wqe, ci);
+
+ return 0;
+}
+
+#define HINIC3_CMDQ_WQE_HEAD_LEN 32
+static void hinic3_dump_cmdq_wqe_head(struct hinic3_hwdev *hwdev,
+ struct hinic3_cmdq_wqe *wqe)
+{
+ u32 i;
+ u32 *data = (u32 *)wqe;
+
+ for (i = 0; i < (HINIC3_CMDQ_WQE_HEAD_LEN / sizeof(u32)); i += 0x4) {
+ sdk_info(hwdev->dev_hdl, "wqe data: 0x%08x, 0x%08x, 0x%08x, 0x%08x\n",
+ *(data + i), *(data + i + 0x1), *(data + i + 0x2),
+ *(data + i + 0x3));
+ }
+}
+
+void hinic3_cmdq_ceq_handler(void *handle, u32 ceqe_data)
+{
+ struct hinic3_cmdqs *cmdqs = ((struct hinic3_hwdev *)handle)->cmdqs;
+ enum hinic3_cmdq_type cmdq_type = CEQE_CMDQ_GET(ceqe_data, TYPE);
+ struct hinic3_cmdq *cmdq = &cmdqs->cmdq[cmdq_type];
+ struct hinic3_hwdev *hwdev = cmdqs->hwdev;
+ struct hinic3_cmdq_wqe *wqe = NULL;
+ struct hinic3_cmdq_wqe_lcmd *wqe_lcmd = NULL;
+ struct hinic3_ctrl *ctrl = NULL;
+ struct hinic3_cmdq_cmd_info *cmd_info = NULL;
+ u16 ci;
+
+ while ((wqe = cmdq_read_wqe(&cmdq->wq, &ci)) != NULL) {
+ cmd_info = &cmdq->cmd_infos[ci];
+
+ switch (cmd_info->cmd_type) {
+ case HINIC3_CMD_TYPE_NONE:
+ return;
+ case HINIC3_CMD_TYPE_TIMEOUT:
+ sdk_warn(hwdev->dev_hdl, "Cmdq timeout, q_id: %u, ci: %u\n",
+ cmdq_type, ci);
+ hinic3_dump_cmdq_wqe_head(hwdev, wqe);
+ /*lint -fallthrough */
+ case HINIC3_CMD_TYPE_FAKE_TIMEOUT:
+ cmdq_clear_cmd_buf(cmd_info, hwdev);
+ clear_wqe_complete_bit(cmdq, wqe, ci);
+ break;
+ case HINIC3_CMD_TYPE_SET_ARM:
+ /* arm_bit was set until here */
+ if (cmdq_arm_ceq_handler(cmdq, wqe, ci))
+ return;
+ break;
+ default:
+ /* only arm bit is using scmd wqe, the wqe is lcmd */
+ wqe_lcmd = &wqe->wqe_lcmd;
+ ctrl = &wqe_lcmd->ctrl;
+ if (!WQE_COMPLETED(hinic3_hw_cpu32((ctrl)->ctrl_info)))
+ return;
+
+ dma_rmb();
+ /* For FORCE_STOP cmd_type, we also need to wait for
+ * the firmware processing to complete to prevent the
+ * firmware from accessing the released cmd_buf
+ */
+ if (cmd_info->cmd_type == HINIC3_CMD_TYPE_FORCE_STOP) {
+ cmdq_clear_cmd_buf(cmd_info, hwdev);
+ clear_wqe_complete_bit(cmdq, wqe, ci);
+ } else if (cmd_info->cmd_type == HINIC3_CMD_TYPE_ASYNC) {
+ cmdq_async_cmd_handler(hwdev, cmdq, wqe, ci);
+ } else {
+ cmdq_sync_cmd_handler(cmdq, wqe, ci);
+ }
+
+ break;
+ }
+ }
+}
+
+static void cmdq_init_queue_ctxt(struct hinic3_cmdqs *cmdqs,
+ struct hinic3_cmdq *cmdq,
+ struct cmdq_ctxt_info *ctxt_info)
+{
+ struct hinic3_wq *wq = &cmdq->wq;
+ u64 cmdq_first_block_paddr, pfn;
+ u16 start_ci = (u16)wq->cons_idx;
+
+ pfn = CMDQ_PFN(hinic3_wq_get_first_wqe_page_addr(wq));
+
+ ctxt_info->curr_wqe_page_pfn =
+ CMDQ_CTXT_PAGE_INFO_SET(1, HW_BUSY_BIT) |
+ CMDQ_CTXT_PAGE_INFO_SET(1, CEQ_EN) |
+ CMDQ_CTXT_PAGE_INFO_SET(1, CEQ_ARM) |
+ CMDQ_CTXT_PAGE_INFO_SET(HINIC3_CEQ_ID_CMDQ, EQ_ID) |
+ CMDQ_CTXT_PAGE_INFO_SET(pfn, CURR_WQE_PAGE_PFN);
+
+ if (!WQ_IS_0_LEVEL_CLA(wq)) {
+ cmdq_first_block_paddr = cmdqs->wq_block_paddr;
+ pfn = CMDQ_PFN(cmdq_first_block_paddr);
+ }
+
+ ctxt_info->wq_block_pfn = CMDQ_CTXT_BLOCK_INFO_SET(start_ci, CI) |
+ CMDQ_CTXT_BLOCK_INFO_SET(pfn, WQ_BLOCK_PFN);
+}
+
+static int init_cmdq(struct hinic3_cmdq *cmdq, struct hinic3_hwdev *hwdev,
+ enum hinic3_cmdq_type q_type)
+{
+ int err;
+
+ cmdq->cmdq_type = q_type;
+ cmdq->wrapped = 1;
+ cmdq->hwdev = hwdev;
+
+ spin_lock_init(&cmdq->cmdq_lock);
+
+ cmdq->cmd_infos = kcalloc(cmdq->wq.q_depth, sizeof(*cmdq->cmd_infos),
+ GFP_KERNEL);
+ if (!cmdq->cmd_infos) {
+ sdk_err(hwdev->dev_hdl, "Failed to allocate cmdq infos\n");
+ err = -ENOMEM;
+ goto cmd_infos_err;
+ }
+
+ return 0;
+
+cmd_infos_err:
+ spin_lock_deinit(&cmdq->cmdq_lock);
+
+ return err;
+}
+
+static void free_cmdq(struct hinic3_cmdq *cmdq)
+{
+ kfree(cmdq->cmd_infos);
+ spin_lock_deinit(&cmdq->cmdq_lock);
+}
+
+static int hinic3_set_cmdq_ctxts(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_cmdqs *cmdqs = hwdev->cmdqs;
+ u8 cmdq_type;
+ int err;
+
+ cmdq_type = HINIC3_CMDQ_SYNC;
+ for (; cmdq_type < cmdqs->cmdq_num; cmdq_type++) {
+ err = hinic3_set_cmdq_ctxt(hwdev, cmdq_type,
+ &cmdqs->cmdq[cmdq_type].cmdq_ctxt);
+ if (err)
+ return err;
+ }
+
+ cmdqs->status |= HINIC3_CMDQ_ENABLE;
+ cmdqs->disable_flag = 0;
+
+ return 0;
+}
+
+static void cmdq_flush_sync_cmd(struct hinic3_cmdq_cmd_info *cmd_info)
+{
+ if (cmd_info->cmd_type != HINIC3_CMD_TYPE_DIRECT_RESP &&
+ cmd_info->cmd_type != HINIC3_CMD_TYPE_SGE_RESP)
+ return;
+
+ cmd_info->cmd_type = HINIC3_CMD_TYPE_FORCE_STOP;
+
+ if (cmd_info->cmpt_code &&
+ *cmd_info->cmpt_code == CMDQ_SEND_CMPT_CODE)
+ *cmd_info->cmpt_code = CMDQ_FORCE_STOP_CMPT_CODE;
+
+ if (cmd_info->done) {
+ complete(cmd_info->done);
+ cmd_info->done = NULL;
+ cmd_info->cmpt_code = NULL;
+ cmd_info->direct_resp = NULL;
+ cmd_info->errcode = NULL;
+ }
+}
+
+void hinic3_cmdq_flush_cmd(struct hinic3_hwdev *hwdev,
+ struct hinic3_cmdq *cmdq)
+{
+ struct hinic3_cmdq_cmd_info *cmd_info = NULL;
+ u16 ci = 0;
+
+ spin_lock_bh(&cmdq->cmdq_lock);
+
+ while (cmdq_read_wqe(&cmdq->wq, &ci)) {
+ hinic3_wq_put_wqebbs(&cmdq->wq, NUM_WQEBBS_FOR_CMDQ_WQE);
+ cmd_info = &cmdq->cmd_infos[ci];
+
+ if (cmd_info->cmd_type == HINIC3_CMD_TYPE_DIRECT_RESP ||
+ cmd_info->cmd_type == HINIC3_CMD_TYPE_SGE_RESP)
+ cmdq_flush_sync_cmd(cmd_info);
+ }
+
+ spin_unlock_bh(&cmdq->cmdq_lock);
+}
+
+static void hinic3_cmdq_flush_channel_sync_cmd(struct hinic3_hwdev *hwdev, u16 channel)
+{
+ struct hinic3_cmdq_cmd_info *cmd_info = NULL;
+ struct hinic3_cmdq *cmdq = NULL;
+ struct hinic3_wq *wq = NULL;
+ u16 wqe_cnt, ci, i;
+
+ if (channel >= HINIC3_CHANNEL_MAX)
+ return;
+
+ cmdq = &hwdev->cmdqs->cmdq[HINIC3_CMDQ_SYNC];
+
+ spin_lock_bh(&cmdq->cmdq_lock);
+
+ wq = &cmdq->wq;
+ ci = wq->cons_idx;
+ wqe_cnt = (u16)WQ_MASK_IDX(wq, wq->prod_idx +
+ wq->q_depth - wq->cons_idx);
+ for (i = 0; i < wqe_cnt; i++) {
+ cmd_info = &cmdq->cmd_infos[WQ_MASK_IDX(wq, ci + i)];
+ if (cmd_info->channel == channel)
+ cmdq_flush_sync_cmd(cmd_info);
+ }
+
+ spin_unlock_bh(&cmdq->cmdq_lock);
+}
+
+void hinic3_cmdq_flush_sync_cmd(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_cmdq_cmd_info *cmd_info = NULL;
+ struct hinic3_cmdq *cmdq = NULL;
+ struct hinic3_wq *wq = NULL;
+ u16 wqe_cnt, ci, i;
+
+ cmdq = &hwdev->cmdqs->cmdq[HINIC3_CMDQ_SYNC];
+
+ spin_lock_bh(&cmdq->cmdq_lock);
+
+ wq = &cmdq->wq;
+ ci = wq->cons_idx;
+ wqe_cnt = (u16)WQ_MASK_IDX(wq, wq->prod_idx +
+ wq->q_depth - wq->cons_idx);
+ for (i = 0; i < wqe_cnt; i++) {
+ cmd_info = &cmdq->cmd_infos[WQ_MASK_IDX(wq, ci + i)];
+ cmdq_flush_sync_cmd(cmd_info);
+ }
+
+ spin_unlock_bh(&cmdq->cmdq_lock);
+}
+
+static void cmdq_reset_all_cmd_buff(struct hinic3_cmdq *cmdq)
+{
+ u16 i;
+
+ for (i = 0; i < cmdq->wq.q_depth; i++)
+ cmdq_clear_cmd_buf(&cmdq->cmd_infos[i], cmdq->hwdev);
+}
+
+int hinic3_cmdq_set_channel_status(struct hinic3_hwdev *hwdev, u16 channel,
+ bool enable)
+{
+ if (channel >= HINIC3_CHANNEL_MAX)
+ return -EINVAL;
+
+ if (enable) {
+ clear_bit(channel, &hwdev->cmdqs->channel_stop);
+ } else {
+ set_bit(channel, &hwdev->cmdqs->channel_stop);
+ hinic3_cmdq_flush_channel_sync_cmd(hwdev, channel);
+ }
+
+ sdk_info(hwdev->dev_hdl, "%s cmdq channel 0x%x\n",
+ enable ? "Enable" : "Disable", channel);
+
+ return 0;
+}
+
+void hinic3_cmdq_enable_channel_lock(struct hinic3_hwdev *hwdev, bool enable)
+{
+ hwdev->cmdqs->lock_channel_en = enable;
+
+ sdk_info(hwdev->dev_hdl, "%s cmdq channel lock\n",
+ enable ? "Enable" : "Disable");
+}
+
+int hinic3_reinit_cmdq_ctxts(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_cmdqs *cmdqs = hwdev->cmdqs;
+ u8 cmdq_type;
+
+ cmdq_type = HINIC3_CMDQ_SYNC;
+ for (; cmdq_type < cmdqs->cmdq_num; cmdq_type++) {
+ hinic3_cmdq_flush_cmd(hwdev, &cmdqs->cmdq[cmdq_type]);
+ cmdq_reset_all_cmd_buff(&cmdqs->cmdq[cmdq_type]);
+ cmdqs->cmdq[cmdq_type].wrapped = 1;
+ hinic3_wq_reset(&cmdqs->cmdq[cmdq_type].wq);
+ }
+
+ return hinic3_set_cmdq_ctxts(hwdev);
+}
+
+static int create_cmdq_wq(struct hinic3_cmdqs *cmdqs)
+{
+ u8 type, cmdq_type;
+ int err;
+
+ cmdq_type = HINIC3_CMDQ_SYNC;
+ for (; cmdq_type < cmdqs->cmdq_num; cmdq_type++) {
+ err = hinic3_wq_create(cmdqs->hwdev, &cmdqs->cmdq[cmdq_type].wq,
+ HINIC3_CMDQ_DEPTH, CMDQ_WQEBB_SIZE);
+ if (err) {
+ sdk_err(cmdqs->hwdev->dev_hdl, "Failed to create cmdq wq\n");
+ goto destroy_wq;
+ }
+ }
+
+ /* 1-level CLA must put all cmdq's wq page addr in one wq block */
+ if (!WQ_IS_0_LEVEL_CLA(&cmdqs->cmdq[HINIC3_CMDQ_SYNC].wq)) {
+ /* cmdq wq's CLA table is up to 512B */
+#define CMDQ_WQ_CLA_SIZE 512
+ if (cmdqs->cmdq[HINIC3_CMDQ_SYNC].wq.num_wq_pages >
+ CMDQ_WQ_CLA_SIZE / sizeof(u64)) {
+ err = -EINVAL;
+ sdk_err(cmdqs->hwdev->dev_hdl, "Cmdq wq page exceed limit: %lu\n",
+ CMDQ_WQ_CLA_SIZE / sizeof(u64));
+ goto destroy_wq;
+ }
+
+ cmdqs->wq_block_vaddr =
+ dma_zalloc_coherent(cmdqs->hwdev->dev_hdl, PAGE_SIZE,
+ &cmdqs->wq_block_paddr, GFP_KERNEL);
+ if (!cmdqs->wq_block_vaddr) {
+ err = -ENOMEM;
+ sdk_err(cmdqs->hwdev->dev_hdl, "Failed to alloc cmdq wq block\n");
+ goto destroy_wq;
+ }
+
+ type = HINIC3_CMDQ_SYNC;
+ for (; type < cmdqs->cmdq_num; type++)
+ memcpy((u8 *)cmdqs->wq_block_vaddr +
+ CMDQ_WQ_CLA_SIZE * type,
+ cmdqs->cmdq[type].wq.wq_block_vaddr,
+ cmdqs->cmdq[type].wq.num_wq_pages * sizeof(u64));
+ }
+
+ return 0;
+
+destroy_wq:
+ type = HINIC3_CMDQ_SYNC;
+ for (; type < cmdq_type; type++)
+ hinic3_wq_destroy(&cmdqs->cmdq[type].wq);
+
+ return err;
+}
+
+static void destroy_cmdq_wq(struct hinic3_cmdqs *cmdqs)
+{
+ u8 cmdq_type;
+
+ if (cmdqs->wq_block_vaddr)
+ dma_free_coherent(cmdqs->hwdev->dev_hdl, PAGE_SIZE,
+ cmdqs->wq_block_vaddr, cmdqs->wq_block_paddr);
+
+ cmdq_type = HINIC3_CMDQ_SYNC;
+ for (; cmdq_type < cmdqs->cmdq_num; cmdq_type++)
+ hinic3_wq_destroy(&cmdqs->cmdq[cmdq_type].wq);
+}
+
+static int init_cmdqs(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+ u8 cmdq_num;
+ int err = -ENOMEM;
+
+ if (COMM_SUPPORT_CMDQ_NUM(hwdev)) {
+ cmdq_num = hwdev->glb_attr.cmdq_num;
+ if (hwdev->glb_attr.cmdq_num > HINIC3_MAX_CMDQ_TYPES) {
+ sdk_warn(hwdev->dev_hdl, "Adjust cmdq num to %d\n", HINIC3_MAX_CMDQ_TYPES);
+ cmdq_num = HINIC3_MAX_CMDQ_TYPES;
+ }
+ } else {
+ cmdq_num = HINIC3_MAX_CMDQ_TYPES;
+ }
+
+ cmdqs = kzalloc(sizeof(*cmdqs), GFP_KERNEL);
+ if (!cmdqs)
+ return err;
+
+ hwdev->cmdqs = cmdqs;
+ cmdqs->hwdev = hwdev;
+ cmdqs->cmdq_num = cmdq_num;
+
+ cmdqs->cmd_buf_pool = dma_pool_create("hinic3_cmdq", hwdev->dev_hdl,
+ HINIC3_CMDQ_BUF_SIZE, HINIC3_CMDQ_BUF_SIZE, 0ULL);
+ if (!cmdqs->cmd_buf_pool) {
+ sdk_err(hwdev->dev_hdl, "Failed to create cmdq buffer pool\n");
+ goto pool_create_err;
+ }
+
+ return 0;
+
+pool_create_err:
+ kfree(cmdqs);
+
+ return err;
+}
+
+int hinic3_cmdqs_init(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_cmdqs *cmdqs = NULL;
+ void __iomem *db_base = NULL;
+ u8 type, cmdq_type;
+ int err = -ENOMEM;
+
+ err = init_cmdqs(hwdev);
+ if (err)
+ return err;
+
+ cmdqs = hwdev->cmdqs;
+
+ err = create_cmdq_wq(cmdqs);
+ if (err)
+ goto create_wq_err;
+
+ err = hinic3_alloc_db_addr(hwdev, &db_base, NULL);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to allocate doorbell address\n");
+ goto alloc_db_err;
+ }
+
+ cmdqs->cmdqs_db_base = (u8 *)db_base;
+ for (cmdq_type = HINIC3_CMDQ_SYNC; cmdq_type < cmdqs->cmdq_num; cmdq_type++) {
+ err = init_cmdq(&cmdqs->cmdq[cmdq_type], hwdev, cmdq_type);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to initialize cmdq type :%d\n", cmdq_type);
+ goto init_cmdq_err;
+ }
+
+ cmdq_init_queue_ctxt(cmdqs, &cmdqs->cmdq[cmdq_type],
+ &cmdqs->cmdq[cmdq_type].cmdq_ctxt);
+ }
+
+ err = hinic3_set_cmdq_ctxts(hwdev);
+ if (err)
+ goto init_cmdq_err;
+
+ return 0;
+
+init_cmdq_err:
+ for (type = HINIC3_CMDQ_SYNC; type < cmdq_type; type++)
+ free_cmdq(&cmdqs->cmdq[type]);
+
+ hinic3_free_db_addr(hwdev, cmdqs->cmdqs_db_base, NULL);
+
+alloc_db_err:
+ destroy_cmdq_wq(cmdqs);
+
+create_wq_err:
+ dma_pool_destroy(cmdqs->cmd_buf_pool);
+ kfree(cmdqs);
+
+ return err;
+}
+
+void hinic3_cmdqs_free(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_cmdqs *cmdqs = hwdev->cmdqs;
+ u8 cmdq_type = HINIC3_CMDQ_SYNC;
+
+ cmdqs->status &= ~HINIC3_CMDQ_ENABLE;
+
+ for (; cmdq_type < cmdqs->cmdq_num; cmdq_type++) {
+ hinic3_cmdq_flush_cmd(hwdev, &cmdqs->cmdq[cmdq_type]);
+ cmdq_reset_all_cmd_buff(&cmdqs->cmdq[cmdq_type]);
+ free_cmdq(&cmdqs->cmdq[cmdq_type]);
+ }
+
+ hinic3_free_db_addr(hwdev, cmdqs->cmdqs_db_base, NULL);
+ destroy_cmdq_wq(cmdqs);
+
+ dma_pool_destroy(cmdqs->cmd_buf_pool);
+
+ kfree(cmdqs);
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.h
new file mode 100644
index 000000000000..ab36dc9c2ba6
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.h
@@ -0,0 +1,204 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_CMDQ_H
+#define HINIC3_CMDQ_H
+
+#include <linux/types.h>
+#include <linux/completion.h>
+#include <linux/spinlock.h>
+
+#include "comm_msg_intf.h"
+#include "hinic3_hw.h"
+#include "hinic3_wq.h"
+#include "hinic3_common.h"
+#include "hinic3_hwdev.h"
+
+#define HINIC3_SCMD_DATA_LEN 16
+
+#define HINIC3_CMDQ_DEPTH 4096
+
+enum hinic3_cmdq_type {
+ HINIC3_CMDQ_SYNC,
+ HINIC3_CMDQ_ASYNC,
+ HINIC3_MAX_CMDQ_TYPES = 4
+};
+
+enum hinic3_db_src_type {
+ HINIC3_DB_SRC_CMDQ_TYPE,
+ HINIC3_DB_SRC_L2NIC_SQ_TYPE,
+};
+
+enum hinic3_cmdq_db_type {
+ HINIC3_DB_SQ_RQ_TYPE,
+ HINIC3_DB_CMDQ_TYPE,
+};
+
+/* hardware define: cmdq wqe */
+struct hinic3_cmdq_header {
+ u32 header_info;
+ u32 saved_data;
+};
+
+struct hinic3_scmd_bufdesc {
+ u32 buf_len;
+ u32 rsvd;
+ u8 data[HINIC3_SCMD_DATA_LEN];
+};
+
+struct hinic3_lcmd_bufdesc {
+ struct hinic3_sge sge;
+ u32 rsvd1;
+ u64 saved_async_buf;
+ u64 rsvd3;
+};
+
+struct hinic3_cmdq_db {
+ u32 db_head;
+ u32 db_info;
+};
+
+struct hinic3_status {
+ u32 status_info;
+};
+
+struct hinic3_ctrl {
+ u32 ctrl_info;
+};
+
+struct hinic3_sge_resp {
+ struct hinic3_sge sge;
+ u32 rsvd;
+};
+
+struct hinic3_cmdq_completion {
+ union {
+ struct hinic3_sge_resp sge_resp;
+ u64 direct_resp;
+ };
+};
+
+struct hinic3_cmdq_wqe_scmd {
+ struct hinic3_cmdq_header header;
+ u64 rsvd;
+ struct hinic3_status status;
+ struct hinic3_ctrl ctrl;
+ struct hinic3_cmdq_completion completion;
+ struct hinic3_scmd_bufdesc buf_desc;
+};
+
+struct hinic3_cmdq_wqe_lcmd {
+ struct hinic3_cmdq_header header;
+ struct hinic3_status status;
+ struct hinic3_ctrl ctrl;
+ struct hinic3_cmdq_completion completion;
+ struct hinic3_lcmd_bufdesc buf_desc;
+};
+
+struct hinic3_cmdq_inline_wqe {
+ struct hinic3_cmdq_wqe_scmd wqe_scmd;
+};
+
+struct hinic3_cmdq_wqe {
+ union {
+ struct hinic3_cmdq_inline_wqe inline_wqe;
+ struct hinic3_cmdq_wqe_lcmd wqe_lcmd;
+ };
+};
+
+struct hinic3_cmdq_arm_bit {
+ u32 q_type;
+ u32 q_id;
+};
+
+enum hinic3_cmdq_status {
+ HINIC3_CMDQ_ENABLE = BIT(0),
+};
+
+enum hinic3_cmdq_cmd_type {
+ HINIC3_CMD_TYPE_NONE,
+ HINIC3_CMD_TYPE_SET_ARM,
+ HINIC3_CMD_TYPE_DIRECT_RESP,
+ HINIC3_CMD_TYPE_SGE_RESP,
+ HINIC3_CMD_TYPE_ASYNC,
+ HINIC3_CMD_TYPE_FAKE_TIMEOUT,
+ HINIC3_CMD_TYPE_TIMEOUT,
+ HINIC3_CMD_TYPE_FORCE_STOP,
+};
+
+struct hinic3_cmdq_cmd_info {
+ enum hinic3_cmdq_cmd_type cmd_type;
+ u16 channel;
+ u16 rsvd1;
+
+ struct completion *done;
+ int *errcode;
+ int *cmpt_code;
+ u64 *direct_resp;
+ u64 cmdq_msg_id;
+
+ struct hinic3_cmd_buf *buf_in;
+ struct hinic3_cmd_buf *buf_out;
+};
+
+struct hinic3_cmdq {
+ struct hinic3_wq wq;
+
+ enum hinic3_cmdq_type cmdq_type;
+ int wrapped;
+
+ /* spinlock for send cmdq commands */
+ spinlock_t cmdq_lock;
+
+ struct cmdq_ctxt_info cmdq_ctxt;
+
+ struct hinic3_cmdq_cmd_info *cmd_infos;
+
+ struct hinic3_hwdev *hwdev;
+ u64 rsvd1[2];
+};
+
+struct hinic3_cmdqs {
+ struct hinic3_hwdev *hwdev;
+
+ struct pci_pool *cmd_buf_pool;
+ /* doorbell area */
+ u8 __iomem *cmdqs_db_base;
+
+ /* All cmdq's CLA of a VF occupy a PAGE when cmdq wq is 1-level CLA */
+ dma_addr_t wq_block_paddr;
+ void *wq_block_vaddr;
+ struct hinic3_cmdq cmdq[HINIC3_MAX_CMDQ_TYPES];
+
+ u32 status;
+ u32 disable_flag;
+
+ bool lock_channel_en;
+ unsigned long channel_stop;
+ u8 cmdq_num;
+ u32 rsvd1;
+ u64 rsvd2;
+};
+
+void hinic3_cmdq_ceq_handler(void *handle, u32 ceqe_data);
+
+int hinic3_reinit_cmdq_ctxts(struct hinic3_hwdev *hwdev);
+
+bool hinic3_cmdq_idle(struct hinic3_cmdq *cmdq);
+
+int hinic3_cmdqs_init(struct hinic3_hwdev *hwdev);
+
+void hinic3_cmdqs_free(struct hinic3_hwdev *hwdev);
+
+void hinic3_cmdq_flush_cmd(struct hinic3_hwdev *hwdev,
+ struct hinic3_cmdq *cmdq);
+
+int hinic3_cmdq_set_channel_status(struct hinic3_hwdev *hwdev, u16 channel,
+ bool enable);
+
+void hinic3_cmdq_enable_channel_lock(struct hinic3_hwdev *hwdev, bool enable);
+
+void hinic3_cmdq_flush_sync_cmd(struct hinic3_hwdev *hwdev);
+
+#endif
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_common.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_common.c
new file mode 100644
index 000000000000..a942ef185e6f
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_common.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#include <linux/kernel.h>
+#include <linux/io-mapping.h>
+#include <linux/delay.h>
+
+#include "ossl_knl.h"
+#include "hinic3_common.h"
+
+int hinic3_dma_zalloc_coherent_align(void *dev_hdl, u64 size, u64 align,
+ unsigned int flag,
+ struct hinic3_dma_addr_align *mem_align)
+{
+ void *vaddr = NULL, *align_vaddr = NULL;
+ dma_addr_t paddr, align_paddr;
+ u64 real_size = size;
+
+ vaddr = dma_zalloc_coherent(dev_hdl, real_size, &paddr, flag);
+ if (!vaddr)
+ return -ENOMEM;
+
+ align_paddr = ALIGN(paddr, align);
+ /* align */
+ if (align_paddr == paddr) {
+ align_vaddr = vaddr;
+ goto out;
+ }
+
+ dma_free_coherent(dev_hdl, real_size, vaddr, paddr);
+
+ /* realloc memory for align */
+ real_size = size + align;
+ vaddr = dma_zalloc_coherent(dev_hdl, real_size, &paddr, flag);
+ if (!vaddr)
+ return -ENOMEM;
+
+ align_paddr = ALIGN(paddr, align);
+ align_vaddr = (void *)((u64)vaddr + (align_paddr - paddr));
+
+out:
+ mem_align->real_size = (u32)real_size;
+ mem_align->ori_vaddr = vaddr;
+ mem_align->ori_paddr = paddr;
+ mem_align->align_vaddr = align_vaddr;
+ mem_align->align_paddr = align_paddr;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_dma_zalloc_coherent_align);
+
+void hinic3_dma_free_coherent_align(void *dev_hdl,
+ struct hinic3_dma_addr_align *mem_align)
+{
+ dma_free_coherent(dev_hdl, mem_align->real_size,
+ mem_align->ori_vaddr, mem_align->ori_paddr);
+}
+EXPORT_SYMBOL(hinic3_dma_free_coherent_align);
+
+int hinic3_wait_for_timeout(void *priv_data, wait_cpl_handler handler,
+ u32 wait_total_ms, u32 wait_once_us)
+{
+ enum hinic3_wait_return ret;
+ unsigned long end;
+ /* Take 9/10 * wait_once_us as the minimum sleep time of usleep_range */
+ u32 usleep_min = wait_once_us - wait_once_us / 10;
+
+ if (!handler)
+ return -EINVAL;
+
+ end = jiffies + msecs_to_jiffies(wait_total_ms);
+ do {
+ ret = handler(priv_data);
+ if (ret == WAIT_PROCESS_CPL)
+ return 0;
+ else if (ret == WAIT_PROCESS_ERR)
+ return -EIO;
+
+ /* Sleep more than 20ms using msleep is accurate */
+ if (wait_once_us >= 20 * USEC_PER_MSEC)
+ msleep(wait_once_us / USEC_PER_MSEC);
+ else
+ usleep_range(usleep_min, wait_once_us);
+ } while (time_before(jiffies, end));
+
+ ret = handler(priv_data);
+ if (ret == WAIT_PROCESS_CPL)
+ return 0;
+ else if (ret == WAIT_PROCESS_ERR)
+ return -EIO;
+
+ return -ETIMEDOUT;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_csr.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_csr.h
new file mode 100644
index 000000000000..b5390c9ed488
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_csr.h
@@ -0,0 +1,187 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_CSR_H
+#define HINIC3_CSR_H
+
+/* bit30/bit31 for bar index flag
+ * 00: bar0
+ * 01: bar1
+ * 10: bar2
+ * 11: bar3
+ */
+#define HINIC3_CFG_REGS_FLAG 0x40000000
+
+#define HINIC3_MGMT_REGS_FLAG 0xC0000000
+
+#define HINIC3_REGS_FLAG_MAKS 0x3FFFFFFF
+
+#define HINIC3_VF_CFG_REG_OFFSET 0x2000
+
+#define HINIC3_HOST_CSR_BASE_ADDR (HINIC3_MGMT_REGS_FLAG + 0x6000)
+#define HINIC3_CSR_GLOBAL_BASE_ADDR (HINIC3_MGMT_REGS_FLAG + 0x6400)
+
+/* HW interface registers */
+#define HINIC3_CSR_FUNC_ATTR0_ADDR (HINIC3_CFG_REGS_FLAG + 0x0)
+#define HINIC3_CSR_FUNC_ATTR1_ADDR (HINIC3_CFG_REGS_FLAG + 0x4)
+#define HINIC3_CSR_FUNC_ATTR2_ADDR (HINIC3_CFG_REGS_FLAG + 0x8)
+#define HINIC3_CSR_FUNC_ATTR3_ADDR (HINIC3_CFG_REGS_FLAG + 0xC)
+#define HINIC3_CSR_FUNC_ATTR4_ADDR (HINIC3_CFG_REGS_FLAG + 0x10)
+#define HINIC3_CSR_FUNC_ATTR5_ADDR (HINIC3_CFG_REGS_FLAG + 0x14)
+#define HINIC3_CSR_FUNC_ATTR6_ADDR (HINIC3_CFG_REGS_FLAG + 0x18)
+
+#define HINIC3_FUNC_CSR_MAILBOX_DATA_OFF 0x80
+#define HINIC3_FUNC_CSR_MAILBOX_CONTROL_OFF \
+ (HINIC3_CFG_REGS_FLAG + 0x0100)
+#define HINIC3_FUNC_CSR_MAILBOX_INT_OFFSET_OFF \
+ (HINIC3_CFG_REGS_FLAG + 0x0104)
+#define HINIC3_FUNC_CSR_MAILBOX_RESULT_H_OFF \
+ (HINIC3_CFG_REGS_FLAG + 0x0108)
+#define HINIC3_FUNC_CSR_MAILBOX_RESULT_L_OFF \
+ (HINIC3_CFG_REGS_FLAG + 0x010C)
+/* CLP registers */
+#define HINIC3_BAR3_CLP_BASE_ADDR (HINIC3_MGMT_REGS_FLAG + 0x0000)
+
+#define HINIC3_UCPU_CLP_SIZE_REG (HINIC3_HOST_CSR_BASE_ADDR + 0x40)
+#define HINIC3_UCPU_CLP_REQBASE_REG (HINIC3_HOST_CSR_BASE_ADDR + 0x44)
+#define HINIC3_UCPU_CLP_RSPBASE_REG (HINIC3_HOST_CSR_BASE_ADDR + 0x48)
+#define HINIC3_UCPU_CLP_REQ_REG (HINIC3_HOST_CSR_BASE_ADDR + 0x4c)
+#define HINIC3_UCPU_CLP_RSP_REG (HINIC3_HOST_CSR_BASE_ADDR + 0x50)
+#define HINIC3_CLP_REG(member) (HINIC3_UCPU_CLP_##member##_REG)
+
+#define HINIC3_CLP_REQ_DATA HINIC3_BAR3_CLP_BASE_ADDR
+#define HINIC3_CLP_RSP_DATA (HINIC3_BAR3_CLP_BASE_ADDR + 0x1000)
+#define HINIC3_CLP_DATA(member) (HINIC3_CLP_##member##_DATA)
+
+#define HINIC3_PPF_ELECTION_OFFSET 0x0
+#define HINIC3_MPF_ELECTION_OFFSET 0x20
+
+#define HINIC3_CSR_PPF_ELECTION_ADDR \
+ (HINIC3_HOST_CSR_BASE_ADDR + HINIC3_PPF_ELECTION_OFFSET)
+
+#define HINIC3_CSR_GLOBAL_MPF_ELECTION_ADDR \
+ (HINIC3_HOST_CSR_BASE_ADDR + HINIC3_MPF_ELECTION_OFFSET)
+
+#define HINIC3_CSR_FUNC_PPF_ELECT_BASE_ADDR (HINIC3_CFG_REGS_FLAG + 0x60)
+#define HINIC3_CSR_FUNC_PPF_ELECT_PORT_STRIDE 0x4
+
+#define HINIC3_CSR_FUNC_PPF_ELECT(host_idx) \
+ (HINIC3_CSR_FUNC_PPF_ELECT_BASE_ADDR + \
+ (host_idx) * HINIC3_CSR_FUNC_PPF_ELECT_PORT_STRIDE)
+
+#define HINIC3_CSR_DMA_ATTR_TBL_ADDR (HINIC3_CFG_REGS_FLAG + 0x380)
+#define HINIC3_CSR_DMA_ATTR_INDIR_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x390)
+
+/* MSI-X registers */
+#define HINIC3_CSR_MSIX_INDIR_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x310)
+#define HINIC3_CSR_MSIX_CTRL_ADDR (HINIC3_CFG_REGS_FLAG + 0x300)
+#define HINIC3_CSR_MSIX_CNT_ADDR (HINIC3_CFG_REGS_FLAG + 0x304)
+#define HINIC3_CSR_FUNC_MSI_CLR_WR_ADDR (HINIC3_CFG_REGS_FLAG + 0x58)
+
+#define HINIC3_MSI_CLR_INDIR_RESEND_TIMER_CLR_SHIFT 0
+#define HINIC3_MSI_CLR_INDIR_INT_MSK_SET_SHIFT 1
+#define HINIC3_MSI_CLR_INDIR_INT_MSK_CLR_SHIFT 2
+#define HINIC3_MSI_CLR_INDIR_AUTO_MSK_SET_SHIFT 3
+#define HINIC3_MSI_CLR_INDIR_AUTO_MSK_CLR_SHIFT 4
+#define HINIC3_MSI_CLR_INDIR_SIMPLE_INDIR_IDX_SHIFT 22
+
+#define HINIC3_MSI_CLR_INDIR_RESEND_TIMER_CLR_MASK 0x1U
+#define HINIC3_MSI_CLR_INDIR_INT_MSK_SET_MASK 0x1U
+#define HINIC3_MSI_CLR_INDIR_INT_MSK_CLR_MASK 0x1U
+#define HINIC3_MSI_CLR_INDIR_AUTO_MSK_SET_MASK 0x1U
+#define HINIC3_MSI_CLR_INDIR_AUTO_MSK_CLR_MASK 0x1U
+#define HINIC3_MSI_CLR_INDIR_SIMPLE_INDIR_IDX_MASK 0x3FFU
+
+#define HINIC3_MSI_CLR_INDIR_SET(val, member) \
+ (((val) & HINIC3_MSI_CLR_INDIR_##member##_MASK) << \
+ HINIC3_MSI_CLR_INDIR_##member##_SHIFT)
+
+/* EQ registers */
+#define HINIC3_AEQ_INDIR_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x210)
+#define HINIC3_CEQ_INDIR_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x290)
+
+#define HINIC3_EQ_INDIR_IDX_ADDR(type) \
+ ((type == HINIC3_AEQ) ? \
+ HINIC3_AEQ_INDIR_IDX_ADDR : HINIC3_CEQ_INDIR_IDX_ADDR)
+
+#define HINIC3_AEQ_MTT_OFF_BASE_ADDR (HINIC3_CFG_REGS_FLAG + 0x240)
+#define HINIC3_CEQ_MTT_OFF_BASE_ADDR (HINIC3_CFG_REGS_FLAG + 0x2C0)
+
+#define HINIC3_CSR_EQ_PAGE_OFF_STRIDE 8
+
+#define HINIC3_AEQ_HI_PHYS_ADDR_REG(pg_num) \
+ (HINIC3_AEQ_MTT_OFF_BASE_ADDR + \
+ (pg_num) * HINIC3_CSR_EQ_PAGE_OFF_STRIDE)
+
+#define HINIC3_AEQ_LO_PHYS_ADDR_REG(pg_num) \
+ (HINIC3_AEQ_MTT_OFF_BASE_ADDR + \
+ (pg_num) * HINIC3_CSR_EQ_PAGE_OFF_STRIDE + 4)
+
+#define HINIC3_CEQ_HI_PHYS_ADDR_REG(pg_num) \
+ (HINIC3_CEQ_MTT_OFF_BASE_ADDR + \
+ (pg_num) * HINIC3_CSR_EQ_PAGE_OFF_STRIDE)
+
+#define HINIC3_CEQ_LO_PHYS_ADDR_REG(pg_num) \
+ (HINIC3_CEQ_MTT_OFF_BASE_ADDR + \
+ (pg_num) * HINIC3_CSR_EQ_PAGE_OFF_STRIDE + 4)
+
+#define HINIC3_CSR_AEQ_CTRL_0_ADDR (HINIC3_CFG_REGS_FLAG + 0x200)
+#define HINIC3_CSR_AEQ_CTRL_1_ADDR (HINIC3_CFG_REGS_FLAG + 0x204)
+#define HINIC3_CSR_AEQ_CONS_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x208)
+#define HINIC3_CSR_AEQ_PROD_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x20C)
+#define HINIC3_CSR_AEQ_CI_SIMPLE_INDIR_ADDR (HINIC3_CFG_REGS_FLAG + 0x50)
+
+#define HINIC3_CSR_CEQ_CTRL_0_ADDR (HINIC3_CFG_REGS_FLAG + 0x280)
+#define HINIC3_CSR_CEQ_CTRL_1_ADDR (HINIC3_CFG_REGS_FLAG + 0x284)
+#define HINIC3_CSR_CEQ_CONS_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x288)
+#define HINIC3_CSR_CEQ_PROD_IDX_ADDR (HINIC3_CFG_REGS_FLAG + 0x28c)
+#define HINIC3_CSR_CEQ_CI_SIMPLE_INDIR_ADDR (HINIC3_CFG_REGS_FLAG + 0x54)
+
+/* API CMD registers */
+#define HINIC3_CSR_API_CMD_BASE (HINIC3_MGMT_REGS_FLAG + 0x2000)
+
+#define HINIC3_CSR_API_CMD_STRIDE 0x80
+
+#define HINIC3_CSR_API_CMD_CHAIN_HEAD_HI_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x0 + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_CHAIN_HEAD_LO_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x4 + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_STATUS_HI_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x8 + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_STATUS_LO_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0xC + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_CHAIN_NUM_CELLS_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x10 + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_CHAIN_CTRL_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x14 + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_CHAIN_PI_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x1C + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_CHAIN_REQ_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x20 + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+#define HINIC3_CSR_API_CMD_STATUS_0_ADDR(idx) \
+ (HINIC3_CSR_API_CMD_BASE + 0x30 + (idx) * HINIC3_CSR_API_CMD_STRIDE)
+
+/* self test register */
+#define HINIC3_MGMT_HEALTH_STATUS_ADDR (HINIC3_MGMT_REGS_FLAG + 0x983c)
+
+#define HINIC3_CHIP_BASE_INFO_ADDR (HINIC3_MGMT_REGS_FLAG + 0xB02C)
+
+#define HINIC3_CHIP_ERR_STATUS0_ADDR (HINIC3_MGMT_REGS_FLAG + 0xC0EC)
+#define HINIC3_CHIP_ERR_STATUS1_ADDR (HINIC3_MGMT_REGS_FLAG + 0xC0F0)
+
+#define HINIC3_ERR_INFO0_ADDR (HINIC3_MGMT_REGS_FLAG + 0xC0F4)
+#define HINIC3_ERR_INFO1_ADDR (HINIC3_MGMT_REGS_FLAG + 0xC0F8)
+#define HINIC3_ERR_INFO2_ADDR (HINIC3_MGMT_REGS_FLAG + 0xC0FC)
+
+#define HINIC3_MULT_HOST_SLAVE_STATUS_ADDR (HINIC3_MGMT_REGS_FLAG + 0xDF30)
+#define HINIC3_MULT_MIGRATE_HOST_STATUS_ADDR (HINIC3_MGMT_REGS_FLAG + 0xDF4C)
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.c
new file mode 100644
index 000000000000..4c13a2e8ffd6
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.c
@@ -0,0 +1,803 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <net/addrconf.h>
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/io-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/time.h>
+#include <linux/timex.h>
+#include <linux/rtc.h>
+#include <linux/debugfs.h>
+
+#include "ossl_knl.h"
+#include "hinic3_mt.h"
+#include "hinic3_crm.h"
+#include "hinic3_lld.h"
+#include "hinic3_sriov.h"
+#include "hinic3_nictool.h"
+#include "hinic3_pci_id_tbl.h"
+#include "hinic3_dev_mgmt.h"
+
+#define HINIC3_WAIT_TOOL_CNT_TIMEOUT 10000
+#define HINIC3_WAIT_TOOL_MIN_USLEEP_TIME 9900
+#define HINIC3_WAIT_TOOL_MAX_USLEEP_TIME 10000
+
+static unsigned long card_bit_map;
+
+LIST_HEAD(g_hinic3_chip_list);
+
+struct list_head *get_hinic3_chip_list(void)
+{
+ return &g_hinic3_chip_list;
+}
+
+void uld_dev_hold(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(lld_dev->pdev);
+
+ atomic_inc(&pci_adapter->uld_ref_cnt[type]);
+}
+EXPORT_SYMBOL(uld_dev_hold);
+
+void uld_dev_put(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(lld_dev->pdev);
+
+ atomic_dec(&pci_adapter->uld_ref_cnt[type]);
+}
+EXPORT_SYMBOL(uld_dev_put);
+
+void lld_dev_cnt_init(struct hinic3_pcidev *pci_adapter)
+{
+ atomic_set(&pci_adapter->ref_cnt, 0);
+}
+
+void lld_dev_hold(struct hinic3_lld_dev *dev)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(dev->pdev);
+
+ atomic_inc(&pci_adapter->ref_cnt);
+}
+
+void lld_dev_put(struct hinic3_lld_dev *dev)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(dev->pdev);
+
+ atomic_dec(&pci_adapter->ref_cnt);
+}
+
+void wait_lld_dev_unused(struct hinic3_pcidev *pci_adapter)
+{
+ unsigned long end;
+
+ end = jiffies + msecs_to_jiffies(HINIC3_WAIT_TOOL_CNT_TIMEOUT);
+ do {
+ if (!atomic_read(&pci_adapter->ref_cnt))
+ return;
+
+ /* if sleep 10ms, use usleep_range to be more precise */
+ usleep_range(HINIC3_WAIT_TOOL_MIN_USLEEP_TIME,
+ HINIC3_WAIT_TOOL_MAX_USLEEP_TIME);
+ } while (time_before(jiffies, end));
+}
+
+enum hinic3_lld_status {
+ HINIC3_NODE_CHANGE = BIT(0),
+};
+
+struct hinic3_lld_lock {
+ /* lock for chip list */
+ struct mutex lld_mutex;
+ unsigned long status;
+ atomic_t dev_ref_cnt;
+};
+
+struct hinic3_lld_lock g_lld_lock;
+
+#define WAIT_LLD_DEV_HOLD_TIMEOUT (10 * 60 * 1000) /* 10minutes */
+#define WAIT_LLD_DEV_NODE_CHANGED (10 * 60 * 1000) /* 10minutes */
+#define WAIT_LLD_DEV_REF_CNT_EMPTY (2 * 60 * 1000) /* 2minutes */
+#define PRINT_TIMEOUT_INTERVAL 10000
+#define MS_PER_SEC 1000
+#define LLD_LOCK_MIN_USLEEP_TIME 900
+#define LLD_LOCK_MAX_USLEEP_TIME 1000
+
+/* node in chip_node will changed, tools or driver can't get node
+ * during this situation
+ */
+void lld_lock_chip_node(void)
+{
+ unsigned long end;
+ bool timeout = true;
+ u32 loop_cnt;
+
+ mutex_lock(&g_lld_lock.lld_mutex);
+
+ loop_cnt = 0;
+ end = jiffies + msecs_to_jiffies(WAIT_LLD_DEV_NODE_CHANGED);
+ do {
+ if (!test_and_set_bit(HINIC3_NODE_CHANGE, &g_lld_lock.status)) {
+ timeout = false;
+ break;
+ }
+
+ loop_cnt++;
+ if (loop_cnt % PRINT_TIMEOUT_INTERVAL == 0)
+ pr_warn("Wait for lld node change complete for %us\n",
+ loop_cnt / MS_PER_SEC);
+
+ /* if sleep 1ms, use usleep_range to be more precise */
+ usleep_range(LLD_LOCK_MIN_USLEEP_TIME,
+ LLD_LOCK_MAX_USLEEP_TIME);
+ } while (time_before(jiffies, end));
+
+ if (timeout && test_and_set_bit(HINIC3_NODE_CHANGE, &g_lld_lock.status))
+ pr_warn("Wait for lld node change complete timeout when trying to get lld lock\n");
+
+ loop_cnt = 0;
+ timeout = true;
+ end = jiffies + msecs_to_jiffies(WAIT_LLD_DEV_NODE_CHANGED);
+ do {
+ if (!atomic_read(&g_lld_lock.dev_ref_cnt)) {
+ timeout = false;
+ break;
+ }
+
+ loop_cnt++;
+ if (loop_cnt % PRINT_TIMEOUT_INTERVAL == 0)
+ pr_warn("Wait for lld dev unused for %us, reference count: %d\n",
+ loop_cnt / MS_PER_SEC,
+ atomic_read(&g_lld_lock.dev_ref_cnt));
+
+ /* if sleep 1ms, use usleep_range to be more precise */
+ usleep_range(LLD_LOCK_MIN_USLEEP_TIME,
+ LLD_LOCK_MAX_USLEEP_TIME);
+ } while (time_before(jiffies, end));
+
+ if (timeout && atomic_read(&g_lld_lock.dev_ref_cnt))
+ pr_warn("Wait for lld dev unused timeout\n");
+
+ mutex_unlock(&g_lld_lock.lld_mutex);
+}
+
+void lld_unlock_chip_node(void)
+{
+ clear_bit(HINIC3_NODE_CHANGE, &g_lld_lock.status);
+}
+
+/* When tools or other drivers want to get node of chip_node, use this function
+ * to prevent node be freed
+ */
+void lld_hold(void)
+{
+ unsigned long end;
+ u32 loop_cnt = 0;
+
+ /* ensure there have not any chip node in changing */
+ mutex_lock(&g_lld_lock.lld_mutex);
+
+ end = jiffies + msecs_to_jiffies(WAIT_LLD_DEV_HOLD_TIMEOUT);
+ do {
+ if (!test_bit(HINIC3_NODE_CHANGE, &g_lld_lock.status))
+ break;
+
+ loop_cnt++;
+
+ if (loop_cnt % PRINT_TIMEOUT_INTERVAL == 0)
+ pr_warn("Wait lld node change complete for %us\n",
+ loop_cnt / MS_PER_SEC);
+ /* if sleep 1ms, use usleep_range to be more precise */
+ usleep_range(LLD_LOCK_MIN_USLEEP_TIME,
+ LLD_LOCK_MAX_USLEEP_TIME);
+ } while (time_before(jiffies, end));
+
+ if (test_bit(HINIC3_NODE_CHANGE, &g_lld_lock.status))
+ pr_warn("Wait lld node change complete timeout when trying to hode lld dev\n");
+
+ atomic_inc(&g_lld_lock.dev_ref_cnt);
+ mutex_unlock(&g_lld_lock.lld_mutex);
+}
+
+void lld_put(void)
+{
+ atomic_dec(&g_lld_lock.dev_ref_cnt);
+}
+
+void hinic3_lld_lock_init(void)
+{
+ mutex_init(&g_lld_lock.lld_mutex);
+ atomic_set(&g_lld_lock.dev_ref_cnt, 0);
+}
+
+void hinic3_get_all_chip_id(void *id_info)
+{
+ struct nic_card_id *card_id = (struct nic_card_id *)id_info;
+ struct card_node *chip_node = NULL;
+ int i = 0;
+ int id, err;
+
+ lld_hold();
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ err = sscanf(chip_node->chip_name, HINIC3_CHIP_NAME "%d", &id);
+ if (err < 0) {
+ pr_err("Failed to get hinic3 id\n");
+ continue;
+ }
+ card_id->id[i] = (u32)id;
+ i++;
+ }
+ lld_put();
+ card_id->num = (u32)i;
+}
+
+void hinic3_get_card_func_info_by_card_name(const char *chip_name,
+ struct hinic3_card_func_info *card_func)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+ struct func_pdev_info *pdev_info = NULL;
+
+ card_func->num_pf = 0;
+
+ lld_hold();
+
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ if (strncmp(chip_node->chip_name, chip_name, IFNAMSIZ))
+ continue;
+
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (hinic3_func_type(dev->hwdev) == TYPE_VF)
+ continue;
+
+ pdev_info = &card_func->pdev_info[card_func->num_pf];
+ pdev_info->bar1_size =
+ pci_resource_len(dev->pcidev,
+ HINIC3_PF_PCI_CFG_REG_BAR);
+ pdev_info->bar1_phy_addr =
+ pci_resource_start(dev->pcidev,
+ HINIC3_PF_PCI_CFG_REG_BAR);
+
+ pdev_info->bar3_size =
+ pci_resource_len(dev->pcidev,
+ HINIC3_PCI_MGMT_REG_BAR);
+ pdev_info->bar3_phy_addr =
+ pci_resource_start(dev->pcidev,
+ HINIC3_PCI_MGMT_REG_BAR);
+
+ card_func->num_pf++;
+ if (card_func->num_pf >= MAX_SIZE) {
+ lld_put();
+ return;
+ }
+ }
+ }
+
+ lld_put();
+}
+
+static bool is_pcidev_match_chip_name(const char *ifname, struct hinic3_pcidev *dev,
+ struct card_node *chip_node, enum func_type type)
+{
+ if (!strncmp(chip_node->chip_name, ifname, IFNAMSIZ)) {
+ if (hinic3_func_type(dev->hwdev) != type)
+ return false;
+ return true;
+ }
+
+ return false;
+}
+
+static struct hinic3_lld_dev *get_dst_type_lld_dev_by_chip_name(const char *ifname,
+ enum func_type type)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (is_pcidev_match_chip_name(ifname, dev, chip_node, type))
+ return &dev->lld_dev;
+ }
+ }
+
+ return NULL;
+}
+
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_chip_name(const char *chip_name)
+{
+ struct hinic3_lld_dev *dev = NULL;
+
+ lld_hold();
+
+ dev = get_dst_type_lld_dev_by_chip_name(chip_name, TYPE_PPF);
+ if (dev)
+ goto out;
+
+ dev = get_dst_type_lld_dev_by_chip_name(chip_name, TYPE_PF);
+ if (dev)
+ goto out;
+
+ dev = get_dst_type_lld_dev_by_chip_name(chip_name, TYPE_VF);
+out:
+ if (dev)
+ lld_dev_hold(dev);
+ lld_put();
+
+ return dev;
+}
+
+static int get_dynamic_uld_dev_name(struct hinic3_pcidev *dev, enum hinic3_service_type type,
+ char *ifname)
+{
+ u32 out_size = IFNAMSIZ;
+
+ if (!g_uld_info[type].ioctl)
+ return -EFAULT;
+
+ return g_uld_info[type].ioctl(dev->uld_dev[type], GET_ULD_DEV_NAME,
+ NULL, 0, ifname, &out_size);
+}
+
+static bool is_pcidev_match_dev_name(const char *dev_name, struct hinic3_pcidev *dev,
+ enum hinic3_service_type type)
+{
+ enum hinic3_service_type i;
+ char nic_uld_name[IFNAMSIZ] = {0};
+ int err;
+
+ if (type > SERVICE_T_MAX)
+ return false;
+
+ if (type == SERVICE_T_MAX) {
+ for (i = SERVICE_T_OVS; i < SERVICE_T_MAX; i++) {
+ if (!strncmp(dev->uld_dev_name[i], dev_name, IFNAMSIZ))
+ return true;
+ }
+ } else {
+ if (!strncmp(dev->uld_dev_name[type], dev_name, IFNAMSIZ))
+ return true;
+ }
+
+ err = get_dynamic_uld_dev_name(dev, SERVICE_T_NIC, (char *)nic_uld_name);
+ if (err == 0) {
+ if (!strncmp(nic_uld_name, dev_name, IFNAMSIZ))
+ return true;
+ }
+
+ return false;
+}
+
+static struct hinic3_lld_dev *get_lld_dev_by_dev_name(const char *dev_name,
+ enum hinic3_service_type type, bool hold)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+
+ lld_hold();
+
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (is_pcidev_match_dev_name(dev_name, dev, type)) {
+ if (hold)
+ lld_dev_hold(&dev->lld_dev);
+ lld_put();
+ return &dev->lld_dev;
+ }
+ }
+ }
+
+ lld_put();
+
+ return NULL;
+}
+
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_chip_and_port(const char *chip_name, u8 port_id)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+
+ lld_hold();
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (hinic3_func_type(dev->hwdev) == TYPE_VF)
+ continue;
+
+ if (hinic3_physical_port_id(dev->hwdev) == port_id &&
+ !strncmp(chip_node->chip_name, chip_name, IFNAMSIZ)) {
+ lld_dev_hold(&dev->lld_dev);
+ lld_put();
+
+ return &dev->lld_dev;
+ }
+ }
+ }
+ lld_put();
+
+ return NULL;
+}
+
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_dev_name(const char *dev_name,
+ enum hinic3_service_type type)
+{
+ return get_lld_dev_by_dev_name(dev_name, type, true);
+}
+EXPORT_SYMBOL(hinic3_get_lld_dev_by_dev_name);
+
+struct hinic3_lld_dev *hinic3_get_lld_dev_by_dev_name_unsafe(const char *dev_name,
+ enum hinic3_service_type type)
+{
+ return get_lld_dev_by_dev_name(dev_name, type, false);
+}
+EXPORT_SYMBOL(hinic3_get_lld_dev_by_dev_name_unsafe);
+
+static void *get_uld_by_lld_dev(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type,
+ bool hold)
+{
+ struct hinic3_pcidev *dev = NULL;
+ void *uld = NULL;
+
+ if (!lld_dev)
+ return NULL;
+
+ dev = pci_get_drvdata(lld_dev->pdev);
+ if (!dev)
+ return NULL;
+
+ spin_lock_bh(&dev->uld_lock);
+ if (!dev->uld_dev[type] || !test_bit(type, &dev->uld_state)) {
+ spin_unlock_bh(&dev->uld_lock);
+ return NULL;
+ }
+ uld = dev->uld_dev[type];
+
+ if (hold)
+ atomic_inc(&dev->uld_ref_cnt[type]);
+ spin_unlock_bh(&dev->uld_lock);
+
+ return uld;
+}
+
+void *hinic3_get_uld_dev(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type)
+{
+ return get_uld_by_lld_dev(lld_dev, type, true);
+}
+EXPORT_SYMBOL(hinic3_get_uld_dev);
+
+void *hinic3_get_uld_dev_unsafe(struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type)
+{
+ return get_uld_by_lld_dev(lld_dev, type, false);
+}
+EXPORT_SYMBOL(hinic3_get_uld_dev_unsafe);
+
+static struct hinic3_lld_dev *get_ppf_lld_dev(struct hinic3_lld_dev *lld_dev, bool hold)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+
+ if (!lld_dev)
+ return NULL;
+
+ pci_adapter = pci_get_drvdata(lld_dev->pdev);
+ if (!pci_adapter)
+ return NULL;
+
+ lld_hold();
+ chip_node = pci_adapter->chip_node;
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (dev->hwdev && hinic3_func_type(dev->hwdev) == TYPE_PPF) {
+ if (hold)
+ lld_dev_hold(&dev->lld_dev);
+ lld_put();
+ return &dev->lld_dev;
+ }
+ }
+ lld_put();
+
+ return NULL;
+}
+
+struct hinic3_lld_dev *hinic3_get_ppf_lld_dev(struct hinic3_lld_dev *lld_dev)
+{
+ return get_ppf_lld_dev(lld_dev, true);
+}
+EXPORT_SYMBOL(hinic3_get_ppf_lld_dev);
+
+struct hinic3_lld_dev *hinic3_get_ppf_lld_dev_unsafe(struct hinic3_lld_dev *lld_dev)
+{
+ return get_ppf_lld_dev(lld_dev, false);
+}
+EXPORT_SYMBOL(hinic3_get_ppf_lld_dev_unsafe);
+
+int hinic3_get_chip_name(struct hinic3_lld_dev *lld_dev, char *chip_name, u16 max_len)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+
+ if (!lld_dev || !chip_name || !max_len)
+ return -EINVAL;
+
+ pci_adapter = pci_get_drvdata(lld_dev->pdev);
+ if (!pci_adapter)
+ return -EFAULT;
+
+ lld_hold();
+ strncpy(chip_name, pci_adapter->chip_node->chip_name, max_len);
+ chip_name[max_len - 1] = '\0';
+
+ lld_put();
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_chip_name);
+
+struct hinic3_hwdev *hinic3_get_sdk_hwdev_by_lld(struct hinic3_lld_dev *lld_dev)
+{
+ return lld_dev->hwdev;
+}
+
+struct card_node *hinic3_get_chip_node_by_lld(struct hinic3_lld_dev *lld_dev)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(lld_dev->pdev);
+
+ return pci_adapter->chip_node;
+}
+
+static struct card_node *hinic3_get_chip_node_by_hwdev(const void *hwdev)
+{
+ struct card_node *chip_node = NULL;
+ struct card_node *node_tmp = NULL;
+ struct hinic3_pcidev *dev = NULL;
+
+ if (!hwdev)
+ return NULL;
+
+ lld_hold();
+
+ list_for_each_entry(node_tmp, &g_hinic3_chip_list, node) {
+ if (!chip_node) {
+ list_for_each_entry(dev, &node_tmp->func_list, node) {
+ if (dev->hwdev == hwdev) {
+ chip_node = node_tmp;
+ break;
+ }
+ }
+ }
+ }
+
+ lld_put();
+
+ return chip_node;
+}
+
+static bool is_func_valid(struct hinic3_pcidev *dev)
+{
+ if (hinic3_func_type(dev->hwdev) == TYPE_VF)
+ return false;
+
+ return true;
+}
+
+void hinic3_get_card_info(const void *hwdev, void *bufin)
+{
+ struct card_node *chip_node = NULL;
+ struct card_info *info = (struct card_info *)bufin;
+ struct hinic3_pcidev *dev = NULL;
+ void *fun_hwdev = NULL;
+ u32 i = 0;
+
+ info->pf_num = 0;
+
+ chip_node = hinic3_get_chip_node_by_hwdev(hwdev);
+ if (!chip_node)
+ return;
+
+ lld_hold();
+
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (!is_func_valid(dev))
+ continue;
+
+ fun_hwdev = dev->hwdev;
+
+ if (hinic3_support_nic(fun_hwdev, NULL)) {
+ if (dev->uld_dev[SERVICE_T_NIC]) {
+ info->pf[i].pf_type |= (u32)BIT(SERVICE_T_NIC);
+ get_dynamic_uld_dev_name(dev, SERVICE_T_NIC, info->pf[i].name);
+ }
+ }
+
+ if (hinic3_support_ppa(fun_hwdev, NULL)) {
+ if (dev->uld_dev[SERVICE_T_PPA]) {
+ info->pf[i].pf_type |= (u32)BIT(SERVICE_T_PPA);
+ get_dynamic_uld_dev_name(dev, SERVICE_T_PPA, info->pf[i].name);
+ }
+ }
+
+ if (hinic3_func_for_mgmt(fun_hwdev))
+ strlcpy(info->pf[i].name, "FOR_MGMT", IFNAMSIZ);
+
+ strlcpy(info->pf[i].bus_info, pci_name(dev->pcidev),
+ sizeof(info->pf[i].bus_info));
+ info->pf_num++;
+ i = info->pf_num;
+ }
+
+ lld_put();
+}
+
+struct hinic3_sriov_info *hinic3_get_sriov_info_by_pcidev(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+
+ if (!pdev)
+ return NULL;
+
+ pci_adapter = pci_get_drvdata(pdev);
+ if (!pci_adapter)
+ return NULL;
+
+ return &pci_adapter->sriov_info;
+}
+
+void *hinic3_get_hwdev_by_pcidev(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+
+ if (!pdev)
+ return NULL;
+
+ pci_adapter = pci_get_drvdata(pdev);
+ if (!pci_adapter)
+ return NULL;
+
+ return pci_adapter->hwdev;
+}
+
+bool hinic3_is_in_host(void)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+
+ lld_hold();
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (hinic3_func_type(dev->hwdev) != TYPE_VF) {
+ lld_put();
+ return true;
+ }
+ }
+ }
+
+ lld_put();
+
+ return false;
+}
+
+
+static bool chip_node_is_exist(struct hinic3_pcidev *pci_adapter,
+ unsigned char *bus_number)
+{
+ struct card_node *chip_node = NULL;
+ struct pci_dev *pf_pdev = NULL;
+
+ if (!pci_is_root_bus(pci_adapter->pcidev->bus))
+ *bus_number = pci_adapter->pcidev->bus->number;
+
+ if (*bus_number != 0) {
+ if (pci_adapter->pcidev->is_virtfn) {
+ pf_pdev = pci_adapter->pcidev->physfn;
+ *bus_number = pf_pdev->bus->number;
+ }
+
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ if (chip_node->bus_num == *bus_number) {
+ pci_adapter->chip_node = chip_node;
+ return true;
+ }
+ }
+ } else if (HINIC3_IS_VF_DEV(pci_adapter->pcidev) ||
+ HINIC3_IS_SPU_DEV(pci_adapter->pcidev)) {
+ list_for_each_entry(chip_node, &g_hinic3_chip_list, node) {
+ if (chip_node) {
+ pci_adapter->chip_node = chip_node;
+ return true;
+ }
+ }
+ }
+
+ return false;
+}
+
+int alloc_chip_node(struct hinic3_pcidev *pci_adapter)
+{
+ struct card_node *chip_node = NULL;
+ unsigned char i;
+ unsigned char bus_number = 0;
+
+ if (chip_node_is_exist(pci_adapter, &bus_number))
+ return 0;
+
+ for (i = 0; i < CARD_MAX_SIZE; i++) {
+ if (test_and_set_bit(i, &card_bit_map) == 0)
+ break;
+ }
+
+ if (i == CARD_MAX_SIZE) {
+ sdk_err(&pci_adapter->pcidev->dev, "Failed to alloc card id\n");
+ return -EFAULT;
+ }
+
+ chip_node = kzalloc(sizeof(*chip_node), GFP_KERNEL);
+ if (!chip_node) {
+ clear_bit(i, &card_bit_map);
+ sdk_err(&pci_adapter->pcidev->dev,
+ "Failed to alloc chip node\n");
+ return -ENOMEM;
+ }
+
+ /* bus number */
+ chip_node->bus_num = bus_number;
+
+ if (snprintf(chip_node->chip_name, IFNAMSIZ, "%s%u", HINIC3_CHIP_NAME, i) < 0) {
+ clear_bit(i, &card_bit_map);
+ kfree(chip_node);
+ return -EINVAL;
+ }
+
+ sdk_info(&pci_adapter->pcidev->dev,
+ "Add new chip %s to global list succeed\n",
+ chip_node->chip_name);
+
+ list_add_tail(&chip_node->node, &g_hinic3_chip_list);
+
+ INIT_LIST_HEAD(&chip_node->func_list);
+ pci_adapter->chip_node = chip_node;
+
+ return 0;
+}
+
+void free_chip_node(struct hinic3_pcidev *pci_adapter)
+{
+ struct card_node *chip_node = pci_adapter->chip_node;
+ int id, err;
+
+ if (list_empty(&chip_node->func_list)) {
+ list_del(&chip_node->node);
+ sdk_info(&pci_adapter->pcidev->dev,
+ "Delete chip %s from global list succeed\n",
+ chip_node->chip_name);
+ err = sscanf(chip_node->chip_name, HINIC3_CHIP_NAME "%d", &id);
+ if (err < 0)
+ sdk_err(&pci_adapter->pcidev->dev, "Failed to get hinic3 id\n");
+
+ clear_bit(id, &card_bit_map);
+
+ kfree(chip_node);
+ }
+}
+
+int hinic3_get_pf_id(struct card_node *chip_node, u32 port_id, u32 *pf_id, u32 *isvalid)
+{
+ struct hinic3_pcidev *dev = NULL;
+
+ lld_hold();
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (hinic3_func_type(dev->hwdev) == TYPE_VF)
+ continue;
+
+ if (hinic3_physical_port_id(dev->hwdev) == port_id) {
+ *pf_id = hinic3_global_func_id(dev->hwdev);
+ *isvalid = 1;
+ break;
+ }
+ }
+ lld_put();
+
+ return 0;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.h
new file mode 100644
index 000000000000..0b7bf8e18732
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_DEV_MGMT_H
+#define HINIC3_DEV_MGMT_H
+#include <linux/types.h>
+#include <linux/bitops.h>
+
+#include "hinic3_sriov.h"
+#include "hinic3_lld.h"
+
+#define HINIC3_VF_PCI_CFG_REG_BAR 0
+#define HINIC3_PF_PCI_CFG_REG_BAR 1
+
+#define HINIC3_PCI_INTR_REG_BAR 2
+#define HINIC3_PCI_MGMT_REG_BAR 3 /* Only PF have mgmt bar */
+#define HINIC3_PCI_DB_BAR 4
+
+#define PRINT_ULD_DETACH_TIMEOUT_INTERVAL 1000 /* 1 second */
+#define ULD_LOCK_MIN_USLEEP_TIME 900
+#define ULD_LOCK_MAX_USLEEP_TIME 1000
+
+#define HINIC3_IS_VF_DEV(pdev) ((pdev)->device == HINIC3_DEV_ID_VF)
+#define HINIC3_IS_SPU_DEV(pdev) ((pdev)->device == HINIC3_DEV_ID_SPU)
+
+enum {
+ HINIC3_NOT_PROBE = 1,
+ HINIC3_PROBE_START = 2,
+ HINIC3_PROBE_OK = 3,
+ HINIC3_IN_REMOVE = 4,
+};
+
+/* Structure pcidev private */
+struct hinic3_pcidev {
+ struct pci_dev *pcidev;
+ void *hwdev;
+ struct card_node *chip_node;
+ struct hinic3_lld_dev lld_dev;
+ /* Record the service object address,
+ * such as hinic3_dev and toe_dev, fc_dev
+ */
+ void *uld_dev[SERVICE_T_MAX];
+ /* Record the service object name */
+ char uld_dev_name[SERVICE_T_MAX][IFNAMSIZ];
+ /* It is a the global variable for driver to manage
+ * all function device linked list
+ */
+ struct list_head node;
+
+ bool disable_vf_load;
+ bool disable_srv_load[SERVICE_T_MAX];
+
+ void __iomem *cfg_reg_base;
+ void __iomem *intr_reg_base;
+ void __iomem *mgmt_reg_base;
+ u64 db_dwqe_len;
+ u64 db_base_phy;
+ void __iomem *db_base;
+
+ /* lock for attach/detach uld */
+ struct mutex pdev_mutex;
+ int lld_state;
+ u32 rsvd1;
+
+ struct hinic3_sriov_info sriov_info;
+
+ /* setted when uld driver processing event */
+ unsigned long state;
+ struct pci_device_id id;
+
+ atomic_t ref_cnt;
+
+ atomic_t uld_ref_cnt[SERVICE_T_MAX];
+ unsigned long uld_state;
+ spinlock_t uld_lock;
+
+ u16 probe_fault_level;
+ u16 rsvd2;
+ u64 rsvd4;
+};
+
+struct hinic_chip_info {
+ u8 chip_id; /* chip id within card */
+ u8 card_type; /* hinic_multi_chip_card_type */
+ u8 rsvd[10]; /* reserved 10 bytes */
+};
+
+struct list_head *get_hinic3_chip_list(void);
+
+int alloc_chip_node(struct hinic3_pcidev *pci_adapter);
+
+void free_chip_node(struct hinic3_pcidev *pci_adapter);
+
+void lld_lock_chip_node(void);
+
+void lld_unlock_chip_node(void);
+
+void hinic3_lld_lock_init(void);
+
+void lld_dev_cnt_init(struct hinic3_pcidev *pci_adapter);
+void wait_lld_dev_unused(struct hinic3_pcidev *pci_adapter);
+
+void *hinic3_get_hwdev_by_pcidev(struct pci_dev *pdev);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.c
new file mode 100644
index 000000000000..1949ab879cbc
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.c
@@ -0,0 +1,431 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/netlink.h>
+#include <linux/pci.h>
+#include <linux/firmware.h>
+
+#include "hinic3_devlink.h"
+#ifdef HAVE_DEVLINK_FLASH_UPDATE_PARAMS
+#include "hinic3_common.h"
+#include "hinic3_api_cmd.h"
+#include "hinic3_mgmt.h"
+#include "hinic3_hw.h"
+
+static bool check_image_valid(struct hinic3_hwdev *hwdev, const u8 *buf,
+ u32 size, struct host_image *host_image)
+{
+ struct firmware_image *fw_image = NULL;
+ u32 len = 0;
+ u32 i;
+
+ fw_image = (struct firmware_image *)buf;
+ if (fw_image->fw_magic != FW_MAGIC_NUM) {
+ sdk_err(hwdev->dev_hdl, "Wrong fw magic read from file, fw_magic: 0x%x\n",
+ fw_image->fw_magic);
+ return false;
+ }
+
+ if (fw_image->fw_info.section_cnt > FW_TYPE_MAX_NUM) {
+ sdk_err(hwdev->dev_hdl, "Wrong fw type number read from file, fw_type_num: 0x%x\n",
+ fw_image->fw_info.section_cnt);
+ return false;
+ }
+
+ for (i = 0; i < fw_image->fw_info.section_cnt; i++) {
+ len += fw_image->section_info[i].section_len;
+ memcpy(&host_image->section_info[i], &fw_image->section_info[i],
+ sizeof(struct firmware_section));
+ }
+
+ if (len != fw_image->fw_len ||
+ (u32)(fw_image->fw_len + FW_IMAGE_HEAD_SIZE) != size) {
+ sdk_err(hwdev->dev_hdl, "Wrong data size read from file\n");
+ return false;
+ }
+
+ host_image->image_info.total_len = fw_image->fw_len;
+ host_image->image_info.fw_version = fw_image->fw_version;
+ host_image->type_num = fw_image->fw_info.section_cnt;
+ host_image->device_id = fw_image->device_id;
+
+ return true;
+}
+
+static bool check_image_integrity(struct hinic3_hwdev *hwdev, struct host_image *host_image)
+{
+ u64 collect_section_type = 0;
+ u32 type, i;
+
+ for (i = 0; i < host_image->type_num; i++) {
+ type = host_image->section_info[i].section_type;
+ if (collect_section_type & (1ULL << type)) {
+ sdk_err(hwdev->dev_hdl, "Duplicate section type: %u\n", type);
+ return false;
+ }
+ collect_section_type |= (1ULL << type);
+ }
+
+ if ((collect_section_type & IMAGE_COLD_SUB_MODULES_MUST_IN) ==
+ IMAGE_COLD_SUB_MODULES_MUST_IN &&
+ (collect_section_type & IMAGE_CFG_SUB_MODULES_MUST_IN) != 0)
+ return true;
+
+ sdk_err(hwdev->dev_hdl, "Failed to check file integrity, valid: 0x%llx, current: 0x%llx\n",
+ (IMAGE_COLD_SUB_MODULES_MUST_IN | IMAGE_CFG_SUB_MODULES_MUST_IN),
+ collect_section_type);
+
+ return false;
+}
+
+static bool check_image_device_type(struct hinic3_hwdev *hwdev, u32 device_type)
+{
+ struct comm_cmd_board_info board_info;
+
+ memset(&board_info, 0, sizeof(board_info));
+ if (hinic3_get_board_info(hwdev, &board_info.info, HINIC3_CHANNEL_COMM)) {
+ sdk_err(hwdev->dev_hdl, "Failed to get board info\n");
+ return false;
+ }
+
+ if (device_type == board_info.info.board_type)
+ return true;
+
+ sdk_err(hwdev->dev_hdl, "The image device type: 0x%x doesn't match the firmware device type: 0x%x\n",
+ device_type, board_info.info.board_type);
+
+ return false;
+}
+
+static void encapsulate_update_cmd(struct hinic3_cmd_update_firmware *msg,
+ struct firmware_section *section_info,
+ int *remain_len, u32 *send_len, u32 *send_pos)
+{
+ memset(msg->data, 0, sizeof(msg->data));
+ msg->ctl_info.sf = (*remain_len == section_info->section_len) ? true : false;
+ msg->section_info.section_crc = section_info->section_crc;
+ msg->section_info.section_type = section_info->section_type;
+ msg->section_version = section_info->section_version;
+ msg->section_len = section_info->section_len;
+ msg->section_offset = *send_pos;
+ msg->ctl_info.bit_signed = section_info->section_flag & 0x1;
+
+ if (*remain_len <= FW_FRAGMENT_MAX_LEN) {
+ msg->ctl_info.sl = true;
+ msg->ctl_info.fragment_len = (u32)(*remain_len);
+ *send_len += section_info->section_len;
+ } else {
+ msg->ctl_info.sl = false;
+ msg->ctl_info.fragment_len = FW_FRAGMENT_MAX_LEN;
+ *send_len += FW_FRAGMENT_MAX_LEN;
+ }
+}
+
+static int hinic3_flash_firmware(struct hinic3_hwdev *hwdev, const u8 *data,
+ struct host_image *image)
+{
+ u32 send_pos, send_len, section_offset, i;
+ struct hinic3_cmd_update_firmware *update_msg = NULL;
+ u16 out_size = sizeof(*update_msg);
+ bool total_flag = false;
+ int remain_len, err;
+
+ update_msg = kzalloc(sizeof(*update_msg), GFP_KERNEL);
+ if (!update_msg) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc update message\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < image->type_num; i++) {
+ section_offset = image->section_info[i].section_offset;
+ remain_len = (int)(image->section_info[i].section_len);
+ send_len = 0;
+ send_pos = 0;
+
+ while (remain_len > 0) {
+ if (!total_flag) {
+ update_msg->total_len = image->image_info.total_len;
+ total_flag = true;
+ } else {
+ update_msg->total_len = 0;
+ }
+
+ encapsulate_update_cmd(update_msg, &image->section_info[i],
+ &remain_len, &send_len, &send_pos);
+
+ memcpy(update_msg->data,
+ ((data + FW_IMAGE_HEAD_SIZE) + section_offset) + send_pos,
+ update_msg->ctl_info.fragment_len);
+
+ err = hinic3_pf_to_mgmt_sync(hwdev, HINIC3_MOD_COMM,
+ COMM_MGMT_CMD_UPDATE_FW,
+ update_msg, sizeof(*update_msg),
+ update_msg, &out_size,
+ FW_UPDATE_MGMT_TIMEOUT);
+ if (err || !out_size || update_msg->msg_head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to update firmware, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, update_msg->msg_head.status, out_size);
+ err = update_msg->msg_head.status ?
+ update_msg->msg_head.status : -EIO;
+ kfree(update_msg);
+ return err;
+ }
+
+ send_pos = send_len;
+ remain_len = (int)(image->section_info[i].section_len - send_len);
+ }
+ }
+
+ kfree(update_msg);
+
+ return 0;
+}
+
+static int hinic3_flash_update_notify(struct devlink *devlink, const struct firmware *fw,
+ struct host_image *image, struct netlink_ext_ack *extack)
+{
+ struct hinic3_devlink *devlink_dev = devlink_priv(devlink);
+ struct hinic3_hwdev *hwdev = devlink_dev->hwdev;
+ int err;
+
+#ifdef HAVE_DEVLINK_FW_FILE_NAME_MEMBER
+ devlink_flash_update_begin_notify(devlink);
+#endif
+ devlink_flash_update_status_notify(devlink, "Flash firmware begin", NULL, 0, 0);
+ sdk_info(hwdev->dev_hdl, "Flash firmware begin\n");
+ err = hinic3_flash_firmware(hwdev, fw->data, image);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to flash firmware, err: %d\n", err);
+ NL_SET_ERR_MSG_MOD(extack, "Flash firmware failed");
+ devlink_flash_update_status_notify(devlink, "Flash firmware failed", NULL, 0, 0);
+ } else {
+ sdk_info(hwdev->dev_hdl, "Flash firmware end\n");
+ devlink_flash_update_status_notify(devlink, "Flash firmware end", NULL, 0, 0);
+ }
+#ifdef HAVE_DEVLINK_FW_FILE_NAME_MEMBER
+ devlink_flash_update_end_notify(devlink);
+#endif
+
+ return err;
+}
+
+#ifdef HAVE_DEVLINK_FW_FILE_NAME_PARAM
+static int hinic3_devlink_flash_update(struct devlink *devlink, const char *file_name,
+ const char *component, struct netlink_ext_ack *extack)
+#else
+static int hinic3_devlink_flash_update(struct devlink *devlink,
+ struct devlink_flash_update_params *params,
+ struct netlink_ext_ack *extack)
+#endif
+{
+ struct hinic3_devlink *devlink_dev = devlink_priv(devlink);
+ struct hinic3_hwdev *hwdev = devlink_dev->hwdev;
+#ifdef HAVE_DEVLINK_FW_FILE_NAME_MEMBER
+ const struct firmware *fw = NULL;
+#else
+ const struct firmware *fw = params->fw;
+#endif
+ struct host_image *image = NULL;
+ int err;
+
+ image = kzalloc(sizeof(*image), GFP_KERNEL);
+ if (!image) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc host image\n");
+ err = -ENOMEM;
+ goto devlink_param_reset;
+ }
+
+#ifdef HAVE_DEVLINK_FW_FILE_NAME_MEMBER
+#ifdef HAVE_DEVLINK_FW_FILE_NAME_PARAM
+ err = request_firmware_direct(&fw, file_name, hwdev->dev_hdl);
+#else
+ err = request_firmware_direct(&fw, params->file_name, hwdev->dev_hdl);
+#endif
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to request firmware\n");
+ goto devlink_request_fw_err;
+ }
+#endif
+
+ if (!check_image_valid(hwdev, fw->data, (u32)(fw->size), image) ||
+ !check_image_integrity(hwdev, image) ||
+ !check_image_device_type(hwdev, image->device_id)) {
+ sdk_err(hwdev->dev_hdl, "Failed to check image\n");
+ NL_SET_ERR_MSG_MOD(extack, "Check image failed");
+ err = -EINVAL;
+ goto devlink_update_out;
+ }
+
+ err = hinic3_flash_update_notify(devlink, fw, image, extack);
+
+devlink_update_out:
+#ifdef HAVE_DEVLINK_FW_FILE_NAME_MEMBER
+ release_firmware(fw);
+
+devlink_request_fw_err:
+#endif
+ kfree(image);
+
+devlink_param_reset:
+ /* reset activate_fw and switch_cfg after flash update operation */
+ devlink_dev->activate_fw = FW_CFG_DEFAULT_INDEX;
+ devlink_dev->switch_cfg = FW_CFG_DEFAULT_INDEX;
+
+ return err;
+}
+
+static const struct devlink_ops hinic3_devlink_ops = {
+ .flash_update = hinic3_devlink_flash_update,
+};
+
+static int hinic3_devlink_get_activate_firmware_config(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx)
+{
+ struct hinic3_devlink *devlink_dev = devlink_priv(devlink);
+
+ ctx->val.vu8 = devlink_dev->activate_fw;
+
+ return 0;
+}
+
+static int hinic3_devlink_set_activate_firmware_config(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx)
+{
+ struct hinic3_devlink *devlink_dev = devlink_priv(devlink);
+ struct hinic3_hwdev *hwdev = devlink_dev->hwdev;
+ int err;
+
+ devlink_dev->activate_fw = ctx->val.vu8;
+ sdk_info(hwdev->dev_hdl, "Activate firmware begin\n");
+
+ err = hinic3_activate_firmware(hwdev, devlink_dev->activate_fw);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to activate firmware, err: %d\n", err);
+ return err;
+ }
+
+ sdk_info(hwdev->dev_hdl, "Activate firmware end\n");
+
+ return 0;
+}
+
+static int hinic3_devlink_get_switch_config(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx)
+{
+ struct hinic3_devlink *devlink_dev = devlink_priv(devlink);
+
+ ctx->val.vu8 = devlink_dev->switch_cfg;
+
+ return 0;
+}
+
+static int hinic3_devlink_set_switch_config(struct devlink *devlink, u32 id,
+ struct devlink_param_gset_ctx *ctx)
+{
+ struct hinic3_devlink *devlink_dev = devlink_priv(devlink);
+ struct hinic3_hwdev *hwdev = devlink_dev->hwdev;
+ int err;
+
+ devlink_dev->switch_cfg = ctx->val.vu8;
+ sdk_info(hwdev->dev_hdl, "Switch cfg begin");
+
+ err = hinic3_switch_config(hwdev, devlink_dev->switch_cfg);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to switch cfg, err: %d\n", err);
+ return err;
+ }
+
+ sdk_info(hwdev->dev_hdl, "Switch cfg end\n");
+
+ return 0;
+}
+
+static int hinic3_devlink_firmware_config_validate(struct devlink *devlink, u32 id,
+ union devlink_param_value val,
+ struct netlink_ext_ack *extack)
+{
+ struct hinic3_devlink *devlink_dev = devlink_priv(devlink);
+ struct hinic3_hwdev *hwdev = devlink_dev->hwdev;
+ u8 cfg_index = val.vu8;
+
+ if (cfg_index > FW_CFG_MAX_INDEX) {
+ sdk_err(hwdev->dev_hdl, "Firmware cfg index out of range [0,7]\n");
+ NL_SET_ERR_MSG_MOD(extack, "Firmware cfg index out of range [0,7]");
+ return -ERANGE;
+ }
+
+ return 0;
+}
+
+static const struct devlink_param hinic3_devlink_params[] = {
+ DEVLINK_PARAM_DRIVER(HINIC3_DEVLINK_PARAM_ID_ACTIVATE_FW,
+ "activate_fw", DEVLINK_PARAM_TYPE_U8,
+ BIT(DEVLINK_PARAM_CMODE_PERMANENT),
+ hinic3_devlink_get_activate_firmware_config,
+ hinic3_devlink_set_activate_firmware_config,
+ hinic3_devlink_firmware_config_validate),
+ DEVLINK_PARAM_DRIVER(HINIC3_DEVLINK_PARAM_ID_SWITCH_CFG,
+ "switch_cfg", DEVLINK_PARAM_TYPE_U8,
+ BIT(DEVLINK_PARAM_CMODE_PERMANENT),
+ hinic3_devlink_get_switch_config,
+ hinic3_devlink_set_switch_config,
+ hinic3_devlink_firmware_config_validate),
+};
+
+int hinic3_init_devlink(struct hinic3_hwdev *hwdev)
+{
+ struct devlink *devlink = NULL;
+ struct pci_dev *pdev = NULL;
+ int err;
+
+ devlink = devlink_alloc(&hinic3_devlink_ops, sizeof(struct hinic3_devlink));
+ if (!devlink) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc devlink\n");
+ return -ENOMEM;
+ }
+
+ hwdev->devlink_dev = devlink_priv(devlink);
+ hwdev->devlink_dev->hwdev = hwdev;
+ hwdev->devlink_dev->activate_fw = FW_CFG_DEFAULT_INDEX;
+ hwdev->devlink_dev->switch_cfg = FW_CFG_DEFAULT_INDEX;
+
+ pdev = hwdev->hwif->pdev;
+ err = devlink_register(devlink, &pdev->dev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to register devlink\n");
+ goto register_devlink_err;
+ }
+
+ err = devlink_params_register(devlink, hinic3_devlink_params,
+ ARRAY_SIZE(hinic3_devlink_params));
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to register devlink params\n");
+ goto register_devlink_params_err;
+ }
+
+ devlink_params_publish(devlink);
+
+ return 0;
+
+register_devlink_params_err:
+ devlink_unregister(devlink);
+
+register_devlink_err:
+ devlink_free(devlink);
+
+ return -EFAULT;
+}
+
+void hinic3_uninit_devlink(struct hinic3_hwdev *hwdev)
+{
+ struct devlink *devlink = priv_to_devlink(hwdev->devlink_dev);
+
+ devlink_params_unpublish(devlink);
+ devlink_params_unregister(devlink, hinic3_devlink_params,
+ ARRAY_SIZE(hinic3_devlink_params));
+ devlink_unregister(devlink);
+ devlink_free(devlink);
+}
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.h
new file mode 100644
index 000000000000..0b5a086358b9
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.h
@@ -0,0 +1,149 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_DEVLINK_H
+#define HINIC3_DEVLINK_H
+
+#include "ossl_knl.h"
+#include "hinic3_hwdev.h"
+
+#define FW_MAGIC_NUM 0x5a5a1100
+#define FW_IMAGE_HEAD_SIZE 4096
+#define FW_FRAGMENT_MAX_LEN 1536
+#define FW_CFG_DEFAULT_INDEX 0xFF
+#define FW_TYPE_MAX_NUM 0x40
+#define FW_CFG_MAX_INDEX 7
+
+#ifdef HAVE_DEVLINK_FLASH_UPDATE_PARAMS
+enum hinic3_devlink_param_id {
+ HINIC3_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX,
+ HINIC3_DEVLINK_PARAM_ID_ACTIVATE_FW,
+ HINIC3_DEVLINK_PARAM_ID_SWITCH_CFG,
+};
+#endif
+
+enum hinic3_firmware_type {
+ UP_FW_UPDATE_MIN_TYPE1 = 0x0,
+ UP_FW_UPDATE_UP_TEXT = 0x0,
+ UP_FW_UPDATE_UP_DATA = 0x1,
+ UP_FW_UPDATE_UP_DICT = 0x2,
+ UP_FW_UPDATE_TILE_PCPTR = 0x3,
+ UP_FW_UPDATE_TILE_TEXT = 0x4,
+ UP_FW_UPDATE_TILE_DATA = 0x5,
+ UP_FW_UPDATE_TILE_DICT = 0x6,
+ UP_FW_UPDATE_PPE_STATE = 0x7,
+ UP_FW_UPDATE_PPE_BRANCH = 0x8,
+ UP_FW_UPDATE_PPE_EXTACT = 0x9,
+ UP_FW_UPDATE_MAX_TYPE1 = 0x9,
+ UP_FW_UPDATE_CFG0 = 0xa,
+ UP_FW_UPDATE_CFG1 = 0xb,
+ UP_FW_UPDATE_CFG2 = 0xc,
+ UP_FW_UPDATE_CFG3 = 0xd,
+ UP_FW_UPDATE_MAX_TYPE1_CFG = 0xd,
+
+ UP_FW_UPDATE_MIN_TYPE2 = 0x14,
+ UP_FW_UPDATE_MAX_TYPE2 = 0x14,
+
+ UP_FW_UPDATE_MIN_TYPE3 = 0x18,
+ UP_FW_UPDATE_PHY = 0x18,
+ UP_FW_UPDATE_BIOS = 0x19,
+ UP_FW_UPDATE_HLINK_ONE = 0x1a,
+ UP_FW_UPDATE_HLINK_TWO = 0x1b,
+ UP_FW_UPDATE_HLINK_THR = 0x1c,
+ UP_FW_UPDATE_MAX_TYPE3 = 0x1c,
+
+ UP_FW_UPDATE_MIN_TYPE4 = 0x20,
+ UP_FW_UPDATE_L0FW = 0x20,
+ UP_FW_UPDATE_L1FW = 0x21,
+ UP_FW_UPDATE_BOOT = 0x22,
+ UP_FW_UPDATE_SEC_DICT = 0x23,
+ UP_FW_UPDATE_HOT_PATCH0 = 0x24,
+ UP_FW_UPDATE_HOT_PATCH1 = 0x25,
+ UP_FW_UPDATE_HOT_PATCH2 = 0x26,
+ UP_FW_UPDATE_HOT_PATCH3 = 0x27,
+ UP_FW_UPDATE_HOT_PATCH4 = 0x28,
+ UP_FW_UPDATE_HOT_PATCH5 = 0x29,
+ UP_FW_UPDATE_HOT_PATCH6 = 0x2a,
+ UP_FW_UPDATE_HOT_PATCH7 = 0x2b,
+ UP_FW_UPDATE_HOT_PATCH8 = 0x2c,
+ UP_FW_UPDATE_HOT_PATCH9 = 0x2d,
+ UP_FW_UPDATE_HOT_PATCH10 = 0x2e,
+ UP_FW_UPDATE_HOT_PATCH11 = 0x2f,
+ UP_FW_UPDATE_HOT_PATCH12 = 0x30,
+ UP_FW_UPDATE_HOT_PATCH13 = 0x31,
+ UP_FW_UPDATE_HOT_PATCH14 = 0x32,
+ UP_FW_UPDATE_HOT_PATCH15 = 0x33,
+ UP_FW_UPDATE_HOT_PATCH16 = 0x34,
+ UP_FW_UPDATE_HOT_PATCH17 = 0x35,
+ UP_FW_UPDATE_HOT_PATCH18 = 0x36,
+ UP_FW_UPDATE_HOT_PATCH19 = 0x37,
+ UP_FW_UPDATE_MAX_TYPE4 = 0x37,
+
+ UP_FW_UPDATE_MIN_TYPE5 = 0x3a,
+ UP_FW_UPDATE_OPTION_ROM = 0x3a,
+ UP_FW_UPDATE_MAX_TYPE5 = 0x3a,
+
+ UP_FW_UPDATE_MIN_TYPE6 = 0x3e,
+ UP_FW_UPDATE_MAX_TYPE6 = 0x3e,
+
+ UP_FW_UPDATE_MIN_TYPE7 = 0x40,
+ UP_FW_UPDATE_MAX_TYPE7 = 0x40,
+};
+
+#define IMAGE_MPU_ALL_IN (BIT_ULL(UP_FW_UPDATE_UP_TEXT) | \
+ BIT_ULL(UP_FW_UPDATE_UP_DATA) | \
+ BIT_ULL(UP_FW_UPDATE_UP_DICT))
+
+#define IMAGE_NPU_ALL_IN (BIT_ULL(UP_FW_UPDATE_TILE_PCPTR) | \
+ BIT_ULL(UP_FW_UPDATE_TILE_TEXT) | \
+ BIT_ULL(UP_FW_UPDATE_TILE_DATA) | \
+ BIT_ULL(UP_FW_UPDATE_TILE_DICT) | \
+ BIT_ULL(UP_FW_UPDATE_PPE_STATE) | \
+ BIT_ULL(UP_FW_UPDATE_PPE_BRANCH) | \
+ BIT_ULL(UP_FW_UPDATE_PPE_EXTACT))
+
+#define IMAGE_COLD_SUB_MODULES_MUST_IN (IMAGE_MPU_ALL_IN | IMAGE_NPU_ALL_IN)
+
+#define IMAGE_CFG_SUB_MODULES_MUST_IN (BIT_ULL(UP_FW_UPDATE_CFG0) | \
+ BIT_ULL(UP_FW_UPDATE_CFG1) | \
+ BIT_ULL(UP_FW_UPDATE_CFG2) | \
+ BIT_ULL(UP_FW_UPDATE_CFG3))
+
+struct firmware_section {
+ u32 section_len;
+ u32 section_offset;
+ u32 section_version;
+ u32 section_type;
+ u32 section_crc;
+ u32 section_flag;
+};
+
+struct firmware_image {
+ u32 fw_version;
+ u32 fw_len;
+ u32 fw_magic;
+ struct {
+ u32 section_cnt : 16;
+ u32 rsvd : 16;
+ } fw_info;
+ struct firmware_section section_info[FW_TYPE_MAX_NUM];
+ u32 device_id; /* cfg fw board_type value */
+ u32 rsvd0[101]; /* device_id and rsvd0[101] is update_head_extend_info */
+ u32 rsvd1[534]; /* big bin file total size 4096B */
+ u32 bin_data; /* obtain the address for use */
+};
+
+struct host_image {
+ struct firmware_section section_info[FW_TYPE_MAX_NUM];
+ struct {
+ u32 total_len;
+ u32 fw_version;
+ } image_info;
+ u32 type_num;
+ u32 device_id;
+};
+
+int hinic3_init_devlink(struct hinic3_hwdev *hwdev);
+void hinic3_uninit_devlink(struct hinic3_hwdev *hwdev);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.c
new file mode 100644
index 000000000000..2638dd1865d7
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.c
@@ -0,0 +1,1381 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/interrupt.h>
+#include <linux/workqueue.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/module.h>
+#include <linux/spinlock.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_hwif.h"
+#include "hinic3_hw.h"
+#include "hinic3_csr.h"
+#include "hinic3_hw_comm.h"
+#include "hinic3_prof_adap.h"
+#include "hinic3_eqs.h"
+
+#define HINIC3_EQS_WQ_NAME "hinic3_eqs"
+
+#define AEQ_CTRL_0_INTR_IDX_SHIFT 0
+#define AEQ_CTRL_0_DMA_ATTR_SHIFT 12
+#define AEQ_CTRL_0_PCI_INTF_IDX_SHIFT 20
+#define AEQ_CTRL_0_INTR_MODE_SHIFT 31
+
+#define AEQ_CTRL_0_INTR_IDX_MASK 0x3FFU
+#define AEQ_CTRL_0_DMA_ATTR_MASK 0x3FU
+#define AEQ_CTRL_0_PCI_INTF_IDX_MASK 0x7U
+#define AEQ_CTRL_0_INTR_MODE_MASK 0x1U
+
+#define AEQ_CTRL_0_SET(val, member) \
+ (((val) & AEQ_CTRL_0_##member##_MASK) << \
+ AEQ_CTRL_0_##member##_SHIFT)
+
+#define AEQ_CTRL_0_CLEAR(val, member) \
+ ((val) & (~(AEQ_CTRL_0_##member##_MASK << \
+ AEQ_CTRL_0_##member##_SHIFT)))
+
+#define AEQ_CTRL_1_LEN_SHIFT 0
+#define AEQ_CTRL_1_ELEM_SIZE_SHIFT 24
+#define AEQ_CTRL_1_PAGE_SIZE_SHIFT 28
+
+#define AEQ_CTRL_1_LEN_MASK 0x1FFFFFU
+#define AEQ_CTRL_1_ELEM_SIZE_MASK 0x3U
+#define AEQ_CTRL_1_PAGE_SIZE_MASK 0xFU
+
+#define AEQ_CTRL_1_SET(val, member) \
+ (((val) & AEQ_CTRL_1_##member##_MASK) << \
+ AEQ_CTRL_1_##member##_SHIFT)
+
+#define AEQ_CTRL_1_CLEAR(val, member) \
+ ((val) & (~(AEQ_CTRL_1_##member##_MASK << \
+ AEQ_CTRL_1_##member##_SHIFT)))
+
+#define HINIC3_EQ_PROD_IDX_MASK 0xFFFFF
+#define HINIC3_TASK_PROCESS_EQE_LIMIT 1024
+#define HINIC3_EQ_UPDATE_CI_STEP 64
+
+/*lint -e806*/
+static uint g_aeq_len = HINIC3_DEFAULT_AEQ_LEN;
+module_param(g_aeq_len, uint, 0444);
+MODULE_PARM_DESC(g_aeq_len,
+ "aeq depth, valid range is " __stringify(HINIC3_MIN_AEQ_LEN)
+ " - " __stringify(HINIC3_MAX_AEQ_LEN));
+
+static uint g_ceq_len = HINIC3_DEFAULT_CEQ_LEN;
+module_param(g_ceq_len, uint, 0444);
+MODULE_PARM_DESC(g_ceq_len,
+ "ceq depth, valid range is " __stringify(HINIC3_MIN_CEQ_LEN)
+ " - " __stringify(HINIC3_MAX_CEQ_LEN));
+
+static uint g_num_ceqe_in_tasklet = HINIC3_TASK_PROCESS_EQE_LIMIT;
+module_param(g_num_ceqe_in_tasklet, uint, 0444);
+MODULE_PARM_DESC(g_num_ceqe_in_tasklet,
+ "The max number of ceqe can be processed in tasklet, default = 1024");
+/*lint +e806*/
+
+#define CEQ_CTRL_0_INTR_IDX_SHIFT 0
+#define CEQ_CTRL_0_DMA_ATTR_SHIFT 12
+#define CEQ_CTRL_0_LIMIT_KICK_SHIFT 20
+#define CEQ_CTRL_0_PCI_INTF_IDX_SHIFT 24
+#define CEQ_CTRL_0_PAGE_SIZE_SHIFT 27
+#define CEQ_CTRL_0_INTR_MODE_SHIFT 31
+
+#define CEQ_CTRL_0_INTR_IDX_MASK 0x3FFU
+#define CEQ_CTRL_0_DMA_ATTR_MASK 0x3FU
+#define CEQ_CTRL_0_LIMIT_KICK_MASK 0xFU
+#define CEQ_CTRL_0_PCI_INTF_IDX_MASK 0x3U
+#define CEQ_CTRL_0_PAGE_SIZE_MASK 0xF
+#define CEQ_CTRL_0_INTR_MODE_MASK 0x1U
+
+#define CEQ_CTRL_0_SET(val, member) \
+ (((val) & CEQ_CTRL_0_##member##_MASK) << \
+ CEQ_CTRL_0_##member##_SHIFT)
+
+#define CEQ_CTRL_1_LEN_SHIFT 0
+#define CEQ_CTRL_1_GLB_FUNC_ID_SHIFT 20
+
+#define CEQ_CTRL_1_LEN_MASK 0xFFFFFU
+#define CEQ_CTRL_1_GLB_FUNC_ID_MASK 0xFFFU
+
+#define CEQ_CTRL_1_SET(val, member) \
+ (((val) & CEQ_CTRL_1_##member##_MASK) << \
+ CEQ_CTRL_1_##member##_SHIFT)
+
+#define EQ_ELEM_DESC_TYPE_SHIFT 0
+#define EQ_ELEM_DESC_SRC_SHIFT 7
+#define EQ_ELEM_DESC_SIZE_SHIFT 8
+#define EQ_ELEM_DESC_WRAPPED_SHIFT 31
+
+#define EQ_ELEM_DESC_TYPE_MASK 0x7FU
+#define EQ_ELEM_DESC_SRC_MASK 0x1U
+#define EQ_ELEM_DESC_SIZE_MASK 0xFFU
+#define EQ_ELEM_DESC_WRAPPED_MASK 0x1U
+
+#define EQ_ELEM_DESC_GET(val, member) \
+ (((val) >> EQ_ELEM_DESC_##member##_SHIFT) & \
+ EQ_ELEM_DESC_##member##_MASK)
+
+#define EQ_CONS_IDX_CONS_IDX_SHIFT 0
+#define EQ_CONS_IDX_INT_ARMED_SHIFT 31
+
+#define EQ_CONS_IDX_CONS_IDX_MASK 0x1FFFFFU
+#define EQ_CONS_IDX_INT_ARMED_MASK 0x1U
+
+#define EQ_CONS_IDX_SET(val, member) \
+ (((val) & EQ_CONS_IDX_##member##_MASK) << \
+ EQ_CONS_IDX_##member##_SHIFT)
+
+#define EQ_CONS_IDX_CLEAR(val, member) \
+ ((val) & (~(EQ_CONS_IDX_##member##_MASK << \
+ EQ_CONS_IDX_##member##_SHIFT)))
+
+#define EQ_CI_SIMPLE_INDIR_CI_SHIFT 0
+#define EQ_CI_SIMPLE_INDIR_ARMED_SHIFT 21
+#define EQ_CI_SIMPLE_INDIR_AEQ_IDX_SHIFT 30
+#define EQ_CI_SIMPLE_INDIR_CEQ_IDX_SHIFT 24
+
+#define EQ_CI_SIMPLE_INDIR_CI_MASK 0x1FFFFFU
+#define EQ_CI_SIMPLE_INDIR_ARMED_MASK 0x1U
+#define EQ_CI_SIMPLE_INDIR_AEQ_IDX_MASK 0x3U
+#define EQ_CI_SIMPLE_INDIR_CEQ_IDX_MASK 0xFFU
+
+#define EQ_CI_SIMPLE_INDIR_SET(val, member) \
+ (((val) & EQ_CI_SIMPLE_INDIR_##member##_MASK) << \
+ EQ_CI_SIMPLE_INDIR_##member##_SHIFT)
+
+#define EQ_CI_SIMPLE_INDIR_CLEAR(val, member) \
+ ((val) & (~(EQ_CI_SIMPLE_INDIR_##member##_MASK << \
+ EQ_CI_SIMPLE_INDIR_##member##_SHIFT)))
+
+#define EQ_WRAPPED(eq) ((u32)(eq)->wrapped << EQ_VALID_SHIFT)
+
+#define EQ_CONS_IDX(eq) ((eq)->cons_idx | \
+ ((u32)(eq)->wrapped << EQ_WRAPPED_SHIFT))
+
+#define EQ_CONS_IDX_REG_ADDR(eq) \
+ (((eq)->type == HINIC3_AEQ) ? \
+ HINIC3_CSR_AEQ_CONS_IDX_ADDR : \
+ HINIC3_CSR_CEQ_CONS_IDX_ADDR)
+#define EQ_CI_SIMPLE_INDIR_REG_ADDR(eq) \
+ (((eq)->type == HINIC3_AEQ) ? \
+ HINIC3_CSR_AEQ_CI_SIMPLE_INDIR_ADDR : \
+ HINIC3_CSR_CEQ_CI_SIMPLE_INDIR_ADDR)
+
+#define EQ_PROD_IDX_REG_ADDR(eq) \
+ (((eq)->type == HINIC3_AEQ) ? \
+ HINIC3_CSR_AEQ_PROD_IDX_ADDR : \
+ HINIC3_CSR_CEQ_PROD_IDX_ADDR)
+
+#define HINIC3_EQ_HI_PHYS_ADDR_REG(type, pg_num) \
+ ((u32)((type == HINIC3_AEQ) ? \
+ HINIC3_AEQ_HI_PHYS_ADDR_REG(pg_num) : \
+ HINIC3_CEQ_HI_PHYS_ADDR_REG(pg_num)))
+
+#define HINIC3_EQ_LO_PHYS_ADDR_REG(type, pg_num) \
+ ((u32)((type == HINIC3_AEQ) ? \
+ HINIC3_AEQ_LO_PHYS_ADDR_REG(pg_num) : \
+ HINIC3_CEQ_LO_PHYS_ADDR_REG(pg_num)))
+
+#define GET_EQ_NUM_PAGES(eq, size) \
+ ((u16)(ALIGN((u32)((eq)->eq_len * (eq)->elem_size), \
+ (size)) / (size)))
+
+#define HINIC3_EQ_MAX_PAGES(eq) \
+ ((eq)->type == HINIC3_AEQ ? HINIC3_AEQ_MAX_PAGES : \
+ HINIC3_CEQ_MAX_PAGES)
+
+#define GET_EQ_NUM_ELEMS(eq, pg_size) ((pg_size) / (u32)(eq)->elem_size)
+
+#define GET_EQ_ELEMENT(eq, idx) \
+ (((u8 *)(eq)->eq_pages[(idx) / (eq)->num_elem_in_pg].align_vaddr) + \
+ (u32)(((idx) & ((eq)->num_elem_in_pg - 1)) * (eq)->elem_size))
+
+#define GET_AEQ_ELEM(eq, idx) \
+ ((struct hinic3_aeq_elem *)GET_EQ_ELEMENT((eq), (idx)))
+
+#define GET_CEQ_ELEM(eq, idx) ((u32 *)GET_EQ_ELEMENT((eq), (idx)))
+
+#define GET_CURR_AEQ_ELEM(eq) GET_AEQ_ELEM((eq), (eq)->cons_idx)
+
+#define GET_CURR_CEQ_ELEM(eq) GET_CEQ_ELEM((eq), (eq)->cons_idx)
+
+#define PAGE_IN_4K(page_size) ((page_size) >> 12)
+#define EQ_SET_HW_PAGE_SIZE_VAL(eq) \
+ ((u32)ilog2(PAGE_IN_4K((eq)->page_size)))
+
+#define ELEMENT_SIZE_IN_32B(eq) (((eq)->elem_size) >> 5)
+#define EQ_SET_HW_ELEM_SIZE_VAL(eq) ((u32)ilog2(ELEMENT_SIZE_IN_32B(eq)))
+
+#define AEQ_DMA_ATTR_DEFAULT 0
+#define CEQ_DMA_ATTR_DEFAULT 0
+
+#define CEQ_LMT_KICK_DEFAULT 0
+
+#define EQ_MSIX_RESEND_TIMER_CLEAR 1
+
+#define EQ_WRAPPED_SHIFT 20
+
+#define EQ_VALID_SHIFT 31
+
+#define CEQE_TYPE_SHIFT 23
+#define CEQE_TYPE_MASK 0x7
+
+#define CEQE_TYPE(type) (((type) >> CEQE_TYPE_SHIFT) & \
+ CEQE_TYPE_MASK)
+
+#define CEQE_DATA_MASK 0x3FFFFFF
+#define CEQE_DATA(data) ((data) & CEQE_DATA_MASK)
+
+#define aeq_to_aeqs(eq) \
+ container_of((eq) - (eq)->q_id, struct hinic3_aeqs, aeq[0])
+
+#define ceq_to_ceqs(eq) \
+ container_of((eq) - (eq)->q_id, struct hinic3_ceqs, ceq[0])
+
+static irqreturn_t ceq_interrupt(int irq, void *data);
+static irqreturn_t aeq_interrupt(int irq, void *data);
+
+static void ceq_tasklet(ulong ceq_data);
+
+/**
+ * hinic3_aeq_register_hw_cb - register aeq callback for specific event
+ * @hwdev: the pointer to hw device
+ * @pri_handle: the pointer to private invoker device
+ * @event: event for the handler
+ * @hw_cb: callback function
+ **/
+int hinic3_aeq_register_hw_cb(void *hwdev, void *pri_handle, enum hinic3_aeq_type event,
+ hinic3_aeq_hwe_cb hwe_cb)
+{
+ struct hinic3_aeqs *aeqs = NULL;
+
+ if (!hwdev || !hwe_cb || event >= HINIC3_MAX_AEQ_EVENTS)
+ return -EINVAL;
+
+ aeqs = ((struct hinic3_hwdev *)hwdev)->aeqs;
+
+ aeqs->aeq_hwe_cb[event] = hwe_cb;
+ aeqs->aeq_hwe_cb_data[event] = pri_handle;
+
+ set_bit(HINIC3_AEQ_HW_CB_REG, &aeqs->aeq_hw_cb_state[event]);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_aeq_register_hw_cb);
+
+/**
+ * hinic3_aeq_unregister_hw_cb - unregister the aeq callback for specific event
+ * @hwdev: the pointer to hw device
+ * @event: event for the handler
+ **/
+void hinic3_aeq_unregister_hw_cb(void *hwdev, enum hinic3_aeq_type event)
+{
+ struct hinic3_aeqs *aeqs = NULL;
+
+ if (!hwdev || event >= HINIC3_MAX_AEQ_EVENTS)
+ return;
+
+ aeqs = ((struct hinic3_hwdev *)hwdev)->aeqs;
+
+ clear_bit(HINIC3_AEQ_HW_CB_REG, &aeqs->aeq_hw_cb_state[event]);
+
+ while (test_bit(HINIC3_AEQ_HW_CB_RUNNING,
+ &aeqs->aeq_hw_cb_state[event]))
+ usleep_range(EQ_USLEEP_LOW_BOUND, EQ_USLEEP_HIG_BOUND);
+
+ aeqs->aeq_hwe_cb[event] = NULL;
+}
+EXPORT_SYMBOL(hinic3_aeq_unregister_hw_cb);
+
+/**
+ * hinic3_aeq_register_swe_cb - register aeq callback for sw event
+ * @hwdev: the pointer to hw device
+ * @pri_handle: the pointer to private invoker device
+ * @event: soft event for the handler
+ * @sw_cb: callback function
+ **/
+int hinic3_aeq_register_swe_cb(void *hwdev, void *pri_handle, enum hinic3_aeq_sw_type event,
+ hinic3_aeq_swe_cb aeq_swe_cb)
+{
+ struct hinic3_aeqs *aeqs = NULL;
+
+ if (!hwdev || !aeq_swe_cb || event >= HINIC3_MAX_AEQ_SW_EVENTS)
+ return -EINVAL;
+
+ aeqs = ((struct hinic3_hwdev *)hwdev)->aeqs;
+
+ aeqs->aeq_swe_cb[event] = aeq_swe_cb;
+ aeqs->aeq_swe_cb_data[event] = pri_handle;
+
+ set_bit(HINIC3_AEQ_SW_CB_REG, &aeqs->aeq_sw_cb_state[event]);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_aeq_register_swe_cb);
+
+/**
+ * hinic3_aeq_unregister_swe_cb - unregister the aeq callback for sw event
+ * @hwdev: the pointer to hw device
+ * @event: soft event for the handler
+ **/
+void hinic3_aeq_unregister_swe_cb(void *hwdev, enum hinic3_aeq_sw_type event)
+{
+ struct hinic3_aeqs *aeqs = NULL;
+
+ if (!hwdev || event >= HINIC3_MAX_AEQ_SW_EVENTS)
+ return;
+
+ aeqs = ((struct hinic3_hwdev *)hwdev)->aeqs;
+
+ clear_bit(HINIC3_AEQ_SW_CB_REG, &aeqs->aeq_sw_cb_state[event]);
+
+ while (test_bit(HINIC3_AEQ_SW_CB_RUNNING,
+ &aeqs->aeq_sw_cb_state[event]))
+ usleep_range(EQ_USLEEP_LOW_BOUND, EQ_USLEEP_HIG_BOUND);
+
+ aeqs->aeq_swe_cb[event] = NULL;
+}
+EXPORT_SYMBOL(hinic3_aeq_unregister_swe_cb);
+
+/**
+ * hinic3_ceq_register_cb - register ceq callback for specific event
+ * @hwdev: the pointer to hw device
+ * @pri_handle: the pointer to private invoker device
+ * @event: event for the handler
+ * @ceq_cb: callback function
+ **/
+int hinic3_ceq_register_cb(void *hwdev, void *pri_handle, enum hinic3_ceq_event event,
+ hinic3_ceq_event_cb callback)
+{
+ struct hinic3_ceqs *ceqs = NULL;
+
+ if (!hwdev || event >= HINIC3_MAX_CEQ_EVENTS)
+ return -EINVAL;
+
+ ceqs = ((struct hinic3_hwdev *)hwdev)->ceqs;
+
+ ceqs->ceq_cb[event] = callback;
+ ceqs->ceq_cb_data[event] = pri_handle;
+
+ set_bit(HINIC3_CEQ_CB_REG, &ceqs->ceq_cb_state[event]);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_ceq_register_cb);
+
+/**
+ * hinic3_ceq_unregister_cb - unregister ceq callback for specific event
+ * @hwdev: the pointer to hw device
+ * @event: event for the handler
+ **/
+void hinic3_ceq_unregister_cb(void *hwdev, enum hinic3_ceq_event event)
+{
+ struct hinic3_ceqs *ceqs = NULL;
+
+ if (!hwdev || event >= HINIC3_MAX_CEQ_EVENTS)
+ return;
+
+ ceqs = ((struct hinic3_hwdev *)hwdev)->ceqs;
+
+ clear_bit(HINIC3_CEQ_CB_REG, &ceqs->ceq_cb_state[event]);
+
+ while (test_bit(HINIC3_CEQ_CB_RUNNING, &ceqs->ceq_cb_state[event]))
+ usleep_range(EQ_USLEEP_LOW_BOUND, EQ_USLEEP_HIG_BOUND);
+
+ ceqs->ceq_cb[event] = NULL;
+}
+EXPORT_SYMBOL(hinic3_ceq_unregister_cb);
+
+/**
+ * set_eq_cons_idx - write the cons idx to the hw
+ * @eq: The event queue to update the cons idx for
+ * @cons idx: consumer index value
+ **/
+static void set_eq_cons_idx(struct hinic3_eq *eq, u32 arm_state)
+{
+ u32 eq_wrap_ci, val;
+ u32 addr = EQ_CI_SIMPLE_INDIR_REG_ADDR(eq);
+
+ eq_wrap_ci = EQ_CONS_IDX(eq);
+
+ /* if use poll mode only eq0 use int_arm mode */
+ if (eq->q_id != 0 && eq->hwdev->poll)
+ val = EQ_CI_SIMPLE_INDIR_SET(HINIC3_EQ_NOT_ARMED, ARMED);
+ else
+ val = EQ_CI_SIMPLE_INDIR_SET(arm_state, ARMED);
+ if (eq->type == HINIC3_AEQ) {
+ val = val |
+ EQ_CI_SIMPLE_INDIR_SET(eq_wrap_ci, CI) |
+ EQ_CI_SIMPLE_INDIR_SET(eq->q_id, AEQ_IDX);
+ } else {
+ val = val |
+ EQ_CI_SIMPLE_INDIR_SET(eq_wrap_ci, CI) |
+ EQ_CI_SIMPLE_INDIR_SET(eq->q_id, CEQ_IDX);
+ }
+
+ hinic3_hwif_write_reg(eq->hwdev->hwif, addr, val);
+}
+
+/**
+ * ceq_event_handler - handle for the ceq events
+ * @ceqs: ceqs part of the chip
+ * @ceqe: ceq element of the event
+ **/
+static void ceq_event_handler(struct hinic3_ceqs *ceqs, u32 ceqe)
+{
+ struct hinic3_hwdev *hwdev = ceqs->hwdev;
+ enum hinic3_ceq_event event = CEQE_TYPE(ceqe);
+ u32 ceqe_data = CEQE_DATA(ceqe);
+
+ if (event >= HINIC3_MAX_CEQ_EVENTS) {
+ sdk_err(hwdev->dev_hdl, "Ceq unknown event:%d, ceqe date: 0x%x\n",
+ event, ceqe_data);
+ return;
+ }
+
+ set_bit(HINIC3_CEQ_CB_RUNNING, &ceqs->ceq_cb_state[event]);
+
+ if (ceqs->ceq_cb[event] &&
+ test_bit(HINIC3_CEQ_CB_REG, &ceqs->ceq_cb_state[event]))
+ ceqs->ceq_cb[event](ceqs->ceq_cb_data[event], ceqe_data);
+
+ clear_bit(HINIC3_CEQ_CB_RUNNING, &ceqs->ceq_cb_state[event]);
+}
+
+static void aeq_elem_handler(struct hinic3_eq *eq, u32 aeqe_desc)
+{
+ struct hinic3_aeqs *aeqs = aeq_to_aeqs(eq);
+ struct hinic3_aeq_elem *aeqe_pos;
+ enum hinic3_aeq_type event;
+ enum hinic3_aeq_sw_type sw_type;
+ u32 sw_event;
+ u8 data[HINIC3_AEQE_DATA_SIZE], size;
+
+ aeqe_pos = GET_CURR_AEQ_ELEM(eq);
+
+ eq->hwdev->cur_recv_aeq_cnt++;
+
+ event = EQ_ELEM_DESC_GET(aeqe_desc, TYPE);
+ if (EQ_ELEM_DESC_GET(aeqe_desc, SRC)) {
+ sw_event = event;
+ sw_type = sw_event >= HINIC3_NIC_FATAL_ERROR_MAX ?
+ HINIC3_STATEFUL_EVENT : HINIC3_STATELESS_EVENT;
+ /* SW event uses only the first 8B */
+ memcpy(data, aeqe_pos->aeqe_data, HINIC3_AEQE_DATA_SIZE);
+ hinic3_be32_to_cpu(data, HINIC3_AEQE_DATA_SIZE);
+ set_bit(HINIC3_AEQ_SW_CB_RUNNING,
+ &aeqs->aeq_sw_cb_state[sw_type]);
+ if (aeqs->aeq_swe_cb[sw_type] &&
+ test_bit(HINIC3_AEQ_SW_CB_REG,
+ &aeqs->aeq_sw_cb_state[sw_type]))
+ aeqs->aeq_swe_cb[sw_type](aeqs->aeq_swe_cb_data[sw_type], event, data);
+
+ clear_bit(HINIC3_AEQ_SW_CB_RUNNING,
+ &aeqs->aeq_sw_cb_state[sw_type]);
+ return;
+ }
+
+ if (event < HINIC3_MAX_AEQ_EVENTS) {
+ memcpy(data, aeqe_pos->aeqe_data, HINIC3_AEQE_DATA_SIZE);
+ hinic3_be32_to_cpu(data, HINIC3_AEQE_DATA_SIZE);
+
+ size = EQ_ELEM_DESC_GET(aeqe_desc, SIZE);
+ set_bit(HINIC3_AEQ_HW_CB_RUNNING,
+ &aeqs->aeq_hw_cb_state[event]);
+ if (aeqs->aeq_hwe_cb[event] &&
+ test_bit(HINIC3_AEQ_HW_CB_REG,
+ &aeqs->aeq_hw_cb_state[event]))
+ aeqs->aeq_hwe_cb[event](aeqs->aeq_hwe_cb_data[event], data, size);
+ clear_bit(HINIC3_AEQ_HW_CB_RUNNING,
+ &aeqs->aeq_hw_cb_state[event]);
+ return;
+ }
+ sdk_warn(eq->hwdev->dev_hdl, "Unknown aeq hw event %d\n", event);
+}
+
+/**
+ * aeq_irq_handler - handler for the aeq event
+ * @eq: the async event queue of the event
+ **/
+static bool aeq_irq_handler(struct hinic3_eq *eq)
+{
+ struct hinic3_aeq_elem *aeqe_pos = NULL;
+ u32 aeqe_desc;
+ u32 i, eqe_cnt = 0;
+
+ for (i = 0; i < HINIC3_TASK_PROCESS_EQE_LIMIT; i++) {
+ aeqe_pos = GET_CURR_AEQ_ELEM(eq);
+
+ /* Data in HW is in Big endian Format */
+ aeqe_desc = be32_to_cpu(aeqe_pos->desc);
+
+ /* HW updates wrapped bit, when it adds eq element event */
+ if (EQ_ELEM_DESC_GET(aeqe_desc, WRAPPED) == eq->wrapped)
+ return false;
+
+ dma_rmb();
+
+ aeq_elem_handler(eq, aeqe_desc);
+
+ eq->cons_idx++;
+
+ if (eq->cons_idx == eq->eq_len) {
+ eq->cons_idx = 0;
+ eq->wrapped = !eq->wrapped;
+ }
+
+ if (++eqe_cnt >= HINIC3_EQ_UPDATE_CI_STEP) {
+ eqe_cnt = 0;
+ set_eq_cons_idx(eq, HINIC3_EQ_NOT_ARMED);
+ }
+ }
+
+ return true;
+}
+
+/**
+ * ceq_irq_handler - handler for the ceq event
+ * @eq: the completion event queue of the event
+ **/
+static bool ceq_irq_handler(struct hinic3_eq *eq)
+{
+ struct hinic3_ceqs *ceqs = ceq_to_ceqs(eq);
+ u32 ceqe, eqe_cnt = 0;
+ u32 i;
+
+ for (i = 0; i < g_num_ceqe_in_tasklet; i++) {
+ ceqe = *(GET_CURR_CEQ_ELEM(eq));
+ ceqe = be32_to_cpu(ceqe);
+
+ /* HW updates wrapped bit, when it adds eq element event */
+ if (EQ_ELEM_DESC_GET(ceqe, WRAPPED) == eq->wrapped)
+ return false;
+
+ ceq_event_handler(ceqs, ceqe);
+
+ eq->cons_idx++;
+
+ if (eq->cons_idx == eq->eq_len) {
+ eq->cons_idx = 0;
+ eq->wrapped = !eq->wrapped;
+ }
+
+ if (++eqe_cnt >= HINIC3_EQ_UPDATE_CI_STEP) {
+ eqe_cnt = 0;
+ set_eq_cons_idx(eq, HINIC3_EQ_NOT_ARMED);
+ }
+ }
+
+ return true;
+}
+
+static void reschedule_eq_handler(struct hinic3_eq *eq)
+{
+ if (eq->type == HINIC3_AEQ) {
+ struct hinic3_aeqs *aeqs = aeq_to_aeqs(eq);
+
+ queue_work_on(hisdk3_get_work_cpu_affinity(eq->hwdev, WORK_TYPE_AEQ),
+ aeqs->workq, &eq->aeq_work);
+ } else {
+ tasklet_schedule(&eq->ceq_tasklet);
+ }
+}
+
+/**
+ * eq_irq_handler - handler for the eq event
+ * @data: the event queue of the event
+ **/
+static bool eq_irq_handler(void *data)
+{
+ struct hinic3_eq *eq = (struct hinic3_eq *)data;
+ bool uncompleted = false;
+
+ if (eq->type == HINIC3_AEQ)
+ uncompleted = aeq_irq_handler(eq);
+ else
+ uncompleted = ceq_irq_handler(eq);
+
+ set_eq_cons_idx(eq, uncompleted ? HINIC3_EQ_NOT_ARMED :
+ HINIC3_EQ_ARMED);
+
+ return uncompleted;
+}
+
+/**
+ * eq_irq_work - eq work for the event
+ * @work: the work that is associated with the eq
+ **/
+static void eq_irq_work(struct work_struct *work)
+{
+ struct hinic3_eq *eq = container_of(work, struct hinic3_eq, aeq_work);
+
+ if (eq_irq_handler(eq))
+ reschedule_eq_handler(eq);
+}
+
+/**
+ * aeq_interrupt - aeq interrupt handler
+ * @irq: irq number
+ * @data: the async event queue of the event
+ **/
+static irqreturn_t aeq_interrupt(int irq, void *data)
+{
+ struct hinic3_eq *aeq = (struct hinic3_eq *)data;
+ struct hinic3_hwdev *hwdev = aeq->hwdev;
+ struct hinic3_aeqs *aeqs = aeq_to_aeqs(aeq);
+ struct workqueue_struct *workq = aeqs->workq;
+
+ /* clear resend timer cnt register */
+ hinic3_misx_intr_clear_resend_bit(hwdev, aeq->eq_irq.msix_entry_idx,
+ EQ_MSIX_RESEND_TIMER_CLEAR);
+
+ queue_work_on(hisdk3_get_work_cpu_affinity(hwdev, WORK_TYPE_AEQ),
+ workq, &aeq->aeq_work);
+ return IRQ_HANDLED;
+}
+
+/**
+ * ceq_tasklet - ceq tasklet for the event
+ * @ceq_data: data that will be used by the tasklet(ceq)
+ **/
+static void ceq_tasklet(ulong ceq_data)
+{
+ struct hinic3_eq *eq = (struct hinic3_eq *)ceq_data;
+
+ eq->soft_intr_jif = jiffies;
+
+ if (eq_irq_handler(eq))
+ reschedule_eq_handler(eq);
+}
+
+/**
+ * ceq_interrupt - ceq interrupt handler
+ * @irq: irq number
+ * @data: the completion event queue of the event
+ **/
+static irqreturn_t ceq_interrupt(int irq, void *data)
+{
+ struct hinic3_eq *ceq = (struct hinic3_eq *)data;
+
+ ceq->hard_intr_jif = jiffies;
+
+ /* clear resend timer counters */
+ hinic3_misx_intr_clear_resend_bit(ceq->hwdev,
+ ceq->eq_irq.msix_entry_idx,
+ EQ_MSIX_RESEND_TIMER_CLEAR);
+
+ tasklet_schedule(&ceq->ceq_tasklet);
+
+ return IRQ_HANDLED;
+}
+
+/**
+ * set_eq_ctrls - setting eq's ctrls registers
+ * @eq: the event queue for setting
+ **/
+static int set_eq_ctrls(struct hinic3_eq *eq)
+{
+ enum hinic3_eq_type type = eq->type;
+ struct hinic3_hwif *hwif = eq->hwdev->hwif;
+ struct irq_info *eq_irq = &eq->eq_irq;
+ u32 addr, val, ctrl0, ctrl1, page_size_val, elem_size;
+ u32 pci_intf_idx = HINIC3_PCI_INTF_IDX(hwif);
+ int err;
+
+ if (type == HINIC3_AEQ) {
+ /* set ctrl0 */
+ addr = HINIC3_CSR_AEQ_CTRL_0_ADDR;
+
+ val = hinic3_hwif_read_reg(hwif, addr);
+
+ val = AEQ_CTRL_0_CLEAR(val, INTR_IDX) &
+ AEQ_CTRL_0_CLEAR(val, DMA_ATTR) &
+ AEQ_CTRL_0_CLEAR(val, PCI_INTF_IDX) &
+ AEQ_CTRL_0_CLEAR(val, INTR_MODE);
+
+ ctrl0 = AEQ_CTRL_0_SET(eq_irq->msix_entry_idx, INTR_IDX) |
+ AEQ_CTRL_0_SET(AEQ_DMA_ATTR_DEFAULT, DMA_ATTR) |
+ AEQ_CTRL_0_SET(pci_intf_idx, PCI_INTF_IDX) |
+ AEQ_CTRL_0_SET(HINIC3_INTR_MODE_ARMED, INTR_MODE);
+
+ val |= ctrl0;
+
+ hinic3_hwif_write_reg(hwif, addr, val);
+
+ /* set ctrl1 */
+ addr = HINIC3_CSR_AEQ_CTRL_1_ADDR;
+
+ page_size_val = EQ_SET_HW_PAGE_SIZE_VAL(eq);
+ elem_size = EQ_SET_HW_ELEM_SIZE_VAL(eq);
+
+ ctrl1 = AEQ_CTRL_1_SET(eq->eq_len, LEN) |
+ AEQ_CTRL_1_SET(elem_size, ELEM_SIZE) |
+ AEQ_CTRL_1_SET(page_size_val, PAGE_SIZE);
+
+ hinic3_hwif_write_reg(hwif, addr, ctrl1);
+ } else {
+ page_size_val = EQ_SET_HW_PAGE_SIZE_VAL(eq);
+ ctrl0 = CEQ_CTRL_0_SET(eq_irq->msix_entry_idx, INTR_IDX) |
+ CEQ_CTRL_0_SET(CEQ_DMA_ATTR_DEFAULT, DMA_ATTR) |
+ CEQ_CTRL_0_SET(CEQ_LMT_KICK_DEFAULT, LIMIT_KICK) |
+ CEQ_CTRL_0_SET(pci_intf_idx, PCI_INTF_IDX) |
+ CEQ_CTRL_0_SET(page_size_val, PAGE_SIZE) |
+ CEQ_CTRL_0_SET(HINIC3_INTR_MODE_ARMED, INTR_MODE);
+
+ ctrl1 = CEQ_CTRL_1_SET(eq->eq_len, LEN);
+
+ /* set ceq ctrl reg through mgmt cpu */
+ err = hinic3_set_ceq_ctrl_reg(eq->hwdev, eq->q_id, ctrl0,
+ ctrl1);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+/**
+ * ceq_elements_init - Initialize all the elements in the ceq
+ * @eq: the event queue
+ * @init_val: value to init with it the elements
+ **/
+static void ceq_elements_init(struct hinic3_eq *eq, u32 init_val)
+{
+ u32 *ceqe = NULL;
+ u32 i;
+
+ for (i = 0; i < eq->eq_len; i++) {
+ ceqe = GET_CEQ_ELEM(eq, i);
+ *(ceqe) = cpu_to_be32(init_val);
+ }
+
+ wmb(); /* Write the init values */
+}
+
+/**
+ * aeq_elements_init - initialize all the elements in the aeq
+ * @eq: the event queue
+ * @init_val: value to init with it the elements
+ **/
+static void aeq_elements_init(struct hinic3_eq *eq, u32 init_val)
+{
+ struct hinic3_aeq_elem *aeqe = NULL;
+ u32 i;
+
+ for (i = 0; i < eq->eq_len; i++) {
+ aeqe = GET_AEQ_ELEM(eq, i);
+ aeqe->desc = cpu_to_be32(init_val);
+ }
+
+ wmb(); /* Write the init values */
+}
+
+static void eq_elements_init(struct hinic3_eq *eq, u32 init_val)
+{
+ if (eq->type == HINIC3_AEQ)
+ aeq_elements_init(eq, init_val);
+ else
+ ceq_elements_init(eq, init_val);
+}
+
+/**
+ * alloc_eq_pages - allocate the pages for the queue
+ * @eq: the event queue
+ **/
+static int alloc_eq_pages(struct hinic3_eq *eq)
+{
+ struct hinic3_hwif *hwif = eq->hwdev->hwif;
+ struct hinic3_dma_addr_align *eq_page = NULL;
+ u32 reg, init_val;
+ u16 pg_idx, i;
+ int err;
+
+ eq->eq_pages = kcalloc(eq->num_pages, sizeof(*eq->eq_pages),
+ GFP_KERNEL);
+ if (!eq->eq_pages) {
+ sdk_err(eq->hwdev->dev_hdl, "Failed to alloc eq pages description\n");
+ return -ENOMEM;
+ }
+
+ for (pg_idx = 0; pg_idx < eq->num_pages; pg_idx++) {
+ eq_page = &eq->eq_pages[pg_idx];
+ err = hinic3_dma_zalloc_coherent_align(eq->hwdev->dev_hdl,
+ eq->page_size,
+ HINIC3_MIN_EQ_PAGE_SIZE,
+ GFP_KERNEL, eq_page);
+ if (err) {
+ sdk_err(eq->hwdev->dev_hdl, "Failed to alloc eq page, page index: %hu\n",
+ pg_idx);
+ goto dma_alloc_err;
+ }
+
+ reg = HINIC3_EQ_HI_PHYS_ADDR_REG(eq->type, pg_idx);
+ hinic3_hwif_write_reg(hwif, reg,
+ upper_32_bits(eq_page->align_paddr));
+
+ reg = HINIC3_EQ_LO_PHYS_ADDR_REG(eq->type, pg_idx);
+ hinic3_hwif_write_reg(hwif, reg,
+ lower_32_bits(eq_page->align_paddr));
+ }
+
+ eq->num_elem_in_pg = GET_EQ_NUM_ELEMS(eq, eq->page_size);
+ if (eq->num_elem_in_pg & (eq->num_elem_in_pg - 1)) {
+ sdk_err(eq->hwdev->dev_hdl, "Number element in eq page != power of 2\n");
+ err = -EINVAL;
+ goto dma_alloc_err;
+ }
+ init_val = EQ_WRAPPED(eq);
+
+ eq_elements_init(eq, init_val);
+
+ return 0;
+
+dma_alloc_err:
+ for (i = 0; i < pg_idx; i++)
+ hinic3_dma_free_coherent_align(eq->hwdev->dev_hdl,
+ &eq->eq_pages[i]);
+
+ kfree(eq->eq_pages);
+
+ return err;
+}
+
+/**
+ * free_eq_pages - free the pages of the queue
+ * @eq: the event queue
+ **/
+static void free_eq_pages(struct hinic3_eq *eq)
+{
+ u16 pg_idx;
+
+ for (pg_idx = 0; pg_idx < eq->num_pages; pg_idx++)
+ hinic3_dma_free_coherent_align(eq->hwdev->dev_hdl,
+ &eq->eq_pages[pg_idx]);
+
+ kfree(eq->eq_pages);
+}
+
+static inline u32 get_page_size(const struct hinic3_eq *eq)
+{
+ u32 total_size;
+ u32 count;
+
+ total_size = ALIGN((eq->eq_len * eq->elem_size),
+ HINIC3_MIN_EQ_PAGE_SIZE);
+ if (total_size <= (HINIC3_EQ_MAX_PAGES(eq) * HINIC3_MIN_EQ_PAGE_SIZE))
+ return HINIC3_MIN_EQ_PAGE_SIZE;
+
+ count = (u32)(ALIGN((total_size / HINIC3_EQ_MAX_PAGES(eq)),
+ HINIC3_MIN_EQ_PAGE_SIZE) / HINIC3_MIN_EQ_PAGE_SIZE);
+
+ /* round up to nearest power of two */
+ count = 1U << (u8)fls((int)(count - 1));
+
+ return ((u32)HINIC3_MIN_EQ_PAGE_SIZE) * count;
+}
+
+static int request_eq_irq(struct hinic3_eq *eq, struct irq_info *entry)
+{
+ int err = 0;
+
+ if (eq->type == HINIC3_AEQ)
+ INIT_WORK(&eq->aeq_work, eq_irq_work);
+ else
+ tasklet_init(&eq->ceq_tasklet, ceq_tasklet, (ulong)eq);
+
+ if (eq->type == HINIC3_AEQ) {
+ snprintf(eq->irq_name, sizeof(eq->irq_name),
+ "hinic3_aeq%u@pci:%s", eq->q_id,
+ pci_name(eq->hwdev->pcidev_hdl));
+
+ err = request_irq(entry->irq_id, aeq_interrupt, 0UL,
+ eq->irq_name, eq);
+ } else {
+ snprintf(eq->irq_name, sizeof(eq->irq_name),
+ "hinic3_ceq%u@pci:%s", eq->q_id,
+ pci_name(eq->hwdev->pcidev_hdl));
+ err = request_irq(entry->irq_id, ceq_interrupt, 0UL,
+ eq->irq_name, eq);
+ }
+
+ return err;
+}
+
+static void reset_eq(struct hinic3_eq *eq)
+{
+ /* clear eq_len to force eqe drop in hardware */
+ if (eq->type == HINIC3_AEQ)
+ hinic3_hwif_write_reg(eq->hwdev->hwif,
+ HINIC3_CSR_AEQ_CTRL_1_ADDR, 0);
+ else
+ hinic3_set_ceq_ctrl_reg(eq->hwdev, eq->q_id, 0, 0);
+
+ wmb(); /* clear eq_len before clear prod idx */
+
+ hinic3_hwif_write_reg(eq->hwdev->hwif, EQ_PROD_IDX_REG_ADDR(eq), 0);
+}
+
+/**
+ * init_eq - initialize eq
+ * @eq: the event queue
+ * @hwdev: the pointer to hw device
+ * @q_id: Queue id number
+ * @q_len: the number of EQ elements
+ * @type: the type of the event queue, ceq or aeq
+ * @entry: msix entry associated with the event queue
+ * Return: 0 - Success, Negative - failure
+ **/
+static int init_eq(struct hinic3_eq *eq, struct hinic3_hwdev *hwdev, u16 q_id,
+ u32 q_len, enum hinic3_eq_type type, struct irq_info *entry)
+{
+ int err = 0;
+
+ eq->hwdev = hwdev;
+ eq->q_id = q_id;
+ eq->type = type;
+ eq->eq_len = q_len;
+
+ /* Indirect access should set q_id first */
+ hinic3_hwif_write_reg(hwdev->hwif, HINIC3_EQ_INDIR_IDX_ADDR(eq->type),
+ eq->q_id);
+ wmb(); /* write index before config */
+
+ reset_eq(eq);
+
+ eq->cons_idx = 0;
+ eq->wrapped = 0;
+
+ eq->elem_size = (type == HINIC3_AEQ) ? HINIC3_AEQE_SIZE : HINIC3_CEQE_SIZE;
+
+ eq->page_size = get_page_size(eq);
+ eq->orig_page_size = eq->page_size;
+ eq->num_pages = GET_EQ_NUM_PAGES(eq, eq->page_size);
+
+ if (eq->num_pages > HINIC3_EQ_MAX_PAGES(eq)) {
+ sdk_err(hwdev->dev_hdl, "Number pages: %u too many pages for eq\n",
+ eq->num_pages);
+ return -EINVAL;
+ }
+
+ err = alloc_eq_pages(eq);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to allocate pages for eq\n");
+ return err;
+ }
+
+ eq->eq_irq.msix_entry_idx = entry->msix_entry_idx;
+ eq->eq_irq.irq_id = entry->irq_id;
+
+ err = set_eq_ctrls(eq);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to set ctrls for eq\n");
+ goto init_eq_ctrls_err;
+ }
+
+ set_eq_cons_idx(eq, HINIC3_EQ_ARMED);
+
+ err = request_eq_irq(eq, entry);
+ if (err) {
+ sdk_err(hwdev->dev_hdl,
+ "Failed to request irq for the eq, err: %d\n", err);
+ goto req_irq_err;
+ }
+
+ hinic3_set_msix_state(hwdev, entry->msix_entry_idx,
+ HINIC3_MSIX_DISABLE);
+
+ return 0;
+
+init_eq_ctrls_err:
+req_irq_err:
+ free_eq_pages(eq);
+ return err;
+}
+
+/**
+ * remove_eq - remove eq
+ * @eq: the event queue
+ **/
+static void remove_eq(struct hinic3_eq *eq)
+{
+ struct irq_info *entry = &eq->eq_irq;
+
+ hinic3_set_msix_state(eq->hwdev, entry->msix_entry_idx,
+ HINIC3_MSIX_DISABLE);
+ synchronize_irq(entry->irq_id);
+
+ free_irq(entry->irq_id, eq);
+
+ /* Indirect access should set q_id first */
+ hinic3_hwif_write_reg(eq->hwdev->hwif,
+ HINIC3_EQ_INDIR_IDX_ADDR(eq->type),
+ eq->q_id);
+
+ wmb(); /* write index before config */
+
+ if (eq->type == HINIC3_AEQ) {
+ cancel_work_sync(&eq->aeq_work);
+
+ /* clear eq_len to avoid hw access host memory */
+ hinic3_hwif_write_reg(eq->hwdev->hwif,
+ HINIC3_CSR_AEQ_CTRL_1_ADDR, 0);
+ } else {
+ tasklet_kill(&eq->ceq_tasklet);
+
+ hinic3_set_ceq_ctrl_reg(eq->hwdev, eq->q_id, 0, 0);
+ }
+
+ /* update cons_idx to avoid invalid interrupt */
+ eq->cons_idx = hinic3_hwif_read_reg(eq->hwdev->hwif,
+ EQ_PROD_IDX_REG_ADDR(eq));
+ set_eq_cons_idx(eq, HINIC3_EQ_NOT_ARMED);
+
+ free_eq_pages(eq);
+}
+
+/**
+ * hinic3_aeqs_init - init all the aeqs
+ * @hwdev: the pointer to hw device
+ * @num_aeqs: number of AEQs
+ * @msix_entries: msix entries associated with the event queues
+ * Return: 0 - Success, Negative - failure
+ **/
+int hinic3_aeqs_init(struct hinic3_hwdev *hwdev, u16 num_aeqs,
+ struct irq_info *msix_entries)
+{
+ struct hinic3_aeqs *aeqs = NULL;
+ int err;
+ u16 i, q_id;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ aeqs = kzalloc(sizeof(*aeqs), GFP_KERNEL);
+ if (!aeqs)
+ return -ENOMEM;
+
+ hwdev->aeqs = aeqs;
+ aeqs->hwdev = hwdev;
+ aeqs->num_aeqs = num_aeqs;
+ aeqs->workq = alloc_workqueue(HINIC3_EQS_WQ_NAME, WQ_MEM_RECLAIM,
+ HINIC3_MAX_AEQS);
+ if (!aeqs->workq) {
+ sdk_err(hwdev->dev_hdl, "Failed to initialize aeq workqueue\n");
+ err = -ENOMEM;
+ goto create_work_err;
+ }
+
+ if (g_aeq_len < HINIC3_MIN_AEQ_LEN || g_aeq_len > HINIC3_MAX_AEQ_LEN) {
+ sdk_warn(hwdev->dev_hdl, "Module Parameter g_aeq_len value %u out of range, resetting to %d\n",
+ g_aeq_len, HINIC3_DEFAULT_AEQ_LEN);
+ g_aeq_len = HINIC3_DEFAULT_AEQ_LEN;
+ }
+
+ for (q_id = 0; q_id < num_aeqs; q_id++) {
+ err = init_eq(&aeqs->aeq[q_id], hwdev, q_id, g_aeq_len,
+ HINIC3_AEQ, &msix_entries[q_id]);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init aeq %u\n",
+ q_id);
+ goto init_aeq_err;
+ }
+ }
+ for (q_id = 0; q_id < num_aeqs; q_id++)
+ hinic3_set_msix_state(hwdev, msix_entries[q_id].msix_entry_idx,
+ HINIC3_MSIX_ENABLE);
+
+ return 0;
+
+init_aeq_err:
+ for (i = 0; i < q_id; i++)
+ remove_eq(&aeqs->aeq[i]);
+
+ destroy_workqueue(aeqs->workq);
+
+create_work_err:
+ kfree(aeqs);
+
+ return err;
+}
+
+/**
+ * hinic3_aeqs_free - free all the aeqs
+ * @hwdev: the pointer to hw device
+ **/
+void hinic3_aeqs_free(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_aeqs *aeqs = hwdev->aeqs;
+ enum hinic3_aeq_type aeq_event = HINIC3_HW_INTER_INT;
+ enum hinic3_aeq_sw_type sw_aeq_event = HINIC3_STATELESS_EVENT;
+ u16 q_id;
+
+ for (q_id = 0; q_id < aeqs->num_aeqs; q_id++)
+ remove_eq(&aeqs->aeq[q_id]);
+
+ for (; sw_aeq_event < HINIC3_MAX_AEQ_SW_EVENTS; sw_aeq_event++)
+ hinic3_aeq_unregister_swe_cb(hwdev, sw_aeq_event);
+
+ for (; aeq_event < HINIC3_MAX_AEQ_EVENTS; aeq_event++)
+ hinic3_aeq_unregister_hw_cb(hwdev, aeq_event);
+
+ destroy_workqueue(aeqs->workq);
+
+ kfree(aeqs);
+}
+
+/**
+ * hinic3_ceqs_init - init all the ceqs
+ * @hwdev: the pointer to hw device
+ * @num_ceqs: number of CEQs
+ * @msix_entries: msix entries associated with the event queues
+ * Return: 0 - Success, Negative - failure
+ **/
+int hinic3_ceqs_init(struct hinic3_hwdev *hwdev, u16 num_ceqs,
+ struct irq_info *msix_entries)
+{
+ struct hinic3_ceqs *ceqs;
+ int err;
+ u16 i, q_id;
+
+ ceqs = kzalloc(sizeof(*ceqs), GFP_KERNEL);
+ if (!ceqs)
+ return -ENOMEM;
+
+ hwdev->ceqs = ceqs;
+
+ ceqs->hwdev = hwdev;
+ ceqs->num_ceqs = num_ceqs;
+
+ if (g_ceq_len < HINIC3_MIN_CEQ_LEN || g_ceq_len > HINIC3_MAX_CEQ_LEN) {
+ sdk_warn(hwdev->dev_hdl, "Module Parameter g_ceq_len value %u out of range, resetting to %d\n",
+ g_ceq_len, HINIC3_DEFAULT_CEQ_LEN);
+ g_ceq_len = HINIC3_DEFAULT_CEQ_LEN;
+ }
+
+ if (!g_num_ceqe_in_tasklet) {
+ sdk_warn(hwdev->dev_hdl, "Module Parameter g_num_ceqe_in_tasklet can not be zero, resetting to %d\n",
+ HINIC3_TASK_PROCESS_EQE_LIMIT);
+ g_num_ceqe_in_tasklet = HINIC3_TASK_PROCESS_EQE_LIMIT;
+ }
+ for (q_id = 0; q_id < num_ceqs; q_id++) {
+ err = init_eq(&ceqs->ceq[q_id], hwdev, q_id, g_ceq_len,
+ HINIC3_CEQ, &msix_entries[q_id]);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init ceq %u\n",
+ q_id);
+ goto init_ceq_err;
+ }
+ }
+ for (q_id = 0; q_id < num_ceqs; q_id++)
+ hinic3_set_msix_state(hwdev, msix_entries[q_id].msix_entry_idx,
+ HINIC3_MSIX_ENABLE);
+
+ for (i = 0; i < HINIC3_MAX_CEQ_EVENTS; i++)
+ ceqs->ceq_cb_state[i] = 0;
+
+ return 0;
+
+init_ceq_err:
+ for (i = 0; i < q_id; i++)
+ remove_eq(&ceqs->ceq[i]);
+
+ kfree(ceqs);
+
+ return err;
+}
+
+/**
+ * hinic3_ceqs_free - free all the ceqs
+ * @hwdev: the pointer to hw device
+ **/
+void hinic3_ceqs_free(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_ceqs *ceqs = hwdev->ceqs;
+ enum hinic3_ceq_event ceq_event = HINIC3_CMDQ;
+ u16 q_id;
+
+ for (q_id = 0; q_id < ceqs->num_ceqs; q_id++)
+ remove_eq(&ceqs->ceq[q_id]);
+
+ for (; ceq_event < HINIC3_MAX_CEQ_EVENTS; ceq_event++)
+ hinic3_ceq_unregister_cb(hwdev, ceq_event);
+
+ kfree(ceqs);
+}
+
+void hinic3_get_ceq_irqs(struct hinic3_hwdev *hwdev, struct irq_info *irqs,
+ u16 *num_irqs)
+{
+ struct hinic3_ceqs *ceqs = hwdev->ceqs;
+ u16 q_id;
+
+ for (q_id = 0; q_id < ceqs->num_ceqs; q_id++) {
+ irqs[q_id].irq_id = ceqs->ceq[q_id].eq_irq.irq_id;
+ irqs[q_id].msix_entry_idx =
+ ceqs->ceq[q_id].eq_irq.msix_entry_idx;
+ }
+
+ *num_irqs = ceqs->num_ceqs;
+}
+
+void hinic3_get_aeq_irqs(struct hinic3_hwdev *hwdev, struct irq_info *irqs,
+ u16 *num_irqs)
+{
+ struct hinic3_aeqs *aeqs = hwdev->aeqs;
+ u16 q_id;
+
+ for (q_id = 0; q_id < aeqs->num_aeqs; q_id++) {
+ irqs[q_id].irq_id = aeqs->aeq[q_id].eq_irq.irq_id;
+ irqs[q_id].msix_entry_idx =
+ aeqs->aeq[q_id].eq_irq.msix_entry_idx;
+ }
+
+ *num_irqs = aeqs->num_aeqs;
+}
+
+void hinic3_dump_aeq_info(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_aeq_elem *aeqe_pos = NULL;
+ struct hinic3_eq *eq = NULL;
+ u32 addr, ci, pi, ctrl0, idx;
+ int q_id;
+
+ for (q_id = 0; q_id < hwdev->aeqs->num_aeqs; q_id++) {
+ eq = &hwdev->aeqs->aeq[q_id];
+ /* Indirect access should set q_id first */
+ hinic3_hwif_write_reg(eq->hwdev->hwif, HINIC3_EQ_INDIR_IDX_ADDR(eq->type),
+ eq->q_id);
+ wmb(); /* write index before config */
+
+ addr = HINIC3_CSR_AEQ_CTRL_0_ADDR;
+
+ ctrl0 = hinic3_hwif_read_reg(hwdev->hwif, addr);
+
+ idx = hinic3_hwif_read_reg(hwdev->hwif, HINIC3_EQ_INDIR_IDX_ADDR(eq->type));
+
+ addr = EQ_CONS_IDX_REG_ADDR(eq);
+ ci = hinic3_hwif_read_reg(hwdev->hwif, addr);
+ addr = EQ_PROD_IDX_REG_ADDR(eq);
+ pi = hinic3_hwif_read_reg(hwdev->hwif, addr);
+ aeqe_pos = GET_CURR_AEQ_ELEM(eq);
+ sdk_err(hwdev->dev_hdl,
+ "Aeq id: %d, idx: %u, ctrl0: 0x%08x, ci: 0x%08x, pi: 0x%x, work_state: 0x%x, wrap: %u, desc: 0x%x swci:0x%x\n",
+ q_id, idx, ctrl0, ci, pi, work_busy(&eq->aeq_work),
+ eq->wrapped, be32_to_cpu(aeqe_pos->desc), eq->cons_idx);
+ }
+
+ hinic3_show_chip_err_info(hwdev);
+}
+
+void hinic3_dump_ceq_info(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_eq *eq = NULL;
+ u32 addr, ci, pi;
+ int q_id;
+
+ for (q_id = 0; q_id < hwdev->ceqs->num_ceqs; q_id++) {
+ eq = &hwdev->ceqs->ceq[q_id];
+ /* Indirect access should set q_id first */
+ hinic3_hwif_write_reg(eq->hwdev->hwif,
+ HINIC3_EQ_INDIR_IDX_ADDR(eq->type),
+ eq->q_id);
+ wmb(); /* write index before config */
+
+ addr = EQ_CONS_IDX_REG_ADDR(eq);
+ ci = hinic3_hwif_read_reg(hwdev->hwif, addr);
+ addr = EQ_PROD_IDX_REG_ADDR(eq);
+ pi = hinic3_hwif_read_reg(hwdev->hwif, addr);
+ sdk_err(hwdev->dev_hdl,
+ "Ceq id: %d, ci: 0x%08x, sw_ci: 0x%08x, pi: 0x%x, tasklet_state: 0x%lx, wrap: %u, ceqe: 0x%x\n",
+ q_id, ci, eq->cons_idx, pi,
+ tasklet_state(&eq->ceq_tasklet),
+ eq->wrapped, be32_to_cpu(*(GET_CURR_CEQ_ELEM(eq))));
+
+ sdk_err(hwdev->dev_hdl, "Ceq last response hard interrupt time: %u\n",
+ jiffies_to_msecs(jiffies - eq->hard_intr_jif));
+ sdk_err(hwdev->dev_hdl, "Ceq last response soft interrupt time: %u\n",
+ jiffies_to_msecs(jiffies - eq->soft_intr_jif));
+ }
+
+ hinic3_show_chip_err_info(hwdev);
+}
+
+int hinic3_get_ceq_info(void *hwdev, u16 q_id, struct hinic3_ceq_info *ceq_info)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct hinic3_eq *eq = NULL;
+
+ if (!hwdev || !ceq_info)
+ return -EINVAL;
+
+ if (q_id >= dev->ceqs->num_ceqs)
+ return -EINVAL;
+
+ eq = &dev->ceqs->ceq[q_id];
+ ceq_info->q_len = eq->eq_len;
+ ceq_info->num_pages = eq->num_pages;
+ ceq_info->page_size = eq->page_size;
+ ceq_info->num_elem_in_pg = eq->num_elem_in_pg;
+ ceq_info->elem_size = eq->elem_size;
+ sdk_info(dev->dev_hdl, "get_ceq_info: qid=0x%x page_size=%ul\n",
+ q_id, eq->page_size);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_ceq_info);
+
+int hinic3_get_ceq_page_phy_addr(void *hwdev, u16 q_id,
+ u16 page_idx, u64 *page_phy_addr)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct hinic3_eq *eq = NULL;
+
+ if (!hwdev || !page_phy_addr)
+ return -EINVAL;
+
+ if (q_id >= dev->ceqs->num_ceqs)
+ return -EINVAL;
+
+ eq = &dev->ceqs->ceq[q_id];
+ if (page_idx >= eq->num_pages)
+ return -EINVAL;
+
+ *page_phy_addr = eq->eq_pages[page_idx].align_paddr;
+ sdk_info(dev->dev_hdl, "ceq_page_phy_addr: 0x%llx page_idx=%u\n",
+ eq->eq_pages[page_idx].align_paddr, page_idx);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_ceq_page_phy_addr);
+
+int hinic3_set_ceq_irq_disable(void *hwdev, u16 q_id)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct hinic3_eq *ceq = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (q_id >= dev->ceqs->num_ceqs)
+ return -EINVAL;
+
+ ceq = &dev->ceqs->ceq[q_id];
+
+ hinic3_set_msix_state(ceq->hwdev, ceq->eq_irq.msix_entry_idx,
+ HINIC3_MSIX_DISABLE);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_ceq_irq_disable);
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.h
new file mode 100644
index 000000000000..a6b83c3bc563
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.h
@@ -0,0 +1,164 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_EQS_H
+#define HINIC3_EQS_H
+
+#include <linux/types.h>
+#include <linux/interrupt.h>
+#include <linux/workqueue.h>
+
+#include "hinic3_common.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_hwdev.h"
+
+#define HINIC3_MAX_AEQS 4
+#define HINIC3_MAX_CEQS 32
+
+#define HINIC3_AEQ_MAX_PAGES 4
+#define HINIC3_CEQ_MAX_PAGES 8
+
+#define HINIC3_AEQE_SIZE 64
+#define HINIC3_CEQE_SIZE 4
+
+#define HINIC3_AEQE_DESC_SIZE 4
+#define HINIC3_AEQE_DATA_SIZE \
+ (HINIC3_AEQE_SIZE - HINIC3_AEQE_DESC_SIZE)
+
+#define HINIC3_DEFAULT_AEQ_LEN 0x10000
+#define HINIC3_DEFAULT_CEQ_LEN 0x10000
+
+#define HINIC3_MIN_EQ_PAGE_SIZE 0x1000 /* min eq page size 4K Bytes */
+#define HINIC3_MAX_EQ_PAGE_SIZE 0x400000 /* max eq page size 4M Bytes */
+
+#define HINIC3_MIN_AEQ_LEN 64
+#define HINIC3_MAX_AEQ_LEN \
+ ((HINIC3_MAX_EQ_PAGE_SIZE / HINIC3_AEQE_SIZE) * HINIC3_AEQ_MAX_PAGES)
+
+#define HINIC3_MIN_CEQ_LEN 64
+#define HINIC3_MAX_CEQ_LEN \
+ ((HINIC3_MAX_EQ_PAGE_SIZE / HINIC3_CEQE_SIZE) * HINIC3_CEQ_MAX_PAGES)
+#define HINIC3_CEQ_ID_CMDQ 0
+
+#define EQ_IRQ_NAME_LEN 64
+
+#define EQ_USLEEP_LOW_BOUND 900
+#define EQ_USLEEP_HIG_BOUND 1000
+
+enum hinic3_eq_type {
+ HINIC3_AEQ,
+ HINIC3_CEQ
+};
+
+enum hinic3_eq_intr_mode {
+ HINIC3_INTR_MODE_ARMED,
+ HINIC3_INTR_MODE_ALWAYS,
+};
+
+enum hinic3_eq_ci_arm_state {
+ HINIC3_EQ_NOT_ARMED,
+ HINIC3_EQ_ARMED,
+};
+
+struct hinic3_eq {
+ struct hinic3_hwdev *hwdev;
+ u16 q_id;
+ u16 rsvd1;
+ enum hinic3_eq_type type;
+ u32 page_size;
+ u32 orig_page_size;
+ u32 eq_len;
+
+ u32 cons_idx;
+ u16 wrapped;
+ u16 rsvd2;
+
+ u16 elem_size;
+ u16 num_pages;
+ u32 num_elem_in_pg;
+
+ struct irq_info eq_irq;
+ char irq_name[EQ_IRQ_NAME_LEN];
+
+ struct hinic3_dma_addr_align *eq_pages;
+
+ struct work_struct aeq_work;
+ struct tasklet_struct ceq_tasklet;
+
+ u64 hard_intr_jif;
+ u64 soft_intr_jif;
+
+ u64 rsvd3;
+};
+
+struct hinic3_aeq_elem {
+ u8 aeqe_data[HINIC3_AEQE_DATA_SIZE];
+ u32 desc;
+};
+
+enum hinic3_aeq_cb_state {
+ HINIC3_AEQ_HW_CB_REG = 0,
+ HINIC3_AEQ_HW_CB_RUNNING,
+ HINIC3_AEQ_SW_CB_REG,
+ HINIC3_AEQ_SW_CB_RUNNING,
+};
+
+struct hinic3_aeqs {
+ struct hinic3_hwdev *hwdev;
+
+ hinic3_aeq_hwe_cb aeq_hwe_cb[HINIC3_MAX_AEQ_EVENTS];
+ void *aeq_hwe_cb_data[HINIC3_MAX_AEQ_EVENTS];
+ hinic3_aeq_swe_cb aeq_swe_cb[HINIC3_MAX_AEQ_SW_EVENTS];
+ void *aeq_swe_cb_data[HINIC3_MAX_AEQ_SW_EVENTS];
+ unsigned long aeq_hw_cb_state[HINIC3_MAX_AEQ_EVENTS];
+ unsigned long aeq_sw_cb_state[HINIC3_MAX_AEQ_SW_EVENTS];
+
+ struct hinic3_eq aeq[HINIC3_MAX_AEQS];
+ u16 num_aeqs;
+ u16 rsvd1;
+ u32 rsvd2;
+
+ struct workqueue_struct *workq;
+};
+
+enum hinic3_ceq_cb_state {
+ HINIC3_CEQ_CB_REG = 0,
+ HINIC3_CEQ_CB_RUNNING,
+};
+
+struct hinic3_ceqs {
+ struct hinic3_hwdev *hwdev;
+
+ hinic3_ceq_event_cb ceq_cb[HINIC3_MAX_CEQ_EVENTS];
+ void *ceq_cb_data[HINIC3_MAX_CEQ_EVENTS];
+ void *ceq_data[HINIC3_MAX_CEQ_EVENTS];
+ unsigned long ceq_cb_state[HINIC3_MAX_CEQ_EVENTS];
+
+ struct hinic3_eq ceq[HINIC3_MAX_CEQS];
+ u16 num_ceqs;
+ u16 rsvd1;
+ u32 rsvd2;
+};
+
+int hinic3_aeqs_init(struct hinic3_hwdev *hwdev, u16 num_aeqs,
+ struct irq_info *msix_entries);
+
+void hinic3_aeqs_free(struct hinic3_hwdev *hwdev);
+
+int hinic3_ceqs_init(struct hinic3_hwdev *hwdev, u16 num_ceqs,
+ struct irq_info *msix_entries);
+
+void hinic3_ceqs_free(struct hinic3_hwdev *hwdev);
+
+void hinic3_get_ceq_irqs(struct hinic3_hwdev *hwdev, struct irq_info *irqs,
+ u16 *num_irqs);
+
+void hinic3_get_aeq_irqs(struct hinic3_hwdev *hwdev, struct irq_info *irqs,
+ u16 *num_irqs);
+
+void hinic3_dump_ceq_info(struct hinic3_hwdev *hwdev);
+
+void hinic3_dump_aeq_info(struct hinic3_hwdev *hwdev);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.c
new file mode 100644
index 000000000000..a4cbac8e4cc1
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.c
@@ -0,0 +1,453 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#include "ossl_knl.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_api_cmd.h"
+#include "hinic3_mgmt.h"
+#include "hinic3_hw_api.h"
+ #ifndef HTONL
+#define HTONL(x) \
+ ((((x) & 0x000000ff) << 24) \
+ | (((x) & 0x0000ff00) << 8) \
+ | (((x) & 0x00ff0000) >> 8) \
+ | (((x) & 0xff000000) >> 24))
+#endif
+
+static void hinic3_sml_ctr_read_build_req(struct chipif_sml_ctr_rd_req *msg,
+ u8 instance_id, u8 op_id,
+ u8 ack, u32 ctr_id, u32 init_val)
+{
+ msg->head.value = 0;
+ msg->head.bs.instance = instance_id;
+ msg->head.bs.op_id = op_id;
+ msg->head.bs.ack = ack;
+ msg->head.value = HTONL(msg->head.value);
+ msg->ctr_id = ctr_id;
+ msg->ctr_id = HTONL(msg->ctr_id);
+ msg->initial = init_val;
+}
+
+static void sml_ctr_htonl_n(u32 *node, u32 len)
+{
+ u32 i;
+ u32 *node_new = node;
+
+ for (i = 0; i < len; i++) {
+ *node_new = HTONL(*node_new);
+ node_new++;
+ }
+}
+
+/**
+ * hinic3_sm_ctr_rd16 - small single 16 counter read
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd16(void *hwdev, u8 node, u8 instance, u32 ctr_id,
+ u16 *value)
+{
+ struct chipif_sml_ctr_rd_req req;
+ union ctr_rd_rsp rsp;
+ int ret;
+
+ if (!hwdev || !value)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&req, 0, sizeof(req));
+
+ hinic3_sml_ctr_read_build_req(&req, instance, CHIPIF_SM_CTR_OP_READ,
+ CHIPIF_ACK, ctr_id, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, node, (u8 *)&req,
+ (unsigned short)sizeof(req),
+ (void *)&rsp,
+ (unsigned short)sizeof(rsp));
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Sm 16bit counter read fail, err(%d)\n", ret);
+ return ret;
+ }
+ sml_ctr_htonl_n((u32 *)&rsp, sizeof(rsp) / sizeof(u32));
+ *value = rsp.bs_ss16_rsp.value1;
+
+ return 0;
+}
+
+/**
+ * hinic3_sm_ctr_rd32 - small single 32 counter read
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd32(void *hwdev, u8 node, u8 instance, u32 ctr_id,
+ u32 *value)
+{
+ struct chipif_sml_ctr_rd_req req;
+ union ctr_rd_rsp rsp;
+ int ret;
+
+ if (!hwdev || !value)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&req, 0, sizeof(req));
+
+ hinic3_sml_ctr_read_build_req(&req, instance, CHIPIF_SM_CTR_OP_READ,
+ CHIPIF_ACK, ctr_id, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, node, (u8 *)&req,
+ (unsigned short)sizeof(req),
+ (void *)&rsp,
+ (unsigned short)sizeof(rsp));
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Sm 32bit counter read fail, err(%d)\n", ret);
+ return ret;
+ }
+ sml_ctr_htonl_n((u32 *)&rsp, sizeof(rsp) / sizeof(u32));
+ *value = rsp.bs_ss32_rsp.value1;
+
+ return 0;
+}
+
+/**
+ * hinic3_sm_ctr_rd32_clear - small single 32 counter read and clear to zero
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value: read counter value ptr
+ * Return: 0 - success, negative - failure
+ * according to ACN error code (ERR_OK, ERR_PARAM, ERR_FAILED...etc)
+ **/
+int hinic3_sm_ctr_rd32_clear(void *hwdev, u8 node, u8 instance,
+ u32 ctr_id, u32 *value)
+{
+ struct chipif_sml_ctr_rd_req req;
+ union ctr_rd_rsp rsp;
+ int ret;
+
+ if (!hwdev || !value)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&req, 0, sizeof(req));
+
+ hinic3_sml_ctr_read_build_req(&req, instance,
+ CHIPIF_SM_CTR_OP_READ_CLEAR,
+ CHIPIF_ACK, ctr_id, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, node, (u8 *)&req,
+ (unsigned short)sizeof(req),
+ (void *)&rsp,
+ (unsigned short)sizeof(rsp));
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Sm 32bit counter clear fail, err(%d)\n", ret);
+ return ret;
+ }
+ sml_ctr_htonl_n((u32 *)&rsp, sizeof(rsp) / sizeof(u32));
+ *value = rsp.bs_ss32_rsp.value1;
+
+ return 0;
+}
+
+/**
+ * hinic3_sm_ctr_rd64_pair - big pair 128 counter read
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value1: read counter value ptr
+ * @value2: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd64_pair(void *hwdev, u8 node, u8 instance,
+ u32 ctr_id, u64 *value1, u64 *value2)
+{
+ struct chipif_sml_ctr_rd_req req;
+ union ctr_rd_rsp rsp;
+ int ret;
+
+ if (!value1) {
+ pr_err("First value is NULL for read 64 bit pair\n");
+ return -EFAULT;
+ }
+
+ if (!value2) {
+ pr_err("Second value is NULL for read 64 bit pair\n");
+ return -EFAULT;
+ }
+
+ if (!hwdev || ((ctr_id & 0x1) != 0)) {
+ pr_err("Hwdev is NULL or ctr_id(%d) is odd number for read 64 bit pair\n",
+ ctr_id);
+ return -EFAULT;
+ }
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&req, 0, sizeof(req));
+
+ hinic3_sml_ctr_read_build_req(&req, instance, CHIPIF_SM_CTR_OP_READ,
+ CHIPIF_ACK, ctr_id, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, node, (u8 *)&req,
+ (unsigned short)sizeof(req), (void *)&rsp,
+ (unsigned short)sizeof(rsp));
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Sm 64 bit rd pair ret(%d)\n", ret);
+ return ret;
+ }
+ sml_ctr_htonl_n((u32 *)&rsp, sizeof(rsp) / sizeof(u32));
+ *value1 = ((u64)rsp.bs_bp64_rsp.val1_h << BIT_32) | rsp.bs_bp64_rsp.val1_l;
+ *value2 = ((u64)rsp.bs_bp64_rsp.val2_h << BIT_32) | rsp.bs_bp64_rsp.val2_l;
+
+ return 0;
+}
+
+/**
+ * hinic3_sm_ctr_rd64_pair_clear - big pair 128 counter read and clear to zero
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value1: read counter value ptr
+ * @value2: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd64_pair_clear(void *hwdev, u8 node, u8 instance, u32 ctr_id,
+ u64 *value1, u64 *value2)
+{
+ struct chipif_sml_ctr_rd_req req = {0};
+ union ctr_rd_rsp rsp;
+ int ret;
+
+ if (!hwdev || !value1 || !value2 || ((ctr_id & 0x1) != 0)) {
+ pr_err("Hwdev or value1 or value2 is NULL or ctr_id(%u) is odd number\n", ctr_id);
+ return -EINVAL;
+ }
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ hinic3_sml_ctr_read_build_req(&req, instance,
+ CHIPIF_SM_CTR_OP_READ_CLEAR,
+ CHIPIF_ACK, ctr_id, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, node, (u8 *)&req,
+ (unsigned short)sizeof(req), (void *)&rsp,
+ (unsigned short)sizeof(rsp));
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Sm 64 bit clear pair fail. ret(%d)\n", ret);
+ return ret;
+ }
+ sml_ctr_htonl_n((u32 *)&rsp, sizeof(rsp) / sizeof(u32));
+ *value1 = ((u64)rsp.bs_bp64_rsp.val1_h << BIT_32) | rsp.bs_bp64_rsp.val1_l;
+ *value2 = ((u64)rsp.bs_bp64_rsp.val2_h << BIT_32) | rsp.bs_bp64_rsp.val2_l;
+
+ return 0;
+}
+
+/**
+ * hinic3_sm_ctr_rd64 - big counter 64 read
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd64(void *hwdev, u8 node, u8 instance, u32 ctr_id,
+ u64 *value)
+{
+ struct chipif_sml_ctr_rd_req req;
+ union ctr_rd_rsp rsp;
+ int ret;
+
+ if (!hwdev || !value)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&req, 0, sizeof(req));
+
+ hinic3_sml_ctr_read_build_req(&req, instance, CHIPIF_SM_CTR_OP_READ,
+ CHIPIF_ACK, ctr_id, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, node, (u8 *)&req,
+ (unsigned short)sizeof(req), (void *)&rsp,
+ (unsigned short)sizeof(rsp));
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Sm 64bit counter read fail err(%d)\n", ret);
+ return ret;
+ }
+ sml_ctr_htonl_n((u32 *)&rsp, sizeof(rsp) / sizeof(u32));
+ *value = ((u64)rsp.bs_bs64_rsp.value1 << BIT_32) | rsp.bs_bs64_rsp.value2;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_sm_ctr_rd64);
+
+/**
+ * hinic3_sm_ctr_rd64_clear - big counter 64 read and clear to zero
+ * @hwdev: the hardware device
+ * @node: the node id
+ * @ctr_id: counter id
+ * @value: read counter value ptr
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_sm_ctr_rd64_clear(void *hwdev, u8 node, u8 instance, u32 ctr_id,
+ u64 *value)
+{
+ struct chipif_sml_ctr_rd_req req = {0};
+ union ctr_rd_rsp rsp;
+ int ret;
+
+ if (!hwdev || !value)
+ return -EINVAL;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ hinic3_sml_ctr_read_build_req(&req, instance,
+ CHIPIF_SM_CTR_OP_READ_CLEAR,
+ CHIPIF_ACK, ctr_id, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, node, (u8 *)&req,
+ (unsigned short)sizeof(req), (void *)&rsp,
+ (unsigned short)sizeof(rsp));
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Sm 64bit counter clear fail err(%d)\n", ret);
+ return ret;
+ }
+ sml_ctr_htonl_n((u32 *)&rsp, sizeof(rsp) / sizeof(u32));
+ *value = ((u64)rsp.bs_bs64_rsp.value1 << BIT_32) | rsp.bs_bs64_rsp.value2;
+
+ return 0;
+}
+
+int hinic3_api_csr_rd32(void *hwdev, u8 dest, u32 addr, u32 *val)
+{
+ struct hinic3_csr_request_api_data api_data = {0};
+ u32 csr_val = 0;
+ u16 in_size = sizeof(api_data);
+ int ret;
+
+ if (!hwdev || !val)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&api_data, 0, sizeof(struct hinic3_csr_request_api_data));
+ api_data.dw0 = 0;
+ api_data.dw1.bits.operation_id = HINIC3_CSR_OPERATION_READ_CSR;
+ api_data.dw1.bits.need_response = HINIC3_CSR_NEED_RESP_DATA;
+ api_data.dw1.bits.data_size = HINIC3_CSR_DATA_SZ_32;
+ api_data.dw1.val32 = cpu_to_be32(api_data.dw1.val32);
+ api_data.dw2.bits.csr_addr = addr;
+ api_data.dw2.val32 = cpu_to_be32(api_data.dw2.val32);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, dest, (u8 *)(&api_data),
+ in_size, &csr_val, 0x4);
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Read 32 bit csr fail, dest %u addr 0x%x, ret: 0x%x\n",
+ dest, addr, ret);
+ return ret;
+ }
+
+ *val = csr_val;
+
+ return 0;
+}
+
+int hinic3_api_csr_wr32(void *hwdev, u8 dest, u32 addr, u32 val)
+{
+ struct hinic3_csr_request_api_data api_data;
+ u16 in_size = sizeof(api_data);
+ int ret;
+
+ if (!hwdev)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&api_data, 0, sizeof(struct hinic3_csr_request_api_data));
+ api_data.dw1.bits.operation_id = HINIC3_CSR_OPERATION_WRITE_CSR;
+ api_data.dw1.bits.need_response = HINIC3_CSR_NO_RESP_DATA;
+ api_data.dw1.bits.data_size = HINIC3_CSR_DATA_SZ_32;
+ api_data.dw1.val32 = cpu_to_be32(api_data.dw1.val32);
+ api_data.dw2.bits.csr_addr = addr;
+ api_data.dw2.val32 = cpu_to_be32(api_data.dw2.val32);
+ api_data.csr_write_data_h = 0xffffffff;
+ api_data.csr_write_data_l = val;
+
+ ret = hinic3_api_cmd_write_nack(hwdev, dest, (u8 *)(&api_data),
+ in_size);
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Write 32 bit csr fail! dest %u addr 0x%x val 0x%x\n",
+ dest, addr, val);
+ return ret;
+ }
+
+ return 0;
+}
+
+int hinic3_api_csr_rd64(void *hwdev, u8 dest, u32 addr, u64 *val)
+{
+ struct hinic3_csr_request_api_data api_data = {0};
+ u64 csr_val = 0;
+ u16 in_size = sizeof(api_data);
+ int ret;
+
+ if (!hwdev || !val)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&api_data, 0, sizeof(struct hinic3_csr_request_api_data));
+ api_data.dw0 = 0;
+ api_data.dw1.bits.operation_id = HINIC3_CSR_OPERATION_READ_CSR;
+ api_data.dw1.bits.need_response = HINIC3_CSR_NEED_RESP_DATA;
+ api_data.dw1.bits.data_size = HINIC3_CSR_DATA_SZ_64;
+ api_data.dw1.val32 = cpu_to_be32(api_data.dw1.val32);
+ api_data.dw2.bits.csr_addr = addr;
+ api_data.dw2.val32 = cpu_to_be32(api_data.dw2.val32);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, dest, (u8 *)(&api_data),
+ in_size, &csr_val, 0x8);
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Read 64 bit csr fail, dest %u addr 0x%x\n",
+ dest, addr);
+ return ret;
+ }
+
+ *val = csr_val;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_api_csr_rd64);
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.h
new file mode 100644
index 000000000000..9ec812eac684
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.h
@@ -0,0 +1,141 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_HW_API_H
+#define HINIC3_HW_API_H
+
+#include <linux/types.h>
+
+#define CHIPIF_ACK 1
+#define CHIPIF_NOACK 0
+
+#define CHIPIF_SM_CTR_OP_READ 0x2
+#define CHIPIF_SM_CTR_OP_READ_CLEAR 0x6
+
+#define BIT_32 32
+
+/* request head */
+union chipif_sml_ctr_req_head {
+ struct {
+ u32 pad:15;
+ u32 ack:1;
+ u32 op_id:5;
+ u32 instance:6;
+ u32 src:5;
+ } bs;
+
+ u32 value;
+};
+
+/* counter read request struct */
+struct chipif_sml_ctr_rd_req {
+ u32 extra;
+ union chipif_sml_ctr_req_head head;
+ u32 ctr_id;
+ u32 initial;
+ u32 pad;
+};
+
+struct hinic3_csr_request_api_data {
+ u32 dw0;
+
+ union {
+ struct {
+ u32 reserved1:13;
+ /* this field indicates the write/read data size:
+ * 2'b00: 32 bits
+ * 2'b01: 64 bits
+ * 2'b10~2'b11:reserved
+ */
+ u32 data_size:2;
+ /* this field indicates that requestor expect receive a
+ * response data or not.
+ * 1'b0: expect not to receive a response data.
+ * 1'b1: expect to receive a response data.
+ */
+ u32 need_response:1;
+ /* this field indicates the operation that the requestor
+ * expected.
+ * 5'b1_1110: write value to csr space.
+ * 5'b1_1111: read register from csr space.
+ */
+ u32 operation_id:5;
+ u32 reserved2:6;
+ /* this field specifies the Src node ID for this API
+ * request message.
+ */
+ u32 src_node_id:5;
+ } bits;
+
+ u32 val32;
+ } dw1;
+
+ union {
+ struct {
+ /* it specifies the CSR address. */
+ u32 csr_addr:26;
+ u32 reserved3:6;
+ } bits;
+
+ u32 val32;
+ } dw2;
+
+ /* if data_size=2'b01, it is high 32 bits of write data. else, it is
+ * 32'hFFFF_FFFF.
+ */
+ u32 csr_write_data_h;
+ /* the low 32 bits of write data. */
+ u32 csr_write_data_l;
+};
+
+/* counter read response union */
+union ctr_rd_rsp {
+ struct {
+ u32 value1:16;
+ u32 pad0:16;
+ u32 pad1[3];
+ } bs_ss16_rsp;
+
+ struct {
+ u32 value1;
+ u32 pad[3];
+ } bs_ss32_rsp;
+
+ struct {
+ u32 value1:20;
+ u32 pad0:12;
+ u32 value2:12;
+ u32 pad1:20;
+ u32 pad2[2];
+ } bs_sp_rsp;
+
+ struct {
+ u32 value1;
+ u32 value2;
+ u32 pad[2];
+ } bs_bs64_rsp;
+
+ struct {
+ u32 val1_h;
+ u32 val1_l;
+ u32 val2_h;
+ u32 val2_l;
+ } bs_bp64_rsp;
+};
+
+enum HINIC3_CSR_API_DATA_OPERATION_ID {
+ HINIC3_CSR_OPERATION_WRITE_CSR = 0x1E,
+ HINIC3_CSR_OPERATION_READ_CSR = 0x1F
+};
+
+enum HINIC3_CSR_API_DATA_NEED_RESPONSE_DATA {
+ HINIC3_CSR_NO_RESP_DATA = 0,
+ HINIC3_CSR_NEED_RESP_DATA = 1
+};
+
+enum HINIC3_CSR_API_DATA_DATA_SIZE {
+ HINIC3_CSR_DATA_SZ_32 = 0,
+ HINIC3_CSR_DATA_SZ_64 = 1
+};
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.c
new file mode 100644
index 000000000000..08a1b8f15cb7
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.c
@@ -0,0 +1,1480 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <linux/module.h>
+#include <linux/semaphore.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_hwif.h"
+#include "cfg_mgt_comm_pub.h"
+#include "hinic3_hw_cfg.h"
+
+static void parse_pub_res_cap_dfx(struct hinic3_hwdev *hwdev,
+ const struct service_cap *cap)
+{
+ sdk_info(hwdev->dev_hdl, "Get public resource capbility: svc_cap_en: 0x%x\n",
+ cap->svc_type);
+ sdk_info(hwdev->dev_hdl, "Host_id: 0x%x, ep_id: 0x%x, er_id: 0x%x, port_id: 0x%x\n",
+ cap->host_id, cap->ep_id, cap->er_id, cap->port_id);
+ sdk_info(hwdev->dev_hdl, "cos_bitmap: 0x%x, flexq: 0x%x, virtio_vq_size: 0x%x\n",
+ cap->cos_valid_bitmap, cap->flexq_en, cap->virtio_vq_size);
+ sdk_info(hwdev->dev_hdl, "Host_total_function: 0x%x, host_oq_id_mask_val: 0x%x, max_vf: 0x%x\n",
+ cap->host_total_function, cap->host_oq_id_mask_val,
+ cap->max_vf);
+ sdk_info(hwdev->dev_hdl, "Host_pf_num: 0x%x, pf_id_start: 0x%x, host_vf_num: 0x%x, vf_id_start: 0x%x\n",
+ cap->pf_num, cap->pf_id_start, cap->vf_num, cap->vf_id_start);
+ sdk_info(hwdev->dev_hdl, "host_valid_bitmap: 0x%x, master_host_id: 0x%x, srv_multi_host_mode: 0x%x\n",
+ cap->host_valid_bitmap, cap->master_host_id, cap->srv_multi_host_mode);
+ sdk_info(hwdev->dev_hdl,
+ "fake_vf_start_id: 0x%x, fake_vf_num: 0x%x, fake_vf_max_pctx: 0x%x\n",
+ cap->fake_vf_start_id, cap->fake_vf_num, cap->fake_vf_max_pctx);
+ sdk_info(hwdev->dev_hdl, "fake_vf_bfilter_start_addr: 0x%x, fake_vf_bfilter_len: 0x%x\n",
+ cap->fake_vf_bfilter_start_addr, cap->fake_vf_bfilter_len);
+}
+
+static void parse_cqm_res_cap(struct hinic3_hwdev *hwdev, struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap)
+{
+ struct dev_sf_svc_attr *attr = &cap->sf_svc_attr;
+
+ cap->fake_vf_start_id = dev_cap->fake_vf_start_id;
+ cap->fake_vf_num = dev_cap->fake_vf_num;
+ cap->fake_vf_max_pctx = dev_cap->fake_vf_max_pctx;
+ cap->fake_vf_num_cfg = dev_cap->fake_vf_num;
+ cap->fake_vf_bfilter_start_addr = dev_cap->fake_vf_bfilter_start_addr;
+ cap->fake_vf_bfilter_len = dev_cap->fake_vf_bfilter_len;
+
+ if (COMM_SUPPORT_VIRTIO_VQ_SIZE(hwdev))
+ cap->virtio_vq_size = (u16)(VIRTIO_BASE_VQ_SIZE << dev_cap->virtio_vq_size);
+ else
+ cap->virtio_vq_size = VIRTIO_DEFAULT_VQ_SIZE;
+
+ if (dev_cap->sf_svc_attr & SF_SVC_FT_BIT)
+ attr->ft_en = true;
+ else
+ attr->ft_en = false;
+
+ if (dev_cap->sf_svc_attr & SF_SVC_RDMA_BIT)
+ attr->rdma_en = true;
+ else
+ attr->rdma_en = false;
+
+ /* PPF will overwrite it when parse dynamic resource */
+ if (dev_cap->func_sf_en)
+ cap->sf_en = true;
+ else
+ cap->sf_en = false;
+
+ cap->lb_mode = dev_cap->lb_mode;
+ cap->smf_pg = dev_cap->smf_pg;
+
+ cap->timer_en = dev_cap->timer_en;
+ cap->host_oq_id_mask_val = dev_cap->host_oq_id_mask_val;
+ cap->max_connect_num = dev_cap->max_conn_num;
+ cap->max_stick2cache_num = dev_cap->max_stick2cache_num;
+ cap->bfilter_start_addr = dev_cap->max_bfilter_start_addr;
+ cap->bfilter_len = dev_cap->bfilter_len;
+ cap->hash_bucket_num = dev_cap->hash_bucket_num;
+}
+
+static void parse_pub_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ cap->host_id = dev_cap->host_id;
+ cap->ep_id = dev_cap->ep_id;
+ cap->er_id = dev_cap->er_id;
+ cap->port_id = dev_cap->port_id;
+
+ cap->svc_type = dev_cap->svc_cap_en;
+ cap->chip_svc_type = cap->svc_type;
+
+ cap->cos_valid_bitmap = dev_cap->valid_cos_bitmap;
+ cap->port_cos_valid_bitmap = dev_cap->port_cos_valid_bitmap;
+ cap->flexq_en = dev_cap->flexq_en;
+
+ cap->host_total_function = dev_cap->host_total_func;
+ cap->host_valid_bitmap = dev_cap->host_valid_bitmap;
+ cap->master_host_id = dev_cap->master_host_id;
+ cap->srv_multi_host_mode = dev_cap->srv_multi_host_mode;
+
+ if (type != TYPE_VF) {
+ cap->max_vf = dev_cap->max_vf;
+ cap->pf_num = dev_cap->host_pf_num;
+ cap->pf_id_start = dev_cap->pf_id_start;
+ cap->vf_num = dev_cap->host_vf_num;
+ cap->vf_id_start = dev_cap->vf_id_start;
+ } else {
+ cap->max_vf = 0;
+ }
+
+ parse_cqm_res_cap(hwdev, cap, dev_cap);
+ parse_pub_res_cap_dfx(hwdev, cap);
+}
+
+static void parse_dynamic_share_res_cap(struct service_cap *cap,
+ const struct cfg_cmd_dev_cap *dev_cap)
+{
+ if (dev_cap->host_sf_en)
+ cap->sf_en = true;
+ else
+ cap->sf_en = false;
+}
+
+static void parse_l2nic_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct nic_service_cap *nic_cap = &cap->nic_cap;
+
+ nic_cap->max_sqs = dev_cap->nic_max_sq_id + 1;
+ nic_cap->max_rqs = dev_cap->nic_max_rq_id + 1;
+ nic_cap->default_num_queues = dev_cap->nic_default_num_queues;
+
+ sdk_info(hwdev->dev_hdl, "L2nic resource capbility, max_sqs: 0x%x, max_rqs: 0x%x\n",
+ nic_cap->max_sqs, nic_cap->max_rqs);
+
+ /* Check parameters from firmware */
+ if (nic_cap->max_sqs > HINIC3_CFG_MAX_QP ||
+ nic_cap->max_rqs > HINIC3_CFG_MAX_QP) {
+ sdk_info(hwdev->dev_hdl, "Number of qp exceed limit[1-%d]: sq: %u, rq: %u\n",
+ HINIC3_CFG_MAX_QP, nic_cap->max_sqs, nic_cap->max_rqs);
+ nic_cap->max_sqs = HINIC3_CFG_MAX_QP;
+ nic_cap->max_rqs = HINIC3_CFG_MAX_QP;
+ }
+}
+
+static void parse_fc_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct dev_fc_svc_cap *fc_cap = &cap->fc_cap.dev_fc_cap;
+
+ fc_cap->max_parent_qpc_num = dev_cap->fc_max_pctx;
+ fc_cap->scq_num = dev_cap->fc_max_scq;
+ fc_cap->srq_num = dev_cap->fc_max_srq;
+ fc_cap->max_child_qpc_num = dev_cap->fc_max_cctx;
+ fc_cap->child_qpc_id_start = dev_cap->fc_cctx_id_start;
+ fc_cap->vp_id_start = dev_cap->fc_vp_id_start;
+ fc_cap->vp_id_end = dev_cap->fc_vp_id_end;
+
+ sdk_info(hwdev->dev_hdl, "Get fc resource capbility\n");
+ sdk_info(hwdev->dev_hdl,
+ "Max_parent_qpc_num: 0x%x, scq_num: 0x%x, srq_num: 0x%x, max_child_qpc_num: 0x%x, child_qpc_id_start: 0x%x\n",
+ fc_cap->max_parent_qpc_num, fc_cap->scq_num, fc_cap->srq_num,
+ fc_cap->max_child_qpc_num, fc_cap->child_qpc_id_start);
+ sdk_info(hwdev->dev_hdl, "Vp_id_start: 0x%x, vp_id_end: 0x%x\n",
+ fc_cap->vp_id_start, fc_cap->vp_id_end);
+}
+
+static void parse_roce_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct dev_roce_svc_own_cap *roce_cap =
+ &cap->rdma_cap.dev_rdma_cap.roce_own_cap;
+
+ roce_cap->max_qps = dev_cap->roce_max_qp;
+ roce_cap->max_cqs = dev_cap->roce_max_cq;
+ roce_cap->max_srqs = dev_cap->roce_max_srq;
+ roce_cap->max_mpts = dev_cap->roce_max_mpt;
+ roce_cap->max_drc_qps = dev_cap->roce_max_drc_qp;
+
+ roce_cap->wqe_cl_start = dev_cap->roce_wqe_cl_start;
+ roce_cap->wqe_cl_end = dev_cap->roce_wqe_cl_end;
+ roce_cap->wqe_cl_sz = dev_cap->roce_wqe_cl_size;
+
+ sdk_info(hwdev->dev_hdl, "Get roce resource capbility, type: 0x%x\n",
+ type);
+ sdk_info(hwdev->dev_hdl, "Max_qps: 0x%x, max_cqs: 0x%x, max_srqs: 0x%x, max_mpts: 0x%x, max_drcts: 0x%x\n",
+ roce_cap->max_qps, roce_cap->max_cqs, roce_cap->max_srqs,
+ roce_cap->max_mpts, roce_cap->max_drc_qps);
+
+ sdk_info(hwdev->dev_hdl, "Wqe_start: 0x%x, wqe_end: 0x%x, wqe_sz: 0x%x\n",
+ roce_cap->wqe_cl_start, roce_cap->wqe_cl_end,
+ roce_cap->wqe_cl_sz);
+
+ if (roce_cap->max_qps == 0) {
+ if (type == TYPE_PF || type == TYPE_PPF) {
+ roce_cap->max_qps = 0x400;
+ roce_cap->max_cqs = 0x800;
+ roce_cap->max_srqs = 0x400;
+ roce_cap->max_mpts = 0x400;
+ roce_cap->max_drc_qps = 0x40;
+ } else {
+ roce_cap->max_qps = 0x200;
+ roce_cap->max_cqs = 0x400;
+ roce_cap->max_srqs = 0x200;
+ roce_cap->max_mpts = 0x200;
+ roce_cap->max_drc_qps = 0x40;
+ }
+ }
+}
+
+static void parse_rdma_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct dev_roce_svc_own_cap *roce_cap =
+ &cap->rdma_cap.dev_rdma_cap.roce_own_cap;
+
+ roce_cap->cmtt_cl_start = dev_cap->roce_cmtt_cl_start;
+ roce_cap->cmtt_cl_end = dev_cap->roce_cmtt_cl_end;
+ roce_cap->cmtt_cl_sz = dev_cap->roce_cmtt_cl_size;
+
+ roce_cap->dmtt_cl_start = dev_cap->roce_dmtt_cl_start;
+ roce_cap->dmtt_cl_end = dev_cap->roce_dmtt_cl_end;
+ roce_cap->dmtt_cl_sz = dev_cap->roce_dmtt_cl_size;
+
+ sdk_info(hwdev->dev_hdl, "Get rdma resource capbility, Cmtt_start: 0x%x, cmtt_end: 0x%x, cmtt_sz: 0x%x\n",
+ roce_cap->cmtt_cl_start, roce_cap->cmtt_cl_end,
+ roce_cap->cmtt_cl_sz);
+
+ sdk_info(hwdev->dev_hdl, "Dmtt_start: 0x%x, dmtt_end: 0x%x, dmtt_sz: 0x%x\n",
+ roce_cap->dmtt_cl_start, roce_cap->dmtt_cl_end,
+ roce_cap->dmtt_cl_sz);
+}
+
+static void parse_ovs_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct ovs_service_cap *ovs_cap = &cap->ovs_cap;
+
+ ovs_cap->dev_ovs_cap.max_pctxs = dev_cap->ovs_max_qpc;
+ ovs_cap->dev_ovs_cap.fake_vf_max_pctx = dev_cap->fake_vf_max_pctx;
+ ovs_cap->dev_ovs_cap.fake_vf_start_id = dev_cap->fake_vf_start_id;
+ ovs_cap->dev_ovs_cap.fake_vf_num = dev_cap->fake_vf_num;
+ ovs_cap->dev_ovs_cap.dynamic_qp_en = dev_cap->flexq_en;
+
+ sdk_info(hwdev->dev_hdl,
+ "Get ovs resource capbility, max_qpc: 0x%x, fake_vf_start_id: 0x%x, fake_vf_num: 0x%x\n",
+ ovs_cap->dev_ovs_cap.max_pctxs,
+ ovs_cap->dev_ovs_cap.fake_vf_start_id,
+ ovs_cap->dev_ovs_cap.fake_vf_num);
+ sdk_info(hwdev->dev_hdl,
+ "fake_vf_max_qpc: 0x%x, dynamic_qp_en: 0x%x\n",
+ ovs_cap->dev_ovs_cap.fake_vf_max_pctx,
+ ovs_cap->dev_ovs_cap.dynamic_qp_en);
+}
+
+static void parse_ppa_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct ppa_service_cap *dip_cap = &cap->ppa_cap;
+
+ dip_cap->qpc_fake_vf_ctx_num = dev_cap->fake_vf_max_pctx;
+ dip_cap->qpc_fake_vf_start = dev_cap->fake_vf_start_id;
+ dip_cap->qpc_fake_vf_num = dev_cap->fake_vf_num;
+ dip_cap->bloomfilter_en = dev_cap->fake_vf_bfilter_len ? 1 : 0;
+ dip_cap->bloomfilter_length = dev_cap->fake_vf_bfilter_len;
+ sdk_info(hwdev->dev_hdl,
+ "Get ppa resource capbility, fake_vf_start_id: 0x%x, fake_vf_num: 0x%x, fake_vf_max_qpc: 0x%x\n",
+ dip_cap->qpc_fake_vf_start,
+ dip_cap->qpc_fake_vf_num,
+ dip_cap->qpc_fake_vf_ctx_num);
+}
+
+static void parse_toe_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct dev_toe_svc_cap *toe_cap = &cap->toe_cap.dev_toe_cap;
+
+ toe_cap->max_pctxs = dev_cap->toe_max_pctx;
+ toe_cap->max_cqs = dev_cap->toe_max_cq;
+ toe_cap->max_srqs = dev_cap->toe_max_srq;
+ toe_cap->srq_id_start = dev_cap->toe_srq_id_start;
+ toe_cap->max_mpts = dev_cap->toe_max_mpt;
+ toe_cap->max_cctxt = dev_cap->toe_max_cctxt;
+
+ sdk_info(hwdev->dev_hdl,
+ "Get toe resource capbility, max_pctxs: 0x%x, max_cqs: 0x%x, max_srqs: 0x%x, srq_id_start: 0x%x, max_mpts: 0x%x\n",
+ toe_cap->max_pctxs, toe_cap->max_cqs, toe_cap->max_srqs,
+ toe_cap->srq_id_start, toe_cap->max_mpts);
+}
+
+static void parse_ipsec_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct ipsec_service_cap *ipsec_cap = &cap->ipsec_cap;
+
+ ipsec_cap->dev_ipsec_cap.max_sactxs = dev_cap->ipsec_max_sactx;
+ ipsec_cap->dev_ipsec_cap.max_cqs = dev_cap->ipsec_max_cq;
+
+ sdk_info(hwdev->dev_hdl, "Get IPsec resource capbility, max_sactxs: 0x%x, max_cqs: 0x%x\n",
+ dev_cap->ipsec_max_sactx, dev_cap->ipsec_max_cq);
+}
+
+static void parse_vbs_res_cap(struct hinic3_hwdev *hwdev,
+ struct service_cap *cap,
+ struct cfg_cmd_dev_cap *dev_cap,
+ enum func_type type)
+{
+ struct vbs_service_cap *vbs_cap = &cap->vbs_cap;
+
+ vbs_cap->vbs_max_volq = dev_cap->vbs_max_volq;
+
+ sdk_info(hwdev->dev_hdl, "Get VBS resource capbility, vbs_max_volq: 0x%x\n",
+ dev_cap->vbs_max_volq);
+}
+
+static void parse_dev_cap(struct hinic3_hwdev *dev,
+ struct cfg_cmd_dev_cap *dev_cap, enum func_type type)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+
+ /* Public resource */
+ parse_pub_res_cap(dev, cap, dev_cap, type);
+
+ /* PPF managed dynamic resource */
+ if (type == TYPE_PPF)
+ parse_dynamic_share_res_cap(cap, dev_cap);
+
+ /* L2 NIC resource */
+ if (IS_NIC_TYPE(dev))
+ parse_l2nic_res_cap(dev, cap, dev_cap, type);
+
+ /* FC without virtulization */
+ if (type == TYPE_PF || type == TYPE_PPF) {
+ if (IS_FC_TYPE(dev))
+ parse_fc_res_cap(dev, cap, dev_cap, type);
+ }
+
+ /* toe resource */
+ if (IS_TOE_TYPE(dev))
+ parse_toe_res_cap(dev, cap, dev_cap, type);
+
+ /* mtt cache line */
+ if (IS_RDMA_ENABLE(dev))
+ parse_rdma_res_cap(dev, cap, dev_cap, type);
+
+ /* RoCE resource */
+ if (IS_ROCE_TYPE(dev))
+ parse_roce_res_cap(dev, cap, dev_cap, type);
+
+ if (IS_OVS_TYPE(dev))
+ parse_ovs_res_cap(dev, cap, dev_cap, type);
+
+ if (IS_IPSEC_TYPE(dev))
+ parse_ipsec_res_cap(dev, cap, dev_cap, type);
+
+ if (IS_PPA_TYPE(dev))
+ parse_ppa_res_cap(dev, cap, dev_cap, type);
+
+ if (IS_VBS_TYPE(dev))
+ parse_vbs_res_cap(dev, cap, dev_cap, type);
+}
+
+static int get_cap_from_fw(struct hinic3_hwdev *dev, enum func_type type)
+{
+ struct cfg_cmd_dev_cap dev_cap;
+ u16 out_len = sizeof(dev_cap);
+ int err;
+
+ memset(&dev_cap, 0, sizeof(dev_cap));
+ dev_cap.func_id = hinic3_global_func_id(dev);
+ sdk_info(dev->dev_hdl, "Get cap from fw, func_idx: %u\n",
+ dev_cap.func_id);
+
+ err = hinic3_msg_to_mgmt_sync(dev, HINIC3_MOD_CFGM, CFG_CMD_GET_DEV_CAP,
+ &dev_cap, sizeof(dev_cap),
+ &dev_cap, &out_len, 0,
+ HINIC3_CHANNEL_COMM);
+ if (err || dev_cap.head.status || !out_len) {
+ sdk_err(dev->dev_hdl,
+ "Failed to get capability from FW, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, dev_cap.head.status, out_len);
+ return -EIO;
+ }
+
+ parse_dev_cap(dev, &dev_cap, type);
+
+ return 0;
+}
+
+static int hinic3_get_dev_cap(struct hinic3_hwdev *dev)
+{
+ enum func_type type = HINIC3_FUNC_TYPE(dev);
+ int err;
+
+ switch (type) {
+ case TYPE_PF:
+ case TYPE_PPF:
+ case TYPE_VF:
+ err = get_cap_from_fw(dev, type);
+ if (err) {
+ sdk_err(dev->dev_hdl, "Failed to get PF/PPF capability\n");
+ return err;
+ }
+ break;
+ default:
+ sdk_err(dev->dev_hdl, "Unsupported PCI Function type: %d\n",
+ type);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_get_ppf_timer_cfg(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct cfg_cmd_host_timer cfg_host_timer;
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ u16 out_len = sizeof(cfg_host_timer);
+ int err;
+
+ memset(&cfg_host_timer, 0, sizeof(cfg_host_timer));
+ cfg_host_timer.host_id = dev->cfg_mgmt->svc_cap.host_id;
+
+ err = hinic3_msg_to_mgmt_sync(dev, HINIC3_MOD_CFGM, CFG_CMD_GET_HOST_TIMER,
+ &cfg_host_timer, sizeof(cfg_host_timer),
+ &cfg_host_timer, &out_len, 0,
+ HINIC3_CHANNEL_COMM);
+ if (err || cfg_host_timer.head.status || !out_len) {
+ sdk_err(dev->dev_hdl,
+ "Failed to get host timer cfg from FW, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, cfg_host_timer.head.status, out_len);
+ return -EIO;
+ }
+
+ cap->timer_pf_id_start = cfg_host_timer.timer_pf_id_start;
+ cap->timer_pf_num = cfg_host_timer.timer_pf_num;
+ cap->timer_vf_id_start = cfg_host_timer.timer_vf_id_start;
+ cap->timer_vf_num = cfg_host_timer.timer_vf_num;
+
+ return 0;
+}
+
+static void nic_param_fix(struct hinic3_hwdev *dev)
+{
+}
+
+static void rdma_mtt_fix(struct hinic3_hwdev *dev)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ struct rdma_service_cap *rdma_cap = &cap->rdma_cap;
+
+ rdma_cap->log_mtt = LOG_MTT_SEG;
+ rdma_cap->log_mtt_seg = LOG_MTT_SEG;
+ rdma_cap->mtt_entry_sz = MTT_ENTRY_SZ;
+ rdma_cap->mpt_entry_sz = RDMA_MPT_ENTRY_SZ;
+ rdma_cap->num_mtts = RDMA_NUM_MTTS;
+}
+
+static void rdma_param_fix(struct hinic3_hwdev *dev)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ struct rdma_service_cap *rdma_cap = &cap->rdma_cap;
+ struct dev_roce_svc_own_cap *roce_cap =
+ &rdma_cap->dev_rdma_cap.roce_own_cap;
+
+ rdma_cap->log_mtt = LOG_MTT_SEG;
+ rdma_cap->log_rdmarc = LOG_RDMARC_SEG;
+ rdma_cap->reserved_qps = RDMA_RSVD_QPS;
+ rdma_cap->max_sq_sg = RDMA_MAX_SQ_SGE;
+
+ /* RoCE */
+ if (IS_ROCE_TYPE(dev)) {
+ roce_cap->qpc_entry_sz = ROCE_QPC_ENTRY_SZ;
+ roce_cap->max_wqes = ROCE_MAX_WQES;
+ roce_cap->max_rq_sg = ROCE_MAX_RQ_SGE;
+ roce_cap->max_sq_inline_data_sz = ROCE_MAX_SQ_INLINE_DATA_SZ;
+ roce_cap->max_rq_desc_sz = ROCE_MAX_RQ_DESC_SZ;
+ roce_cap->rdmarc_entry_sz = ROCE_RDMARC_ENTRY_SZ;
+ roce_cap->max_qp_init_rdma = ROCE_MAX_QP_INIT_RDMA;
+ roce_cap->max_qp_dest_rdma = ROCE_MAX_QP_DEST_RDMA;
+ roce_cap->max_srq_wqes = ROCE_MAX_SRQ_WQES;
+ roce_cap->reserved_srqs = ROCE_RSVD_SRQS;
+ roce_cap->max_srq_sge = ROCE_MAX_SRQ_SGE;
+ roce_cap->srqc_entry_sz = ROCE_SRQC_ENTERY_SZ;
+ roce_cap->max_msg_sz = ROCE_MAX_MSG_SZ;
+ }
+
+ rdma_cap->max_sq_desc_sz = RDMA_MAX_SQ_DESC_SZ;
+ rdma_cap->wqebb_size = WQEBB_SZ;
+ rdma_cap->max_cqes = RDMA_MAX_CQES;
+ rdma_cap->reserved_cqs = RDMA_RSVD_CQS;
+ rdma_cap->cqc_entry_sz = RDMA_CQC_ENTRY_SZ;
+ rdma_cap->cqe_size = RDMA_CQE_SZ;
+ rdma_cap->reserved_mrws = RDMA_RSVD_MRWS;
+ rdma_cap->mpt_entry_sz = RDMA_MPT_ENTRY_SZ;
+
+ /* 2^8 - 1
+ * +------------------------+-----------+
+ * | 4B | 1M(20b) | Key(8b) |
+ * +------------------------+-----------+
+ * key = 8bit key + 24bit index,
+ * now Lkey of SGE uses 2bit(bit31 and bit30), so key only have 10bit,
+ * we use original 8bits directly for simpilification
+ */
+ rdma_cap->max_fmr_maps = 0xff;
+ rdma_cap->num_mtts = RDMA_NUM_MTTS;
+ rdma_cap->log_mtt_seg = LOG_MTT_SEG;
+ rdma_cap->mtt_entry_sz = MTT_ENTRY_SZ;
+ rdma_cap->log_rdmarc_seg = LOG_RDMARC_SEG;
+ rdma_cap->local_ca_ack_delay = LOCAL_ACK_DELAY;
+ rdma_cap->num_ports = RDMA_NUM_PORTS;
+ rdma_cap->db_page_size = DB_PAGE_SZ;
+ rdma_cap->direct_wqe_size = DWQE_SZ;
+ rdma_cap->num_pds = NUM_PD;
+ rdma_cap->reserved_pds = RSVD_PD;
+ rdma_cap->max_xrcds = MAX_XRCDS;
+ rdma_cap->reserved_xrcds = RSVD_XRCDS;
+ rdma_cap->max_gid_per_port = MAX_GID_PER_PORT;
+ rdma_cap->gid_entry_sz = GID_ENTRY_SZ;
+ rdma_cap->reserved_lkey = RSVD_LKEY;
+ rdma_cap->num_comp_vectors = (u32)dev->cfg_mgmt->eq_info.num_ceq;
+ rdma_cap->page_size_cap = PAGE_SZ_CAP;
+ rdma_cap->flags = (RDMA_BMME_FLAG_LOCAL_INV |
+ RDMA_BMME_FLAG_REMOTE_INV |
+ RDMA_BMME_FLAG_FAST_REG_WR |
+ RDMA_DEV_CAP_FLAG_XRC |
+ RDMA_DEV_CAP_FLAG_MEM_WINDOW |
+ RDMA_BMME_FLAG_TYPE_2_WIN |
+ RDMA_BMME_FLAG_WIN_TYPE_2B |
+ RDMA_DEV_CAP_FLAG_ATOMIC);
+ rdma_cap->max_frpl_len = MAX_FRPL_LEN;
+ rdma_cap->max_pkeys = MAX_PKEYS;
+}
+
+static void toe_param_fix(struct hinic3_hwdev *dev)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ struct toe_service_cap *toe_cap = &cap->toe_cap;
+
+ toe_cap->pctx_sz = TOE_PCTX_SZ;
+ toe_cap->scqc_sz = TOE_CQC_SZ;
+}
+
+static void ovs_param_fix(struct hinic3_hwdev *dev)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ struct ovs_service_cap *ovs_cap = &cap->ovs_cap;
+
+ ovs_cap->pctx_sz = OVS_PCTX_SZ;
+}
+
+static void ppa_param_fix(struct hinic3_hwdev *dev)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ struct ppa_service_cap *ppa_cap = &cap->ppa_cap;
+
+ ppa_cap->pctx_sz = PPA_PCTX_SZ;
+}
+
+static void fc_param_fix(struct hinic3_hwdev *dev)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ struct fc_service_cap *fc_cap = &cap->fc_cap;
+
+ fc_cap->parent_qpc_size = FC_PCTX_SZ;
+ fc_cap->child_qpc_size = FC_CCTX_SZ;
+ fc_cap->sqe_size = FC_SQE_SZ;
+
+ fc_cap->scqc_size = FC_SCQC_SZ;
+ fc_cap->scqe_size = FC_SCQE_SZ;
+
+ fc_cap->srqc_size = FC_SRQC_SZ;
+ fc_cap->srqe_size = FC_SRQE_SZ;
+}
+
+static void ipsec_param_fix(struct hinic3_hwdev *dev)
+{
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+ struct ipsec_service_cap *ipsec_cap = &cap->ipsec_cap;
+
+ ipsec_cap->sactx_sz = IPSEC_SACTX_SZ;
+}
+
+static void init_service_param(struct hinic3_hwdev *dev)
+{
+ if (IS_NIC_TYPE(dev))
+ nic_param_fix(dev);
+ if (IS_RDMA_ENABLE(dev))
+ rdma_mtt_fix(dev);
+ if (IS_ROCE_TYPE(dev))
+ rdma_param_fix(dev);
+ if (IS_FC_TYPE(dev))
+ fc_param_fix(dev);
+ if (IS_TOE_TYPE(dev))
+ toe_param_fix(dev);
+ if (IS_OVS_TYPE(dev))
+ ovs_param_fix(dev);
+ if (IS_IPSEC_TYPE(dev))
+ ipsec_param_fix(dev);
+ if (IS_PPA_TYPE(dev))
+ ppa_param_fix(dev);
+}
+
+static void cfg_get_eq_num(struct hinic3_hwdev *dev)
+{
+ struct cfg_eq_info *eq_info = &dev->cfg_mgmt->eq_info;
+
+ eq_info->num_ceq = dev->hwif->attr.num_ceqs;
+ eq_info->num_ceq_remain = eq_info->num_ceq;
+}
+
+static int cfg_init_eq(struct hinic3_hwdev *dev)
+{
+ struct cfg_mgmt_info *cfg_mgmt = dev->cfg_mgmt;
+ struct cfg_eq *eq = NULL;
+ u8 num_ceq, i = 0;
+
+ cfg_get_eq_num(dev);
+ num_ceq = cfg_mgmt->eq_info.num_ceq;
+
+ sdk_info(dev->dev_hdl, "Cfg mgmt: ceqs=0x%x, remain=0x%x\n",
+ cfg_mgmt->eq_info.num_ceq, cfg_mgmt->eq_info.num_ceq_remain);
+
+ if (!num_ceq) {
+ sdk_err(dev->dev_hdl, "Ceq num cfg in fw is zero\n");
+ return -EFAULT;
+ }
+
+ eq = kcalloc(num_ceq, sizeof(*eq), GFP_KERNEL);
+ if (!eq)
+ return -ENOMEM;
+
+ for (i = 0; i < num_ceq; ++i) {
+ eq[i].eqn = i;
+ eq[i].free = CFG_FREE;
+ eq[i].type = SERVICE_T_MAX;
+ }
+
+ cfg_mgmt->eq_info.eq = eq;
+
+ mutex_init(&cfg_mgmt->eq_info.eq_mutex);
+
+ return 0;
+}
+
+int hinic3_vector_to_eqn(void *hwdev, enum hinic3_service_type type, int vector)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct cfg_mgmt_info *cfg_mgmt = NULL;
+ struct cfg_eq *eq = NULL;
+ int eqn = -EINVAL;
+ int vector_num = vector;
+
+ if (!hwdev || vector < 0)
+ return -EINVAL;
+
+ if (type != SERVICE_T_ROCE) {
+ sdk_err(dev->dev_hdl,
+ "Service type :%d, only RDMA service could get eqn by vector.\n",
+ type);
+ return -EINVAL;
+ }
+
+ cfg_mgmt = dev->cfg_mgmt;
+ vector_num = (vector_num % cfg_mgmt->eq_info.num_ceq) + CFG_RDMA_CEQ_BASE;
+
+ eq = cfg_mgmt->eq_info.eq;
+ if (eq[vector_num].type == SERVICE_T_ROCE && eq[vector_num].free == CFG_BUSY)
+ eqn = eq[vector_num].eqn;
+
+ return eqn;
+}
+EXPORT_SYMBOL(hinic3_vector_to_eqn);
+
+static int cfg_init_interrupt(struct hinic3_hwdev *dev)
+{
+ struct cfg_mgmt_info *cfg_mgmt = dev->cfg_mgmt;
+ struct cfg_irq_info *irq_info = &cfg_mgmt->irq_param_info;
+ u16 intr_num = dev->hwif->attr.num_irqs;
+ u16 intr_needed = dev->hwif->attr.msix_flex_en ? (dev->hwif->attr.num_aeqs +
+ dev->hwif->attr.num_ceqs + dev->hwif->attr.num_sq) : intr_num;
+
+ if (!intr_num) {
+ sdk_err(dev->dev_hdl, "Irq num cfg in fw is zero, msix_flex_en %d\n",
+ dev->hwif->attr.msix_flex_en);
+ return -EFAULT;
+ }
+
+ if (intr_needed > intr_num) {
+ sdk_warn(dev->dev_hdl, "Irq num cfg(%d) is less than the needed irq num(%d) msix_flex_en %d\n",
+ intr_num, intr_needed, dev->hwif->attr.msix_flex_en);
+ intr_needed = intr_num;
+ }
+
+ irq_info->alloc_info = kcalloc(intr_num, sizeof(*irq_info->alloc_info),
+ GFP_KERNEL);
+ if (!irq_info->alloc_info)
+ return -ENOMEM;
+
+ irq_info->num_irq_hw = intr_needed;
+ /* Production requires VF only surppots MSI-X */
+ if (HINIC3_FUNC_TYPE(dev) == TYPE_VF)
+ cfg_mgmt->svc_cap.interrupt_type = INTR_TYPE_MSIX;
+ else
+ cfg_mgmt->svc_cap.interrupt_type = 0;
+
+ mutex_init(&irq_info->irq_mutex);
+ return 0;
+}
+
+static int cfg_enable_interrupt(struct hinic3_hwdev *dev)
+{
+ struct cfg_mgmt_info *cfg_mgmt = dev->cfg_mgmt;
+ u16 nreq = cfg_mgmt->irq_param_info.num_irq_hw;
+
+ void *pcidev = dev->pcidev_hdl;
+ struct irq_alloc_info_st *irq_info = NULL;
+ struct msix_entry *entry = NULL;
+ u16 i = 0;
+ int actual_irq;
+
+ irq_info = cfg_mgmt->irq_param_info.alloc_info;
+
+ sdk_info(dev->dev_hdl, "Interrupt type: %u, irq num: %u.\n",
+ cfg_mgmt->svc_cap.interrupt_type, nreq);
+
+ switch (cfg_mgmt->svc_cap.interrupt_type) {
+ case INTR_TYPE_MSIX:
+ if (!nreq) {
+ sdk_err(dev->dev_hdl, "Interrupt number cannot be zero\n");
+ return -EINVAL;
+ }
+ entry = kcalloc(nreq, sizeof(*entry), GFP_KERNEL);
+ if (!entry)
+ return -ENOMEM;
+
+ for (i = 0; i < nreq; i++)
+ entry[i].entry = i;
+
+ actual_irq = pci_enable_msix_range(pcidev, entry,
+ VECTOR_THRESHOLD, nreq);
+ if (actual_irq < 0) {
+ sdk_err(dev->dev_hdl, "Alloc msix entries with threshold 2 failed. actual_irq: %d\n",
+ actual_irq);
+ kfree(entry);
+ return -ENOMEM;
+ }
+
+ nreq = (u16)actual_irq;
+ cfg_mgmt->irq_param_info.num_total = nreq;
+ cfg_mgmt->irq_param_info.num_irq_remain = nreq;
+ sdk_info(dev->dev_hdl, "Request %u msix vector success.\n",
+ nreq);
+
+ for (i = 0; i < nreq; ++i) {
+ /* u16 driver uses to specify entry, OS writes */
+ irq_info[i].info.msix_entry_idx = entry[i].entry;
+ /* u32 kernel uses to write allocated vector */
+ irq_info[i].info.irq_id = entry[i].vector;
+ irq_info[i].type = SERVICE_T_MAX;
+ irq_info[i].free = CFG_FREE;
+ }
+
+ kfree(entry);
+
+ break;
+
+ default:
+ sdk_err(dev->dev_hdl, "Unsupport interrupt type %d\n",
+ cfg_mgmt->svc_cap.interrupt_type);
+ break;
+ }
+
+ return 0;
+}
+
+int hinic3_alloc_irqs(void *hwdev, enum hinic3_service_type type, u16 num,
+ struct irq_info *irq_info_array, u16 *act_num)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct cfg_mgmt_info *cfg_mgmt = NULL;
+ struct cfg_irq_info *irq_info = NULL;
+ struct irq_alloc_info_st *alloc_info = NULL;
+ int max_num_irq;
+ u16 free_num_irq;
+ int i, j;
+ u16 num_new = num;
+
+ if (!hwdev || !irq_info_array || !act_num)
+ return -EINVAL;
+
+ cfg_mgmt = dev->cfg_mgmt;
+ irq_info = &cfg_mgmt->irq_param_info;
+ alloc_info = irq_info->alloc_info;
+ max_num_irq = irq_info->num_total;
+ free_num_irq = irq_info->num_irq_remain;
+
+ mutex_lock(&irq_info->irq_mutex);
+
+ if (num > free_num_irq) {
+ if (free_num_irq == 0) {
+ sdk_err(dev->dev_hdl, "no free irq resource in cfg mgmt.\n");
+ mutex_unlock(&irq_info->irq_mutex);
+ return -ENOMEM;
+ }
+
+ sdk_warn(dev->dev_hdl, "only %u irq resource in cfg mgmt.\n", free_num_irq);
+ num_new = free_num_irq;
+ }
+
+ *act_num = 0;
+
+ for (i = 0; i < num_new; i++) {
+ for (j = 0; j < max_num_irq; j++) {
+ if (alloc_info[j].free == CFG_FREE) {
+ if (irq_info->num_irq_remain == 0) {
+ sdk_err(dev->dev_hdl, "No free irq resource in cfg mgmt\n");
+ mutex_unlock(&irq_info->irq_mutex);
+ return -EINVAL;
+ }
+ alloc_info[j].type = type;
+ alloc_info[j].free = CFG_BUSY;
+
+ irq_info_array[i].msix_entry_idx =
+ alloc_info[j].info.msix_entry_idx;
+ irq_info_array[i].irq_id = alloc_info[j].info.irq_id;
+ (*act_num)++;
+ irq_info->num_irq_remain--;
+
+ break;
+ }
+ }
+ }
+
+ mutex_unlock(&irq_info->irq_mutex);
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_alloc_irqs);
+
+void hinic3_free_irq(void *hwdev, enum hinic3_service_type type, u32 irq_id)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct cfg_mgmt_info *cfg_mgmt = NULL;
+ struct cfg_irq_info *irq_info = NULL;
+ struct irq_alloc_info_st *alloc_info = NULL;
+ int max_num_irq;
+ int i;
+
+ if (!hwdev)
+ return;
+
+ cfg_mgmt = dev->cfg_mgmt;
+ irq_info = &cfg_mgmt->irq_param_info;
+ alloc_info = irq_info->alloc_info;
+ max_num_irq = irq_info->num_total;
+
+ mutex_lock(&irq_info->irq_mutex);
+
+ for (i = 0; i < max_num_irq; i++) {
+ if (irq_id == alloc_info[i].info.irq_id &&
+ type == alloc_info[i].type) {
+ if (alloc_info[i].free == CFG_BUSY) {
+ alloc_info[i].free = CFG_FREE;
+ irq_info->num_irq_remain++;
+ if (irq_info->num_irq_remain > max_num_irq) {
+ sdk_err(dev->dev_hdl, "Find target,but over range\n");
+ mutex_unlock(&irq_info->irq_mutex);
+ return;
+ }
+ break;
+ }
+ }
+ }
+
+ if (i >= max_num_irq)
+ sdk_warn(dev->dev_hdl, "Irq %u don`t need to free\n", irq_id);
+
+ mutex_unlock(&irq_info->irq_mutex);
+}
+EXPORT_SYMBOL(hinic3_free_irq);
+
+int hinic3_alloc_ceqs(void *hwdev, enum hinic3_service_type type, int num,
+ int *ceq_id_array, int *act_num)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct cfg_mgmt_info *cfg_mgmt = NULL;
+ struct cfg_eq_info *eq = NULL;
+ int free_ceq;
+ int i, j;
+ int num_new = num;
+
+ if (!hwdev || !ceq_id_array || !act_num)
+ return -EINVAL;
+
+ cfg_mgmt = dev->cfg_mgmt;
+ eq = &cfg_mgmt->eq_info;
+ free_ceq = eq->num_ceq_remain;
+
+ mutex_lock(&eq->eq_mutex);
+
+ if (num > free_ceq) {
+ if (free_ceq <= 0) {
+ sdk_err(dev->dev_hdl, "No free ceq resource in cfg mgmt\n");
+ mutex_unlock(&eq->eq_mutex);
+ return -ENOMEM;
+ }
+
+ sdk_warn(dev->dev_hdl, "Only %d ceq resource in cfg mgmt\n",
+ free_ceq);
+ }
+
+ *act_num = 0;
+
+ num_new = min(num_new, eq->num_ceq - CFG_RDMA_CEQ_BASE);
+ for (i = 0; i < num_new; i++) {
+ if (eq->num_ceq_remain == 0) {
+ sdk_warn(dev->dev_hdl, "Alloc %d ceqs, less than required %d ceqs\n",
+ *act_num, num_new);
+ mutex_unlock(&eq->eq_mutex);
+ return 0;
+ }
+
+ for (j = CFG_RDMA_CEQ_BASE; j < eq->num_ceq; j++) {
+ if (eq->eq[j].free == CFG_FREE) {
+ eq->eq[j].type = type;
+ eq->eq[j].free = CFG_BUSY;
+ eq->num_ceq_remain--;
+ ceq_id_array[i] = eq->eq[j].eqn;
+ (*act_num)++;
+ break;
+ }
+ }
+ }
+
+ mutex_unlock(&eq->eq_mutex);
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_alloc_ceqs);
+
+void hinic3_free_ceq(void *hwdev, enum hinic3_service_type type, int ceq_id)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct cfg_mgmt_info *cfg_mgmt = NULL;
+ struct cfg_eq_info *eq = NULL;
+ u8 num_ceq;
+ u8 i = 0;
+
+ if (!hwdev)
+ return;
+
+ cfg_mgmt = dev->cfg_mgmt;
+ eq = &cfg_mgmt->eq_info;
+ num_ceq = eq->num_ceq;
+
+ mutex_lock(&eq->eq_mutex);
+
+ for (i = 0; i < num_ceq; i++) {
+ if (ceq_id == eq->eq[i].eqn &&
+ type == cfg_mgmt->eq_info.eq[i].type) {
+ if (eq->eq[i].free == CFG_BUSY) {
+ eq->eq[i].free = CFG_FREE;
+ eq->num_ceq_remain++;
+ if (eq->num_ceq_remain > num_ceq)
+ eq->num_ceq_remain %= num_ceq;
+
+ mutex_unlock(&eq->eq_mutex);
+ return;
+ }
+ }
+ }
+
+ if (i >= num_ceq)
+ sdk_warn(dev->dev_hdl, "ceq %d don`t need to free.\n", ceq_id);
+
+ mutex_unlock(&eq->eq_mutex);
+}
+EXPORT_SYMBOL(hinic3_free_ceq);
+
+int init_cfg_mgmt(struct hinic3_hwdev *dev)
+{
+ int err;
+ struct cfg_mgmt_info *cfg_mgmt;
+
+ cfg_mgmt = kzalloc(sizeof(*cfg_mgmt), GFP_KERNEL);
+ if (!cfg_mgmt)
+ return -ENOMEM;
+
+ dev->cfg_mgmt = cfg_mgmt;
+ cfg_mgmt->hwdev = dev;
+
+ err = cfg_init_eq(dev);
+ if (err) {
+ sdk_err(dev->dev_hdl, "Failed to init cfg event queue, err: %d\n",
+ err);
+ goto free_mgmt_mem;
+ }
+
+ err = cfg_init_interrupt(dev);
+ if (err) {
+ sdk_err(dev->dev_hdl, "Failed to init cfg interrupt, err: %d\n",
+ err);
+ goto free_eq_mem;
+ }
+
+ err = cfg_enable_interrupt(dev);
+ if (err) {
+ sdk_err(dev->dev_hdl, "Failed to enable cfg interrupt, err: %d\n",
+ err);
+ goto free_interrupt_mem;
+ }
+
+ return 0;
+
+free_interrupt_mem:
+ kfree(cfg_mgmt->irq_param_info.alloc_info);
+ mutex_deinit(&((cfg_mgmt->irq_param_info).irq_mutex));
+ cfg_mgmt->irq_param_info.alloc_info = NULL;
+
+free_eq_mem:
+ kfree(cfg_mgmt->eq_info.eq);
+ mutex_deinit(&cfg_mgmt->eq_info.eq_mutex);
+ cfg_mgmt->eq_info.eq = NULL;
+
+free_mgmt_mem:
+ kfree(cfg_mgmt);
+ return err;
+}
+
+void free_cfg_mgmt(struct hinic3_hwdev *dev)
+{
+ struct cfg_mgmt_info *cfg_mgmt = dev->cfg_mgmt;
+
+ /* if the allocated resource were recycled */
+ if (cfg_mgmt->irq_param_info.num_irq_remain !=
+ cfg_mgmt->irq_param_info.num_total ||
+ cfg_mgmt->eq_info.num_ceq_remain != cfg_mgmt->eq_info.num_ceq)
+ sdk_err(dev->dev_hdl, "Can't reclaim all irq and event queue, please check\n");
+
+ switch (cfg_mgmt->svc_cap.interrupt_type) {
+ case INTR_TYPE_MSIX:
+ pci_disable_msix(dev->pcidev_hdl);
+ break;
+
+ case INTR_TYPE_MSI:
+ pci_disable_msi(dev->pcidev_hdl);
+ break;
+
+ case INTR_TYPE_INT:
+ default:
+ break;
+ }
+
+ kfree(cfg_mgmt->irq_param_info.alloc_info);
+ cfg_mgmt->irq_param_info.alloc_info = NULL;
+ mutex_deinit(&((cfg_mgmt->irq_param_info).irq_mutex));
+
+ kfree(cfg_mgmt->eq_info.eq);
+ cfg_mgmt->eq_info.eq = NULL;
+ mutex_deinit(&cfg_mgmt->eq_info.eq_mutex);
+
+ kfree(cfg_mgmt);
+}
+
+int init_capability(struct hinic3_hwdev *dev)
+{
+ int err;
+ struct cfg_mgmt_info *cfg_mgmt = dev->cfg_mgmt;
+
+ cfg_mgmt->svc_cap.sf_svc_attr.ft_pf_en = false;
+ cfg_mgmt->svc_cap.sf_svc_attr.rdma_pf_en = false;
+
+ err = hinic3_get_dev_cap(dev);
+ if (err != 0)
+ return err;
+
+ init_service_param(dev);
+
+ sdk_info(dev->dev_hdl, "Init capability success\n");
+ return 0;
+}
+
+void free_capability(struct hinic3_hwdev *dev)
+{
+ sdk_info(dev->dev_hdl, "Free capability success");
+}
+
+bool hinic3_support_nic(void *hwdev, struct nic_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_NIC_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.nic_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_nic);
+
+bool hinic3_support_ppa(void *hwdev, struct ppa_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_PPA_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.ppa_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_ppa);
+
+bool hinic3_support_migr(void *hwdev, struct migr_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_MIGR_TYPE(dev))
+ return false;
+
+ if (cap)
+ cap->master_host_id = dev->cfg_mgmt->svc_cap.master_host_id;
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_migr);
+
+bool hinic3_support_ipsec(void *hwdev, struct ipsec_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_IPSEC_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.ipsec_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_ipsec);
+
+bool hinic3_support_roce(void *hwdev, struct rdma_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_ROCE_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.rdma_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_roce);
+
+bool hinic3_support_fc(void *hwdev, struct fc_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_FC_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.fc_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_fc);
+
+bool hinic3_support_rdma(void *hwdev, struct rdma_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_RDMA_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.rdma_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_rdma);
+
+bool hinic3_support_ovs(void *hwdev, struct ovs_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_OVS_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.ovs_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_ovs);
+
+bool hinic3_support_vbs(void *hwdev, struct vbs_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_VBS_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.vbs_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_vbs);
+
+/* Only PPF support it, PF is not */
+bool hinic3_support_toe(void *hwdev, struct toe_service_cap *cap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (!IS_TOE_TYPE(dev))
+ return false;
+
+ if (cap)
+ memcpy(cap, &dev->cfg_mgmt->svc_cap.toe_cap, sizeof(*cap));
+
+ return true;
+}
+EXPORT_SYMBOL(hinic3_support_toe);
+
+bool hinic3_func_for_mgmt(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ if (dev->cfg_mgmt->svc_cap.chip_svc_type)
+ return false;
+ else
+ return true;
+}
+
+bool hinic3_get_stateful_enable(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return false;
+
+ return dev->cfg_mgmt->svc_cap.sf_en;
+}
+EXPORT_SYMBOL(hinic3_get_stateful_enable);
+
+u8 hinic3_host_oq_id_mask(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting host oq id mask\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.host_oq_id_mask_val;
+}
+EXPORT_SYMBOL(hinic3_host_oq_id_mask);
+
+u8 hinic3_host_id(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting host id\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.host_id;
+}
+EXPORT_SYMBOL(hinic3_host_id);
+
+u16 hinic3_host_total_func(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting host total function number\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.host_total_function;
+}
+EXPORT_SYMBOL(hinic3_host_total_func);
+
+u16 hinic3_func_max_qnum(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting function max queue number\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.nic_cap.max_sqs;
+}
+EXPORT_SYMBOL(hinic3_func_max_qnum);
+
+u16 hinic3_func_max_nic_qnum(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting function max queue number\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.nic_cap.max_sqs;
+}
+EXPORT_SYMBOL(hinic3_func_max_nic_qnum);
+
+u8 hinic3_ep_id(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting ep id\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.ep_id;
+}
+EXPORT_SYMBOL(hinic3_ep_id);
+
+u8 hinic3_er_id(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting er id\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.er_id;
+}
+EXPORT_SYMBOL(hinic3_er_id);
+
+u8 hinic3_physical_port_id(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting physical port id\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.port_id;
+}
+EXPORT_SYMBOL(hinic3_physical_port_id);
+
+u16 hinic3_func_max_vf(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting max vf number\n");
+ return 0;
+ }
+ return dev->cfg_mgmt->svc_cap.max_vf;
+}
+EXPORT_SYMBOL(hinic3_func_max_vf);
+
+int hinic3_cos_valid_bitmap(void *hwdev, u8 *func_dft_cos, u8 *port_cos_bitmap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting cos valid bitmap\n");
+ return 1;
+ }
+ *func_dft_cos = dev->cfg_mgmt->svc_cap.cos_valid_bitmap;
+ *port_cos_bitmap = dev->cfg_mgmt->svc_cap.port_cos_valid_bitmap;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_cos_valid_bitmap);
+
+void hinic3_shutdown_hwdev(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return;
+
+ if (IS_SLAVE_HOST(dev))
+ set_slave_host_enable(hwdev, hinic3_pcie_itf_id(hwdev), false);
+}
+
+u32 hinic3_host_pf_num(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting pf number capability\n");
+ return 0;
+ }
+
+ return dev->cfg_mgmt->svc_cap.pf_num;
+}
+EXPORT_SYMBOL(hinic3_host_pf_num);
+
+u32 hinic3_host_pf_id_start(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for getting pf id start capability\n");
+ return 0;
+ }
+
+ return dev->cfg_mgmt->svc_cap.pf_id_start;
+}
+EXPORT_SYMBOL(hinic3_host_pf_id_start);
+
+u8 hinic3_flexq_en(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return 0;
+
+ return dev->cfg_mgmt->svc_cap.flexq_en;
+}
+EXPORT_SYMBOL(hinic3_flexq_en);
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.h
new file mode 100644
index 000000000000..0a27530ba522
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.h
@@ -0,0 +1,332 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_HW_CFG_H
+#define HINIC3_HW_CFG_H
+
+#include <linux/types.h>
+#include "cfg_mgt_comm_pub.h"
+#include "hinic3_hwdev.h"
+
+#define CFG_MAX_CMD_TIMEOUT 30000 /* ms */
+
+enum {
+ CFG_FREE = 0,
+ CFG_BUSY = 1
+};
+
+/* start position for CEQs allocation, Max number of CEQs is 32 */
+/*lint -save -e849*/
+enum {
+ CFG_RDMA_CEQ_BASE = 0
+};
+
+/*lint -restore*/
+
+/* RDMA resource */
+#define K_UNIT BIT(10)
+#define M_UNIT BIT(20)
+#define G_UNIT BIT(30)
+
+#define VIRTIO_BASE_VQ_SIZE 2048U
+#define VIRTIO_DEFAULT_VQ_SIZE 8192U
+
+/* L2NIC */
+#define HINIC3_CFG_MAX_QP 256
+
+/* RDMA */
+#define RDMA_RSVD_QPS 2
+#define ROCE_MAX_WQES (8 * K_UNIT - 1)
+#define IWARP_MAX_WQES (8 * K_UNIT)
+
+#define RDMA_MAX_SQ_SGE 16
+
+#define ROCE_MAX_RQ_SGE 16
+
+/* value changed should change ROCE_MAX_WQE_BB_PER_WR synchronously */
+#define RDMA_MAX_SQ_DESC_SZ (256)
+
+/* (256B(cache_line_len) - 16B(ctrl_seg_len) - 48B(max_task_seg_len)) */
+#define ROCE_MAX_SQ_INLINE_DATA_SZ 192
+
+#define ROCE_MAX_RQ_DESC_SZ 256
+
+#define ROCE_QPC_ENTRY_SZ 512
+
+#define WQEBB_SZ 64
+
+#define ROCE_RDMARC_ENTRY_SZ 32
+#define ROCE_MAX_QP_INIT_RDMA 128
+#define ROCE_MAX_QP_DEST_RDMA 128
+
+#define ROCE_MAX_SRQ_WQES (16 * K_UNIT - 1)
+#define ROCE_RSVD_SRQS 0
+#define ROCE_MAX_SRQ_SGE 15
+#define ROCE_SRQC_ENTERY_SZ 64
+
+#define RDMA_MAX_CQES (8 * M_UNIT - 1)
+#define RDMA_RSVD_CQS 0
+
+#define RDMA_CQC_ENTRY_SZ 128
+
+#define RDMA_CQE_SZ 64
+#define RDMA_RSVD_MRWS 128
+#define RDMA_MPT_ENTRY_SZ 64
+#define RDMA_NUM_MTTS (1 * G_UNIT)
+#define LOG_MTT_SEG 5
+#define MTT_ENTRY_SZ 8
+#define LOG_RDMARC_SEG 3
+
+#define LOCAL_ACK_DELAY 15
+#define RDMA_NUM_PORTS 1
+#define ROCE_MAX_MSG_SZ (2 * G_UNIT)
+
+#define DB_PAGE_SZ (4 * K_UNIT)
+#define DWQE_SZ 256
+
+#define NUM_PD (128 * K_UNIT)
+#define RSVD_PD 0
+
+#define MAX_XRCDS (64 * K_UNIT)
+#define RSVD_XRCDS 0
+
+#define MAX_GID_PER_PORT 128
+#define GID_ENTRY_SZ 32
+#define RSVD_LKEY ((RDMA_RSVD_MRWS - 1) << 8)
+#define NUM_COMP_VECTORS 32
+#define PAGE_SZ_CAP ((1UL << 12) | (1UL << 16) | (1UL << 21))
+#define ROCE_MODE 1
+
+#define MAX_FRPL_LEN 511
+#define MAX_PKEYS 1
+
+/* ToE */
+#define TOE_PCTX_SZ 1024
+#define TOE_CQC_SZ 64
+
+/* IoE */
+#define IOE_PCTX_SZ 512
+
+/* FC */
+#define FC_PCTX_SZ 256
+#define FC_CCTX_SZ 256
+#define FC_SQE_SZ 128
+#define FC_SCQC_SZ 64
+#define FC_SCQE_SZ 64
+#define FC_SRQC_SZ 64
+#define FC_SRQE_SZ 32
+
+/* OVS */
+#define OVS_PCTX_SZ 512
+
+/* PPA */
+#define PPA_PCTX_SZ 512
+
+/* IPsec */
+#define IPSEC_SACTX_SZ 512
+
+struct dev_sf_svc_attr {
+ bool ft_en; /* business enable flag (not include RDMA) */
+ bool ft_pf_en; /* In FPGA Test VF resource is in PF or not,
+ * 0 - VF, 1 - PF, VF doesn't need this bit.
+ */
+ bool rdma_en;
+ bool rdma_pf_en;/* In FPGA Test VF RDMA resource is in PF or not,
+ * 0 - VF, 1 - PF, VF doesn't need this bit.
+ */
+};
+
+enum intr_type {
+ INTR_TYPE_MSIX,
+ INTR_TYPE_MSI,
+ INTR_TYPE_INT,
+ INTR_TYPE_NONE,
+ /* PXE,OVS need single thread processing,
+ * synchronization messages must use poll wait mechanism interface
+ */
+};
+
+/* device capability */
+struct service_cap {
+ struct dev_sf_svc_attr sf_svc_attr;
+ u16 svc_type; /* user input service type */
+ u16 chip_svc_type; /* HW supported service type, reference to servic_bit_define_e */
+
+ u8 host_id;
+ u8 ep_id;
+ u8 er_id; /* PF/VF's ER */
+ u8 port_id; /* PF/VF's physical port */
+
+ /* Host global resources */
+ u16 host_total_function;
+ u8 pf_num;
+ u8 pf_id_start;
+ u16 vf_num; /* max numbers of vf in current host */
+ u16 vf_id_start;
+ u8 host_oq_id_mask_val;
+ u8 host_valid_bitmap;
+ u8 master_host_id;
+ u8 srv_multi_host_mode;
+ u16 virtio_vq_size;
+
+ u8 timer_pf_num;
+ u8 timer_pf_id_start;
+ u16 timer_vf_num;
+ u16 timer_vf_id_start;
+
+ u8 flexq_en;
+ u8 cos_valid_bitmap;
+ u8 port_cos_valid_bitmap;
+ u16 max_vf; /* max VF number that PF supported */
+
+ u16 fake_vf_start_id;
+ u16 fake_vf_num;
+ u32 fake_vf_max_pctx;
+ u16 fake_vf_bfilter_start_addr;
+ u16 fake_vf_bfilter_len;
+
+ u16 fake_vf_num_cfg;
+
+ /* DO NOT get interrupt_type from firmware */
+ enum intr_type interrupt_type;
+
+ bool sf_en; /* stateful business status */
+ u8 timer_en; /* 0:disable, 1:enable */
+ u8 bloomfilter_en; /* 0:disable, 1:enable */
+
+ u8 lb_mode;
+ u8 smf_pg;
+
+ /* For test */
+ u32 test_mode;
+ u32 test_qpc_num;
+ u32 test_qpc_resvd_num;
+ u32 test_page_size_reorder;
+ bool test_xid_alloc_mode;
+ bool test_gpa_check_enable;
+ u8 test_qpc_alloc_mode;
+ u8 test_scqc_alloc_mode;
+
+ u32 test_max_conn_num;
+ u32 test_max_cache_conn_num;
+ u32 test_scqc_num;
+ u32 test_mpt_num;
+ u32 test_scq_resvd_num;
+ u32 test_mpt_recvd_num;
+ u32 test_hash_num;
+ u32 test_reorder_num;
+
+ u32 max_connect_num; /* PF/VF maximum connection number(1M) */
+ /* The maximum connections which can be stick to cache memory, max 1K */
+ u16 max_stick2cache_num;
+ /* Starting address in cache memory for bloom filter, 64Bytes aligned */
+ u16 bfilter_start_addr;
+ /* Length for bloom filter, aligned on 64Bytes. The size is length*64B.
+ * Bloom filter memory size + 1 must be power of 2.
+ * The maximum memory size of bloom filter is 4M
+ */
+ u16 bfilter_len;
+ /* The size of hash bucket tables, align on 64 entries.
+ * Be used to AND (&) the hash value. Bucket Size +1 must be power of 2.
+ * The maximum number of hash bucket is 4M
+ */
+ u16 hash_bucket_num;
+
+ struct nic_service_cap nic_cap; /* NIC capability */
+ struct rdma_service_cap rdma_cap; /* RDMA capability */
+ struct fc_service_cap fc_cap; /* FC capability */
+ struct toe_service_cap toe_cap; /* ToE capability */
+ struct ovs_service_cap ovs_cap; /* OVS capability */
+ struct ipsec_service_cap ipsec_cap; /* IPsec capability */
+ struct ppa_service_cap ppa_cap; /* PPA capability */
+ struct vbs_service_cap vbs_cap; /* VBS capability */
+};
+
+struct svc_cap_info {
+ u32 func_idx;
+ struct service_cap cap;
+};
+
+struct cfg_eq {
+ enum hinic3_service_type type;
+ int eqn;
+ int free; /* 1 - alocated, 0- freed */
+};
+
+struct cfg_eq_info {
+ struct cfg_eq *eq;
+
+ u8 num_ceq;
+
+ u8 num_ceq_remain;
+
+ /* mutex used for allocate EQs */
+ struct mutex eq_mutex;
+};
+
+struct irq_alloc_info_st {
+ enum hinic3_service_type type;
+ int free; /* 1 - alocated, 0- freed */
+ struct irq_info info;
+};
+
+struct cfg_irq_info {
+ struct irq_alloc_info_st *alloc_info;
+ u16 num_total;
+ u16 num_irq_remain;
+ u16 num_irq_hw; /* device max irq number */
+
+ /* mutex used for allocate EQs */
+ struct mutex irq_mutex;
+};
+
+#define VECTOR_THRESHOLD 2
+
+struct cfg_mgmt_info {
+ struct hinic3_hwdev *hwdev;
+ struct service_cap svc_cap;
+ struct cfg_eq_info eq_info; /* EQ */
+ struct cfg_irq_info irq_param_info; /* IRQ */
+ u32 func_seq_num; /* temporary */
+};
+
+#define CFG_SERVICE_FT_EN (CFG_SERVICE_MASK_VBS | CFG_SERVICE_MASK_TOE | \
+ CFG_SERVICE_MASK_IPSEC | CFG_SERVICE_MASK_FC | \
+ CFG_SERVICE_MASK_VIRTIO | CFG_SERVICE_MASK_OVS)
+#define CFG_SERVICE_RDMA_EN CFG_SERVICE_MASK_ROCE
+
+#define IS_NIC_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_NIC)
+#define IS_ROCE_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_ROCE)
+#define IS_VBS_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_VBS)
+#define IS_TOE_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_TOE)
+#define IS_IPSEC_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_IPSEC)
+#define IS_FC_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_FC)
+#define IS_OVS_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_OVS)
+#define IS_FT_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_FT_EN)
+#define IS_RDMA_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_RDMA_EN)
+#define IS_RDMA_ENABLE(dev) \
+ ((dev)->cfg_mgmt->svc_cap.sf_svc_attr.rdma_en)
+#define IS_PPA_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_PPA)
+#define IS_MIGR_TYPE(dev) \
+ (((u32)(dev)->cfg_mgmt->svc_cap.chip_svc_type) & CFG_SERVICE_MASK_MIGRATE)
+
+int init_cfg_mgmt(struct hinic3_hwdev *dev);
+
+void free_cfg_mgmt(struct hinic3_hwdev *dev);
+
+int init_capability(struct hinic3_hwdev *dev);
+
+void free_capability(struct hinic3_hwdev *dev);
+
+#endif
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.c
new file mode 100644
index 000000000000..f207408b19d6
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.c
@@ -0,0 +1,1540 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/msi.h>
+#include <linux/types.h>
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/semaphore.h>
+#include <linux/interrupt.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_csr.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_hwif.h"
+#include "hinic3_mgmt.h"
+#include "hinic3_hw_cfg.h"
+#include "hinic3_cmdq.h"
+#include "comm_msg_intf.h"
+#include "hinic3_hw_comm.h"
+
+#define HINIC3_MSIX_CNT_LLI_TIMER_SHIFT 0
+#define HINIC3_MSIX_CNT_LLI_CREDIT_SHIFT 8
+#define HINIC3_MSIX_CNT_COALESC_TIMER_SHIFT 8
+#define HINIC3_MSIX_CNT_PENDING_SHIFT 8
+#define HINIC3_MSIX_CNT_RESEND_TIMER_SHIFT 29
+
+#define HINIC3_MSIX_CNT_LLI_TIMER_MASK 0xFFU
+#define HINIC3_MSIX_CNT_LLI_CREDIT_MASK 0xFFU
+#define HINIC3_MSIX_CNT_COALESC_TIMER_MASK 0xFFU
+#define HINIC3_MSIX_CNT_PENDING_MASK 0x1FU
+#define HINIC3_MSIX_CNT_RESEND_TIMER_MASK 0x7U
+
+#define HINIC3_MSIX_CNT_SET(val, member) \
+ (((val) & HINIC3_MSIX_CNT_##member##_MASK) << \
+ HINIC3_MSIX_CNT_##member##_SHIFT)
+
+#define DEFAULT_RX_BUF_SIZE ((u16)0xB)
+
+enum hinic3_rx_buf_size {
+ HINIC3_RX_BUF_SIZE_32B = 0x20,
+ HINIC3_RX_BUF_SIZE_64B = 0x40,
+ HINIC3_RX_BUF_SIZE_96B = 0x60,
+ HINIC3_RX_BUF_SIZE_128B = 0x80,
+ HINIC3_RX_BUF_SIZE_192B = 0xC0,
+ HINIC3_RX_BUF_SIZE_256B = 0x100,
+ HINIC3_RX_BUF_SIZE_384B = 0x180,
+ HINIC3_RX_BUF_SIZE_512B = 0x200,
+ HINIC3_RX_BUF_SIZE_768B = 0x300,
+ HINIC3_RX_BUF_SIZE_1K = 0x400,
+ HINIC3_RX_BUF_SIZE_1_5K = 0x600,
+ HINIC3_RX_BUF_SIZE_2K = 0x800,
+ HINIC3_RX_BUF_SIZE_3K = 0xC00,
+ HINIC3_RX_BUF_SIZE_4K = 0x1000,
+ HINIC3_RX_BUF_SIZE_8K = 0x2000,
+ HINIC3_RX_BUF_SIZE_16K = 0x4000,
+};
+
+const int hinic3_hw_rx_buf_size[] = {
+ HINIC3_RX_BUF_SIZE_32B,
+ HINIC3_RX_BUF_SIZE_64B,
+ HINIC3_RX_BUF_SIZE_96B,
+ HINIC3_RX_BUF_SIZE_128B,
+ HINIC3_RX_BUF_SIZE_192B,
+ HINIC3_RX_BUF_SIZE_256B,
+ HINIC3_RX_BUF_SIZE_384B,
+ HINIC3_RX_BUF_SIZE_512B,
+ HINIC3_RX_BUF_SIZE_768B,
+ HINIC3_RX_BUF_SIZE_1K,
+ HINIC3_RX_BUF_SIZE_1_5K,
+ HINIC3_RX_BUF_SIZE_2K,
+ HINIC3_RX_BUF_SIZE_3K,
+ HINIC3_RX_BUF_SIZE_4K,
+ HINIC3_RX_BUF_SIZE_8K,
+ HINIC3_RX_BUF_SIZE_16K,
+};
+
+static inline int comm_msg_to_mgmt_sync(struct hinic3_hwdev *hwdev, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size)
+{
+ return hinic3_msg_to_mgmt_sync(hwdev, HINIC3_MOD_COMM, cmd, buf_in,
+ in_size, buf_out, out_size, 0,
+ HINIC3_CHANNEL_COMM);
+}
+
+static inline int comm_msg_to_mgmt_sync_ch(struct hinic3_hwdev *hwdev, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size, u16 channel)
+{
+ return hinic3_msg_to_mgmt_sync(hwdev, HINIC3_MOD_COMM, cmd, buf_in,
+ in_size, buf_out, out_size, 0, channel);
+}
+
+int hinic3_get_interrupt_cfg(void *dev, struct interrupt_info *info,
+ u16 channel)
+{
+ struct hinic3_hwdev *hwdev = dev;
+ struct comm_cmd_msix_config msix_cfg;
+ u16 out_size = sizeof(msix_cfg);
+ int err;
+
+ if (!hwdev || !info)
+ return -EINVAL;
+
+ memset(&msix_cfg, 0, sizeof(msix_cfg));
+ msix_cfg.func_id = hinic3_global_func_id(hwdev);
+ msix_cfg.msix_index = info->msix_index;
+ msix_cfg.opcode = MGMT_MSG_CMD_OP_GET;
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_CFG_MSIX_CTRL_REG,
+ &msix_cfg, sizeof(msix_cfg), &msix_cfg,
+ &out_size, channel);
+ if (err || !out_size || msix_cfg.head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to get interrupt config, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, msix_cfg.head.status, out_size, channel);
+ return -EINVAL;
+ }
+
+ info->lli_credit_limit = msix_cfg.lli_credit_cnt;
+ info->lli_timer_cfg = msix_cfg.lli_timer_cnt;
+ info->pending_limt = msix_cfg.pending_cnt;
+ info->coalesc_timer_cfg = msix_cfg.coalesce_timer_cnt;
+ info->resend_timer_cfg = msix_cfg.resend_timer_cnt;
+
+ return 0;
+}
+
+int hinic3_set_interrupt_cfg_direct(void *hwdev, struct interrupt_info *info,
+ u16 channel)
+{
+ struct comm_cmd_msix_config msix_cfg;
+ u16 out_size = sizeof(msix_cfg);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&msix_cfg, 0, sizeof(msix_cfg));
+ msix_cfg.func_id = hinic3_global_func_id(hwdev);
+ msix_cfg.msix_index = (u16)info->msix_index;
+ msix_cfg.opcode = MGMT_MSG_CMD_OP_SET;
+
+ msix_cfg.lli_credit_cnt = info->lli_credit_limit;
+ msix_cfg.lli_timer_cnt = info->lli_timer_cfg;
+ msix_cfg.pending_cnt = info->pending_limt;
+ msix_cfg.coalesce_timer_cnt = info->coalesc_timer_cfg;
+ msix_cfg.resend_timer_cnt = info->resend_timer_cfg;
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_CFG_MSIX_CTRL_REG,
+ &msix_cfg, sizeof(msix_cfg), &msix_cfg,
+ &out_size, channel);
+ if (err || !out_size || msix_cfg.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set interrupt config, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, msix_cfg.head.status, out_size, channel);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_set_interrupt_cfg(void *dev, struct interrupt_info info, u16 channel)
+{
+ struct interrupt_info temp_info;
+ struct hinic3_hwdev *hwdev = dev;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ temp_info.msix_index = info.msix_index;
+
+ err = hinic3_get_interrupt_cfg(hwdev, &temp_info, channel);
+ if (err)
+ return -EINVAL;
+
+ if (!info.lli_set) {
+ info.lli_credit_limit = temp_info.lli_credit_limit;
+ info.lli_timer_cfg = temp_info.lli_timer_cfg;
+ }
+
+ if (!info.interrupt_coalesc_set) {
+ info.pending_limt = temp_info.pending_limt;
+ info.coalesc_timer_cfg = temp_info.coalesc_timer_cfg;
+ info.resend_timer_cfg = temp_info.resend_timer_cfg;
+ }
+
+ return hinic3_set_interrupt_cfg_direct(hwdev, &info, channel);
+}
+EXPORT_SYMBOL(hinic3_set_interrupt_cfg);
+
+void hinic3_misx_intr_clear_resend_bit(void *hwdev, u16 msix_idx,
+ u8 clear_resend_en)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 msix_ctrl = 0, addr;
+
+ if (!hwdev)
+ return;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ msix_ctrl = HINIC3_MSI_CLR_INDIR_SET(msix_idx, SIMPLE_INDIR_IDX) |
+ HINIC3_MSI_CLR_INDIR_SET(clear_resend_en, RESEND_TIMER_CLR);
+
+ addr = HINIC3_CSR_FUNC_MSI_CLR_WR_ADDR;
+ hinic3_hwif_write_reg(hwif, addr, msix_ctrl);
+}
+EXPORT_SYMBOL(hinic3_misx_intr_clear_resend_bit);
+
+int hinic3_set_wq_page_size(void *hwdev, u16 func_idx, u32 page_size,
+ u16 channel)
+{
+ struct comm_cmd_wq_page_size page_size_info;
+ u16 out_size = sizeof(page_size_info);
+ int err;
+
+ memset(&page_size_info, 0, sizeof(page_size_info));
+ page_size_info.func_id = func_idx;
+ page_size_info.page_size = HINIC3_PAGE_SIZE_HW(page_size);
+ page_size_info.opcode = MGMT_MSG_CMD_OP_SET;
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_CFG_PAGESIZE,
+ &page_size_info, sizeof(page_size_info),
+ &page_size_info, &out_size, channel);
+ if (err || !out_size || page_size_info.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set wq page size, err: %d, status: 0x%x, out_size: 0x%0x, channel: 0x%x\n",
+ err, page_size_info.head.status, out_size, channel);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_func_reset(void *dev, u16 func_id, u64 reset_flag, u16 channel)
+{
+ struct comm_cmd_func_reset func_reset;
+ struct hinic3_hwdev *hwdev = dev;
+ u16 out_size = sizeof(func_reset);
+ int err = 0;
+
+ if (!dev) {
+ pr_err("Invalid para: dev is null.\n");
+ return -EINVAL;
+ }
+
+ sdk_info(hwdev->dev_hdl, "Function is reset, flag: 0x%llx, channel:0x%x\n",
+ reset_flag, channel);
+
+ memset(&func_reset, 0, sizeof(func_reset));
+ func_reset.func_id = func_id;
+ func_reset.reset_flag = reset_flag;
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_FUNC_RESET,
+ &func_reset, sizeof(func_reset),
+ &func_reset, &out_size, channel);
+ if (err || !out_size || func_reset.head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to reset func resources, reset_flag 0x%llx, err: %d, status: 0x%x, out_size: 0x%x\n",
+ reset_flag, err, func_reset.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_func_reset);
+
+static u16 get_hw_rx_buf_size(int rx_buf_sz)
+{
+ u16 num_hw_types =
+ sizeof(hinic3_hw_rx_buf_size) /
+ sizeof(hinic3_hw_rx_buf_size[0]);
+ u16 i;
+
+ for (i = 0; i < num_hw_types; i++) {
+ if (hinic3_hw_rx_buf_size[i] == rx_buf_sz)
+ return i;
+ }
+
+ pr_err("Chip can't support rx buf size of %d\n", rx_buf_sz);
+
+ return DEFAULT_RX_BUF_SIZE; /* default 2K */
+}
+
+int hinic3_set_root_ctxt(void *hwdev, u32 rq_depth, u32 sq_depth, int rx_buf_sz,
+ u16 channel)
+{
+ struct comm_cmd_root_ctxt root_ctxt;
+ u16 out_size = sizeof(root_ctxt);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&root_ctxt, 0, sizeof(root_ctxt));
+ root_ctxt.func_id = hinic3_global_func_id(hwdev);
+
+ root_ctxt.set_cmdq_depth = 0;
+ root_ctxt.cmdq_depth = 0;
+
+ root_ctxt.lro_en = 1;
+
+ root_ctxt.rq_depth = (u16)ilog2(rq_depth);
+ root_ctxt.rx_buf_sz = get_hw_rx_buf_size(rx_buf_sz);
+ root_ctxt.sq_depth = (u16)ilog2(sq_depth);
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_SET_VAT,
+ &root_ctxt, sizeof(root_ctxt),
+ &root_ctxt, &out_size, channel);
+ if (err || !out_size || root_ctxt.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set root context, err: %d, status: 0x%x, out_size: 0x%x, channel: 0x%x\n",
+ err, root_ctxt.head.status, out_size, channel);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_root_ctxt);
+
+int hinic3_clean_root_ctxt(void *hwdev, u16 channel)
+{
+ struct comm_cmd_root_ctxt root_ctxt;
+ u16 out_size = sizeof(root_ctxt);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&root_ctxt, 0, sizeof(root_ctxt));
+ root_ctxt.func_id = hinic3_global_func_id(hwdev);
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_SET_VAT,
+ &root_ctxt, sizeof(root_ctxt),
+ &root_ctxt, &out_size, channel);
+ if (err || !out_size || root_ctxt.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set root context, err: %d, status: 0x%x, out_size: 0x%x, channel: 0x%x\n",
+ err, root_ctxt.head.status, out_size, channel);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_clean_root_ctxt);
+
+int hinic3_set_cmdq_depth(void *hwdev, u16 cmdq_depth)
+{
+ struct comm_cmd_root_ctxt root_ctxt;
+ u16 out_size = sizeof(root_ctxt);
+ int err;
+
+ memset(&root_ctxt, 0, sizeof(root_ctxt));
+ root_ctxt.func_id = hinic3_global_func_id(hwdev);
+
+ root_ctxt.set_cmdq_depth = 1;
+ root_ctxt.cmdq_depth = (u8)ilog2(cmdq_depth);
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_VAT, &root_ctxt,
+ sizeof(root_ctxt), &root_ctxt, &out_size);
+ if (err || !out_size || root_ctxt.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set cmdq depth, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, root_ctxt.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_set_cmdq_ctxt(struct hinic3_hwdev *hwdev, u8 cmdq_id,
+ struct cmdq_ctxt_info *ctxt)
+{
+ struct comm_cmd_cmdq_ctxt cmdq_ctxt;
+ u16 out_size = sizeof(cmdq_ctxt);
+ int err;
+
+ memset(&cmdq_ctxt, 0, sizeof(cmdq_ctxt));
+ memcpy(&cmdq_ctxt.ctxt, ctxt, sizeof(*ctxt));
+ cmdq_ctxt.func_id = hinic3_global_func_id(hwdev);
+ cmdq_ctxt.cmdq_id = cmdq_id;
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_CMDQ_CTXT,
+ &cmdq_ctxt, sizeof(cmdq_ctxt),
+ &cmdq_ctxt, &out_size);
+ if (err || !out_size || cmdq_ctxt.head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to set cmdq ctxt, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, cmdq_ctxt.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_set_ceq_ctrl_reg(struct hinic3_hwdev *hwdev, u16 q_id,
+ u32 ctrl0, u32 ctrl1)
+{
+ struct comm_cmd_ceq_ctrl_reg ceq_ctrl;
+ u16 out_size = sizeof(ceq_ctrl);
+ int err;
+
+ memset(&ceq_ctrl, 0, sizeof(ceq_ctrl));
+ ceq_ctrl.func_id = hinic3_global_func_id(hwdev);
+ ceq_ctrl.q_id = q_id;
+ ceq_ctrl.ctrl0 = ctrl0;
+ ceq_ctrl.ctrl1 = ctrl1;
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_CEQ_CTRL_REG,
+ &ceq_ctrl, sizeof(ceq_ctrl),
+ &ceq_ctrl, &out_size);
+ if (err || !out_size || ceq_ctrl.head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to set ceq %u ctrl reg, err: %d status: 0x%x, out_size: 0x%x\n",
+ q_id, err, ceq_ctrl.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_set_dma_attr_tbl(struct hinic3_hwdev *hwdev, u8 entry_idx, u8 st, u8 at, u8 ph,
+ u8 no_snooping, u8 tph_en)
+{
+ struct comm_cmd_dma_attr_config dma_attr;
+ u16 out_size = sizeof(dma_attr);
+ int err;
+
+ memset(&dma_attr, 0, sizeof(dma_attr));
+ dma_attr.func_id = hinic3_global_func_id(hwdev);
+ dma_attr.entry_idx = entry_idx;
+ dma_attr.st = st;
+ dma_attr.at = at;
+ dma_attr.ph = ph;
+ dma_attr.no_snooping = no_snooping;
+ dma_attr.tph_en = tph_en;
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_DMA_ATTR, &dma_attr, sizeof(dma_attr),
+ &dma_attr, &out_size);
+ if (err || !out_size || dma_attr.head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to set dma attr, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, dma_attr.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_set_bdf_ctxt(void *hwdev, u8 bus, u8 device, u8 function)
+{
+ struct comm_cmd_bdf_info bdf_info;
+ u16 out_size = sizeof(bdf_info);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&bdf_info, 0, sizeof(bdf_info));
+ bdf_info.function_idx = hinic3_global_func_id(hwdev);
+ bdf_info.bus = bus;
+ bdf_info.device = device;
+ bdf_info.function = function;
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SEND_BDF_INFO,
+ &bdf_info, sizeof(bdf_info),
+ &bdf_info, &out_size);
+ if (err || !out_size || bdf_info.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set bdf info to MPU, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, bdf_info.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_sync_time(void *hwdev, u64 time)
+{
+ struct comm_cmd_sync_time time_info;
+ u16 out_size = sizeof(time_info);
+ int err;
+
+ memset(&time_info, 0, sizeof(time_info));
+ time_info.mstime = time;
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SYNC_TIME, &time_info,
+ sizeof(time_info), &time_info, &out_size);
+ if (err || time_info.head.status || !out_size) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to sync time to mgmt, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, time_info.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+int hinic3_set_ppf_flr_type(void *hwdev, enum hinic3_ppf_flr_type flr_type)
+{
+ struct comm_cmd_ppf_flr_type_set flr_type_set;
+ u16 out_size = sizeof(struct comm_cmd_ppf_flr_type_set);
+ struct hinic3_hwdev *dev = hwdev;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&flr_type_set, 0, sizeof(flr_type_set));
+ flr_type_set.func_id = hinic3_global_func_id(hwdev);
+ flr_type_set.ppf_flr_type = flr_type;
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_PPF_FLR_TYPE,
+ &flr_type_set, sizeof(flr_type_set),
+ &flr_type_set, &out_size);
+ if (err || !out_size || flr_type_set.head.status) {
+ sdk_err(dev->dev_hdl, "Failed to set ppf flr type, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, flr_type_set.head.status, out_size);
+ return -EIO;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_ppf_flr_type);
+
+static int hinic3_get_fw_ver(struct hinic3_hwdev *hwdev, enum hinic3_fw_ver_type type,
+ u8 *mgmt_ver, u8 version_size, u16 channel)
+{
+ struct comm_cmd_get_fw_version fw_ver;
+ u16 out_size = sizeof(fw_ver);
+ int err;
+
+ if (!hwdev || !mgmt_ver)
+ return -EINVAL;
+
+ memset(&fw_ver, 0, sizeof(fw_ver));
+ fw_ver.fw_type = type;
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_GET_FW_VERSION,
+ &fw_ver, sizeof(fw_ver), &fw_ver,
+ &out_size, channel);
+ if (err || !out_size || fw_ver.head.status) {
+ sdk_err(hwdev->dev_hdl,
+ "Failed to get fw version, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, fw_ver.head.status, out_size, channel);
+ return -EIO;
+ }
+
+ err = snprintf(mgmt_ver, version_size, "%s", fw_ver.ver);
+ if (err < 0)
+ return -EINVAL;
+
+ return 0;
+}
+
+int hinic3_get_mgmt_version(void *hwdev, u8 *mgmt_ver, u8 version_size,
+ u16 channel)
+{
+ return hinic3_get_fw_ver(hwdev, HINIC3_FW_VER_TYPE_MPU, mgmt_ver,
+ version_size, channel);
+}
+EXPORT_SYMBOL(hinic3_get_mgmt_version);
+
+int hinic3_get_fw_version(void *hwdev, struct hinic3_fw_version *fw_ver,
+ u16 channel)
+{
+ int err;
+
+ if (!hwdev || !fw_ver)
+ return -EINVAL;
+
+ err = hinic3_get_fw_ver(hwdev, HINIC3_FW_VER_TYPE_MPU,
+ fw_ver->mgmt_ver, sizeof(fw_ver->mgmt_ver),
+ channel);
+ if (err)
+ return err;
+
+ err = hinic3_get_fw_ver(hwdev, HINIC3_FW_VER_TYPE_NPU,
+ fw_ver->microcode_ver,
+ sizeof(fw_ver->microcode_ver), channel);
+ if (err)
+ return err;
+
+ return hinic3_get_fw_ver(hwdev, HINIC3_FW_VER_TYPE_BOOT,
+ fw_ver->boot_ver, sizeof(fw_ver->boot_ver),
+ channel);
+}
+EXPORT_SYMBOL(hinic3_get_fw_version);
+
+static int hinic3_comm_features_nego(void *hwdev, u8 opcode, u64 *s_feature,
+ u16 size)
+{
+ struct comm_cmd_feature_nego feature_nego;
+ u16 out_size = sizeof(feature_nego);
+ struct hinic3_hwdev *dev = hwdev;
+ int err;
+
+ if (!hwdev || !s_feature || size > COMM_MAX_FEATURE_QWORD)
+ return -EINVAL;
+
+ memset(&feature_nego, 0, sizeof(feature_nego));
+ feature_nego.func_id = hinic3_global_func_id(hwdev);
+ feature_nego.opcode = opcode;
+ if (opcode == MGMT_MSG_CMD_OP_SET)
+ memcpy(feature_nego.s_feature, s_feature, (size * sizeof(u64)));
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_FEATURE_NEGO,
+ &feature_nego, sizeof(feature_nego),
+ &feature_nego, &out_size);
+ if (err || !out_size || feature_nego.head.status) {
+ sdk_err(dev->dev_hdl, "Failed to negotiate feature, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, feature_nego.head.status, out_size);
+ return -EINVAL;
+ }
+
+ if (opcode == MGMT_MSG_CMD_OP_GET)
+ memcpy(s_feature, feature_nego.s_feature, (size * sizeof(u64)));
+
+ return 0;
+}
+
+int hinic3_get_comm_features(void *hwdev, u64 *s_feature, u16 size)
+{
+ return hinic3_comm_features_nego(hwdev, MGMT_MSG_CMD_OP_GET, s_feature,
+ size);
+}
+
+int hinic3_set_comm_features(void *hwdev, u64 *s_feature, u16 size)
+{
+ return hinic3_comm_features_nego(hwdev, MGMT_MSG_CMD_OP_SET, s_feature,
+ size);
+}
+
+int hinic3_comm_channel_detect(struct hinic3_hwdev *hwdev)
+{
+ struct comm_cmd_channel_detect channel_detect_info;
+ u16 out_size = sizeof(channel_detect_info);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&channel_detect_info, 0, sizeof(channel_detect_info));
+ channel_detect_info.func_id = hinic3_global_func_id(hwdev);
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_CHANNEL_DETECT,
+ &channel_detect_info, sizeof(channel_detect_info),
+ &channel_detect_info, &out_size);
+ if ((channel_detect_info.head.status != HINIC3_MGMT_CMD_UNSUPPORTED &&
+ channel_detect_info.head.status) || err || !out_size) {
+ sdk_err(hwdev->dev_hdl, "Failed to send channel detect, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, channel_detect_info.head.status, out_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_func_tmr_bitmap_set(void *hwdev, u16 func_id, bool en)
+{
+ struct comm_cmd_func_tmr_bitmap_op bitmap_op;
+ u16 out_size = sizeof(bitmap_op);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&bitmap_op, 0, sizeof(bitmap_op));
+ bitmap_op.func_id = func_id;
+ bitmap_op.opcode = en ? FUNC_TMR_BITMAP_ENABLE : FUNC_TMR_BITMAP_DISABLE;
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_FUNC_TMR_BITMAT,
+ &bitmap_op, sizeof(bitmap_op),
+ &bitmap_op, &out_size);
+ if (err || !out_size || bitmap_op.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set timer bitmap, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, bitmap_op.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int ppf_ht_gpa_set(struct hinic3_hwdev *hwdev, struct hinic3_page_addr *pg0,
+ struct hinic3_page_addr *pg1)
+{
+ struct comm_cmd_ht_gpa ht_gpa_set;
+ u16 out_size = sizeof(ht_gpa_set);
+ int ret;
+
+ memset(&ht_gpa_set, 0, sizeof(ht_gpa_set));
+ pg0->virt_addr = dma_zalloc_coherent(hwdev->dev_hdl,
+ HINIC3_HT_GPA_PAGE_SIZE,
+ &pg0->phys_addr, GFP_KERNEL);
+ if (!pg0->virt_addr) {
+ sdk_err(hwdev->dev_hdl, "Alloc pg0 page addr failed\n");
+ return -EFAULT;
+ }
+
+ pg1->virt_addr = dma_zalloc_coherent(hwdev->dev_hdl,
+ HINIC3_HT_GPA_PAGE_SIZE,
+ &pg1->phys_addr, GFP_KERNEL);
+ if (!pg1->virt_addr) {
+ sdk_err(hwdev->dev_hdl, "Alloc pg1 page addr failed\n");
+ return -EFAULT;
+ }
+
+ ht_gpa_set.host_id = hinic3_host_id(hwdev);
+ ht_gpa_set.page_pa0 = pg0->phys_addr;
+ ht_gpa_set.page_pa1 = pg1->phys_addr;
+ sdk_info(hwdev->dev_hdl, "PPF ht gpa set: page_addr0.pa=0x%llx, page_addr1.pa=0x%llx\n",
+ pg0->phys_addr, pg1->phys_addr);
+ ret = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_PPF_HT_GPA,
+ &ht_gpa_set, sizeof(ht_gpa_set),
+ &ht_gpa_set, &out_size);
+ if (ret || !out_size || ht_gpa_set.head.status) {
+ sdk_warn(hwdev->dev_hdl, "PPF ht gpa set failed, ret: %d, status: 0x%x, out_size: 0x%x\n",
+ ret, ht_gpa_set.head.status, out_size);
+ return -EFAULT;
+ }
+
+ hwdev->page_pa0.phys_addr = pg0->phys_addr;
+ hwdev->page_pa0.virt_addr = pg0->virt_addr;
+
+ hwdev->page_pa1.phys_addr = pg1->phys_addr;
+ hwdev->page_pa1.virt_addr = pg1->virt_addr;
+
+ return 0;
+}
+
+int hinic3_ppf_ht_gpa_init(void *dev)
+{
+ struct hinic3_page_addr page_addr0[HINIC3_PPF_HT_GPA_SET_RETRY_TIMES];
+ struct hinic3_page_addr page_addr1[HINIC3_PPF_HT_GPA_SET_RETRY_TIMES];
+ struct hinic3_hwdev *hwdev = dev;
+ int ret;
+ int i;
+ int j;
+ size_t size;
+
+ if (!dev) {
+ pr_err("Invalid para: dev is null.\n");
+ return -EINVAL;
+ }
+
+ size = HINIC3_PPF_HT_GPA_SET_RETRY_TIMES * sizeof(page_addr0[0]);
+ memset(page_addr0, 0, size);
+ memset(page_addr1, 0, size);
+
+ for (i = 0; i < HINIC3_PPF_HT_GPA_SET_RETRY_TIMES; i++) {
+ ret = ppf_ht_gpa_set(hwdev, &page_addr0[i], &page_addr1[i]);
+ if (ret == 0)
+ break;
+ }
+
+ for (j = 0; j < i; j++) {
+ if (page_addr0[j].virt_addr) {
+ dma_free_coherent(hwdev->dev_hdl,
+ HINIC3_HT_GPA_PAGE_SIZE,
+ page_addr0[j].virt_addr,
+ (dma_addr_t)page_addr0[j].phys_addr);
+ page_addr0[j].virt_addr = NULL;
+ }
+ if (page_addr1[j].virt_addr) {
+ dma_free_coherent(hwdev->dev_hdl,
+ HINIC3_HT_GPA_PAGE_SIZE,
+ page_addr1[j].virt_addr,
+ (dma_addr_t)page_addr1[j].phys_addr);
+ page_addr1[j].virt_addr = NULL;
+ }
+ }
+
+ if (i >= HINIC3_PPF_HT_GPA_SET_RETRY_TIMES) {
+ sdk_err(hwdev->dev_hdl, "PPF ht gpa init failed, retry times: %d\n",
+ i);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+void hinic3_ppf_ht_gpa_deinit(void *dev)
+{
+ struct hinic3_hwdev *hwdev = dev;
+
+ if (!dev) {
+ pr_err("Invalid para: dev is null.\n");
+ return;
+ }
+
+ if (hwdev->page_pa0.virt_addr) {
+ dma_free_coherent(hwdev->dev_hdl, HINIC3_HT_GPA_PAGE_SIZE,
+ hwdev->page_pa0.virt_addr,
+ (dma_addr_t)(hwdev->page_pa0.phys_addr));
+ hwdev->page_pa0.virt_addr = NULL;
+ }
+
+ if (hwdev->page_pa1.virt_addr) {
+ dma_free_coherent(hwdev->dev_hdl, HINIC3_HT_GPA_PAGE_SIZE,
+ hwdev->page_pa1.virt_addr,
+ (dma_addr_t)hwdev->page_pa1.phys_addr);
+ hwdev->page_pa1.virt_addr = NULL;
+ }
+}
+
+static int set_ppf_tmr_status(struct hinic3_hwdev *hwdev,
+ enum ppf_tmr_status status)
+{
+ struct comm_cmd_ppf_tmr_op op;
+ u16 out_size = sizeof(op);
+ int err = 0;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&op, 0, sizeof(op));
+
+ if (hinic3_func_type(hwdev) != TYPE_PPF)
+ return -EFAULT;
+
+ op.opcode = status;
+ op.ppf_id = hinic3_ppf_idx(hwdev);
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_PPF_TMR, &op,
+ sizeof(op), &op, &out_size);
+ if (err || !out_size || op.head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to set ppf timer, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, op.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_ppf_tmr_start(void *hwdev)
+{
+ if (!hwdev) {
+ pr_err("Hwdev pointer is NULL for starting ppf timer\n");
+ return -EINVAL;
+ }
+
+ return set_ppf_tmr_status(hwdev, HINIC_PPF_TMR_FLAG_START);
+}
+EXPORT_SYMBOL(hinic3_ppf_tmr_start);
+
+int hinic3_ppf_tmr_stop(void *hwdev)
+{
+ if (!hwdev) {
+ pr_err("Hwdev pointer is NULL for stop ppf timer\n");
+ return -EINVAL;
+ }
+
+ return set_ppf_tmr_status(hwdev, HINIC_PPF_TMR_FLAG_STOP);
+}
+EXPORT_SYMBOL(hinic3_ppf_tmr_stop);
+
+static int mqm_eqm_try_alloc_mem(struct hinic3_hwdev *hwdev, u32 page_size,
+ u32 page_num)
+{
+ struct hinic3_page_addr *page_addr = hwdev->mqm_att.brm_srch_page_addr;
+ u32 valid_num = 0;
+ u32 flag = 1;
+ u32 i = 0;
+
+ for (i = 0; i < page_num; i++) {
+ page_addr->virt_addr =
+ dma_zalloc_coherent(hwdev->dev_hdl, page_size,
+ &page_addr->phys_addr, GFP_KERNEL);
+ if (!page_addr->virt_addr) {
+ flag = 0;
+ break;
+ }
+ valid_num++;
+ page_addr++;
+ }
+
+ if (flag == 1) {
+ hwdev->mqm_att.page_size = page_size;
+ hwdev->mqm_att.page_num = page_num;
+ } else {
+ page_addr = hwdev->mqm_att.brm_srch_page_addr;
+ for (i = 0; i < valid_num; i++) {
+ dma_free_coherent(hwdev->dev_hdl, page_size,
+ page_addr->virt_addr,
+ (dma_addr_t)page_addr->phys_addr);
+ page_addr++;
+ }
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int mqm_eqm_alloc_page_mem(struct hinic3_hwdev *hwdev)
+{
+ int ret = 0;
+ u32 page_num;
+
+ /* apply for 2M page, page number is chunk_num/1024 */
+ page_num = (hwdev->mqm_att.chunk_num + 0x3ff) >> 0xa;
+ ret = mqm_eqm_try_alloc_mem(hwdev, 0x2 * 0x400 * 0x400, page_num);
+ if (ret == 0) {
+ sdk_info(hwdev->dev_hdl, "[mqm_eqm_init] Alloc page_size 2M OK\n");
+ return 0;
+ }
+
+ /* apply for 64KB page, page number is chunk_num/32 */
+ page_num = (hwdev->mqm_att.chunk_num + 0x1f) >> 0x5;
+ ret = mqm_eqm_try_alloc_mem(hwdev, 0x40 * 0x400, page_num);
+ if (ret == 0) {
+ sdk_info(hwdev->dev_hdl, "[mqm_eqm_init] Alloc page_size 64K OK\n");
+ return 0;
+ }
+
+ /* apply for 4KB page, page number is chunk_num/2 */
+ page_num = (hwdev->mqm_att.chunk_num + 1) >> 1;
+ ret = mqm_eqm_try_alloc_mem(hwdev, 0x4 * 0x400, page_num);
+ if (ret == 0) {
+ sdk_info(hwdev->dev_hdl, "[mqm_eqm_init] Alloc page_size 4K OK\n");
+ return 0;
+ }
+
+ return ret;
+}
+
+static void mqm_eqm_free_page_mem(struct hinic3_hwdev *hwdev)
+{
+ u32 i;
+ struct hinic3_page_addr *page_addr;
+ u32 page_size;
+
+ page_size = hwdev->mqm_att.page_size;
+ page_addr = hwdev->mqm_att.brm_srch_page_addr;
+
+ for (i = 0; i < hwdev->mqm_att.page_num; i++) {
+ dma_free_coherent(hwdev->dev_hdl, page_size,
+ page_addr->virt_addr, (dma_addr_t)(page_addr->phys_addr));
+ page_addr++;
+ }
+}
+
+static int mqm_eqm_set_cfg_2_hw(struct hinic3_hwdev *hwdev, u8 valid)
+{
+ struct comm_cmd_eqm_cfg info_eqm_cfg;
+ u16 out_size = sizeof(info_eqm_cfg);
+ int err;
+
+ memset(&info_eqm_cfg, 0, sizeof(info_eqm_cfg));
+
+ info_eqm_cfg.host_id = hinic3_host_id(hwdev);
+ info_eqm_cfg.page_size = hwdev->mqm_att.page_size;
+ info_eqm_cfg.valid = valid;
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_SET_MQM_CFG_INFO,
+ &info_eqm_cfg, sizeof(info_eqm_cfg),
+ &info_eqm_cfg, &out_size);
+ if (err || !out_size || info_eqm_cfg.head.status) {
+ sdk_err(hwdev->dev_hdl, "Failed to init func table, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, info_eqm_cfg.head.status, out_size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+#define EQM_DATA_BUF_SIZE 1024
+#define MQM_ATT_PAGE_NUM 128
+
+static int mqm_eqm_set_page_2_hw(struct hinic3_hwdev *hwdev)
+{
+ struct comm_cmd_eqm_search_gpa *info = NULL;
+ struct hinic3_page_addr *page_addr = NULL;
+ void *send_buf = NULL;
+ u16 send_buf_size;
+ u32 i;
+ u64 *gpa_hi52 = NULL;
+ u64 gpa;
+ u32 num;
+ u32 start_idx;
+ int err = 0;
+ u16 out_size;
+ u8 cmd;
+
+ send_buf_size = sizeof(struct comm_cmd_eqm_search_gpa) +
+ EQM_DATA_BUF_SIZE;
+ send_buf = kzalloc(send_buf_size, GFP_KERNEL);
+ if (!send_buf) {
+ sdk_err(hwdev->dev_hdl, "Alloc virtual mem failed\r\n");
+ return -EFAULT;
+ }
+
+ page_addr = hwdev->mqm_att.brm_srch_page_addr;
+ info = (struct comm_cmd_eqm_search_gpa *)send_buf;
+
+ gpa_hi52 = info->gpa_hi52;
+ num = 0;
+ start_idx = 0;
+ cmd = COMM_MGMT_CMD_SET_MQM_SRCH_GPA;
+ for (i = 0; i < hwdev->mqm_att.page_num; i++) {
+ /* gpa align to 4K, save gpa[31:12] */
+ gpa = page_addr->phys_addr >> 12;
+ gpa_hi52[num] = gpa;
+ num++;
+ if (num == MQM_ATT_PAGE_NUM) {
+ info->num = num;
+ info->start_idx = start_idx;
+ info->host_id = hinic3_host_id(hwdev);
+ out_size = send_buf_size;
+ err = comm_msg_to_mgmt_sync(hwdev, cmd, info,
+ (u16)send_buf_size,
+ info, &out_size);
+ if (MSG_TO_MGMT_SYNC_RETURN_ERR(err, out_size,
+ info->head.status)) {
+ sdk_err(hwdev->dev_hdl, "Set mqm srch gpa fail, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, info->head.status, out_size);
+ err = -EFAULT;
+ goto set_page_2_hw_end;
+ }
+
+ gpa_hi52 = info->gpa_hi52;
+ num = 0;
+ start_idx = i + 1;
+ }
+ page_addr++;
+ }
+
+ if (num != 0) {
+ info->num = num;
+ info->start_idx = start_idx;
+ info->host_id = hinic3_host_id(hwdev);
+ out_size = send_buf_size;
+ err = comm_msg_to_mgmt_sync(hwdev, cmd, info,
+ (u16)send_buf_size, info,
+ &out_size);
+ if (MSG_TO_MGMT_SYNC_RETURN_ERR(err, out_size,
+ info->head.status)) {
+ sdk_err(hwdev->dev_hdl, "Set mqm srch gpa fail, err: %d, status: 0x%x, out_size: 0x%x\n",
+ err, info->head.status, out_size);
+ err = -EFAULT;
+ goto set_page_2_hw_end;
+ }
+ }
+
+set_page_2_hw_end:
+ kfree(send_buf);
+ return err;
+}
+
+static int get_eqm_num(struct hinic3_hwdev *hwdev, struct comm_cmd_get_eqm_num *info_eqm_fix)
+{
+ int ret;
+ u16 len = sizeof(*info_eqm_fix);
+
+ memset(info_eqm_fix, 0, sizeof(*info_eqm_fix));
+
+ ret = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_GET_MQM_FIX_INFO,
+ info_eqm_fix, sizeof(*info_eqm_fix), info_eqm_fix, &len);
+ if (ret || !len || info_eqm_fix->head.status) {
+ sdk_err(hwdev->dev_hdl, "Get mqm fix info fail,err: %d, status: 0x%x, out_size: 0x%x\n",
+ ret, info_eqm_fix->head.status, len);
+ return -EFAULT;
+ }
+
+ sdk_info(hwdev->dev_hdl, "get chunk_num: 0x%x, search_gpa_num: 0x%08x\n",
+ info_eqm_fix->chunk_num, info_eqm_fix->search_gpa_num);
+
+ return 0;
+}
+
+static int mqm_eqm_init(struct hinic3_hwdev *hwdev)
+{
+ struct comm_cmd_get_eqm_num info_eqm_fix;
+ int ret;
+
+ if (hwdev->hwif->attr.func_type != TYPE_PPF)
+ return 0;
+
+ ret = get_eqm_num(hwdev, &info_eqm_fix);
+ if (ret)
+ return ret;
+
+ if (!(info_eqm_fix.chunk_num))
+ return 0;
+
+ hwdev->mqm_att.chunk_num = info_eqm_fix.chunk_num;
+ hwdev->mqm_att.search_gpa_num = info_eqm_fix.search_gpa_num;
+ hwdev->mqm_att.page_size = 0;
+ hwdev->mqm_att.page_num = 0;
+
+ hwdev->mqm_att.brm_srch_page_addr =
+ kcalloc(hwdev->mqm_att.chunk_num, sizeof(struct hinic3_page_addr), GFP_KERNEL);
+ if (!(hwdev->mqm_att.brm_srch_page_addr)) {
+ sdk_err(hwdev->dev_hdl, "Alloc virtual mem failed\r\n");
+ return -EFAULT;
+ }
+
+ ret = mqm_eqm_alloc_page_mem(hwdev);
+ if (ret) {
+ sdk_err(hwdev->dev_hdl, "Alloc eqm page mem failed\r\n");
+ goto err_page;
+ }
+
+ ret = mqm_eqm_set_page_2_hw(hwdev);
+ if (ret) {
+ sdk_err(hwdev->dev_hdl, "Set page to hw failed\r\n");
+ goto err_ecmd;
+ }
+
+ ret = mqm_eqm_set_cfg_2_hw(hwdev, 1);
+ if (ret) {
+ sdk_err(hwdev->dev_hdl, "Set page to hw failed\r\n");
+ goto err_ecmd;
+ }
+
+ sdk_info(hwdev->dev_hdl, "ppf_ext_db_init ok\r\n");
+
+ return 0;
+
+err_ecmd:
+ mqm_eqm_free_page_mem(hwdev);
+
+err_page:
+ kfree(hwdev->mqm_att.brm_srch_page_addr);
+
+ return ret;
+}
+
+static void mqm_eqm_deinit(struct hinic3_hwdev *hwdev)
+{
+ int ret;
+
+ if (hwdev->hwif->attr.func_type != TYPE_PPF)
+ return;
+
+ if (!(hwdev->mqm_att.chunk_num))
+ return;
+
+ mqm_eqm_free_page_mem(hwdev);
+ kfree(hwdev->mqm_att.brm_srch_page_addr);
+
+ ret = mqm_eqm_set_cfg_2_hw(hwdev, 0);
+ if (ret) {
+ sdk_err(hwdev->dev_hdl, "Set mqm eqm cfg to chip fail! err: %d\n",
+ ret);
+ return;
+ }
+
+ hwdev->mqm_att.chunk_num = 0;
+ hwdev->mqm_att.search_gpa_num = 0;
+ hwdev->mqm_att.page_num = 0;
+ hwdev->mqm_att.page_size = 0;
+}
+
+int hinic3_ppf_ext_db_init(struct hinic3_hwdev *hwdev)
+{
+ int ret;
+
+ ret = mqm_eqm_init(hwdev);
+ if (ret) {
+ sdk_err(hwdev->dev_hdl, "MQM eqm init fail!\n");
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_ppf_ext_db_deinit(struct hinic3_hwdev *hwdev)
+{
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hwdev->hwif->attr.func_type != TYPE_PPF)
+ return 0;
+
+ mqm_eqm_deinit(hwdev);
+
+ return 0;
+}
+
+#define HINIC3_FLR_TIMEOUT 1000
+
+static enum hinic3_wait_return check_flr_finish_handler(void *priv_data)
+{
+ struct hinic3_hwif *hwif = priv_data;
+ enum hinic3_pf_status status;
+
+ status = hinic3_get_pf_status(hwif);
+ if (status == HINIC3_PF_STATUS_FLR_FINISH_FLAG) {
+ hinic3_set_pf_status(hwif, HINIC3_PF_STATUS_ACTIVE_FLAG);
+ return WAIT_PROCESS_CPL;
+ }
+
+ return WAIT_PROCESS_WAITING;
+}
+
+static int wait_for_flr_finish(struct hinic3_hwif *hwif)
+{
+ return hinic3_wait_for_timeout(hwif, check_flr_finish_handler,
+ HINIC3_FLR_TIMEOUT, 0xa * USEC_PER_MSEC);
+}
+
+#define HINIC3_WAIT_CMDQ_IDLE_TIMEOUT 5000
+
+static enum hinic3_wait_return check_cmdq_stop_handler(void *priv_data)
+{
+ struct hinic3_hwdev *hwdev = priv_data;
+ struct hinic3_cmdqs *cmdqs = hwdev->cmdqs;
+ enum hinic3_cmdq_type cmdq_type;
+
+ /* Stop waiting when card unpresent */
+ if (!hwdev->chip_present_flag)
+ return WAIT_PROCESS_CPL;
+
+ cmdq_type = HINIC3_CMDQ_SYNC;
+ for (; cmdq_type < cmdqs->cmdq_num; cmdq_type++) {
+ if (!hinic3_cmdq_idle(&cmdqs->cmdq[cmdq_type]))
+ return WAIT_PROCESS_WAITING;
+ }
+
+ return WAIT_PROCESS_CPL;
+}
+
+static int wait_cmdq_stop(struct hinic3_hwdev *hwdev)
+{
+ enum hinic3_cmdq_type cmdq_type;
+ struct hinic3_cmdqs *cmdqs = hwdev->cmdqs;
+ int err;
+
+ if (!(cmdqs->status & HINIC3_CMDQ_ENABLE))
+ return 0;
+
+ cmdqs->status &= ~HINIC3_CMDQ_ENABLE;
+
+ err = hinic3_wait_for_timeout(hwdev, check_cmdq_stop_handler,
+ HINIC3_WAIT_CMDQ_IDLE_TIMEOUT,
+ USEC_PER_MSEC);
+ if (err == 0)
+ return 0;
+
+ cmdq_type = HINIC3_CMDQ_SYNC;
+ for (; cmdq_type < cmdqs->cmdq_num; cmdq_type++) {
+ if (!hinic3_cmdq_idle(&cmdqs->cmdq[cmdq_type]))
+ sdk_err(hwdev->dev_hdl, "Cmdq %d is busy\n", cmdq_type);
+ }
+
+ cmdqs->status |= HINIC3_CMDQ_ENABLE;
+
+ return err;
+}
+
+static int hinic3_rx_tx_flush(struct hinic3_hwdev *hwdev, u16 channel)
+{
+ struct hinic3_hwif *hwif = hwdev->hwif;
+ struct comm_cmd_clear_doorbell clear_db;
+ struct comm_cmd_clear_resource clr_res;
+ u16 out_size;
+ int err;
+ int ret = 0;
+
+ if (HINIC3_FUNC_TYPE(hwdev) != TYPE_VF)
+ msleep(100); /* wait ucode 100 ms stop I/O */
+
+ err = wait_cmdq_stop(hwdev);
+ if (err != 0) {
+ sdk_warn(hwdev->dev_hdl, "CMDQ is still working, please check CMDQ timeout value is reasonable\n");
+ ret = err;
+ }
+
+ hinic3_disable_doorbell(hwif);
+
+ out_size = sizeof(clear_db);
+ memset(&clear_db, 0, sizeof(clear_db));
+ clear_db.func_id = HINIC3_HWIF_GLOBAL_IDX(hwif);
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_FLUSH_DOORBELL,
+ &clear_db, sizeof(clear_db),
+ &clear_db, &out_size, channel);
+ if (err != 0 || !out_size || clear_db.head.status) {
+ sdk_warn(hwdev->dev_hdl, "Failed to flush doorbell, err: %d, status: 0x%x, out_size: 0x%x, channel: 0x%x\n",
+ err, clear_db.head.status, out_size, channel);
+ if (err != 0)
+ ret = err;
+ else
+ ret = -EFAULT;
+ }
+
+ if (HINIC3_FUNC_TYPE(hwdev) != TYPE_VF)
+ hinic3_set_pf_status(hwif, HINIC3_PF_STATUS_FLR_START_FLAG);
+ else
+ msleep(100); /* wait ucode 100 ms stop I/O */
+
+ memset(&clr_res, 0, sizeof(clr_res));
+ clr_res.func_id = HINIC3_HWIF_GLOBAL_IDX(hwif);
+
+ err = hinic3_msg_to_mgmt_no_ack(hwdev, HINIC3_MOD_COMM,
+ COMM_MGMT_CMD_START_FLUSH, &clr_res,
+ sizeof(clr_res), channel);
+ if (err != 0) {
+ sdk_warn(hwdev->dev_hdl, "Failed to notice flush message, err: %d, channel: 0x%x\n",
+ err, channel);
+ ret = err;
+ }
+
+ if (HINIC3_FUNC_TYPE(hwdev) != TYPE_VF) {
+ err = wait_for_flr_finish(hwif);
+ if (err != 0) {
+ sdk_warn(hwdev->dev_hdl, "Wait firmware FLR timeout\n");
+ ret = err;
+ }
+ }
+
+ hinic3_enable_doorbell(hwif);
+
+ err = hinic3_reinit_cmdq_ctxts(hwdev);
+ if (err != 0) {
+ sdk_warn(hwdev->dev_hdl, "Failed to reinit cmdq\n");
+ ret = err;
+ }
+
+ return ret;
+}
+
+int hinic3_func_rx_tx_flush(void *hwdev, u16 channel)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (dev->chip_present_flag == 0)
+ return 0;
+
+ return hinic3_rx_tx_flush(dev, channel);
+}
+EXPORT_SYMBOL(hinic3_func_rx_tx_flush);
+
+int hinic3_get_board_info(void *hwdev, struct hinic3_board_info *info,
+ u16 channel)
+{
+ struct comm_cmd_board_info board_info;
+ u16 out_size = sizeof(board_info);
+ int err;
+
+ if (!hwdev || !info)
+ return -EINVAL;
+
+ memset(&board_info, 0, sizeof(board_info));
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_GET_BOARD_INFO,
+ &board_info, sizeof(board_info),
+ &board_info, &out_size, channel);
+ if (err || board_info.head.status || !out_size) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to get board info, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, board_info.head.status, out_size, channel);
+ return -EIO;
+ }
+
+ memcpy(info, &board_info.info, sizeof(*info));
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_board_info);
+
+int hinic3_get_hw_pf_infos(void *hwdev, struct hinic3_hw_pf_infos *infos,
+ u16 channel)
+{
+ struct comm_cmd_hw_pf_infos *pf_infos = NULL;
+ u16 out_size = sizeof(*pf_infos);
+ int err = 0;
+
+ if (!hwdev || !infos)
+ return -EINVAL;
+
+ pf_infos = kzalloc(sizeof(*pf_infos), GFP_KERNEL);
+ if (!pf_infos)
+ return -ENOMEM;
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev, COMM_MGMT_CMD_GET_HW_PF_INFOS,
+ pf_infos, sizeof(*pf_infos),
+ pf_infos, &out_size, channel);
+ if (pf_infos->head.status || err || !out_size) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to get hw pf information, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n",
+ err, pf_infos->head.status, out_size, channel);
+ err = -EIO;
+ goto free_buf;
+ }
+
+ memcpy(infos, &pf_infos->infos, sizeof(*infos));
+
+free_buf:
+ kfree(pf_infos);
+ return err;
+}
+EXPORT_SYMBOL(hinic3_get_hw_pf_infos);
+
+int hinic3_get_global_attr(void *hwdev, struct comm_global_attr *attr)
+{
+ struct comm_cmd_get_glb_attr get_attr;
+ u16 out_size = sizeof(get_attr);
+ int err = 0;
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_GET_GLOBAL_ATTR,
+ &get_attr, sizeof(get_attr), &get_attr,
+ &out_size);
+ if (err || !out_size || get_attr.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to get global attribute, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, get_attr.head.status, out_size);
+ return -EIO;
+ }
+
+ memcpy(attr, &get_attr.attr, sizeof(*attr));
+
+ return 0;
+}
+
+int hinic3_set_func_svc_used_state(void *hwdev, u16 svc_type, u8 state,
+ u16 channel)
+{
+ struct comm_cmd_func_svc_used_state used_state;
+ u16 out_size = sizeof(used_state);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ memset(&used_state, 0, sizeof(used_state));
+ used_state.func_id = hinic3_global_func_id(hwdev);
+ used_state.svc_type = svc_type;
+ used_state.used_state = state;
+
+ err = comm_msg_to_mgmt_sync_ch(hwdev,
+ COMM_MGMT_CMD_SET_FUNC_SVC_USED_STATE,
+ &used_state, sizeof(used_state),
+ &used_state, &out_size, channel);
+ if (err || !out_size || used_state.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to set func service used state, err: %d, status: 0x%x, out size: 0x%x, channel: 0x%x\n\n",
+ err, used_state.head.status, out_size, channel);
+ return -EIO;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_func_svc_used_state);
+
+int hinic3_get_sml_table_info(void *hwdev, u32 tbl_id, u8 *node_id, u8 *instance_id)
+{
+ struct sml_table_id_info sml_table[TABLE_INDEX_MAX];
+ struct comm_cmd_get_sml_tbl_data sml_tbl;
+ u16 out_size = sizeof(sml_tbl);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (tbl_id >= TABLE_INDEX_MAX) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl, "sml table index out of range [0, %u]",
+ TABLE_INDEX_MAX - 1);
+ return -EINVAL;
+ }
+
+ err = comm_msg_to_mgmt_sync(hwdev, COMM_MGMT_CMD_GET_SML_TABLE_INFO,
+ &sml_tbl, sizeof(sml_tbl), &sml_tbl, &out_size);
+ if (err || !out_size || sml_tbl.head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to get sml table information, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, sml_tbl.head.status, out_size);
+ return -EIO;
+ }
+
+ memcpy(sml_table, sml_tbl.tbl_data, sizeof(sml_table));
+
+ *node_id = sml_table[tbl_id].node_id;
+ *instance_id = sml_table[tbl_id].instance_id;
+
+ return 0;
+}
+
+int hinic3_activate_firmware(void *hwdev, u8 cfg_index)
+{
+ struct hinic3_cmd_activate_firmware activate_msg;
+ u16 out_size = sizeof(activate_msg);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_func_type(hwdev) != TYPE_PF)
+ return -EOPNOTSUPP;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&activate_msg, 0, sizeof(activate_msg));
+ activate_msg.index = cfg_index;
+
+ err = hinic3_pf_to_mgmt_sync(hwdev, HINIC3_MOD_COMM, COMM_MGMT_CMD_ACTIVE_FW,
+ &activate_msg, sizeof(activate_msg),
+ &activate_msg, &out_size, FW_UPDATE_MGMT_TIMEOUT);
+ if (err || !out_size || activate_msg.msg_head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to activate firmware, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, activate_msg.msg_head.status, out_size);
+ err = activate_msg.msg_head.status ? activate_msg.msg_head.status : -EIO;
+ return err;
+ }
+
+ return 0;
+}
+
+int hinic3_switch_config(void *hwdev, u8 cfg_index)
+{
+ struct hinic3_cmd_switch_config switch_cfg;
+ u16 out_size = sizeof(switch_cfg);
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_func_type(hwdev) != TYPE_PF)
+ return -EOPNOTSUPP;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ memset(&switch_cfg, 0, sizeof(switch_cfg));
+ switch_cfg.index = cfg_index;
+
+ err = hinic3_pf_to_mgmt_sync(hwdev, HINIC3_MOD_COMM, COMM_MGMT_CMD_SWITCH_CFG,
+ &switch_cfg, sizeof(switch_cfg),
+ &switch_cfg, &out_size, FW_UPDATE_MGMT_TIMEOUT);
+ if (err || !out_size || switch_cfg.msg_head.status) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Failed to switch cfg, err: %d, status: 0x%x, out size: 0x%x\n",
+ err, switch_cfg.msg_head.status, out_size);
+ err = switch_cfg.msg_head.status ? switch_cfg.msg_head.status : -EIO;
+ return err;
+ }
+
+ return 0;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.h
new file mode 100644
index 000000000000..be9e4a6b24f9
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_COMM_H
+#define HINIC3_COMM_H
+
+#include <linux/types.h>
+
+#include "comm_msg_intf.h"
+#include "hinic3_hwdev.h"
+
+#define MSG_TO_MGMT_SYNC_RETURN_ERR(err, out_size, status) \
+ ((err) || (status) || !(out_size))
+
+#define HINIC3_PAGE_SIZE_HW(pg_size) ((u8)ilog2((u32)((pg_size) >> 12)))
+
+enum func_tmr_bitmap_status {
+ FUNC_TMR_BITMAP_DISABLE,
+ FUNC_TMR_BITMAP_ENABLE,
+};
+
+enum ppf_tmr_status {
+ HINIC_PPF_TMR_FLAG_STOP,
+ HINIC_PPF_TMR_FLAG_START,
+};
+
+#define HINIC3_HT_GPA_PAGE_SIZE 4096UL
+#define HINIC3_PPF_HT_GPA_SET_RETRY_TIMES 10
+
+int hinic3_set_cmdq_depth(void *hwdev, u16 cmdq_depth);
+
+int hinic3_set_cmdq_ctxt(struct hinic3_hwdev *hwdev, u8 cmdq_id,
+ struct cmdq_ctxt_info *ctxt);
+
+int hinic3_ppf_ext_db_init(struct hinic3_hwdev *hwdev);
+
+int hinic3_ppf_ext_db_deinit(struct hinic3_hwdev *hwdev);
+
+int hinic3_set_ceq_ctrl_reg(struct hinic3_hwdev *hwdev, u16 q_id,
+ u32 ctrl0, u32 ctrl1);
+
+int hinic3_set_dma_attr_tbl(struct hinic3_hwdev *hwdev, u8 entry_idx, u8 st, u8 at, u8 ph,
+ u8 no_snooping, u8 tph_en);
+
+int hinic3_get_comm_features(void *hwdev, u64 *s_feature, u16 size);
+int hinic3_set_comm_features(void *hwdev, u64 *s_feature, u16 size);
+
+int hinic3_comm_channel_detect(struct hinic3_hwdev *hwdev);
+
+int hinic3_get_global_attr(void *hwdev, struct comm_global_attr *attr);
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.c
new file mode 100644
index 000000000000..79e4dacbd0c9
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.c
@@ -0,0 +1,599 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#include "ossl_knl.h"
+#include "hinic3_mt.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_comm_cmd.h"
+#include "hinic3_hw_mt.h"
+
+#define HINIC3_CMDQ_BUF_MAX_SIZE 2048U
+#define DW_WIDTH 4
+
+#define MSG_MAX_IN_SIZE (2048 * 1024)
+#define MSG_MAX_OUT_SIZE (2048 * 1024)
+
+/* completion timeout interval, unit is millisecond */
+#define MGMT_MSG_UPDATE_TIMEOUT 50000U
+
+void free_buff_in(void *hwdev, const struct msg_module *nt_msg, void *buf_in)
+{
+ if (!buf_in)
+ return;
+
+ if (nt_msg->module == SEND_TO_NPU)
+ hinic3_free_cmd_buf(hwdev, buf_in);
+ else
+ kfree(buf_in);
+}
+
+void free_buff_out(void *hwdev, struct msg_module *nt_msg,
+ void *buf_out)
+{
+ if (!buf_out)
+ return;
+
+ if (nt_msg->module == SEND_TO_NPU &&
+ !nt_msg->npu_cmd.direct_resp)
+ hinic3_free_cmd_buf(hwdev, buf_out);
+ else
+ kfree(buf_out);
+}
+
+int alloc_buff_in(void *hwdev, struct msg_module *nt_msg,
+ u32 in_size, void **buf_in)
+{
+ void *msg_buf = NULL;
+
+ if (!in_size)
+ return 0;
+
+ if (nt_msg->module == SEND_TO_NPU) {
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+
+ if (in_size > HINIC3_CMDQ_BUF_MAX_SIZE) {
+ pr_err("Cmdq in size(%u) more than 2KB\n", in_size);
+ return -ENOMEM;
+ }
+
+ cmd_buf = hinic3_alloc_cmd_buf(hwdev);
+ if (!cmd_buf) {
+ pr_err("Alloc cmdq cmd buffer failed in %s\n",
+ __func__);
+ return -ENOMEM;
+ }
+ msg_buf = cmd_buf->buf;
+ *buf_in = (void *)cmd_buf;
+ cmd_buf->size = (u16)in_size;
+ } else {
+ if (in_size > MSG_MAX_IN_SIZE) {
+ pr_err("In size(%u) more than 2M\n", in_size);
+ return -ENOMEM;
+ }
+ msg_buf = kzalloc(in_size, GFP_KERNEL);
+ *buf_in = msg_buf;
+ }
+ if (!(*buf_in)) {
+ pr_err("Alloc buffer in failed\n");
+ return -ENOMEM;
+ }
+
+ if (copy_from_user(msg_buf, nt_msg->in_buf, in_size)) {
+ pr_err("%s:%d: Copy from user failed\n",
+ __func__, __LINE__);
+ free_buff_in(hwdev, nt_msg, *buf_in);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int alloc_buff_out(void *hwdev, struct msg_module *nt_msg,
+ u32 out_size, void **buf_out)
+{
+ if (!out_size)
+ return 0;
+
+ if (nt_msg->module == SEND_TO_NPU &&
+ !nt_msg->npu_cmd.direct_resp) {
+ struct hinic3_cmd_buf *cmd_buf = NULL;
+
+ if (out_size > HINIC3_CMDQ_BUF_MAX_SIZE) {
+ pr_err("Cmdq out size(%u) more than 2KB\n", out_size);
+ return -ENOMEM;
+ }
+
+ cmd_buf = hinic3_alloc_cmd_buf(hwdev);
+ *buf_out = (void *)cmd_buf;
+ } else {
+ if (out_size > MSG_MAX_OUT_SIZE) {
+ pr_err("out size(%u) more than 2M\n", out_size);
+ return -ENOMEM;
+ }
+ *buf_out = kzalloc(out_size, GFP_KERNEL);
+ }
+ if (!(*buf_out)) {
+ pr_err("Alloc buffer out failed\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+int copy_buf_out_to_user(struct msg_module *nt_msg,
+ u32 out_size, void *buf_out)
+{
+ int ret = 0;
+ void *msg_out = NULL;
+
+ if (nt_msg->module == SEND_TO_NPU &&
+ !nt_msg->npu_cmd.direct_resp)
+ msg_out = ((struct hinic3_cmd_buf *)buf_out)->buf;
+ else
+ msg_out = buf_out;
+
+ if (copy_to_user(nt_msg->out_buf, msg_out, out_size))
+ ret = -EFAULT;
+
+ return ret;
+}
+
+int get_func_type(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ u16 func_type;
+
+ if (*out_size != sizeof(u16) || !buf_out) {
+ pr_err("Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(u16));
+ return -EFAULT;
+ }
+
+ func_type = hinic3_func_type(hinic3_get_sdk_hwdev_by_lld(lld_dev));
+
+ *(u16 *)buf_out = func_type;
+ return 0;
+}
+
+int get_func_id(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ u16 func_id;
+
+ if (*out_size != sizeof(u16) || !buf_out) {
+ pr_err("Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(u16));
+ return -EFAULT;
+ }
+
+ func_id = hinic3_global_func_id(hinic3_get_sdk_hwdev_by_lld(lld_dev));
+ *(u16 *)buf_out = func_id;
+
+ return 0;
+}
+
+int get_hw_driver_stats(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ return hinic3_dbg_get_hw_stats(hinic3_get_sdk_hwdev_by_lld(lld_dev),
+ buf_out, (u16 *)out_size);
+}
+
+int clear_hw_driver_stats(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ u16 size;
+
+ size = hinic3_dbg_clear_hw_stats(hinic3_get_sdk_hwdev_by_lld(lld_dev));
+ if (*out_size != size) {
+ pr_err("Unexpect out buf size from user :%u, expect: %u\n",
+ *out_size, size);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int get_self_test_result(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ u32 result;
+
+ if (*out_size != sizeof(u32) || !buf_out) {
+ pr_err("Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(u32));
+ return -EFAULT;
+ }
+
+ result = hinic3_get_self_test_result(hinic3_get_sdk_hwdev_by_lld(lld_dev));
+ *(u32 *)buf_out = result;
+
+ return 0;
+}
+
+int get_chip_faults_stats(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ u32 offset = 0;
+ struct nic_cmd_chip_fault_stats *fault_info = NULL;
+
+ if (!buf_in || !buf_out || *out_size != sizeof(*fault_info) ||
+ in_size != sizeof(*fault_info)) {
+ pr_err("Unexpect out buf size from user: %d, expect: %lu\n",
+ *out_size, sizeof(*fault_info));
+ return -EFAULT;
+ }
+ fault_info = (struct nic_cmd_chip_fault_stats *)buf_in;
+ offset = fault_info->offset;
+
+ fault_info = (struct nic_cmd_chip_fault_stats *)buf_out;
+ hinic3_get_chip_fault_stats(hinic3_get_sdk_hwdev_by_lld(lld_dev),
+ fault_info->chip_fault_stats, offset);
+
+ return 0;
+}
+
+static u32 get_up_timeout_val(enum hinic3_mod_type mod, u16 cmd)
+{
+ if (mod == HINIC3_MOD_COMM &&
+ (cmd == COMM_MGMT_CMD_UPDATE_FW ||
+ cmd == COMM_MGMT_CMD_UPDATE_BIOS ||
+ cmd == COMM_MGMT_CMD_ACTIVE_FW ||
+ cmd == COMM_MGMT_CMD_SWITCH_CFG ||
+ cmd == COMM_MGMT_CMD_HOT_ACTIVE_FW))
+ return MGMT_MSG_UPDATE_TIMEOUT;
+
+ return 0; /* use default mbox/apichain timeout time */
+}
+
+static int api_csr_read(void *hwdev, struct msg_module *nt_msg,
+ void *buf_in, u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct up_log_msg_st *up_log_msg = (struct up_log_msg_st *)buf_in;
+ int ret = 0;
+ u32 rd_len;
+ u32 rd_addr;
+ u32 rd_cnt = 0;
+ u32 offset = 0;
+ u8 node_id;
+ u32 i;
+
+ if (!buf_in || !buf_out || in_size != sizeof(*up_log_msg) ||
+ *out_size != up_log_msg->rd_len || up_log_msg->rd_len % DW_WIDTH != 0)
+ return -EINVAL;
+
+ rd_len = up_log_msg->rd_len;
+ rd_addr = up_log_msg->addr;
+ node_id = (u8)nt_msg->mpu_cmd.mod;
+
+ rd_cnt = rd_len / DW_WIDTH;
+
+ for (i = 0; i < rd_cnt; i++) {
+ ret = hinic3_api_csr_rd32(hwdev, node_id,
+ rd_addr + offset,
+ (u32 *)(((u8 *)buf_out) + offset));
+ if (ret) {
+ pr_err("Csr rd fail, err: %d, node_id: %u, csr addr: 0x%08x\n",
+ ret, node_id, rd_addr + offset);
+ return ret;
+ }
+ offset += DW_WIDTH;
+ }
+ *out_size = rd_len;
+
+ return ret;
+}
+
+static int api_csr_write(void *hwdev, struct msg_module *nt_msg,
+ void *buf_in, u32 in_size, void *buf_out,
+ u32 *out_size)
+{
+ struct csr_write_st *csr_write_msg = (struct csr_write_st *)buf_in;
+ int ret = 0;
+ u32 rd_len;
+ u32 rd_addr;
+ u32 rd_cnt = 0;
+ u32 offset = 0;
+ u8 node_id;
+ u32 i;
+ u8 *data = NULL;
+
+ if (!buf_in || in_size != sizeof(*csr_write_msg) || csr_write_msg->rd_len % DW_WIDTH != 0)
+ return -EINVAL;
+
+ rd_len = csr_write_msg->rd_len;
+ rd_addr = csr_write_msg->addr;
+ node_id = (u8)nt_msg->mpu_cmd.mod;
+
+ rd_cnt = rd_len / DW_WIDTH;
+
+ data = kzalloc(rd_len, GFP_KERNEL);
+ if (!data) {
+ pr_err("No more memory\n");
+ return -EFAULT;
+ }
+ if (copy_from_user(data, (void *)csr_write_msg->data, rd_len)) {
+ pr_err("Copy information from user failed\n");
+ kfree(data);
+ return -EFAULT;
+ }
+
+ for (i = 0; i < rd_cnt; i++) {
+ ret = hinic3_api_csr_wr32(hwdev, node_id,
+ rd_addr + offset,
+ *((u32 *)(data + offset)));
+ if (ret) {
+ pr_err("Csr wr fail, ret: %d, node_id: %u, csr addr: 0x%08x\n",
+ ret, rd_addr + offset, node_id);
+ kfree(data);
+ return ret;
+ }
+ offset += DW_WIDTH;
+ }
+
+ *out_size = 0;
+ kfree(data);
+ return ret;
+}
+
+int send_to_mpu(void *hwdev, struct msg_module *nt_msg,
+ void *buf_in, u32 in_size, void *buf_out, u32 *out_size)
+{
+ enum hinic3_mod_type mod;
+ u32 timeout;
+ int ret = 0;
+ u16 cmd;
+
+ mod = (enum hinic3_mod_type)nt_msg->mpu_cmd.mod;
+ cmd = nt_msg->mpu_cmd.cmd;
+
+ if (nt_msg->mpu_cmd.api_type == API_TYPE_MBOX || nt_msg->mpu_cmd.api_type == API_TYPE_CLP) {
+ timeout = get_up_timeout_val(mod, cmd);
+
+ if (nt_msg->mpu_cmd.api_type == API_TYPE_MBOX)
+ ret = hinic3_msg_to_mgmt_sync(hwdev, mod, cmd, buf_in, (u16)in_size,
+ buf_out, (u16 *)out_size, timeout,
+ HINIC3_CHANNEL_DEFAULT);
+ else
+ ret = hinic3_clp_to_mgmt(hwdev, mod, cmd, buf_in, (u16)in_size,
+ buf_out, (u16 *)out_size);
+ if (ret) {
+ pr_err("Message to mgmt cpu return fail, mod: %d, cmd: %u\n", mod, cmd);
+ return ret;
+ }
+ } else if (nt_msg->mpu_cmd.api_type == API_TYPE_API_CHAIN_BYPASS) {
+ if (nt_msg->mpu_cmd.cmd == API_CSR_WRITE)
+ return api_csr_write(hwdev, nt_msg, buf_in, in_size, buf_out, out_size);
+
+ ret = api_csr_read(hwdev, nt_msg, buf_in, in_size, buf_out, out_size);
+ } else if (nt_msg->mpu_cmd.api_type == API_TYPE_API_CHAIN_TO_MPU) {
+ timeout = get_up_timeout_val(mod, cmd);
+ if (hinic3_pcie_itf_id(hwdev) != SPU_HOST_ID)
+ ret = hinic3_msg_to_mgmt_api_chain_sync(hwdev, mod, cmd, buf_in,
+ (u16)in_size, buf_out,
+ (u16 *)out_size, timeout);
+ else
+ ret = hinic3_msg_to_mgmt_sync(hwdev, mod, cmd, buf_in, (u16)in_size,
+ buf_out, (u16 *)out_size, timeout,
+ HINIC3_CHANNEL_DEFAULT);
+ if (ret) {
+ pr_err("Message to mgmt api chain cpu return fail, mod: %d, cmd: %u\n",
+ mod, cmd);
+ return ret;
+ }
+ } else {
+ pr_err("Unsupported api_type %d\n", nt_msg->mpu_cmd.api_type);
+ return -EINVAL;
+ }
+
+ return ret;
+}
+
+int send_to_npu(void *hwdev, struct msg_module *nt_msg,
+ void *buf_in, u32 in_size, void *buf_out, u32 *out_size)
+{
+ int ret = 0;
+ u8 cmd;
+ enum hinic3_mod_type mod;
+
+ mod = (enum hinic3_mod_type)nt_msg->npu_cmd.mod;
+ cmd = nt_msg->npu_cmd.cmd;
+
+ if (nt_msg->npu_cmd.direct_resp) {
+ ret = hinic3_cmdq_direct_resp(hwdev, mod, cmd,
+ buf_in, buf_out, 0,
+ HINIC3_CHANNEL_DEFAULT);
+ if (ret)
+ pr_err("Send direct cmdq failed, err: %d\n", ret);
+ } else {
+ ret = hinic3_cmdq_detail_resp(hwdev, mod, cmd, buf_in, buf_out,
+ NULL, 0, HINIC3_CHANNEL_DEFAULT);
+ if (ret)
+ pr_err("Send detail cmdq failed, err: %d\n", ret);
+ }
+
+ return ret;
+}
+
+static int sm_rd16(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out)
+{
+ u16 val1;
+ int ret;
+
+ ret = hinic3_sm_ctr_rd16(hwdev, node, instance, id, &val1);
+ if (ret != 0) {
+ pr_err("Get sm ctr information (16 bits)failed!\n");
+ val1 = 0xffff;
+ }
+
+ buf_out->val1 = val1;
+
+ return ret;
+}
+
+static int sm_rd32(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out)
+{
+ u32 val1;
+ int ret;
+
+ ret = hinic3_sm_ctr_rd32(hwdev, node, instance, id, &val1);
+ if (ret) {
+ pr_err("Get sm ctr information (32 bits)failed!\n");
+ val1 = ~0;
+ }
+
+ buf_out->val1 = val1;
+
+ return ret;
+}
+
+static int sm_rd32_clear(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out)
+{
+ u32 val1;
+ int ret;
+
+ ret = hinic3_sm_ctr_rd32_clear(hwdev, node, instance, id, &val1);
+ if (ret) {
+ pr_err("Get sm ctr clear information(32 bits) failed!\n");
+ val1 = ~0;
+ }
+
+ buf_out->val1 = val1;
+
+ return ret;
+}
+
+static int sm_rd64_pair(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out)
+{
+ u64 val1 = 0, val2 = 0;
+ int ret;
+
+ ret = hinic3_sm_ctr_rd64_pair(hwdev, node, instance, id, &val1, &val2);
+ if (ret) {
+ pr_err("Get sm ctr information (64 bits pair)failed!\n");
+ val1 = ~0;
+ val2 = ~0;
+ }
+
+ buf_out->val1 = val1;
+ buf_out->val2 = val2;
+
+ return ret;
+}
+
+static int sm_rd64_pair_clear(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out)
+{
+ u64 val1 = 0;
+ u64 val2 = 0;
+ int ret;
+
+ ret = hinic3_sm_ctr_rd64_pair_clear(hwdev, node, instance, id, &val1,
+ &val2);
+ if (ret) {
+ pr_err("Get sm ctr clear information(64 bits pair) failed!\n");
+ val1 = ~0;
+ val2 = ~0;
+ }
+
+ buf_out->val1 = val1;
+ buf_out->val2 = val2;
+
+ return ret;
+}
+
+static int sm_rd64(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out)
+{
+ u64 val1;
+ int ret;
+
+ ret = hinic3_sm_ctr_rd64(hwdev, node, instance, id, &val1);
+ if (ret) {
+ pr_err("Get sm ctr information (64 bits)failed!\n");
+ val1 = ~0;
+ }
+ buf_out->val1 = val1;
+
+ return ret;
+}
+
+static int sm_rd64_clear(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out)
+{
+ u64 val1;
+ int ret;
+
+ ret = hinic3_sm_ctr_rd64_clear(hwdev, node, instance, id, &val1);
+ if (ret) {
+ pr_err("Get sm ctr clear information(64 bits) failed!\n");
+ val1 = ~0;
+ }
+ buf_out->val1 = val1;
+
+ return ret;
+}
+
+typedef int (*sm_module)(void *hwdev, u32 id, u8 instance,
+ u8 node, struct sm_out_st *buf_out);
+
+struct sm_module_handle {
+ enum sm_cmd_type sm_cmd_name;
+ sm_module sm_func;
+};
+
+const struct sm_module_handle sm_module_cmd_handle[] = {
+ {SM_CTR_RD16, sm_rd16},
+ {SM_CTR_RD32, sm_rd32},
+ {SM_CTR_RD64_PAIR, sm_rd64_pair},
+ {SM_CTR_RD64, sm_rd64},
+ {SM_CTR_RD32_CLEAR, sm_rd32_clear},
+ {SM_CTR_RD64_PAIR_CLEAR, sm_rd64_pair_clear},
+ {SM_CTR_RD64_CLEAR, sm_rd64_clear}
+};
+
+int send_to_sm(void *hwdev, struct msg_module *nt_msg,
+ void *buf_in, u32 in_size, void *buf_out, u32 *out_size)
+{
+ struct sm_in_st *sm_in = buf_in;
+ struct sm_out_st *sm_out = buf_out;
+ u32 msg_formate = nt_msg->msg_formate;
+ int index, num_cmds = sizeof(sm_module_cmd_handle) /
+ sizeof(sm_module_cmd_handle[0]);
+ int ret = 0;
+
+ if (!buf_in || !buf_out || in_size != sizeof(*sm_in) || *out_size != sizeof(*sm_out)) {
+ pr_err("Unexpect out buf size :%u, in buf size: %u\n",
+ *out_size, in_size);
+ return -EINVAL;
+ }
+
+ for (index = 0; index < num_cmds; index++) {
+ if (msg_formate != sm_module_cmd_handle[index].sm_cmd_name)
+ continue;
+
+ ret = sm_module_cmd_handle[index].sm_func(hwdev, (u32)sm_in->id,
+ (u8)sm_in->instance,
+ (u8)sm_in->node, sm_out);
+ break;
+ }
+
+ if (index == num_cmds) {
+ pr_err("Can't find callback for %d\n", msg_formate);
+ return -EINVAL;
+ }
+
+ if (ret != 0)
+ pr_err("Get sm information fail, id:%u, instance:%u, node:%u\n",
+ sm_in->id, sm_in->instance, sm_in->node);
+
+ *out_size = sizeof(struct sm_out_st);
+
+ return ret;
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.h
new file mode 100644
index 000000000000..9330200823b9
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_HW_MT_H
+#define HINIC3_HW_MT_H
+
+#include "hinic3_lld.h"
+
+struct sm_in_st {
+ int node;
+ int id;
+ int instance;
+};
+
+struct sm_out_st {
+ u64 val1;
+ u64 val2;
+};
+
+struct up_log_msg_st {
+ u32 rd_len;
+ u32 addr;
+};
+
+struct csr_write_st {
+ u32 rd_len;
+ u32 addr;
+ u8 *data;
+};
+
+int get_func_type(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+int get_func_id(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+int get_hw_driver_stats(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+int clear_hw_driver_stats(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+int get_self_test_result(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+int get_chip_faults_stats(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.c
new file mode 100644
index 000000000000..2d29290f59e9
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.c
@@ -0,0 +1,2141 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/time.h>
+#include <linux/timex.h>
+#include <linux/rtc.h>
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/completion.h>
+#include <linux/semaphore.h>
+#include <linux/interrupt.h>
+#include <linux/vmalloc.h>
+
+#include "ossl_knl.h"
+#include "hinic3_mt.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_csr.h"
+#include "hinic3_hwif.h"
+#include "hinic3_eqs.h"
+#include "hinic3_api_cmd.h"
+#include "hinic3_mgmt.h"
+#include "hinic3_mbox.h"
+#include "hinic3_cmdq.h"
+#include "hinic3_hw_cfg.h"
+#include "hinic3_hw_comm.h"
+#include "hinic3_prof_adap.h"
+#include "hinic3_devlink.h"
+#include "hinic3_hwdev.h"
+
+static unsigned int wq_page_order = HINIC3_MAX_WQ_PAGE_SIZE_ORDER;
+module_param(wq_page_order, uint, 0444);
+MODULE_PARM_DESC(wq_page_order, "Set wq page size order, wq page size is 4K * (2 ^ wq_page_order) - default is 8");
+
+enum hinic3_pcie_nosnoop {
+ HINIC3_PCIE_SNOOP = 0,
+ HINIC3_PCIE_NO_SNOOP = 1,
+};
+
+enum hinic3_pcie_tph {
+ HINIC3_PCIE_TPH_DISABLE = 0,
+ HINIC3_PCIE_TPH_ENABLE = 1,
+};
+
+#define HINIC3_DMA_ATTR_INDIR_IDX_SHIFT 0
+
+#define HINIC3_DMA_ATTR_INDIR_IDX_MASK 0x3FF
+
+#define HINIC3_DMA_ATTR_INDIR_IDX_SET(val, member) \
+ (((u32)(val) & HINIC3_DMA_ATTR_INDIR_##member##_MASK) << \
+ HINIC3_DMA_ATTR_INDIR_##member##_SHIFT)
+
+#define HINIC3_DMA_ATTR_INDIR_IDX_CLEAR(val, member) \
+ ((val) & (~(HINIC3_DMA_ATTR_INDIR_##member##_MASK \
+ << HINIC3_DMA_ATTR_INDIR_##member##_SHIFT)))
+
+#define HINIC3_DMA_ATTR_ENTRY_ST_SHIFT 0
+#define HINIC3_DMA_ATTR_ENTRY_AT_SHIFT 8
+#define HINIC3_DMA_ATTR_ENTRY_PH_SHIFT 10
+#define HINIC3_DMA_ATTR_ENTRY_NO_SNOOPING_SHIFT 12
+#define HINIC3_DMA_ATTR_ENTRY_TPH_EN_SHIFT 13
+
+#define HINIC3_DMA_ATTR_ENTRY_ST_MASK 0xFF
+#define HINIC3_DMA_ATTR_ENTRY_AT_MASK 0x3
+#define HINIC3_DMA_ATTR_ENTRY_PH_MASK 0x3
+#define HINIC3_DMA_ATTR_ENTRY_NO_SNOOPING_MASK 0x1
+#define HINIC3_DMA_ATTR_ENTRY_TPH_EN_MASK 0x1
+
+#define HINIC3_DMA_ATTR_ENTRY_SET(val, member) \
+ (((u32)(val) & HINIC3_DMA_ATTR_ENTRY_##member##_MASK) << \
+ HINIC3_DMA_ATTR_ENTRY_##member##_SHIFT)
+
+#define HINIC3_DMA_ATTR_ENTRY_CLEAR(val, member) \
+ ((val) & (~(HINIC3_DMA_ATTR_ENTRY_##member##_MASK \
+ << HINIC3_DMA_ATTR_ENTRY_##member##_SHIFT)))
+
+#define HINIC3_PCIE_ST_DISABLE 0
+#define HINIC3_PCIE_AT_DISABLE 0
+#define HINIC3_PCIE_PH_DISABLE 0
+
+#define PCIE_MSIX_ATTR_ENTRY 0
+
+#define HINIC3_CHIP_PRESENT 1
+#define HINIC3_CHIP_ABSENT 0
+
+#define HINIC3_DEAULT_EQ_MSIX_PENDING_LIMIT 0
+#define HINIC3_DEAULT_EQ_MSIX_COALESC_TIMER_CFG 0xFF
+#define HINIC3_DEAULT_EQ_MSIX_RESEND_TIMER_CFG 7
+
+#define HINIC3_HWDEV_WQ_NAME "hinic3_hardware"
+#define HINIC3_WQ_MAX_REQ 10
+
+#define SLAVE_HOST_STATUS_CLEAR(host_id, val) ((val) & (~(1U << (host_id))))
+#define SLAVE_HOST_STATUS_SET(host_id, enable) (((u8)(enable) & 1U) << (host_id))
+#define SLAVE_HOST_STATUS_GET(host_id, val) (!!((val) & (1U << (host_id))))
+
+void set_slave_host_enable(void *hwdev, u8 host_id, bool enable)
+{
+ u32 reg_val;
+ struct hinic3_hwdev *dev = (struct hinic3_hwdev *)hwdev;
+
+ if (HINIC3_FUNC_TYPE(dev) != TYPE_PPF)
+ return;
+
+ reg_val = hinic3_hwif_read_reg(dev->hwif, HINIC3_MULT_HOST_SLAVE_STATUS_ADDR);
+
+ reg_val = SLAVE_HOST_STATUS_CLEAR(host_id, reg_val);
+ reg_val |= SLAVE_HOST_STATUS_SET(host_id, enable);
+ hinic3_hwif_write_reg(dev->hwif, HINIC3_MULT_HOST_SLAVE_STATUS_ADDR, reg_val);
+
+ sdk_info(dev->dev_hdl, "Set slave host %d status %d, reg value: 0x%x\n",
+ host_id, enable, reg_val);
+}
+
+int hinic3_get_slave_host_enable(void *hwdev, u8 host_id, u8 *slave_en)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ u32 reg_val;
+
+ if (HINIC3_FUNC_TYPE(dev) != TYPE_PPF) {
+ sdk_warn(dev->dev_hdl, "hwdev should be ppf\n");
+ return -EINVAL;
+ }
+
+ reg_val = hinic3_hwif_read_reg(dev->hwif, HINIC3_MULT_HOST_SLAVE_STATUS_ADDR);
+ *slave_en = SLAVE_HOST_STATUS_GET(host_id, reg_val);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_slave_host_enable);
+
+int hinic3_get_slave_bitmap(void *hwdev, u8 *slave_host_bitmap)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct service_cap *cap = &dev->cfg_mgmt->svc_cap;
+
+ if (HINIC3_FUNC_TYPE(dev) != TYPE_PPF) {
+ sdk_warn(dev->dev_hdl, "hwdev should be ppf\n");
+ return -EINVAL;
+ }
+
+ *slave_host_bitmap = cap->host_valid_bitmap & (~(1U << cap->master_host_id));
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_slave_bitmap);
+
+static void set_func_host_mode(struct hinic3_hwdev *hwdev, enum hinic3_func_mode mode)
+{
+ switch (mode) {
+ case FUNC_MOD_MULTI_BM_MASTER:
+ sdk_info(hwdev->dev_hdl, "Detect multi-host BM master host\n");
+ hwdev->func_mode = FUNC_MOD_MULTI_BM_MASTER;
+ break;
+ case FUNC_MOD_MULTI_BM_SLAVE:
+ sdk_info(hwdev->dev_hdl, "Detect multi-host BM slave host\n");
+ hwdev->func_mode = FUNC_MOD_MULTI_BM_SLAVE;
+ break;
+ case FUNC_MOD_MULTI_VM_MASTER:
+ sdk_info(hwdev->dev_hdl, "Detect multi-host VM master host\n");
+ hwdev->func_mode = FUNC_MOD_MULTI_VM_MASTER;
+ break;
+ case FUNC_MOD_MULTI_VM_SLAVE:
+ sdk_info(hwdev->dev_hdl, "Detect multi-host VM slave host\n");
+ hwdev->func_mode = FUNC_MOD_MULTI_VM_SLAVE;
+ break;
+ default:
+ hwdev->func_mode = FUNC_MOD_NORMAL_HOST;
+ break;
+ }
+}
+
+static void hinic3_init_host_mode_pre(struct hinic3_hwdev *hwdev)
+{
+ struct service_cap *cap = &hwdev->cfg_mgmt->svc_cap;
+ u8 host_id = hwdev->hwif->attr.pci_intf_idx;
+
+ if (HINIC3_FUNC_TYPE(hwdev) == TYPE_VF) {
+ set_func_host_mode(hwdev, FUNC_MOD_NORMAL_HOST);
+ return;
+ }
+
+ switch (cap->srv_multi_host_mode) {
+ case HINIC3_SDI_MODE_BM:
+ if (host_id == cap->master_host_id)
+ set_func_host_mode(hwdev, FUNC_MOD_MULTI_BM_MASTER);
+ else
+ set_func_host_mode(hwdev, FUNC_MOD_MULTI_BM_SLAVE);
+ break;
+ case HINIC3_SDI_MODE_VM:
+ if (host_id == cap->master_host_id)
+ set_func_host_mode(hwdev, FUNC_MOD_MULTI_VM_MASTER);
+ else
+ set_func_host_mode(hwdev, FUNC_MOD_MULTI_VM_SLAVE);
+ break;
+ default:
+ set_func_host_mode(hwdev, FUNC_MOD_NORMAL_HOST);
+ break;
+ }
+}
+
+static int hinic3_multi_host_init(struct hinic3_hwdev *hwdev)
+{
+ if (!IS_MULTI_HOST(hwdev) || !HINIC3_IS_PPF(hwdev))
+ return 0;
+
+ if (IS_SLAVE_HOST(hwdev))
+ set_slave_host_enable(hwdev, hinic3_pcie_itf_id(hwdev), true);
+
+ return 0;
+}
+
+static int hinic3_multi_host_free(struct hinic3_hwdev *hwdev)
+{
+ if (!IS_MULTI_HOST(hwdev) || !HINIC3_IS_PPF(hwdev))
+ return 0;
+
+ if (IS_SLAVE_HOST(hwdev))
+ set_slave_host_enable(hwdev, hinic3_pcie_itf_id(hwdev), false);
+
+ return 0;
+}
+
+static u8 hinic3_nic_sw_aeqe_handler(void *hwdev, u8 event, u8 *data)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev)
+ return 0;
+
+ sdk_err(dev->dev_hdl, "Received nic ucode aeq event type: 0x%x, data: 0x%llx\n",
+ event, *((u64 *)data));
+
+ if (event < HINIC3_NIC_FATAL_ERROR_MAX)
+ atomic_inc(&dev->hw_stats.nic_ucode_event_stats[event]);
+
+ return 0;
+}
+
+static void hinic3_init_heartbeat_detect(struct hinic3_hwdev *hwdev);
+static void hinic3_destroy_heartbeat_detect(struct hinic3_hwdev *hwdev);
+
+typedef void (*mgmt_event_cb)(void *handle, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size);
+
+struct mgmt_event_handle {
+ u16 cmd;
+ mgmt_event_cb proc;
+};
+
+static int pf_handle_vf_comm_mbox(void *pri_handle,
+ u16 vf_id, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size)
+{
+ struct hinic3_hwdev *hwdev = pri_handle;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ sdk_warn(hwdev->dev_hdl, "Unsupported vf mbox event %u to process\n",
+ cmd);
+
+ return 0;
+}
+
+static int vf_handle_pf_comm_mbox(void *pri_handle,
+ u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size)
+{
+ struct hinic3_hwdev *hwdev = pri_handle;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ sdk_warn(hwdev->dev_hdl, "Unsupported pf mbox event %u to process\n",
+ cmd);
+ return 0;
+}
+
+static void chip_fault_show(struct hinic3_hwdev *hwdev,
+ struct hinic3_fault_event *event)
+{
+ char fault_level[FAULT_LEVEL_MAX][FAULT_SHOW_STR_LEN + 1] = {
+ "fatal", "reset", "host", "flr", "general", "suggestion"};
+ char level_str[FAULT_SHOW_STR_LEN + 1];
+ u8 level;
+
+ memset(level_str, 0, FAULT_SHOW_STR_LEN + 1);
+ level = event->event.chip.err_level;
+ if (level < FAULT_LEVEL_MAX)
+ strncpy(level_str, fault_level[level],
+ FAULT_SHOW_STR_LEN);
+ else
+ strncpy(level_str, "Unknown", FAULT_SHOW_STR_LEN);
+
+ if (level == FAULT_LEVEL_SERIOUS_FLR)
+ dev_err(hwdev->dev_hdl, "err_level: %u [%s], flr func_id: %u\n",
+ level, level_str, event->event.chip.func_id);
+
+ dev_err(hwdev->dev_hdl,
+ "Module_id: 0x%x, err_type: 0x%x, err_level: %u[%s], err_csr_addr: 0x%08x, err_csr_value: 0x%08x\n",
+ event->event.chip.node_id,
+ event->event.chip.err_type, level, level_str,
+ event->event.chip.err_csr_addr,
+ event->event.chip.err_csr_value);
+}
+
+static void fault_report_show(struct hinic3_hwdev *hwdev,
+ struct hinic3_fault_event *event)
+{
+ char fault_type[FAULT_TYPE_MAX][FAULT_SHOW_STR_LEN + 1] = {
+ "chip", "ucode", "mem rd timeout", "mem wr timeout",
+ "reg rd timeout", "reg wr timeout", "phy fault", "tsensor fault"};
+ char type_str[FAULT_SHOW_STR_LEN + 1] = {0};
+ struct fault_event_stats *fault = NULL;
+
+ sdk_err(hwdev->dev_hdl, "Fault event report received, func_id: %u\n",
+ hinic3_global_func_id(hwdev));
+
+ fault = &hwdev->hw_stats.fault_event_stats;
+
+ if (event->type < FAULT_TYPE_MAX) {
+ strncpy(type_str, fault_type[event->type], sizeof(type_str));
+ atomic_inc(&fault->fault_type_stat[event->type]);
+ } else {
+ strncpy(type_str, "Unknown", sizeof(type_str));
+ }
+
+ sdk_err(hwdev->dev_hdl, "Fault type: %u [%s]\n", event->type, type_str);
+ /* 0, 1, 2 and 3 word Represents array event->event.val index */
+ sdk_err(hwdev->dev_hdl, "Fault val[0]: 0x%08x, val[1]: 0x%08x, val[2]: 0x%08x, val[3]: 0x%08x\n",
+ event->event.val[0x0], event->event.val[0x1],
+ event->event.val[0x2], event->event.val[0x3]);
+
+ hinic3_show_chip_err_info(hwdev);
+
+ switch (event->type) {
+ case FAULT_TYPE_CHIP:
+ chip_fault_show(hwdev, event);
+ break;
+ case FAULT_TYPE_UCODE:
+ sdk_err(hwdev->dev_hdl, "Cause_id: %u, core_id: %u, c_id: %u, epc: 0x%08x\n",
+ event->event.ucode.cause_id, event->event.ucode.core_id,
+ event->event.ucode.c_id, event->event.ucode.epc);
+ break;
+ case FAULT_TYPE_MEM_RD_TIMEOUT:
+ case FAULT_TYPE_MEM_WR_TIMEOUT:
+ sdk_err(hwdev->dev_hdl, "Err_csr_ctrl: 0x%08x, err_csr_data: 0x%08x, ctrl_tab: 0x%08x, mem_index: 0x%08x\n",
+ event->event.mem_timeout.err_csr_ctrl,
+ event->event.mem_timeout.err_csr_data,
+ event->event.mem_timeout.ctrl_tab, event->event.mem_timeout.mem_index);
+ break;
+ case FAULT_TYPE_REG_RD_TIMEOUT:
+ case FAULT_TYPE_REG_WR_TIMEOUT:
+ sdk_err(hwdev->dev_hdl, "Err_csr: 0x%08x\n", event->event.reg_timeout.err_csr);
+ break;
+ case FAULT_TYPE_PHY_FAULT:
+ sdk_err(hwdev->dev_hdl, "Op_type: %u, port_id: %u, dev_ad: %u, csr_addr: 0x%08x, op_data: 0x%08x\n",
+ event->event.phy_fault.op_type, event->event.phy_fault.port_id,
+ event->event.phy_fault.dev_ad, event->event.phy_fault.csr_addr,
+ event->event.phy_fault.op_data);
+ break;
+ default:
+ break;
+ }
+}
+
+static void fault_event_handler(void *dev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct hinic3_cmd_fault_event *fault_event = NULL;
+ struct hinic3_fault_event *fault = NULL;
+ struct hinic3_event_info event_info;
+ struct hinic3_hwdev *hwdev = dev;
+ u8 fault_src = HINIC3_FAULT_SRC_TYPE_MAX;
+ u8 fault_level;
+
+ if (in_size != sizeof(*fault_event)) {
+ sdk_err(hwdev->dev_hdl, "Invalid fault event report, length: %u, should be %ld\n",
+ in_size, sizeof(*fault_event));
+ return;
+ }
+
+ fault_event = buf_in;
+ fault_report_show(hwdev, &fault_event->event);
+
+ if (fault_event->event.type == FAULT_TYPE_CHIP)
+ fault_level = fault_event->event.event.chip.err_level;
+ else
+ fault_level = FAULT_LEVEL_FATAL;
+
+ if (hwdev->event_callback) {
+ event_info.service = EVENT_SRV_COMM;
+ event_info.type = EVENT_COMM_FAULT;
+ fault = (void *)event_info.event_data;
+ memcpy(fault, &fault_event->event, sizeof(struct hinic3_fault_event));
+ fault->fault_level = fault_level;
+ hwdev->event_callback(hwdev->event_pri_handle, &event_info);
+ }
+
+ if (fault_event->event.type <= FAULT_TYPE_REG_WR_TIMEOUT)
+ fault_src = fault_event->event.type;
+ else if (fault_event->event.type == FAULT_TYPE_PHY_FAULT)
+ fault_src = HINIC3_FAULT_SRC_HW_PHY_FAULT;
+
+ hisdk3_fault_post_process(hwdev, fault_src, fault_level);
+}
+
+static void ffm_event_record(struct hinic3_hwdev *dev, struct dbgtool_k_glb_info *dbgtool_info,
+ struct ffm_intr_info *intr)
+{
+ struct rtc_time rctm;
+ struct timeval txc;
+ u32 ffm_idx;
+ u32 last_err_csr_addr;
+ u32 last_err_csr_value;
+
+ ffm_idx = dbgtool_info->ffm->ffm_num;
+ last_err_csr_addr = dbgtool_info->ffm->last_err_csr_addr;
+ last_err_csr_value = dbgtool_info->ffm->last_err_csr_value;
+ if (ffm_idx < FFM_RECORD_NUM_MAX) {
+ if (intr->err_csr_addr == last_err_csr_addr &&
+ intr->err_csr_value == last_err_csr_value) {
+ dbgtool_info->ffm->ffm[ffm_idx - 1].times++;
+ sdk_err(dev->dev_hdl, "Receive intr same, ffm_idx: %u\n", ffm_idx - 1);
+ return;
+ }
+ sdk_err(dev->dev_hdl, "Receive intr, ffm_idx: %u\n", ffm_idx);
+
+ dbgtool_info->ffm->ffm[ffm_idx].intr_info.node_id = intr->node_id;
+ dbgtool_info->ffm->ffm[ffm_idx].intr_info.err_level = intr->err_level;
+ dbgtool_info->ffm->ffm[ffm_idx].intr_info.err_type = intr->err_type;
+ dbgtool_info->ffm->ffm[ffm_idx].intr_info.err_csr_addr = intr->err_csr_addr;
+ dbgtool_info->ffm->ffm[ffm_idx].intr_info.err_csr_value = intr->err_csr_value;
+ dbgtool_info->ffm->last_err_csr_addr = intr->err_csr_addr;
+ dbgtool_info->ffm->last_err_csr_value = intr->err_csr_value;
+ dbgtool_info->ffm->ffm[ffm_idx].times = 1;
+
+ /* Obtain the current UTC time */
+ do_gettimeofday(&txc);
+
+ /* Calculate the time in date value to tm, i.e. GMT + 8, mutiplied by 60 * 60 */
+ rtc_time_to_tm((unsigned long)txc.tv_sec + 60 * 60 * 8, &rctm);
+
+ /* tm_year starts from 1900; 0->1900, 1->1901, and so on */
+ dbgtool_info->ffm->ffm[ffm_idx].year = (u16)(rctm.tm_year + 1900);
+ /* tm_mon starts from 0, 0 indicates January, and so on */
+ dbgtool_info->ffm->ffm[ffm_idx].mon = (u8)rctm.tm_mon + 1;
+ dbgtool_info->ffm->ffm[ffm_idx].mday = (u8)rctm.tm_mday;
+ dbgtool_info->ffm->ffm[ffm_idx].hour = (u8)rctm.tm_hour;
+ dbgtool_info->ffm->ffm[ffm_idx].min = (u8)rctm.tm_min;
+ dbgtool_info->ffm->ffm[ffm_idx].sec = (u8)rctm.tm_sec;
+
+ dbgtool_info->ffm->ffm_num++;
+ }
+}
+
+static void ffm_event_msg_handler(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct dbgtool_k_glb_info *dbgtool_info = NULL;
+ struct hinic3_hwdev *dev = hwdev;
+ struct card_node *card_info = NULL;
+ struct ffm_intr_info *intr = NULL;
+
+ if (in_size != sizeof(*intr)) {
+ sdk_err(dev->dev_hdl, "Invalid fault event report, length: %u, should be %ld.\n",
+ in_size, sizeof(*intr));
+ return;
+ }
+
+ intr = buf_in;
+
+ sdk_err(dev->dev_hdl, "node_id: 0x%x, err_type: 0x%x, err_level: %u, err_csr_addr: 0x%08x, err_csr_value: 0x%08x\n",
+ intr->node_id, intr->err_type, intr->err_level,
+ intr->err_csr_addr, intr->err_csr_value);
+
+ hinic3_show_chip_err_info(hwdev);
+
+ card_info = dev->chip_node;
+ dbgtool_info = card_info->dbgtool_info;
+
+ *out_size = sizeof(*intr);
+
+ if (!dbgtool_info)
+ return;
+
+ if (!dbgtool_info->ffm)
+ return;
+
+ ffm_event_record(dev, dbgtool_info, intr);
+}
+
+#define X_CSR_INDEX 30
+
+static void sw_watchdog_timeout_info_show(struct hinic3_hwdev *hwdev,
+ void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct comm_info_sw_watchdog *watchdog_info = buf_in;
+ u32 stack_len, i, j, tmp;
+ u32 *dump_addr = NULL;
+ u64 *reg = NULL;
+
+ if (in_size != sizeof(*watchdog_info)) {
+ sdk_err(hwdev->dev_hdl, "Invalid mgmt watchdog report, length: %d, should be %ld\n",
+ in_size, sizeof(*watchdog_info));
+ return;
+ }
+
+ sdk_err(hwdev->dev_hdl, "Mgmt deadloop time: 0x%x 0x%x, task id: 0x%x, sp: 0x%llx\n",
+ watchdog_info->curr_time_h, watchdog_info->curr_time_l,
+ watchdog_info->task_id, watchdog_info->sp);
+ sdk_err(hwdev->dev_hdl,
+ "Stack current used: 0x%x, peak used: 0x%x, overflow flag: 0x%x, top: 0x%llx, bottom: 0x%llx\n",
+ watchdog_info->curr_used, watchdog_info->peak_used,
+ watchdog_info->is_overflow, watchdog_info->stack_top, watchdog_info->stack_bottom);
+
+ sdk_err(hwdev->dev_hdl, "Mgmt pc: 0x%llx, elr: 0x%llx, spsr: 0x%llx, far: 0x%llx, esr: 0x%llx, xzr: 0x%llx\n",
+ watchdog_info->pc, watchdog_info->elr, watchdog_info->spsr, watchdog_info->far,
+ watchdog_info->esr, watchdog_info->xzr); /*lint !e10 !e26 */
+
+ sdk_err(hwdev->dev_hdl, "Mgmt register info\n");
+ reg = &watchdog_info->x30;
+ for (i = 0; i <= X_CSR_INDEX; i++)
+ sdk_err(hwdev->dev_hdl, "x%02u:0x%llx\n",
+ X_CSR_INDEX - i, reg[i]); /*lint !e661 !e662 */
+
+ if (watchdog_info->stack_actlen <= DATA_LEN_1K) {
+ stack_len = watchdog_info->stack_actlen;
+ } else {
+ sdk_err(hwdev->dev_hdl, "Oops stack length: 0x%x is wrong\n",
+ watchdog_info->stack_actlen);
+ stack_len = DATA_LEN_1K;
+ }
+
+ sdk_err(hwdev->dev_hdl, "Mgmt dump stack, 16 bytes per line(start from sp)\n");
+ for (i = 0; i < (stack_len / DUMP_16B_PER_LINE); i++) {
+ dump_addr = (u32 *)(watchdog_info->stack_data + (u32)(i * DUMP_16B_PER_LINE));
+ sdk_err(hwdev->dev_hdl, "0x%08x 0x%08x 0x%08x 0x%08x\n",
+ *dump_addr, *(dump_addr + 0x1), *(dump_addr + 0x2), *(dump_addr + 0x3));
+ }
+
+ tmp = (stack_len % DUMP_16B_PER_LINE) / DUMP_4_VAR_PER_LINE;
+ for (j = 0; j < tmp; j++) {
+ dump_addr = (u32 *)(watchdog_info->stack_data +
+ (u32)(i * DUMP_16B_PER_LINE + j * DUMP_4_VAR_PER_LINE));
+ sdk_err(hwdev->dev_hdl, "0x%08x ", *dump_addr);
+ }
+
+ *out_size = sizeof(*watchdog_info);
+ watchdog_info = buf_out;
+ watchdog_info->head.status = 0;
+}
+
+static void mgmt_watchdog_timeout_event_handler(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ struct hinic3_event_info event_info = { 0 };
+ struct hinic3_hwdev *dev = hwdev;
+
+ sw_watchdog_timeout_info_show(dev, buf_in, in_size, buf_out, out_size);
+
+ if (dev->event_callback) {
+ event_info.type = EVENT_COMM_MGMT_WATCHDOG;
+ dev->event_callback(dev->event_pri_handle, &event_info);
+ }
+}
+
+static void show_exc_info(struct hinic3_hwdev *hwdev, EXC_INFO_S *exc_info)
+{
+ u32 i;
+
+ /* key information */
+ sdk_err(hwdev->dev_hdl, "==================== Exception Info Begin ====================\n");
+ sdk_err(hwdev->dev_hdl, "Exception CpuTick : 0x%08x 0x%08x\n",
+ exc_info->cpu_tick.cnt_hi, exc_info->cpu_tick.cnt_lo);
+ sdk_err(hwdev->dev_hdl, "Exception Cause : %u\n", exc_info->exc_cause);
+ sdk_err(hwdev->dev_hdl, "Os Version : %s\n", exc_info->os_ver);
+ sdk_err(hwdev->dev_hdl, "App Version : %s\n", exc_info->app_ver);
+ sdk_err(hwdev->dev_hdl, "CPU Type : 0x%08x\n", exc_info->cpu_type);
+ sdk_err(hwdev->dev_hdl, "CPU ID : 0x%08x\n", exc_info->cpu_id);
+ sdk_err(hwdev->dev_hdl, "Thread Type : 0x%08x\n", exc_info->thread_type);
+ sdk_err(hwdev->dev_hdl, "Thread ID : 0x%08x\n", exc_info->thread_id);
+ sdk_err(hwdev->dev_hdl, "Byte Order : 0x%08x\n", exc_info->byte_order);
+ sdk_err(hwdev->dev_hdl, "Nest Count : 0x%08x\n", exc_info->nest_cnt);
+ sdk_err(hwdev->dev_hdl, "Fatal Error Num : 0x%08x\n", exc_info->fatal_errno);
+ sdk_err(hwdev->dev_hdl, "Current SP : 0x%016llx\n", exc_info->uw_sp);
+ sdk_err(hwdev->dev_hdl, "Stack Bottom : 0x%016llx\n", exc_info->stack_bottom);
+
+ /* register field */
+ sdk_err(hwdev->dev_hdl, "Register contents when exception occur.\n");
+ sdk_err(hwdev->dev_hdl, "%-14s: 0x%016llx \t %-14s: 0x%016llx\n", "TTBR0",
+ exc_info->reg_info.ttbr0, "TTBR1", exc_info->reg_info.ttbr1);
+ sdk_err(hwdev->dev_hdl, "%-14s: 0x%016llx \t %-14s: 0x%016llx\n", "TCR",
+ exc_info->reg_info.tcr, "MAIR", exc_info->reg_info.mair);
+ sdk_err(hwdev->dev_hdl, "%-14s: 0x%016llx \t %-14s: 0x%016llx\n", "SCTLR",
+ exc_info->reg_info.sctlr, "VBAR", exc_info->reg_info.vbar);
+ sdk_err(hwdev->dev_hdl, "%-14s: 0x%016llx \t %-14s: 0x%016llx\n", "CURRENTE1",
+ exc_info->reg_info.current_el, "SP", exc_info->reg_info.sp);
+ sdk_err(hwdev->dev_hdl, "%-14s: 0x%016llx \t %-14s: 0x%016llx\n", "ELR",
+ exc_info->reg_info.elr, "SPSR", exc_info->reg_info.spsr);
+ sdk_err(hwdev->dev_hdl, "%-14s: 0x%016llx \t %-14s: 0x%016llx\n", "FAR",
+ exc_info->reg_info.far_r, "ESR", exc_info->reg_info.esr);
+ sdk_err(hwdev->dev_hdl, "%-14s: 0x%016llx\n", "XZR", exc_info->reg_info.xzr);
+
+ for (i = 0; i < XREGS_NUM - 1; i += 0x2)
+ sdk_err(hwdev->dev_hdl, "XREGS[%02u]%-5s: 0x%016llx \t XREGS[%02u]%-5s: 0x%016llx",
+ i, " ", exc_info->reg_info.xregs[i],
+ (u32)(i + 0x1U), " ", exc_info->reg_info.xregs[(u32)(i + 0x1U)]);
+
+ sdk_err(hwdev->dev_hdl, "XREGS[%02u]%-5s: 0x%016llx \t ", XREGS_NUM - 1, " ",
+ exc_info->reg_info.xregs[XREGS_NUM - 1]);
+}
+
+#define FOUR_REG_LEN 16
+
+static void mgmt_lastword_report_event_handler(void *hwdev, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size)
+{
+ comm_info_up_lastword_s *lastword_info = buf_in;
+ EXC_INFO_S *exc_info = &lastword_info->stack_info;
+ u32 stack_len = lastword_info->stack_actlen;
+ struct hinic3_hwdev *dev = hwdev;
+ u32 *curr_reg = NULL;
+ u32 reg_i, cnt;
+
+ if (in_size != sizeof(*lastword_info)) {
+ sdk_err(dev->dev_hdl, "Invalid mgmt lastword, length: %u, should be %ld\n",
+ in_size, sizeof(*lastword_info));
+ return;
+ }
+
+ show_exc_info(dev, exc_info);
+
+ /* call stack dump */
+ sdk_err(dev->dev_hdl, "Dump stack when exceptioin occurs, 16Bytes per line.\n");
+
+ cnt = stack_len / FOUR_REG_LEN;
+ for (reg_i = 0; reg_i < cnt; reg_i++) {
+ curr_reg = (u32 *)(lastword_info->stack_data + ((u64)(u32)(reg_i * FOUR_REG_LEN)));
+ sdk_err(dev->dev_hdl, "0x%08x 0x%08x 0x%08x 0x%08x\n",
+ *curr_reg, *(curr_reg + 0x1), *(curr_reg + 0x2), *(curr_reg + 0x3));
+ }
+
+ sdk_err(dev->dev_hdl, "==================== Exception Info End ====================\n");
+}
+
+const struct mgmt_event_handle mgmt_event_proc[] = {
+ {
+ .cmd = COMM_MGMT_CMD_FAULT_REPORT,
+ .proc = fault_event_handler,
+ },
+
+ {
+ .cmd = COMM_MGMT_CMD_FFM_SET,
+ .proc = ffm_event_msg_handler,
+ },
+
+ {
+ .cmd = COMM_MGMT_CMD_WATCHDOG_INFO,
+ .proc = mgmt_watchdog_timeout_event_handler,
+ },
+
+ {
+ .cmd = COMM_MGMT_CMD_LASTWORD_GET,
+ .proc = mgmt_lastword_report_event_handler,
+ },
+};
+
+static void pf_handle_mgmt_comm_event(void *handle, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size)
+{
+ struct hinic3_hwdev *hwdev = handle;
+ u32 i, event_num = ARRAY_LEN(mgmt_event_proc);
+
+ if (!hwdev)
+ return;
+
+ for (i = 0; i < event_num; i++) {
+ if (cmd == mgmt_event_proc[i].cmd) {
+ if (mgmt_event_proc[i].proc)
+ mgmt_event_proc[i].proc(handle, buf_in, in_size,
+ buf_out, out_size);
+
+ return;
+ }
+ }
+
+ sdk_warn(hwdev->dev_hdl, "Unsupported mgmt cpu event %u to process\n",
+ cmd);
+ *out_size = sizeof(struct mgmt_msg_head);
+ ((struct mgmt_msg_head *)buf_out)->status = HINIC3_MGMT_CMD_UNSUPPORTED;
+}
+
+static void hinic3_set_chip_present(struct hinic3_hwdev *hwdev)
+{
+ hwdev->chip_present_flag = HINIC3_CHIP_PRESENT;
+}
+
+static void hinic3_set_chip_absent(struct hinic3_hwdev *hwdev)
+{
+ sdk_err(hwdev->dev_hdl, "Card not present\n");
+ hwdev->chip_present_flag = HINIC3_CHIP_ABSENT;
+}
+
+int hinic3_get_chip_present_flag(const void *hwdev)
+{
+ if (!hwdev)
+ return 0;
+
+ return ((struct hinic3_hwdev *)hwdev)->chip_present_flag;
+}
+EXPORT_SYMBOL(hinic3_get_chip_present_flag);
+
+void hinic3_force_complete_all(void *dev)
+{
+ struct hinic3_recv_msg *recv_resp_msg = NULL;
+ struct hinic3_hwdev *hwdev = dev;
+ struct hinic3_mbox *func_to_func = NULL;
+
+ spin_lock_bh(&hwdev->channel_lock);
+ if (hinic3_func_type(hwdev) != TYPE_VF &&
+ test_bit(HINIC3_HWDEV_MGMT_INITED, &hwdev->func_state)) {
+ recv_resp_msg = &hwdev->pf_to_mgmt->recv_resp_msg_from_mgmt;
+ spin_lock_bh(&hwdev->pf_to_mgmt->sync_event_lock);
+ if (hwdev->pf_to_mgmt->event_flag == SEND_EVENT_START) {
+ complete(&recv_resp_msg->recv_done);
+ hwdev->pf_to_mgmt->event_flag = SEND_EVENT_TIMEOUT;
+ }
+ spin_unlock_bh(&hwdev->pf_to_mgmt->sync_event_lock);
+ }
+
+ if (test_bit(HINIC3_HWDEV_MBOX_INITED, &hwdev->func_state)) {
+ func_to_func = hwdev->func_to_func;
+ spin_lock(&func_to_func->mbox_lock);
+ if (func_to_func->event_flag == EVENT_START)
+ func_to_func->event_flag = EVENT_TIMEOUT;
+ spin_unlock(&func_to_func->mbox_lock);
+ }
+
+ if (test_bit(HINIC3_HWDEV_CMDQ_INITED, &hwdev->func_state))
+ hinic3_cmdq_flush_sync_cmd(hwdev);
+
+ spin_unlock_bh(&hwdev->channel_lock);
+}
+EXPORT_SYMBOL(hinic3_force_complete_all);
+
+void hinic3_detect_hw_present(void *hwdev)
+{
+ if (!get_card_present_state((struct hinic3_hwdev *)hwdev)) {
+ hinic3_set_chip_absent(hwdev);
+ hinic3_force_complete_all(hwdev);
+ }
+}
+
+/**
+ * dma_attr_table_init - initialize the default dma attributes
+ * @hwdev: the pointer to hw device
+ **/
+static int dma_attr_table_init(struct hinic3_hwdev *hwdev)
+{
+ u32 addr, val, dst_attr;
+
+ /* Use indirect access should set entry_idx first */
+ addr = HINIC3_CSR_DMA_ATTR_INDIR_IDX_ADDR;
+ val = hinic3_hwif_read_reg(hwdev->hwif, addr);
+ val = HINIC3_DMA_ATTR_INDIR_IDX_CLEAR(val, IDX);
+
+ val |= HINIC3_DMA_ATTR_INDIR_IDX_SET(PCIE_MSIX_ATTR_ENTRY, IDX);
+
+ hinic3_hwif_write_reg(hwdev->hwif, addr, val);
+
+ wmb(); /* write index before config */
+
+ addr = HINIC3_CSR_DMA_ATTR_TBL_ADDR;
+ val = hinic3_hwif_read_reg(hwdev->hwif, addr);
+
+ dst_attr = HINIC3_DMA_ATTR_ENTRY_SET(HINIC3_PCIE_ST_DISABLE, ST) |
+ HINIC3_DMA_ATTR_ENTRY_SET(HINIC3_PCIE_AT_DISABLE, AT) |
+ HINIC3_DMA_ATTR_ENTRY_SET(HINIC3_PCIE_PH_DISABLE, PH) |
+ HINIC3_DMA_ATTR_ENTRY_SET(HINIC3_PCIE_SNOOP, NO_SNOOPING) |
+ HINIC3_DMA_ATTR_ENTRY_SET(HINIC3_PCIE_TPH_DISABLE, TPH_EN);
+
+ if (val == dst_attr)
+ return 0;
+
+ return hinic3_set_dma_attr_tbl(hwdev, PCIE_MSIX_ATTR_ENTRY, HINIC3_PCIE_ST_DISABLE,
+ HINIC3_PCIE_AT_DISABLE, HINIC3_PCIE_PH_DISABLE,
+ HINIC3_PCIE_SNOOP, HINIC3_PCIE_TPH_DISABLE);
+}
+
+static int init_aeqs_msix_attr(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_aeqs *aeqs = hwdev->aeqs;
+ struct interrupt_info info = {0};
+ struct hinic3_eq *eq = NULL;
+ int q_id;
+ int err;
+
+ info.lli_set = 0;
+ info.interrupt_coalesc_set = 1;
+ info.pending_limt = HINIC3_DEAULT_EQ_MSIX_PENDING_LIMIT;
+ info.coalesc_timer_cfg = HINIC3_DEAULT_EQ_MSIX_COALESC_TIMER_CFG;
+ info.resend_timer_cfg = HINIC3_DEAULT_EQ_MSIX_RESEND_TIMER_CFG;
+
+ for (q_id = aeqs->num_aeqs - 1; q_id >= 0; q_id--) {
+ eq = &aeqs->aeq[q_id];
+ info.msix_index = eq->eq_irq.msix_entry_idx;
+ err = hinic3_set_interrupt_cfg_direct(hwdev, &info,
+ HINIC3_CHANNEL_COMM);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Set msix attr for aeq %d failed\n",
+ q_id);
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static int init_ceqs_msix_attr(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_ceqs *ceqs = hwdev->ceqs;
+ struct interrupt_info info = {0};
+ struct hinic3_eq *eq = NULL;
+ u16 q_id;
+ int err;
+
+ info.lli_set = 0;
+ info.interrupt_coalesc_set = 1;
+ info.pending_limt = HINIC3_DEAULT_EQ_MSIX_PENDING_LIMIT;
+ info.coalesc_timer_cfg = HINIC3_DEAULT_EQ_MSIX_COALESC_TIMER_CFG;
+ info.resend_timer_cfg = HINIC3_DEAULT_EQ_MSIX_RESEND_TIMER_CFG;
+
+ for (q_id = 0; q_id < ceqs->num_ceqs; q_id++) {
+ eq = &ceqs->ceq[q_id];
+ info.msix_index = eq->eq_irq.msix_entry_idx;
+ err = hinic3_set_interrupt_cfg(hwdev, info,
+ HINIC3_CHANNEL_COMM);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Set msix attr for ceq %u failed\n",
+ q_id);
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static int hinic3_comm_clp_to_mgmt_init(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF || !COMM_SUPPORT_CLP(hwdev))
+ return 0;
+
+ err = hinic3_clp_pf_to_mgmt_init(hwdev);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static void hinic3_comm_clp_to_mgmt_free(struct hinic3_hwdev *hwdev)
+{
+ if (hinic3_func_type(hwdev) == TYPE_VF || !COMM_SUPPORT_CLP(hwdev))
+ return;
+
+ hinic3_clp_pf_to_mgmt_free(hwdev);
+}
+
+static int hinic3_comm_aeqs_init(struct hinic3_hwdev *hwdev)
+{
+ struct irq_info aeq_irqs[HINIC3_MAX_AEQS] = {{0} };
+ u16 num_aeqs, resp_num_irq = 0, i;
+ int err;
+
+ num_aeqs = HINIC3_HWIF_NUM_AEQS(hwdev->hwif);
+ if (num_aeqs > HINIC3_MAX_AEQS) {
+ sdk_warn(hwdev->dev_hdl, "Adjust aeq num to %d\n",
+ HINIC3_MAX_AEQS);
+ num_aeqs = HINIC3_MAX_AEQS;
+ }
+ err = hinic3_alloc_irqs(hwdev, SERVICE_T_INTF, num_aeqs, aeq_irqs,
+ &resp_num_irq);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc aeq irqs, num_aeqs: %u\n",
+ num_aeqs);
+ return err;
+ }
+
+ if (resp_num_irq < num_aeqs) {
+ sdk_warn(hwdev->dev_hdl, "Adjust aeq num to %u\n",
+ resp_num_irq);
+ num_aeqs = resp_num_irq;
+ }
+
+ err = hinic3_aeqs_init(hwdev, num_aeqs, aeq_irqs);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init aeqs\n");
+ goto aeqs_init_err;
+ }
+
+ return 0;
+
+aeqs_init_err:
+ for (i = 0; i < num_aeqs; i++)
+ hinic3_free_irq(hwdev, SERVICE_T_INTF, aeq_irqs[i].irq_id);
+
+ return err;
+}
+
+static void hinic3_comm_aeqs_free(struct hinic3_hwdev *hwdev)
+{
+ struct irq_info aeq_irqs[HINIC3_MAX_AEQS] = {{0} };
+ u16 num_irqs, i;
+
+ hinic3_get_aeq_irqs(hwdev, aeq_irqs, &num_irqs);
+
+ hinic3_aeqs_free(hwdev);
+
+ for (i = 0; i < num_irqs; i++)
+ hinic3_free_irq(hwdev, SERVICE_T_INTF, aeq_irqs[i].irq_id);
+}
+
+static int hinic3_comm_ceqs_init(struct hinic3_hwdev *hwdev)
+{
+ struct irq_info ceq_irqs[HINIC3_MAX_CEQS] = {{0} };
+ u16 num_ceqs, resp_num_irq = 0, i;
+ int err;
+
+ num_ceqs = HINIC3_HWIF_NUM_CEQS(hwdev->hwif);
+ if (num_ceqs > HINIC3_MAX_CEQS) {
+ sdk_warn(hwdev->dev_hdl, "Adjust ceq num to %d\n",
+ HINIC3_MAX_CEQS);
+ num_ceqs = HINIC3_MAX_CEQS;
+ }
+
+ err = hinic3_alloc_irqs(hwdev, SERVICE_T_INTF, num_ceqs, ceq_irqs,
+ &resp_num_irq);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc ceq irqs, num_ceqs: %u\n",
+ num_ceqs);
+ return err;
+ }
+
+ if (resp_num_irq < num_ceqs) {
+ sdk_warn(hwdev->dev_hdl, "Adjust ceq num to %u\n",
+ resp_num_irq);
+ num_ceqs = resp_num_irq;
+ }
+
+ err = hinic3_ceqs_init(hwdev, num_ceqs, ceq_irqs);
+ if (err) {
+ sdk_err(hwdev->dev_hdl,
+ "Failed to init ceqs, err:%d\n", err);
+ goto ceqs_init_err;
+ }
+
+ return 0;
+
+ceqs_init_err:
+ for (i = 0; i < num_ceqs; i++)
+ hinic3_free_irq(hwdev, SERVICE_T_INTF, ceq_irqs[i].irq_id);
+
+ return err;
+}
+
+static void hinic3_comm_ceqs_free(struct hinic3_hwdev *hwdev)
+{
+ struct irq_info ceq_irqs[HINIC3_MAX_CEQS] = {{0} };
+ u16 num_irqs;
+ int i;
+
+ hinic3_get_ceq_irqs(hwdev, ceq_irqs, &num_irqs);
+
+ hinic3_ceqs_free(hwdev);
+
+ for (i = 0; i < num_irqs; i++)
+ hinic3_free_irq(hwdev, SERVICE_T_INTF, ceq_irqs[i].irq_id);
+}
+
+static int hinic3_comm_func_to_func_init(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ err = hinic3_func_to_func_init(hwdev);
+ if (err)
+ return err;
+
+ hinic3_aeq_register_hw_cb(hwdev, hwdev, HINIC3_MBX_FROM_FUNC,
+ hinic3_mbox_func_aeqe_handler);
+ hinic3_aeq_register_hw_cb(hwdev, hwdev, HINIC3_MSG_FROM_MGMT_CPU,
+ hinic3_mgmt_msg_aeqe_handler);
+
+ if (!HINIC3_IS_VF(hwdev))
+ hinic3_register_pf_mbox_cb(hwdev, HINIC3_MOD_COMM,
+ hwdev,
+ pf_handle_vf_comm_mbox);
+ else
+ hinic3_register_vf_mbox_cb(hwdev, HINIC3_MOD_COMM,
+ hwdev,
+ vf_handle_pf_comm_mbox);
+
+ set_bit(HINIC3_HWDEV_MBOX_INITED, &hwdev->func_state);
+
+ return 0;
+}
+
+static void hinic3_comm_func_to_func_free(struct hinic3_hwdev *hwdev)
+{
+ spin_lock_bh(&hwdev->channel_lock);
+ clear_bit(HINIC3_HWDEV_MBOX_INITED, &hwdev->func_state);
+ spin_unlock_bh(&hwdev->channel_lock);
+
+ hinic3_aeq_unregister_hw_cb(hwdev, HINIC3_MBX_FROM_FUNC);
+
+ if (!HINIC3_IS_VF(hwdev)) {
+ hinic3_unregister_pf_mbox_cb(hwdev, HINIC3_MOD_COMM);
+ } else {
+ hinic3_unregister_vf_mbox_cb(hwdev, HINIC3_MOD_COMM);
+
+ hinic3_aeq_unregister_hw_cb(hwdev, HINIC3_MSG_FROM_MGMT_CPU);
+ }
+
+ hinic3_func_to_func_free(hwdev);
+}
+
+static int hinic3_comm_pf_to_mgmt_init(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return 0;
+
+ err = hinic3_pf_to_mgmt_init(hwdev);
+ if (err)
+ return err;
+
+ hinic3_register_mgmt_msg_cb(hwdev, HINIC3_MOD_COMM, hwdev,
+ pf_handle_mgmt_comm_event);
+
+ set_bit(HINIC3_HWDEV_MGMT_INITED, &hwdev->func_state);
+
+ return 0;
+}
+
+static void hinic3_comm_pf_to_mgmt_free(struct hinic3_hwdev *hwdev)
+{
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return;
+
+ spin_lock_bh(&hwdev->channel_lock);
+ clear_bit(HINIC3_HWDEV_MGMT_INITED, &hwdev->func_state);
+ spin_unlock_bh(&hwdev->channel_lock);
+
+ hinic3_unregister_mgmt_msg_cb(hwdev, HINIC3_MOD_COMM);
+
+ hinic3_aeq_unregister_hw_cb(hwdev, HINIC3_MSG_FROM_MGMT_CPU);
+
+ hinic3_pf_to_mgmt_free(hwdev);
+}
+
+static int hinic3_comm_cmdqs_init(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ err = hinic3_cmdqs_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init cmd queues\n");
+ return err;
+ }
+
+ hinic3_ceq_register_cb(hwdev, hwdev, HINIC3_CMDQ, hinic3_cmdq_ceq_handler);
+
+ err = hinic3_set_cmdq_depth(hwdev, HINIC3_CMDQ_DEPTH);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to set cmdq depth\n");
+ goto set_cmdq_depth_err;
+ }
+
+ set_bit(HINIC3_HWDEV_CMDQ_INITED, &hwdev->func_state);
+
+ return 0;
+
+set_cmdq_depth_err:
+ hinic3_cmdqs_free(hwdev);
+
+ return err;
+}
+
+static void hinic3_comm_cmdqs_free(struct hinic3_hwdev *hwdev)
+{
+ spin_lock_bh(&hwdev->channel_lock);
+ clear_bit(HINIC3_HWDEV_CMDQ_INITED, &hwdev->func_state);
+ spin_unlock_bh(&hwdev->channel_lock);
+
+ hinic3_ceq_unregister_cb(hwdev, HINIC3_CMDQ);
+ hinic3_cmdqs_free(hwdev);
+}
+
+static void hinic3_sync_mgmt_func_state(struct hinic3_hwdev *hwdev)
+{
+ hinic3_set_pf_status(hwdev->hwif, HINIC3_PF_STATUS_ACTIVE_FLAG);
+}
+
+static void hinic3_unsync_mgmt_func_state(struct hinic3_hwdev *hwdev)
+{
+ hinic3_set_pf_status(hwdev->hwif, HINIC3_PF_STATUS_INIT);
+}
+
+static int init_basic_attributes(struct hinic3_hwdev *hwdev)
+{
+ u64 drv_features[COMM_MAX_FEATURE_QWORD] = {HINIC3_DRV_FEATURE_QW0, 0, 0, 0};
+ int err, i;
+
+ if (hinic3_func_type(hwdev) == TYPE_PPF)
+ drv_features[0] |= COMM_F_CHANNEL_DETECT;
+
+ err = hinic3_get_board_info(hwdev, &hwdev->board_info,
+ HINIC3_CHANNEL_COMM);
+ if (err)
+ return err;
+
+ err = hinic3_get_comm_features(hwdev, hwdev->features,
+ COMM_MAX_FEATURE_QWORD);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Get comm features failed\n");
+ return err;
+ }
+
+ sdk_info(hwdev->dev_hdl, "Comm hw features: 0x%llx, drv features: 0x%llx\n",
+ hwdev->features[0], drv_features[0]);
+
+ for (i = 0; i < COMM_MAX_FEATURE_QWORD; i++)
+ hwdev->features[i] &= drv_features[i];
+
+ err = hinic3_get_global_attr(hwdev, &hwdev->glb_attr);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to get global attribute\n");
+ return err;
+ }
+
+ sdk_info(hwdev->dev_hdl,
+ "global attribute: max_host: 0x%x, max_pf: 0x%x, vf_id_start: 0x%x, mgmt node id: 0x%x, cmdq_num: 0x%x\n",
+ hwdev->glb_attr.max_host_num, hwdev->glb_attr.max_pf_num,
+ hwdev->glb_attr.vf_id_start,
+ hwdev->glb_attr.mgmt_host_node_id,
+ hwdev->glb_attr.cmdq_num);
+
+ return 0;
+}
+
+static int init_basic_mgmt_channel(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ err = hinic3_comm_aeqs_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init async event queues\n");
+ return err;
+ }
+
+ err = hinic3_comm_func_to_func_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init mailbox\n");
+ goto func_to_func_init_err;
+ }
+
+ err = init_aeqs_msix_attr(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init aeqs msix attr\n");
+ goto aeqs_msix_attr_init_err;
+ }
+
+ return 0;
+
+aeqs_msix_attr_init_err:
+ hinic3_comm_func_to_func_free(hwdev);
+
+func_to_func_init_err:
+ hinic3_comm_aeqs_free(hwdev);
+
+ return err;
+}
+
+static void free_base_mgmt_channel(struct hinic3_hwdev *hwdev)
+{
+ hinic3_comm_func_to_func_free(hwdev);
+ hinic3_comm_aeqs_free(hwdev);
+}
+
+static int init_pf_mgmt_channel(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ err = hinic3_comm_clp_to_mgmt_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init clp\n");
+ return err;
+ }
+
+ err = hinic3_comm_pf_to_mgmt_init(hwdev);
+ if (err) {
+ hinic3_comm_clp_to_mgmt_free(hwdev);
+ sdk_err(hwdev->dev_hdl, "Failed to init pf to mgmt\n");
+ return err;
+ }
+
+ return 0;
+}
+
+static void free_pf_mgmt_channel(struct hinic3_hwdev *hwdev)
+{
+ hinic3_comm_clp_to_mgmt_free(hwdev);
+ hinic3_comm_pf_to_mgmt_free(hwdev);
+}
+
+static int init_mgmt_channel_post(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ /* mbox host channel resources will be freed in
+ * hinic3_func_to_func_free
+ */
+ if (HINIC3_IS_PPF(hwdev)) {
+ err = hinic3_mbox_init_host_msg_channel(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init mbox host channel\n");
+ return err;
+ }
+ }
+
+ err = init_pf_mgmt_channel(hwdev);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static void free_mgmt_msg_channel_post(struct hinic3_hwdev *hwdev)
+{
+ free_pf_mgmt_channel(hwdev);
+}
+
+static int init_cmdqs_channel(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ err = dma_attr_table_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init dma attr table\n");
+ goto dma_attr_init_err;
+ }
+
+ err = hinic3_comm_ceqs_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init completion event queues\n");
+ goto ceqs_init_err;
+ }
+
+ err = init_ceqs_msix_attr(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init ceqs msix attr\n");
+ goto init_ceq_msix_err;
+ }
+
+ /* set default wq page_size */
+ if (wq_page_order > HINIC3_MAX_WQ_PAGE_SIZE_ORDER) {
+ sdk_info(hwdev->dev_hdl, "wq_page_order exceed limit[0, %d], reset to %d\n",
+ HINIC3_MAX_WQ_PAGE_SIZE_ORDER,
+ HINIC3_MAX_WQ_PAGE_SIZE_ORDER);
+ wq_page_order = HINIC3_MAX_WQ_PAGE_SIZE_ORDER;
+ }
+ hwdev->wq_page_size = HINIC3_HW_WQ_PAGE_SIZE * (1U << wq_page_order);
+ sdk_info(hwdev->dev_hdl, "WQ page size: 0x%x\n", hwdev->wq_page_size);
+ err = hinic3_set_wq_page_size(hwdev, hinic3_global_func_id(hwdev),
+ hwdev->wq_page_size, HINIC3_CHANNEL_COMM);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to set wq page size\n");
+ goto init_wq_pg_size_err;
+ }
+
+ err = hinic3_comm_cmdqs_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init cmd queues\n");
+ goto cmdq_init_err;
+ }
+
+ return 0;
+
+cmdq_init_err:
+ if (HINIC3_FUNC_TYPE(hwdev) != TYPE_VF)
+ hinic3_set_wq_page_size(hwdev, hinic3_global_func_id(hwdev),
+ HINIC3_HW_WQ_PAGE_SIZE,
+ HINIC3_CHANNEL_COMM);
+init_wq_pg_size_err:
+init_ceq_msix_err:
+ hinic3_comm_ceqs_free(hwdev);
+
+ceqs_init_err:
+dma_attr_init_err:
+
+ return err;
+}
+
+static void hinic3_free_cmdqs_channel(struct hinic3_hwdev *hwdev)
+{
+ hinic3_comm_cmdqs_free(hwdev);
+
+ if (HINIC3_FUNC_TYPE(hwdev) != TYPE_VF)
+ hinic3_set_wq_page_size(hwdev, hinic3_global_func_id(hwdev),
+ HINIC3_HW_WQ_PAGE_SIZE, HINIC3_CHANNEL_COMM);
+
+ hinic3_comm_ceqs_free(hwdev);
+}
+
+static int hinic3_init_comm_ch(struct hinic3_hwdev *hwdev)
+{
+ int err;
+
+ err = init_basic_mgmt_channel(hwdev);
+ if (err)
+ return err;
+
+ err = hinic3_func_reset(hwdev, hinic3_global_func_id(hwdev),
+ HINIC3_COMM_RES, HINIC3_CHANNEL_COMM);
+ if (err)
+ goto func_reset_err;
+
+ err = init_basic_attributes(hwdev);
+ if (err)
+ goto init_basic_attr_err;
+
+ err = init_mgmt_channel_post(hwdev);
+ if (err)
+ goto init_mgmt_channel_post_err;
+
+ err = hinic3_set_func_svc_used_state(hwdev, SVC_T_COMM, 1, HINIC3_CHANNEL_COMM);
+ if (err)
+ goto set_used_state_err;
+
+ err = init_cmdqs_channel(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init cmdq channel\n");
+ goto init_cmdqs_channel_err;
+ }
+
+ hinic3_sync_mgmt_func_state(hwdev);
+
+ if (HISDK3_F_CHANNEL_LOCK_EN(hwdev)) {
+ hinic3_mbox_enable_channel_lock(hwdev, true);
+ hinic3_cmdq_enable_channel_lock(hwdev, true);
+ }
+
+ err = hinic3_aeq_register_swe_cb(hwdev, hwdev, HINIC3_STATELESS_EVENT,
+ hinic3_nic_sw_aeqe_handler);
+ if (err) {
+ sdk_err(hwdev->dev_hdl,
+ "Failed to register sw aeqe handler\n");
+ goto register_ucode_aeqe_err;
+ }
+
+ return 0;
+
+register_ucode_aeqe_err:
+ hinic3_unsync_mgmt_func_state(hwdev);
+ hinic3_free_cmdqs_channel(hwdev);
+init_cmdqs_channel_err:
+ hinic3_set_func_svc_used_state(hwdev, SVC_T_COMM, 0, HINIC3_CHANNEL_COMM);
+set_used_state_err:
+ free_mgmt_msg_channel_post(hwdev);
+init_mgmt_channel_post_err:
+init_basic_attr_err:
+func_reset_err:
+ free_base_mgmt_channel(hwdev);
+
+ return err;
+}
+
+static void hinic3_uninit_comm_ch(struct hinic3_hwdev *hwdev)
+{
+ hinic3_aeq_unregister_swe_cb(hwdev, HINIC3_STATELESS_EVENT);
+
+ hinic3_unsync_mgmt_func_state(hwdev);
+
+ hinic3_free_cmdqs_channel(hwdev);
+
+ hinic3_set_func_svc_used_state(hwdev, SVC_T_COMM, 0, HINIC3_CHANNEL_COMM);
+
+ free_mgmt_msg_channel_post(hwdev);
+
+ free_base_mgmt_channel(hwdev);
+}
+
+static void hinic3_auto_sync_time_work(struct work_struct *work)
+{
+ struct delayed_work *delay = to_delayed_work(work);
+ struct hinic3_hwdev *hwdev = container_of(delay, struct hinic3_hwdev, sync_time_task);
+ int err;
+
+ err = hinic3_sync_time(hwdev, ossl_get_real_time());
+ if (err)
+ sdk_err(hwdev->dev_hdl, "Synchronize UTC time to firmware failed, errno:%d.\n",
+ err);
+
+ queue_delayed_work(hwdev->workq, &hwdev->sync_time_task,
+ msecs_to_jiffies(HINIC3_SYNFW_TIME_PERIOD));
+}
+
+static void hinic3_auto_channel_detect_work(struct work_struct *work)
+{
+ struct delayed_work *delay = to_delayed_work(work);
+ struct hinic3_hwdev *hwdev = container_of(delay, struct hinic3_hwdev, channel_detect_task);
+ struct card_node *chip_node = NULL;
+
+ hinic3_comm_channel_detect(hwdev);
+
+ chip_node = hwdev->chip_node;
+ if (!atomic_read(&chip_node->channel_busy_cnt))
+ queue_delayed_work(hwdev->workq, &hwdev->channel_detect_task,
+ msecs_to_jiffies(HINIC3_CHANNEL_DETECT_PERIOD));
+}
+
+static int hinic3_init_ppf_work(struct hinic3_hwdev *hwdev)
+{
+
+ if (hinic3_func_type(hwdev) != TYPE_PPF)
+ return 0;
+
+ INIT_DELAYED_WORK(&hwdev->sync_time_task, hinic3_auto_sync_time_work);
+ queue_delayed_work(hwdev->workq, &hwdev->sync_time_task,
+ msecs_to_jiffies(HINIC3_SYNFW_TIME_PERIOD));
+
+ if (COMM_SUPPORT_CHANNEL_DETECT(hwdev)) {
+ INIT_DELAYED_WORK(&hwdev->channel_detect_task,
+ hinic3_auto_channel_detect_work);
+ queue_delayed_work(hwdev->workq, &hwdev->channel_detect_task,
+ msecs_to_jiffies(HINIC3_CHANNEL_DETECT_PERIOD));
+ }
+
+
+ return 0;
+
+}
+
+static void hinic3_free_ppf_work(struct hinic3_hwdev *hwdev)
+{
+ if (hinic3_func_type(hwdev) != TYPE_PPF)
+ return;
+
+
+ if (COMM_SUPPORT_CHANNEL_DETECT(hwdev)) {
+ hwdev->features[0] &= ~(COMM_F_CHANNEL_DETECT);
+ cancel_delayed_work_sync(&hwdev->channel_detect_task);
+ }
+
+ cancel_delayed_work_sync(&hwdev->sync_time_task);
+}
+
+static int init_hwdew(struct hinic3_init_para *para)
+{
+ struct hinic3_hwdev *hwdev;
+
+ hwdev = kzalloc(sizeof(*hwdev), GFP_KERNEL);
+ if (!hwdev)
+ return -ENOMEM;
+
+ *para->hwdev = hwdev;
+ hwdev->adapter_hdl = para->adapter_hdl;
+ hwdev->pcidev_hdl = para->pcidev_hdl;
+ hwdev->dev_hdl = para->dev_hdl;
+ hwdev->chip_node = para->chip_node;
+ hwdev->poll = para->poll;
+ hwdev->probe_fault_level = para->probe_fault_level;
+ hwdev->func_state = 0;
+
+ hwdev->chip_fault_stats = vzalloc(HINIC3_CHIP_FAULT_SIZE);
+ if (!hwdev->chip_fault_stats)
+ goto alloc_chip_fault_stats_err;
+
+ hwdev->stateful_ref_cnt = 0;
+ memset(hwdev->features, 0, sizeof(hwdev->features));
+
+ spin_lock_init(&hwdev->channel_lock);
+ mutex_init(&hwdev->stateful_mutex);
+
+ return 0;
+
+alloc_chip_fault_stats_err:
+ para->probe_fault_level = hwdev->probe_fault_level;
+ kfree(hwdev);
+ *para->hwdev = NULL;
+ return -EFAULT;
+}
+
+int hinic3_init_hwdev(struct hinic3_init_para *para)
+{
+ struct hinic3_hwdev *hwdev;
+ int err;
+
+ err = init_hwdew(para);
+ if (err)
+ return err;
+
+ hwdev = *para->hwdev;
+
+ err = hinic3_init_hwif(hwdev, para->cfg_reg_base, para->intr_reg_base, para->mgmt_reg_base,
+ para->db_base_phy, para->db_base, para->db_dwqe_len);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init hwif\n");
+ goto init_hwif_err;
+ }
+
+ hinic3_set_chip_present(hwdev);
+
+ hisdk3_init_profile_adapter(hwdev);
+
+ hwdev->workq = alloc_workqueue(HINIC3_HWDEV_WQ_NAME, WQ_MEM_RECLAIM, HINIC3_WQ_MAX_REQ);
+ if (!hwdev->workq) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc hardware workq\n");
+ goto alloc_workq_err;
+ }
+
+ hinic3_init_heartbeat_detect(hwdev);
+
+ err = init_cfg_mgmt(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init config mgmt\n");
+ goto init_cfg_mgmt_err;
+ }
+
+ err = hinic3_init_comm_ch(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init communication channel\n");
+ goto init_comm_ch_err;
+ }
+
+#ifdef HAVE_DEVLINK_FLASH_UPDATE_PARAMS
+ err = hinic3_init_devlink(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init devlink\n");
+ goto init_devlink_err;
+ }
+#endif
+
+ err = init_capability(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init capability\n");
+ goto init_cap_err;
+ }
+
+ hinic3_init_host_mode_pre(hwdev);
+
+ err = hinic3_multi_host_init(hwdev);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init function mode\n");
+ goto init_multi_host_fail;
+ }
+
+ err = hinic3_init_ppf_work(hwdev);
+ if (err)
+ goto init_ppf_work_fail;
+
+ err = hinic3_set_comm_features(hwdev, hwdev->features, COMM_MAX_FEATURE_QWORD);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to set comm features\n");
+ goto set_feature_err;
+ }
+
+ return 0;
+
+set_feature_err:
+ hinic3_free_ppf_work(hwdev);
+
+init_ppf_work_fail:
+ hinic3_multi_host_free(hwdev);
+
+init_multi_host_fail:
+ free_capability(hwdev);
+
+init_cap_err:
+#ifdef HAVE_DEVLINK_FLASH_UPDATE_PARAMS
+ hinic3_uninit_devlink(hwdev);
+
+init_devlink_err:
+#endif
+ hinic3_uninit_comm_ch(hwdev);
+
+init_comm_ch_err:
+ free_cfg_mgmt(hwdev);
+
+init_cfg_mgmt_err:
+ hinic3_destroy_heartbeat_detect(hwdev);
+ destroy_workqueue(hwdev->workq);
+
+alloc_workq_err:
+ hisdk3_deinit_profile_adapter(hwdev);
+
+ hinic3_free_hwif(hwdev);
+
+init_hwif_err:
+ spin_lock_deinit(&hwdev->channel_lock);
+ vfree(hwdev->chip_fault_stats);
+ para->probe_fault_level = hwdev->probe_fault_level;
+ kfree(hwdev);
+ *para->hwdev = NULL;
+
+ return -EFAULT;
+}
+
+void hinic3_free_hwdev(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ u64 drv_features[COMM_MAX_FEATURE_QWORD];
+
+ memset(drv_features, 0, sizeof(drv_features));
+ hinic3_set_comm_features(hwdev, drv_features, COMM_MAX_FEATURE_QWORD);
+
+ hinic3_free_ppf_work(dev);
+
+ hinic3_multi_host_free(dev);
+
+ hinic3_func_rx_tx_flush(hwdev, HINIC3_CHANNEL_COMM);
+
+ free_capability(dev);
+
+#ifdef HAVE_DEVLINK_FLASH_UPDATE_PARAMS
+ hinic3_uninit_devlink(dev);
+#endif
+
+ hinic3_uninit_comm_ch(dev);
+
+ free_cfg_mgmt(dev);
+ hinic3_destroy_heartbeat_detect(hwdev);
+ destroy_workqueue(dev->workq);
+
+ hisdk3_deinit_profile_adapter(hwdev);
+ hinic3_free_hwif(dev);
+
+ spin_lock_deinit(&dev->channel_lock);
+ vfree(dev->chip_fault_stats);
+
+ kfree(dev);
+}
+
+void *hinic3_get_pcidev_hdl(void *hwdev)
+{
+ struct hinic3_hwdev *dev = (struct hinic3_hwdev *)hwdev;
+
+ if (!hwdev)
+ return NULL;
+
+ return dev->pcidev_hdl;
+}
+
+int hinic3_register_service_adapter(void *hwdev, void *service_adapter,
+ enum hinic3_service_type type)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev || !service_adapter || type >= SERVICE_T_MAX)
+ return -EINVAL;
+
+ if (dev->service_adapter[type])
+ return -EINVAL;
+
+ dev->service_adapter[type] = service_adapter;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_register_service_adapter);
+
+void hinic3_unregister_service_adapter(void *hwdev,
+ enum hinic3_service_type type)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev || type >= SERVICE_T_MAX)
+ return;
+
+ dev->service_adapter[type] = NULL;
+}
+EXPORT_SYMBOL(hinic3_unregister_service_adapter);
+
+void *hinic3_get_service_adapter(void *hwdev, enum hinic3_service_type type)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev || type >= SERVICE_T_MAX)
+ return NULL;
+
+ return dev->service_adapter[type];
+}
+EXPORT_SYMBOL(hinic3_get_service_adapter);
+
+int hinic3_dbg_get_hw_stats(const void *hwdev, u8 *hw_stats, const u16 *out_size)
+{
+ struct hinic3_hw_stats *tmp_hw_stats = (struct hinic3_hw_stats *)hw_stats;
+ struct card_node *chip_node = NULL;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (*out_size != sizeof(struct hinic3_hw_stats) || !hw_stats) {
+ pr_err("Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(struct hinic3_hw_stats));
+ return -EFAULT;
+ }
+
+ memcpy(hw_stats, &((struct hinic3_hwdev *)hwdev)->hw_stats,
+ sizeof(struct hinic3_hw_stats));
+
+ chip_node = ((struct hinic3_hwdev *)hwdev)->chip_node;
+
+ atomic_set(&tmp_hw_stats->nic_ucode_event_stats[HINIC3_CHANNEL_BUSY],
+ atomic_read(&chip_node->channel_busy_cnt));
+
+ return 0;
+}
+
+u16 hinic3_dbg_clear_hw_stats(void *hwdev)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_hwdev *dev = hwdev;
+
+ memset((void *)&dev->hw_stats, 0, sizeof(struct hinic3_hw_stats));
+ memset((void *)dev->chip_fault_stats, 0, HINIC3_CHIP_FAULT_SIZE);
+
+ chip_node = dev->chip_node;
+ if (COMM_SUPPORT_CHANNEL_DETECT(dev) && atomic_read(&chip_node->channel_busy_cnt)) {
+ atomic_set(&chip_node->channel_busy_cnt, 0);
+ dev->aeq_busy_cnt = 0;
+ queue_delayed_work(dev->workq, &dev->channel_detect_task,
+ msecs_to_jiffies(HINIC3_CHANNEL_DETECT_PERIOD));
+ }
+
+ return sizeof(struct hinic3_hw_stats);
+}
+
+void hinic3_get_chip_fault_stats(const void *hwdev, u8 *chip_fault_stats,
+ u32 offset)
+{
+ if (offset >= HINIC3_CHIP_FAULT_SIZE) {
+ pr_err("Invalid chip offset value: %d\n", offset);
+ return;
+ }
+
+ if (offset + MAX_DRV_BUF_SIZE <= HINIC3_CHIP_FAULT_SIZE)
+ memcpy(chip_fault_stats,
+ ((struct hinic3_hwdev *)hwdev)->chip_fault_stats
+ + offset, MAX_DRV_BUF_SIZE);
+ else
+ memcpy(chip_fault_stats,
+ ((struct hinic3_hwdev *)hwdev)->chip_fault_stats
+ + offset, HINIC3_CHIP_FAULT_SIZE - offset);
+}
+
+void hinic3_event_register(void *dev, void *pri_handle,
+ hinic3_event_handler callback)
+{
+ struct hinic3_hwdev *hwdev = dev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for register event\n");
+ return;
+ }
+
+ hwdev->event_callback = callback;
+ hwdev->event_pri_handle = pri_handle;
+}
+
+void hinic3_event_unregister(void *dev)
+{
+ struct hinic3_hwdev *hwdev = dev;
+
+ if (!dev) {
+ pr_err("Hwdev pointer is NULL for register event\n");
+ return;
+ }
+
+ hwdev->event_callback = NULL;
+ hwdev->event_pri_handle = NULL;
+}
+
+void hinic3_event_callback(void *hwdev, struct hinic3_event_info *event)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev) {
+ pr_err("Hwdev pointer is NULL for event callback\n");
+ return;
+ }
+
+ if (!dev->event_callback) {
+ sdk_info(dev->dev_hdl, "Event callback function not register\n");
+ return;
+ }
+
+ dev->event_callback(dev->event_pri_handle, event);
+}
+EXPORT_SYMBOL(hinic3_event_callback);
+
+void hinic3_set_pcie_order_cfg(void *handle)
+{
+}
+
+void hinic3_disable_mgmt_msg_report(void *hwdev)
+{
+ struct hinic3_hwdev *hw_dev = (struct hinic3_hwdev *)hwdev;
+
+ hinic3_set_pf_status(hw_dev->hwif, HINIC3_PF_STATUS_INIT);
+}
+
+void hinic3_record_pcie_error(void *hwdev)
+{
+ struct hinic3_hwdev *dev = (struct hinic3_hwdev *)hwdev;
+
+ if (!hwdev)
+ return;
+
+ atomic_inc(&dev->hw_stats.fault_event_stats.pcie_fault_stats);
+}
+
+bool hinic3_need_init_stateful_default(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ u16 chip_svc_type = dev->cfg_mgmt->svc_cap.svc_type;
+
+
+ /* Current virtio net have to init cqm in PPF. */
+ if (hinic3_func_type(hwdev) == TYPE_PPF && (chip_svc_type & CFG_SERVICE_MASK_VIRTIO) != 0)
+ return true;
+
+ /* Other service type will init cqm when uld call. */
+ return false;
+}
+
+static inline void stateful_uninit(struct hinic3_hwdev *hwdev)
+{
+ u32 stateful_en;
+
+
+ stateful_en = IS_FT_TYPE(hwdev) | IS_RDMA_TYPE(hwdev);
+ if (stateful_en)
+ hinic3_ppf_ext_db_deinit(hwdev);
+}
+
+int hinic3_stateful_init(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ int stateful_en;
+ int err;
+
+ if (!dev)
+ return -EINVAL;
+
+ if (!hinic3_get_stateful_enable(dev))
+ return 0;
+
+ mutex_lock(&dev->stateful_mutex);
+ if (dev->stateful_ref_cnt++) {
+ mutex_unlock(&dev->stateful_mutex);
+ return 0;
+ }
+
+ stateful_en = (int)(IS_FT_TYPE(dev) | IS_RDMA_TYPE(dev));
+ if (stateful_en != 0 && HINIC3_IS_PPF(dev)) {
+ err = hinic3_ppf_ext_db_init(dev);
+ if (err)
+ goto out;
+ }
+
+
+ mutex_unlock(&dev->stateful_mutex);
+ sdk_info(dev->dev_hdl, "Initialize stateful resource success\n");
+
+ return 0;
+
+
+out:
+ dev->stateful_ref_cnt--;
+ mutex_unlock(&dev->stateful_mutex);
+
+ return err;
+}
+EXPORT_SYMBOL(hinic3_stateful_init);
+
+void hinic3_stateful_deinit(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev || !hinic3_get_stateful_enable(dev))
+ return;
+
+ mutex_lock(&dev->stateful_mutex);
+ if (!dev->stateful_ref_cnt || --dev->stateful_ref_cnt) {
+ mutex_unlock(&dev->stateful_mutex);
+ return;
+ }
+
+ stateful_uninit(hwdev);
+ mutex_unlock(&dev->stateful_mutex);
+
+ sdk_info(dev->dev_hdl, "Clear stateful resource success\n");
+}
+EXPORT_SYMBOL(hinic3_stateful_deinit);
+
+void hinic3_free_stateful(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!dev || !hinic3_get_stateful_enable(dev) || !dev->stateful_ref_cnt)
+ return;
+
+ if (!hinic3_need_init_stateful_default(hwdev) || dev->stateful_ref_cnt > 1)
+ sdk_info(dev->dev_hdl, "Current stateful resource ref is incorrect, ref_cnt:%u\n",
+ dev->stateful_ref_cnt);
+
+ stateful_uninit(hwdev);
+
+ sdk_info(dev->dev_hdl, "Clear stateful resource success\n");
+}
+
+int hinic3_get_card_present_state(void *hwdev, bool *card_present_state)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev || !card_present_state)
+ return -EINVAL;
+
+ *card_present_state = get_card_present_state(dev);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_card_present_state);
+
+void hinic3_link_event_stats(void *dev, u8 link)
+{
+ struct hinic3_hwdev *hwdev = dev;
+
+ if (link)
+ atomic_inc(&hwdev->hw_stats.link_event_stats.link_up_stats);
+ else
+ atomic_inc(&hwdev->hw_stats.link_event_stats.link_down_stats);
+}
+EXPORT_SYMBOL(hinic3_link_event_stats);
+
+u8 hinic3_max_pf_num(void *hwdev)
+{
+ if (!hwdev)
+ return 0;
+
+ return HINIC3_MAX_PF_NUM((struct hinic3_hwdev *)hwdev);
+}
+EXPORT_SYMBOL(hinic3_max_pf_num);
+
+void hinic3_fault_event_report(void *hwdev, u16 src, u16 level)
+{
+ if (!hwdev)
+ return;
+
+ sdk_info(((struct hinic3_hwdev *)hwdev)->dev_hdl, "Fault event report, src: %u, level: %u\n",
+ src, level);
+
+ hisdk3_fault_post_process(hwdev, src, level);
+}
+EXPORT_SYMBOL(hinic3_fault_event_report);
+
+void hinic3_probe_success(void *hwdev)
+{
+ if (!hwdev)
+ return;
+
+ hisdk3_probe_success(hwdev);
+}
+
+#define HINIC3_CHANNEL_BUSY_TIMEOUT 25
+
+static void hinic3_update_channel_status(struct hinic3_hwdev *hwdev)
+{
+ struct card_node *chip_node = hwdev->chip_node;
+
+ if (!chip_node)
+ return;
+
+ if (hinic3_func_type(hwdev) != TYPE_PPF || !COMM_SUPPORT_CHANNEL_DETECT(hwdev) ||
+ atomic_read(&chip_node->channel_busy_cnt))
+ return;
+
+ if (test_bit(HINIC3_HWDEV_MBOX_INITED, &hwdev->func_state)) {
+ if (hwdev->last_recv_aeq_cnt != hwdev->cur_recv_aeq_cnt) {
+ hwdev->aeq_busy_cnt = 0;
+ hwdev->last_recv_aeq_cnt = hwdev->cur_recv_aeq_cnt;
+ } else {
+ hwdev->aeq_busy_cnt++;
+ }
+
+ if (hwdev->aeq_busy_cnt > HINIC3_CHANNEL_BUSY_TIMEOUT) {
+ atomic_inc(&chip_node->channel_busy_cnt);
+ sdk_err(hwdev->dev_hdl, "Detect channel busy\n");
+ }
+ }
+}
+
+static void hinic3_heartbeat_lost_handler(struct work_struct *work)
+{
+ struct hinic3_event_info event_info = { 0 };
+ struct hinic3_hwdev *hwdev = container_of(work, struct hinic3_hwdev,
+ heartbeat_lost_work);
+ u16 src, level;
+
+ atomic_inc(&hwdev->hw_stats.heart_lost_stats);
+
+ if (hwdev->event_callback) {
+ event_info.service = EVENT_SRV_COMM;
+ event_info.type =
+ hwdev->pcie_link_down ? EVENT_COMM_PCIE_LINK_DOWN :
+ EVENT_COMM_HEART_LOST;
+ hwdev->event_callback(hwdev->event_pri_handle, &event_info);
+ }
+
+ if (hwdev->pcie_link_down) {
+ src = HINIC3_FAULT_SRC_PCIE_LINK_DOWN;
+ level = FAULT_LEVEL_HOST;
+ sdk_err(hwdev->dev_hdl, "Detect pcie is link down\n");
+ } else {
+ src = HINIC3_FAULT_SRC_HOST_HEARTBEAT_LOST;
+ level = FAULT_LEVEL_FATAL;
+ sdk_err(hwdev->dev_hdl, "Heart lost report received, func_id: %d\n",
+ hinic3_global_func_id(hwdev));
+ }
+
+ hinic3_show_chip_err_info(hwdev);
+
+ hisdk3_fault_post_process(hwdev, src, level);
+}
+
+#define DETECT_PCIE_LINK_DOWN_RETRY 2
+#define HINIC3_HEARTBEAT_START_EXPIRE 5000
+#define HINIC3_HEARTBEAT_PERIOD 1000
+
+static bool hinic3_is_hw_abnormal(struct hinic3_hwdev *hwdev)
+{
+ u32 status;
+
+ if (!hinic3_get_chip_present_flag(hwdev))
+ return false;
+
+ status = hinic3_get_heartbeat_status(hwdev);
+ if (status == HINIC3_PCIE_LINK_DOWN) {
+ sdk_warn(hwdev->dev_hdl, "Detect BAR register read failed\n");
+ hwdev->rd_bar_err_cnt++;
+ if (hwdev->rd_bar_err_cnt >= DETECT_PCIE_LINK_DOWN_RETRY) {
+ hinic3_set_chip_absent(hwdev);
+ hinic3_force_complete_all(hwdev);
+ hwdev->pcie_link_down = true;
+ return true;
+ }
+
+ return false;
+ }
+
+ if (status) {
+ hwdev->heartbeat_lost = true;
+ return true;
+ }
+
+ hwdev->rd_bar_err_cnt = 0;
+
+ return false;
+}
+
+#ifdef HAVE_TIMER_SETUP
+static void hinic3_heartbeat_timer_handler(struct timer_list *t)
+#else
+static void hinic3_heartbeat_timer_handler(unsigned long data)
+#endif
+{
+#ifdef HAVE_TIMER_SETUP
+ struct hinic3_hwdev *hwdev = from_timer(hwdev, t, heartbeat_timer);
+#else
+ struct hinic3_hwdev *hwdev = (struct hinic3_hwdev *)data;
+#endif
+
+ if (hinic3_is_hw_abnormal(hwdev)) {
+ stop_timer(&hwdev->heartbeat_timer);
+ queue_work(hwdev->workq, &hwdev->heartbeat_lost_work);
+ } else {
+ mod_timer(&hwdev->heartbeat_timer,
+ jiffies + msecs_to_jiffies(HINIC3_HEARTBEAT_PERIOD));
+ }
+
+ hinic3_update_channel_status(hwdev);
+}
+
+static void hinic3_init_heartbeat_detect(struct hinic3_hwdev *hwdev)
+{
+#ifdef HAVE_TIMER_SETUP
+ timer_setup(&hwdev->heartbeat_timer, hinic3_heartbeat_timer_handler, 0);
+#else
+ initialize_timer(hwdev->adapter_hdl, &hwdev->heartbeat_timer);
+ hwdev->heartbeat_timer.data = (u64)hwdev;
+ hwdev->heartbeat_timer.function = hinic3_heartbeat_timer_handler;
+#endif
+
+ hwdev->heartbeat_timer.expires =
+ jiffies + msecs_to_jiffies(HINIC3_HEARTBEAT_START_EXPIRE);
+
+ add_to_timer(&hwdev->heartbeat_timer, HINIC3_HEARTBEAT_PERIOD);
+
+ INIT_WORK(&hwdev->heartbeat_lost_work, hinic3_heartbeat_lost_handler);
+}
+
+static void hinic3_destroy_heartbeat_detect(struct hinic3_hwdev *hwdev)
+{
+ destroy_work(&hwdev->heartbeat_lost_work);
+ stop_timer(&hwdev->heartbeat_timer);
+ delete_timer(&hwdev->heartbeat_timer);
+}
+
+void hinic3_set_api_stop(void *hwdev)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ if (!hwdev)
+ return;
+
+ dev->chip_present_flag = HINIC3_CHIP_ABSENT;
+ sdk_info(dev->dev_hdl, "Set card absent\n");
+ hinic3_force_complete_all(dev);
+ sdk_info(dev->dev_hdl, "All messages interacting with the chip will stop\n");
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.h
new file mode 100644
index 000000000000..9f7d8a4859ec
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.h
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_HWDEV_H
+#define HINIC3_HWDEV_H
+
+#include "hinic3_mt.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_profile.h"
+
+struct cfg_mgmt_info;
+
+struct hinic3_hwif;
+struct hinic3_aeqs;
+struct hinic3_ceqs;
+struct hinic3_mbox;
+struct hinic3_msg_pf_to_mgmt;
+struct hinic3_hwdev;
+
+#define HINIC3_CHANNEL_DETECT_PERIOD (5 * 1000)
+
+struct hinic3_page_addr {
+ void *virt_addr;
+ u64 phys_addr;
+};
+
+struct mqm_addr_trans_tbl_info {
+ u32 chunk_num;
+ u32 search_gpa_num;
+ u32 page_size;
+ u32 page_num;
+ struct hinic3_page_addr *brm_srch_page_addr;
+};
+
+struct hinic3_devlink {
+ struct hinic3_hwdev *hwdev;
+ u8 activate_fw; /* 0 ~ 7 */
+ u8 switch_cfg; /* 0 ~ 7 */
+};
+
+enum hinic3_func_mode {
+ /* single host */
+ FUNC_MOD_NORMAL_HOST,
+ /* multi host, bare-metal, sdi side */
+ FUNC_MOD_MULTI_BM_MASTER,
+ /* multi host, bare-metal, host side */
+ FUNC_MOD_MULTI_BM_SLAVE,
+ /* multi host, vm mode, sdi side */
+ FUNC_MOD_MULTI_VM_MASTER,
+ /* multi host, vm mode, host side */
+ FUNC_MOD_MULTI_VM_SLAVE,
+};
+
+#define IS_BMGW_MASTER_HOST(hwdev) \
+ ((hwdev)->func_mode == FUNC_MOD_MULTI_BM_MASTER)
+#define IS_BMGW_SLAVE_HOST(hwdev) \
+ ((hwdev)->func_mode == FUNC_MOD_MULTI_BM_SLAVE)
+#define IS_VM_MASTER_HOST(hwdev) \
+ ((hwdev)->func_mode == FUNC_MOD_MULTI_VM_MASTER)
+#define IS_VM_SLAVE_HOST(hwdev) \
+ ((hwdev)->func_mode == FUNC_MOD_MULTI_VM_SLAVE)
+
+#define IS_MASTER_HOST(hwdev) \
+ (IS_BMGW_MASTER_HOST(hwdev) || IS_VM_MASTER_HOST(hwdev))
+
+#define IS_SLAVE_HOST(hwdev) \
+ (IS_BMGW_SLAVE_HOST(hwdev) || IS_VM_SLAVE_HOST(hwdev))
+
+#define IS_MULTI_HOST(hwdev) \
+ (IS_BMGW_MASTER_HOST(hwdev) || IS_BMGW_SLAVE_HOST(hwdev) || \
+ IS_VM_MASTER_HOST(hwdev) || IS_VM_SLAVE_HOST(hwdev))
+
+#define NEED_MBOX_FORWARD(hwdev) IS_BMGW_SLAVE_HOST(hwdev)
+
+enum hinic3_host_mode_e {
+ HINIC3_MODE_NORMAL = 0,
+ HINIC3_SDI_MODE_VM,
+ HINIC3_SDI_MODE_BM,
+ HINIC3_SDI_MODE_MAX,
+};
+
+struct hinic3_hwdev {
+ void *adapter_hdl; /* pointer to hinic3_pcidev or NDIS_Adapter */
+ void *pcidev_hdl; /* pointer to pcidev or Handler */
+ void *dev_hdl; /* pointer to pcidev->dev or Handler, for
+ * sdk_err() or dma_alloc()
+ */
+
+ void *service_adapter[SERVICE_T_MAX];
+ void *chip_node;
+ void *ppf_hwdev;
+
+ u32 wq_page_size;
+ int chip_present_flag;
+ bool poll; /* use polling mode or int mode */
+ u32 rsvd1;
+
+ struct hinic3_hwif *hwif; /* include void __iomem *bar */
+ struct comm_global_attr glb_attr;
+ u64 features[COMM_MAX_FEATURE_QWORD];
+
+ struct cfg_mgmt_info *cfg_mgmt;
+
+ struct hinic3_cmdqs *cmdqs;
+ struct hinic3_aeqs *aeqs;
+ struct hinic3_ceqs *ceqs;
+ struct hinic3_mbox *func_to_func;
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt;
+ struct hinic3_clp_pf_to_mgmt *clp_pf_to_mgmt;
+
+ void *cqm_hdl;
+ struct mqm_addr_trans_tbl_info mqm_att;
+ struct hinic3_page_addr page_pa0;
+ struct hinic3_page_addr page_pa1;
+ u32 stateful_ref_cnt;
+ u32 rsvd2;
+
+ struct mutex stateful_mutex; /* protect cqm init and deinit */
+
+ struct hinic3_hw_stats hw_stats;
+ u8 *chip_fault_stats;
+
+ hinic3_event_handler event_callback;
+ void *event_pri_handle;
+
+ struct hinic3_board_info board_info;
+
+ struct delayed_work sync_time_task;
+ struct delayed_work channel_detect_task;
+ struct hisdk3_prof_attr *prof_attr;
+ struct hinic3_prof_adapter *prof_adap;
+
+ struct workqueue_struct *workq;
+
+ u32 rd_bar_err_cnt;
+ bool pcie_link_down;
+ bool heartbeat_lost;
+ struct timer_list heartbeat_timer;
+ struct work_struct heartbeat_lost_work;
+
+ ulong func_state;
+ spinlock_t channel_lock; /* protect channel init and deinit */
+
+ u16 probe_fault_level;
+
+ struct hinic3_devlink *devlink_dev;
+
+ enum hinic3_func_mode func_mode;
+ u32 rsvd3;
+
+ u64 cur_recv_aeq_cnt;
+ u64 last_recv_aeq_cnt;
+ u16 aeq_busy_cnt;
+ u64 rsvd4[8];
+};
+
+#define HINIC3_DRV_FEATURE_QW0 \
+ (COMM_F_API_CHAIN | COMM_F_CLP | COMM_F_MBOX_SEGMENT | \
+ COMM_F_CMDQ_NUM | COMM_F_VIRTIO_VQ_SIZE)
+
+#define HINIC3_MAX_HOST_NUM(hwdev) ((hwdev)->glb_attr.max_host_num)
+#define HINIC3_MAX_PF_NUM(hwdev) ((hwdev)->glb_attr.max_pf_num)
+#define HINIC3_MGMT_CPU_NODE_ID(hwdev) ((hwdev)->glb_attr.mgmt_host_node_id)
+
+#define COMM_FEATURE_QW0(hwdev, feature) \
+ ((hwdev)->features[0] & COMM_F_##feature)
+#define COMM_SUPPORT_API_CHAIN(hwdev) COMM_FEATURE_QW0(hwdev, API_CHAIN)
+#define COMM_SUPPORT_CLP(hwdev) COMM_FEATURE_QW0(hwdev, CLP)
+#define COMM_SUPPORT_CHANNEL_DETECT(hwdev) COMM_FEATURE_QW0(hwdev, CHANNEL_DETECT)
+#define COMM_SUPPORT_MBOX_SEGMENT(hwdev) (hinic3_pcie_itf_id(hwdev) == SPU_HOST_ID)
+#define COMM_SUPPORT_CMDQ_NUM(hwdev) COMM_FEATURE_QW0(hwdev, CMDQ_NUM)
+#define COMM_SUPPORT_VIRTIO_VQ_SIZE(hwdev) COMM_FEATURE_QW0(hwdev, VIRTIO_VQ_SIZE)
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.c
new file mode 100644
index 000000000000..9b749135dbed
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.c
@@ -0,0 +1,994 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/types.h>
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_csr.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_hwif.h"
+
+#ifndef CONFIG_MODULE_PROF
+#define WAIT_HWIF_READY_TIMEOUT 10000
+#else
+#define WAIT_HWIF_READY_TIMEOUT 30000
+#endif
+
+#define HINIC3_WAIT_DOORBELL_AND_OUTBOUND_TIMEOUT 60000
+
+#define MAX_MSIX_ENTRY 2048
+
+#define DB_IDX(db, db_base) \
+ ((u32)(((ulong)(db) - (ulong)(db_base)) / \
+ HINIC3_DB_PAGE_SIZE))
+
+#define HINIC3_AF0_FUNC_GLOBAL_IDX_SHIFT 0
+#define HINIC3_AF0_P2P_IDX_SHIFT 12
+#define HINIC3_AF0_PCI_INTF_IDX_SHIFT 17
+#define HINIC3_AF0_VF_IN_PF_SHIFT 20
+#define HINIC3_AF0_FUNC_TYPE_SHIFT 28
+
+#define HINIC3_AF0_FUNC_GLOBAL_IDX_MASK 0xFFF
+#define HINIC3_AF0_P2P_IDX_MASK 0x1F
+#define HINIC3_AF0_PCI_INTF_IDX_MASK 0x7
+#define HINIC3_AF0_VF_IN_PF_MASK 0xFF
+#define HINIC3_AF0_FUNC_TYPE_MASK 0x1
+
+#define HINIC3_AF0_GET(val, member) \
+ (((val) >> HINIC3_AF0_##member##_SHIFT) & HINIC3_AF0_##member##_MASK)
+
+#define HINIC3_AF1_PPF_IDX_SHIFT 0
+#define HINIC3_AF1_AEQS_PER_FUNC_SHIFT 8
+#define HINIC3_AF1_MGMT_INIT_STATUS_SHIFT 30
+#define HINIC3_AF1_PF_INIT_STATUS_SHIFT 31
+
+#define HINIC3_AF1_PPF_IDX_MASK 0x3F
+#define HINIC3_AF1_AEQS_PER_FUNC_MASK 0x3
+#define HINIC3_AF1_MGMT_INIT_STATUS_MASK 0x1
+#define HINIC3_AF1_PF_INIT_STATUS_MASK 0x1
+
+#define HINIC3_AF1_GET(val, member) \
+ (((val) >> HINIC3_AF1_##member##_SHIFT) & HINIC3_AF1_##member##_MASK)
+
+#define HINIC3_AF2_CEQS_PER_FUNC_SHIFT 0
+#define HINIC3_AF2_DMA_ATTR_PER_FUNC_SHIFT 9
+#define HINIC3_AF2_IRQS_PER_FUNC_SHIFT 16
+
+#define HINIC3_AF2_CEQS_PER_FUNC_MASK 0x1FF
+#define HINIC3_AF2_DMA_ATTR_PER_FUNC_MASK 0x7
+#define HINIC3_AF2_IRQS_PER_FUNC_MASK 0x7FF
+
+#define HINIC3_AF2_GET(val, member) \
+ (((val) >> HINIC3_AF2_##member##_SHIFT) & HINIC3_AF2_##member##_MASK)
+
+#define HINIC3_AF3_GLOBAL_VF_ID_OF_NXT_PF_SHIFT 0
+#define HINIC3_AF3_GLOBAL_VF_ID_OF_PF_SHIFT 16
+
+#define HINIC3_AF3_GLOBAL_VF_ID_OF_NXT_PF_MASK 0xFFF
+#define HINIC3_AF3_GLOBAL_VF_ID_OF_PF_MASK 0xFFF
+
+#define HINIC3_AF3_GET(val, member) \
+ (((val) >> HINIC3_AF3_##member##_SHIFT) & HINIC3_AF3_##member##_MASK)
+
+#define HINIC3_AF4_DOORBELL_CTRL_SHIFT 0
+#define HINIC3_AF4_DOORBELL_CTRL_MASK 0x1
+
+#define HINIC3_AF4_GET(val, member) \
+ (((val) >> HINIC3_AF4_##member##_SHIFT) & HINIC3_AF4_##member##_MASK)
+
+#define HINIC3_AF4_SET(val, member) \
+ (((val) & HINIC3_AF4_##member##_MASK) << HINIC3_AF4_##member##_SHIFT)
+
+#define HINIC3_AF4_CLEAR(val, member) \
+ ((val) & (~(HINIC3_AF4_##member##_MASK << HINIC3_AF4_##member##_SHIFT)))
+
+#define HINIC3_AF5_OUTBOUND_CTRL_SHIFT 0
+#define HINIC3_AF5_OUTBOUND_CTRL_MASK 0x1
+
+#define HINIC3_AF5_GET(val, member) \
+ (((val) >> HINIC3_AF5_##member##_SHIFT) & HINIC3_AF5_##member##_MASK)
+
+#define HINIC3_AF5_SET(val, member) \
+ (((val) & HINIC3_AF5_##member##_MASK) << HINIC3_AF5_##member##_SHIFT)
+
+#define HINIC3_AF5_CLEAR(val, member) \
+ ((val) & (~(HINIC3_AF5_##member##_MASK << HINIC3_AF5_##member##_SHIFT)))
+
+#define HINIC3_AF6_PF_STATUS_SHIFT 0
+#define HINIC3_AF6_PF_STATUS_MASK 0xFFFF
+
+#define HINIC3_AF6_FUNC_MAX_SQ_SHIFT 23
+#define HINIC3_AF6_FUNC_MAX_SQ_MASK 0x1FF
+
+#define HINIC3_AF6_MSIX_FLEX_EN_SHIFT 22
+#define HINIC3_AF6_MSIX_FLEX_EN_MASK 0x1
+
+#define HINIC3_AF6_SET(val, member) \
+ ((((u32)(val)) & HINIC3_AF6_##member##_MASK) << \
+ HINIC3_AF6_##member##_SHIFT)
+
+#define HINIC3_AF6_GET(val, member) \
+ (((u32)(val) >> HINIC3_AF6_##member##_SHIFT) & HINIC3_AF6_##member##_MASK)
+
+#define HINIC3_AF6_CLEAR(val, member) \
+ ((u32)(val) & (~(HINIC3_AF6_##member##_MASK << \
+ HINIC3_AF6_##member##_SHIFT)))
+
+#define HINIC3_PPF_ELECT_PORT_IDX_SHIFT 0
+
+#define HINIC3_PPF_ELECT_PORT_IDX_MASK 0x3F
+
+#define HINIC3_PPF_ELECT_PORT_GET(val, member) \
+ (((val) >> HINIC3_PPF_ELECT_PORT_##member##_SHIFT) & \
+ HINIC3_PPF_ELECT_PORT_##member##_MASK)
+
+#define HINIC3_PPF_ELECTION_IDX_SHIFT 0
+
+#define HINIC3_PPF_ELECTION_IDX_MASK 0x3F
+
+#define HINIC3_PPF_ELECTION_SET(val, member) \
+ (((val) & HINIC3_PPF_ELECTION_##member##_MASK) << \
+ HINIC3_PPF_ELECTION_##member##_SHIFT)
+
+#define HINIC3_PPF_ELECTION_GET(val, member) \
+ (((val) >> HINIC3_PPF_ELECTION_##member##_SHIFT) & \
+ HINIC3_PPF_ELECTION_##member##_MASK)
+
+#define HINIC3_PPF_ELECTION_CLEAR(val, member) \
+ ((val) & (~(HINIC3_PPF_ELECTION_##member##_MASK << \
+ HINIC3_PPF_ELECTION_##member##_SHIFT)))
+
+#define HINIC3_MPF_ELECTION_IDX_SHIFT 0
+
+#define HINIC3_MPF_ELECTION_IDX_MASK 0x1F
+
+#define HINIC3_MPF_ELECTION_SET(val, member) \
+ (((val) & HINIC3_MPF_ELECTION_##member##_MASK) << \
+ HINIC3_MPF_ELECTION_##member##_SHIFT)
+
+#define HINIC3_MPF_ELECTION_GET(val, member) \
+ (((val) >> HINIC3_MPF_ELECTION_##member##_SHIFT) & \
+ HINIC3_MPF_ELECTION_##member##_MASK)
+
+#define HINIC3_MPF_ELECTION_CLEAR(val, member) \
+ ((val) & (~(HINIC3_MPF_ELECTION_##member##_MASK << \
+ HINIC3_MPF_ELECTION_##member##_SHIFT)))
+
+#define HINIC3_GET_REG_FLAG(reg) ((reg) & (~(HINIC3_REGS_FLAG_MAKS)))
+
+#define HINIC3_GET_REG_ADDR(reg) ((reg) & (HINIC3_REGS_FLAG_MAKS))
+
+u32 hinic3_hwif_read_reg(struct hinic3_hwif *hwif, u32 reg)
+{
+ if (HINIC3_GET_REG_FLAG(reg) == HINIC3_MGMT_REGS_FLAG)
+ return be32_to_cpu(readl(hwif->mgmt_regs_base +
+ HINIC3_GET_REG_ADDR(reg)));
+ else
+ return be32_to_cpu(readl(hwif->cfg_regs_base +
+ HINIC3_GET_REG_ADDR(reg)));
+}
+
+void hinic3_hwif_write_reg(struct hinic3_hwif *hwif, u32 reg, u32 val)
+{
+ if (HINIC3_GET_REG_FLAG(reg) == HINIC3_MGMT_REGS_FLAG)
+ writel(cpu_to_be32(val),
+ hwif->mgmt_regs_base + HINIC3_GET_REG_ADDR(reg));
+ else
+ writel(cpu_to_be32(val),
+ hwif->cfg_regs_base + HINIC3_GET_REG_ADDR(reg));
+}
+
+bool get_card_present_state(struct hinic3_hwdev *hwdev)
+{
+ u32 attr1;
+
+ attr1 = hinic3_hwif_read_reg(hwdev->hwif, HINIC3_CSR_FUNC_ATTR1_ADDR);
+ if (attr1 == HINIC3_PCIE_LINK_DOWN) {
+ sdk_warn(hwdev->dev_hdl, "Card is not present\n");
+ return false;
+ }
+
+ return true;
+}
+
+/**
+ * hinic3_get_heartbeat_status - get heart beat status
+ * @hwdev: the pointer to hw device
+ * Return: 0 - normal, 1 - heart lost, 0xFFFFFFFF - Pcie link down
+ **/
+u32 hinic3_get_heartbeat_status(void *hwdev)
+{
+ u32 attr1;
+
+ if (!hwdev)
+ return HINIC3_PCIE_LINK_DOWN;
+
+ attr1 = hinic3_hwif_read_reg(((struct hinic3_hwdev *)hwdev)->hwif,
+ HINIC3_CSR_FUNC_ATTR1_ADDR);
+ if (attr1 == HINIC3_PCIE_LINK_DOWN)
+ return attr1;
+
+ return !HINIC3_AF1_GET(attr1, MGMT_INIT_STATUS);
+}
+EXPORT_SYMBOL(hinic3_get_heartbeat_status);
+
+#define MIGRATE_HOST_STATUS_CLEAR(host_id, val) ((val) & (~(1U << (host_id))))
+#define MIGRATE_HOST_STATUS_SET(host_id, enable) (((u8)(enable) & 1U) << (host_id))
+#define MIGRATE_HOST_STATUS_GET(host_id, val) (!!((val) & (1U << (host_id))))
+
+int hinic3_set_host_migrate_enable(void *hwdev, u8 host_id, bool enable)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ u32 reg_val;
+
+ if (HINIC3_FUNC_TYPE(dev) != TYPE_PPF) {
+ sdk_warn(dev->dev_hdl, "hwdev should be ppf\n");
+ return -EINVAL;
+ }
+
+ reg_val = hinic3_hwif_read_reg(dev->hwif, HINIC3_MULT_MIGRATE_HOST_STATUS_ADDR);
+ reg_val = MIGRATE_HOST_STATUS_CLEAR(host_id, reg_val);
+ reg_val |= MIGRATE_HOST_STATUS_SET(host_id, enable);
+
+ hinic3_hwif_write_reg(dev->hwif, HINIC3_MULT_MIGRATE_HOST_STATUS_ADDR, reg_val);
+
+ sdk_info(dev->dev_hdl, "Set migrate host %d status %d, reg value: 0x%x\n",
+ host_id, enable, reg_val);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_host_migrate_enable);
+
+int hinic3_get_host_migrate_enable(void *hwdev, u8 host_id, u8 *migrate_en)
+{
+ struct hinic3_hwdev *dev = hwdev;
+
+ u32 reg_val;
+
+ if (HINIC3_FUNC_TYPE(dev) != TYPE_PPF) {
+ sdk_warn(dev->dev_hdl, "hwdev should be ppf\n");
+ return -EINVAL;
+ }
+
+ reg_val = hinic3_hwif_read_reg(dev->hwif, HINIC3_MULT_MIGRATE_HOST_STATUS_ADDR);
+ *migrate_en = MIGRATE_HOST_STATUS_GET(host_id, reg_val);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_get_host_migrate_enable);
+
+static enum hinic3_wait_return check_hwif_ready_handler(void *priv_data)
+{
+ u32 status;
+
+ status = hinic3_get_heartbeat_status(priv_data);
+ if (status == HINIC3_PCIE_LINK_DOWN)
+ return WAIT_PROCESS_ERR;
+ else if (!status)
+ return WAIT_PROCESS_CPL;
+
+ return WAIT_PROCESS_WAITING;
+}
+
+static int wait_hwif_ready(struct hinic3_hwdev *hwdev)
+{
+ int ret;
+
+ ret = hinic3_wait_for_timeout(hwdev, check_hwif_ready_handler,
+ WAIT_HWIF_READY_TIMEOUT, USEC_PER_MSEC);
+ if (ret == -ETIMEDOUT) {
+ hwdev->probe_fault_level = FAULT_LEVEL_FATAL;
+ sdk_err(hwdev->dev_hdl, "Wait for hwif timeout\n");
+ }
+
+ return ret;
+}
+
+/**
+ * set_hwif_attr - set the attributes as members in hwif
+ * @hwif: the hardware interface of a pci function device
+ * @attr0: the first attribute that was read from the hw
+ * @attr1: the second attribute that was read from the hw
+ * @attr2: the third attribute that was read from the hw
+ * @attr3: the fourth attribute that was read from the hw
+ **/
+static void set_hwif_attr(struct hinic3_hwif *hwif, u32 attr0, u32 attr1,
+ u32 attr2, u32 attr3, u32 attr6)
+{
+ hwif->attr.func_global_idx = HINIC3_AF0_GET(attr0, FUNC_GLOBAL_IDX);
+ hwif->attr.port_to_port_idx = HINIC3_AF0_GET(attr0, P2P_IDX);
+ hwif->attr.pci_intf_idx = HINIC3_AF0_GET(attr0, PCI_INTF_IDX);
+ hwif->attr.vf_in_pf = HINIC3_AF0_GET(attr0, VF_IN_PF);
+ hwif->attr.func_type = HINIC3_AF0_GET(attr0, FUNC_TYPE);
+
+ hwif->attr.ppf_idx = HINIC3_AF1_GET(attr1, PPF_IDX);
+ hwif->attr.num_aeqs = BIT(HINIC3_AF1_GET(attr1, AEQS_PER_FUNC));
+ hwif->attr.num_ceqs = (u8)HINIC3_AF2_GET(attr2, CEQS_PER_FUNC);
+ hwif->attr.num_irqs = HINIC3_AF2_GET(attr2, IRQS_PER_FUNC);
+ if (hwif->attr.num_irqs > MAX_MSIX_ENTRY)
+ hwif->attr.num_irqs = MAX_MSIX_ENTRY;
+
+ hwif->attr.num_dma_attr = BIT(HINIC3_AF2_GET(attr2, DMA_ATTR_PER_FUNC));
+
+ hwif->attr.global_vf_id_of_pf = HINIC3_AF3_GET(attr3,
+ GLOBAL_VF_ID_OF_PF);
+
+ hwif->attr.num_sq = HINIC3_AF6_GET(attr6, FUNC_MAX_SQ);
+ hwif->attr.msix_flex_en = HINIC3_AF6_GET(attr6, MSIX_FLEX_EN);
+}
+
+/**
+ * get_hwif_attr - read and set the attributes as members in hwif
+ * @hwif: the hardware interface of a pci function device
+ **/
+static int get_hwif_attr(struct hinic3_hwif *hwif)
+{
+ u32 addr, attr0, attr1, attr2, attr3, attr6;
+
+ addr = HINIC3_CSR_FUNC_ATTR0_ADDR;
+ attr0 = hinic3_hwif_read_reg(hwif, addr);
+ if (attr0 == HINIC3_PCIE_LINK_DOWN)
+ return -EFAULT;
+
+ addr = HINIC3_CSR_FUNC_ATTR1_ADDR;
+ attr1 = hinic3_hwif_read_reg(hwif, addr);
+ if (attr1 == HINIC3_PCIE_LINK_DOWN)
+ return -EFAULT;
+
+ addr = HINIC3_CSR_FUNC_ATTR2_ADDR;
+ attr2 = hinic3_hwif_read_reg(hwif, addr);
+ if (attr2 == HINIC3_PCIE_LINK_DOWN)
+ return -EFAULT;
+
+ addr = HINIC3_CSR_FUNC_ATTR3_ADDR;
+ attr3 = hinic3_hwif_read_reg(hwif, addr);
+ if (attr3 == HINIC3_PCIE_LINK_DOWN)
+ return -EFAULT;
+
+ addr = HINIC3_CSR_FUNC_ATTR6_ADDR;
+ attr6 = hinic3_hwif_read_reg(hwif, addr);
+ if (attr6 == HINIC3_PCIE_LINK_DOWN)
+ return -EFAULT;
+
+ set_hwif_attr(hwif, attr0, attr1, attr2, attr3, attr6);
+
+ return 0;
+}
+
+void hinic3_set_pf_status(struct hinic3_hwif *hwif,
+ enum hinic3_pf_status status)
+{
+ u32 attr6 = hinic3_hwif_read_reg(hwif, HINIC3_CSR_FUNC_ATTR6_ADDR);
+
+ attr6 = HINIC3_AF6_CLEAR(attr6, PF_STATUS);
+ attr6 |= HINIC3_AF6_SET(status, PF_STATUS);
+
+ if (hwif->attr.func_type == TYPE_VF)
+ return;
+
+ hinic3_hwif_write_reg(hwif, HINIC3_CSR_FUNC_ATTR6_ADDR, attr6);
+}
+
+enum hinic3_pf_status hinic3_get_pf_status(struct hinic3_hwif *hwif)
+{
+ u32 attr6 = hinic3_hwif_read_reg(hwif, HINIC3_CSR_FUNC_ATTR6_ADDR);
+
+ return HINIC3_AF6_GET(attr6, PF_STATUS);
+}
+
+static enum hinic3_doorbell_ctrl hinic3_get_doorbell_ctrl_status(struct hinic3_hwif *hwif)
+{
+ u32 attr4 = hinic3_hwif_read_reg(hwif, HINIC3_CSR_FUNC_ATTR4_ADDR);
+
+ return HINIC3_AF4_GET(attr4, DOORBELL_CTRL);
+}
+
+static enum hinic3_outbound_ctrl hinic3_get_outbound_ctrl_status(struct hinic3_hwif *hwif)
+{
+ u32 attr5 = hinic3_hwif_read_reg(hwif, HINIC3_CSR_FUNC_ATTR5_ADDR);
+
+ return HINIC3_AF5_GET(attr5, OUTBOUND_CTRL);
+}
+
+void hinic3_enable_doorbell(struct hinic3_hwif *hwif)
+{
+ u32 addr, attr4;
+
+ addr = HINIC3_CSR_FUNC_ATTR4_ADDR;
+ attr4 = hinic3_hwif_read_reg(hwif, addr);
+
+ attr4 = HINIC3_AF4_CLEAR(attr4, DOORBELL_CTRL);
+ attr4 |= HINIC3_AF4_SET(ENABLE_DOORBELL, DOORBELL_CTRL);
+
+ hinic3_hwif_write_reg(hwif, addr, attr4);
+}
+
+void hinic3_disable_doorbell(struct hinic3_hwif *hwif)
+{
+ u32 addr, attr4;
+
+ addr = HINIC3_CSR_FUNC_ATTR4_ADDR;
+ attr4 = hinic3_hwif_read_reg(hwif, addr);
+
+ attr4 = HINIC3_AF4_CLEAR(attr4, DOORBELL_CTRL);
+ attr4 |= HINIC3_AF4_SET(DISABLE_DOORBELL, DOORBELL_CTRL);
+
+ hinic3_hwif_write_reg(hwif, addr, attr4);
+}
+
+/**
+ * set_ppf - try to set hwif as ppf and set the type of hwif in this case
+ * @hwif: the hardware interface of a pci function device
+ **/
+static void set_ppf(struct hinic3_hwif *hwif)
+{
+ struct hinic3_func_attr *attr = &hwif->attr;
+ u32 addr, val, ppf_election;
+
+ /* Read Modify Write */
+ addr = HINIC3_CSR_PPF_ELECTION_ADDR;
+
+ val = hinic3_hwif_read_reg(hwif, addr);
+ val = HINIC3_PPF_ELECTION_CLEAR(val, IDX);
+
+ ppf_election = HINIC3_PPF_ELECTION_SET(attr->func_global_idx, IDX);
+ val |= ppf_election;
+
+ hinic3_hwif_write_reg(hwif, addr, val);
+
+ /* Check PPF */
+ val = hinic3_hwif_read_reg(hwif, addr);
+
+ attr->ppf_idx = HINIC3_PPF_ELECTION_GET(val, IDX);
+ if (attr->ppf_idx == attr->func_global_idx)
+ attr->func_type = TYPE_PPF;
+}
+
+/**
+ * get_mpf - get the mpf index into the hwif
+ * @hwif: the hardware interface of a pci function device
+ **/
+static void get_mpf(struct hinic3_hwif *hwif)
+{
+ struct hinic3_func_attr *attr = &hwif->attr;
+ u32 mpf_election, addr;
+
+ addr = HINIC3_CSR_GLOBAL_MPF_ELECTION_ADDR;
+
+ mpf_election = hinic3_hwif_read_reg(hwif, addr);
+ attr->mpf_idx = HINIC3_MPF_ELECTION_GET(mpf_election, IDX);
+}
+
+/**
+ * set_mpf - try to set hwif as mpf and set the mpf idx in hwif
+ * @hwif: the hardware interface of a pci function device
+ **/
+static void set_mpf(struct hinic3_hwif *hwif)
+{
+ struct hinic3_func_attr *attr = &hwif->attr;
+ u32 addr, val, mpf_election;
+
+ /* Read Modify Write */
+ addr = HINIC3_CSR_GLOBAL_MPF_ELECTION_ADDR;
+
+ val = hinic3_hwif_read_reg(hwif, addr);
+
+ val = HINIC3_MPF_ELECTION_CLEAR(val, IDX);
+ mpf_election = HINIC3_MPF_ELECTION_SET(attr->func_global_idx, IDX);
+
+ val |= mpf_election;
+ hinic3_hwif_write_reg(hwif, addr, val);
+}
+
+static int init_hwif(struct hinic3_hwdev *hwdev, void *cfg_reg_base, void *intr_reg_base,
+ void *mgmt_regs_base)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ hwif = kzalloc(sizeof(*hwif), GFP_KERNEL);
+ if (!hwif)
+ return -ENOMEM;
+
+ hwdev->hwif = hwif;
+ hwif->pdev = hwdev->pcidev_hdl;
+
+ /* if function is VF, mgmt_regs_base will be NULL */
+ hwif->cfg_regs_base = mgmt_regs_base ? cfg_reg_base :
+ (u8 *)cfg_reg_base + HINIC3_VF_CFG_REG_OFFSET;
+
+ hwif->intr_regs_base = intr_reg_base;
+ hwif->mgmt_regs_base = mgmt_regs_base;
+
+ return 0;
+}
+
+static int init_db_area_idx(struct hinic3_hwif *hwif, u64 db_base_phy, u8 *db_base,
+ u64 db_dwqe_len)
+{
+ struct hinic3_free_db_area *free_db_area = &hwif->free_db_area;
+ u32 db_max_areas;
+
+ hwif->db_base_phy = db_base_phy;
+ hwif->db_base = db_base;
+ hwif->db_dwqe_len = db_dwqe_len;
+
+ db_max_areas = (db_dwqe_len > HINIC3_DB_DWQE_SIZE) ?
+ HINIC3_DB_MAX_AREAS :
+ (u32)(db_dwqe_len / HINIC3_DB_PAGE_SIZE);
+ free_db_area->db_bitmap_array = bitmap_zalloc(db_max_areas, GFP_KERNEL);
+ if (!free_db_area->db_bitmap_array) {
+ pr_err("Failed to allocate db area.\n");
+ return -ENOMEM;
+ }
+ free_db_area->db_max_areas = db_max_areas;
+ spin_lock_init(&free_db_area->idx_lock);
+ return 0;
+}
+
+static void free_db_area(struct hinic3_free_db_area *free_db_area)
+{
+ spin_lock_deinit(&free_db_area->idx_lock);
+ kfree(free_db_area->db_bitmap_array);
+}
+
+static int get_db_idx(struct hinic3_hwif *hwif, u32 *idx)
+{
+ struct hinic3_free_db_area *free_db_area = &hwif->free_db_area;
+ u32 pg_idx;
+
+ spin_lock(&free_db_area->idx_lock);
+ pg_idx = (u32)find_first_zero_bit(free_db_area->db_bitmap_array,
+ free_db_area->db_max_areas);
+ if (pg_idx == free_db_area->db_max_areas) {
+ spin_unlock(&free_db_area->idx_lock);
+ return -ENOMEM;
+ }
+ set_bit(pg_idx, free_db_area->db_bitmap_array);
+ spin_unlock(&free_db_area->idx_lock);
+
+ *idx = pg_idx;
+
+ return 0;
+}
+
+static void free_db_idx(struct hinic3_hwif *hwif, u32 idx)
+{
+ struct hinic3_free_db_area *free_db_area = &hwif->free_db_area;
+
+ if (idx >= free_db_area->db_max_areas)
+ return;
+
+ spin_lock(&free_db_area->idx_lock);
+ clear_bit((int)idx, free_db_area->db_bitmap_array);
+
+ spin_unlock(&free_db_area->idx_lock);
+}
+
+void hinic3_free_db_addr(void *hwdev, const void __iomem *db_base,
+ void __iomem *dwqe_base)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 idx;
+
+ if (!hwdev || !db_base)
+ return;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+ idx = DB_IDX(db_base, hwif->db_base);
+
+ free_db_idx(hwif, idx);
+}
+EXPORT_SYMBOL(hinic3_free_db_addr);
+
+int hinic3_alloc_db_addr(void *hwdev, void __iomem **db_base,
+ void __iomem **dwqe_base)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 idx = 0;
+ int err;
+
+ if (!hwdev || !db_base)
+ return -EINVAL;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ err = get_db_idx(hwif, &idx);
+ if (err)
+ return -EFAULT;
+
+ *db_base = hwif->db_base + idx * HINIC3_DB_PAGE_SIZE;
+
+ if (!dwqe_base)
+ return 0;
+
+ *dwqe_base = (u8 *)*db_base + HINIC3_DWQE_OFFSET;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_alloc_db_addr);
+
+void hinic3_free_db_phy_addr(void *hwdev, u64 db_base, u64 dwqe_base)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 idx;
+
+ if (!hwdev)
+ return;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+ idx = DB_IDX(db_base, hwif->db_base_phy);
+
+ free_db_idx(hwif, idx);
+}
+EXPORT_SYMBOL(hinic3_free_db_phy_addr);
+
+int hinic3_alloc_db_phy_addr(void *hwdev, u64 *db_base, u64 *dwqe_base)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 idx;
+ int err;
+
+ if (!hwdev || !db_base || !dwqe_base)
+ return -EINVAL;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ err = get_db_idx(hwif, &idx);
+ if (err)
+ return -EFAULT;
+
+ *db_base = hwif->db_base_phy + idx * HINIC3_DB_PAGE_SIZE;
+ *dwqe_base = *db_base + HINIC3_DWQE_OFFSET;
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_alloc_db_phy_addr);
+
+void hinic3_set_msix_auto_mask_state(void *hwdev, u16 msix_idx,
+ enum hinic3_msix_auto_mask flag)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 mask_bits;
+ u32 addr;
+
+ if (!hwdev)
+ return;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ if (flag)
+ mask_bits = HINIC3_MSI_CLR_INDIR_SET(1, AUTO_MSK_SET);
+ else
+ mask_bits = HINIC3_MSI_CLR_INDIR_SET(1, AUTO_MSK_CLR);
+
+ mask_bits = mask_bits |
+ HINIC3_MSI_CLR_INDIR_SET(msix_idx, SIMPLE_INDIR_IDX);
+
+ addr = HINIC3_CSR_FUNC_MSI_CLR_WR_ADDR;
+ hinic3_hwif_write_reg(hwif, addr, mask_bits);
+}
+EXPORT_SYMBOL(hinic3_set_msix_auto_mask_state);
+
+void hinic3_set_msix_state(void *hwdev, u16 msix_idx,
+ enum hinic3_msix_state flag)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 mask_bits;
+ u32 addr;
+ u8 int_msk = 1;
+
+ if (!hwdev)
+ return;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ if (flag)
+ mask_bits = HINIC3_MSI_CLR_INDIR_SET(int_msk, INT_MSK_SET);
+ else
+ mask_bits = HINIC3_MSI_CLR_INDIR_SET(int_msk, INT_MSK_CLR);
+ mask_bits = mask_bits |
+ HINIC3_MSI_CLR_INDIR_SET(msix_idx, SIMPLE_INDIR_IDX);
+
+ addr = HINIC3_CSR_FUNC_MSI_CLR_WR_ADDR;
+ hinic3_hwif_write_reg(hwif, addr, mask_bits);
+}
+EXPORT_SYMBOL(hinic3_set_msix_state);
+
+static void disable_all_msix(struct hinic3_hwdev *hwdev)
+{
+ u16 num_irqs = hwdev->hwif->attr.num_irqs;
+ u16 i;
+
+ for (i = 0; i < num_irqs; i++)
+ hinic3_set_msix_state(hwdev, i, HINIC3_MSIX_DISABLE);
+}
+
+static enum hinic3_wait_return check_db_outbound_enable_handler(void *priv_data)
+{
+ struct hinic3_hwif *hwif = priv_data;
+ enum hinic3_doorbell_ctrl db_ctrl;
+ enum hinic3_outbound_ctrl outbound_ctrl;
+
+ db_ctrl = hinic3_get_doorbell_ctrl_status(hwif);
+ outbound_ctrl = hinic3_get_outbound_ctrl_status(hwif);
+ if (outbound_ctrl == ENABLE_OUTBOUND && db_ctrl == ENABLE_DOORBELL)
+ return WAIT_PROCESS_CPL;
+
+ return WAIT_PROCESS_WAITING;
+}
+
+static int wait_until_doorbell_and_outbound_enabled(struct hinic3_hwif *hwif)
+{
+ return hinic3_wait_for_timeout(hwif, check_db_outbound_enable_handler,
+ HINIC3_WAIT_DOORBELL_AND_OUTBOUND_TIMEOUT, USEC_PER_MSEC);
+}
+
+static void select_ppf_mpf(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_hwif *hwif = hwdev->hwif;
+
+ if (!HINIC3_IS_VF(hwdev)) {
+ set_ppf(hwif);
+
+ if (HINIC3_IS_PPF(hwdev))
+ set_mpf(hwif);
+
+ get_mpf(hwif);
+ }
+}
+
+/**
+ * hinic3_init_hwif - initialize the hw interface
+ * @hwif: the hardware interface of a pci function device
+ * @pdev: the pci device that will be part of the hwif struct
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_init_hwif(struct hinic3_hwdev *hwdev, void *cfg_reg_base,
+ void *intr_reg_base, void *mgmt_regs_base, u64 db_base_phy,
+ void *db_base, u64 db_dwqe_len)
+{
+ struct hinic3_hwif *hwif = NULL;
+ u32 attr1, attr4, attr5;
+ int err;
+
+ err = init_hwif(hwdev, cfg_reg_base, intr_reg_base, mgmt_regs_base);
+ if (err)
+ return err;
+
+ hwif = hwdev->hwif;
+
+ err = init_db_area_idx(hwif, db_base_phy, db_base, db_dwqe_len);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to init db area.\n");
+ goto init_db_area_err;
+ }
+
+ err = wait_hwif_ready(hwdev);
+ if (err) {
+ attr1 = hinic3_hwif_read_reg(hwif, HINIC3_CSR_FUNC_ATTR1_ADDR);
+ sdk_err(hwdev->dev_hdl, "Chip status is not ready, attr1:0x%x\n", attr1);
+ goto hwif_ready_err;
+ }
+
+ err = get_hwif_attr(hwif);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Get hwif attr failed\n");
+ goto hwif_ready_err;
+ }
+
+ err = wait_until_doorbell_and_outbound_enabled(hwif);
+ if (err) {
+ attr4 = hinic3_hwif_read_reg(hwif, HINIC3_CSR_FUNC_ATTR4_ADDR);
+ attr5 = hinic3_hwif_read_reg(hwif, HINIC3_CSR_FUNC_ATTR5_ADDR);
+ sdk_err(hwdev->dev_hdl, "Hw doorbell/outbound is disabled, attr4 0x%x attr5 0x%x\n",
+ attr4, attr5);
+ goto hwif_ready_err;
+ }
+
+ select_ppf_mpf(hwdev);
+
+ disable_all_msix(hwdev);
+ /* disable mgmt cpu report any event */
+ hinic3_set_pf_status(hwdev->hwif, HINIC3_PF_STATUS_INIT);
+
+ sdk_info(hwdev->dev_hdl, "global_func_idx: %u, func_type: %d, host_id: %u, ppf: %u, mpf: %u\n",
+ hwif->attr.func_global_idx, hwif->attr.func_type, hwif->attr.pci_intf_idx,
+ hwif->attr.ppf_idx, hwif->attr.mpf_idx);
+
+ return 0;
+
+hwif_ready_err:
+ hinic3_show_chip_err_info(hwdev);
+ free_db_area(&hwif->free_db_area);
+init_db_area_err:
+ kfree(hwif);
+
+ return err;
+}
+
+/**
+ * hinic3_free_hwif - free the hw interface
+ * @hwif: the hardware interface of a pci function device
+ * @pdev: the pci device that will be part of the hwif struct
+ **/
+void hinic3_free_hwif(struct hinic3_hwdev *hwdev)
+{
+ spin_lock_deinit(&hwdev->hwif->free_db_area.idx_lock);
+ free_db_area(&hwdev->hwif->free_db_area);
+ kfree(hwdev->hwif);
+}
+
+u16 hinic3_global_func_id(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.func_global_idx;
+}
+EXPORT_SYMBOL(hinic3_global_func_id);
+
+u16 hinic3_intr_num(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.num_irqs;
+}
+EXPORT_SYMBOL(hinic3_intr_num);
+
+u8 hinic3_pf_id_of_vf(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.port_to_port_idx;
+}
+EXPORT_SYMBOL(hinic3_pf_id_of_vf);
+
+u8 hinic3_pcie_itf_id(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.pci_intf_idx;
+}
+EXPORT_SYMBOL(hinic3_pcie_itf_id);
+
+u8 hinic3_vf_in_pf(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.vf_in_pf;
+}
+EXPORT_SYMBOL(hinic3_vf_in_pf);
+
+enum func_type hinic3_func_type(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.func_type;
+}
+EXPORT_SYMBOL(hinic3_func_type);
+
+u8 hinic3_ceq_num(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.num_ceqs;
+}
+EXPORT_SYMBOL(hinic3_ceq_num);
+
+u16 hinic3_glb_pf_vf_offset(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.global_vf_id_of_pf;
+}
+EXPORT_SYMBOL(hinic3_glb_pf_vf_offset);
+
+u8 hinic3_ppf_idx(void *hwdev)
+{
+ struct hinic3_hwif *hwif = NULL;
+
+ if (!hwdev)
+ return 0;
+
+ hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hwif->attr.ppf_idx;
+}
+EXPORT_SYMBOL(hinic3_ppf_idx);
+
+u8 hinic3_host_ppf_idx(struct hinic3_hwdev *hwdev, u8 host_id)
+{
+ u32 ppf_elect_port_addr;
+ u32 val;
+
+ if (!hwdev)
+ return 0;
+
+ ppf_elect_port_addr = HINIC3_CSR_FUNC_PPF_ELECT(host_id);
+ val = hinic3_hwif_read_reg(hwdev->hwif, ppf_elect_port_addr);
+
+ return HINIC3_PPF_ELECT_PORT_GET(val, IDX);
+}
+
+u32 hinic3_get_self_test_result(void *hwdev)
+{
+ struct hinic3_hwif *hwif = ((struct hinic3_hwdev *)hwdev)->hwif;
+
+ return hinic3_hwif_read_reg(hwif, HINIC3_MGMT_HEALTH_STATUS_ADDR);
+}
+
+void hinic3_show_chip_err_info(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_hwif *hwif = hwdev->hwif;
+ u32 value;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return;
+
+ value = hinic3_hwif_read_reg(hwif, HINIC3_CHIP_BASE_INFO_ADDR);
+ sdk_warn(hwdev->dev_hdl, "Chip base info: 0x%08x\n", value);
+
+ value = hinic3_hwif_read_reg(hwif, HINIC3_MGMT_HEALTH_STATUS_ADDR);
+ sdk_warn(hwdev->dev_hdl, "Mgmt CPU health status: 0x%08x\n", value);
+
+ value = hinic3_hwif_read_reg(hwif, HINIC3_CHIP_ERR_STATUS0_ADDR);
+ sdk_warn(hwdev->dev_hdl, "Chip fatal error status0: 0x%08x\n", value);
+ value = hinic3_hwif_read_reg(hwif, HINIC3_CHIP_ERR_STATUS1_ADDR);
+ sdk_warn(hwdev->dev_hdl, "Chip fatal error status1: 0x%08x\n", value);
+
+ value = hinic3_hwif_read_reg(hwif, HINIC3_ERR_INFO0_ADDR);
+ sdk_warn(hwdev->dev_hdl, "Chip exception info0: 0x%08x\n", value);
+ value = hinic3_hwif_read_reg(hwif, HINIC3_ERR_INFO1_ADDR);
+ sdk_warn(hwdev->dev_hdl, "Chip exception info1: 0x%08x\n", value);
+ value = hinic3_hwif_read_reg(hwif, HINIC3_ERR_INFO2_ADDR);
+ sdk_warn(hwdev->dev_hdl, "Chip exception info2: 0x%08x\n", value);
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.h
new file mode 100644
index 000000000000..b204b213c43f
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.h
@@ -0,0 +1,113 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_HWIF_H
+#define HINIC3_HWIF_H
+
+#include "hinic3_hwdev.h"
+
+#define HINIC3_PCIE_LINK_DOWN 0xFFFFFFFF
+
+struct hinic3_free_db_area {
+ unsigned long *db_bitmap_array;
+ u32 db_max_areas;
+ /* spinlock for allocating doorbell area */
+ spinlock_t idx_lock;
+};
+
+struct hinic3_func_attr {
+ u16 func_global_idx;
+ u8 port_to_port_idx;
+ u8 pci_intf_idx;
+ u8 vf_in_pf;
+ u8 rsvd1;
+ u16 rsvd2;
+ enum func_type func_type;
+
+ u8 mpf_idx;
+
+ u8 ppf_idx;
+
+ u16 num_irqs; /* max: 2 ^ 15 */
+ u8 num_aeqs; /* max: 2 ^ 3 */
+ u8 num_ceqs; /* max: 2 ^ 7 */
+
+ u16 num_sq; /* max: 2 ^ 8 */
+ u8 num_dma_attr; /* max: 2 ^ 6 */
+ u8 msix_flex_en;
+
+ u16 global_vf_id_of_pf;
+};
+
+struct hinic3_hwif {
+ u8 __iomem *cfg_regs_base;
+ u8 __iomem *intr_regs_base;
+ u8 __iomem *mgmt_regs_base;
+ u64 db_base_phy;
+ u64 db_dwqe_len;
+ u8 __iomem *db_base;
+
+ struct hinic3_free_db_area free_db_area;
+
+ struct hinic3_func_attr attr;
+
+ void *pdev;
+ u64 rsvd;
+};
+
+enum hinic3_outbound_ctrl {
+ ENABLE_OUTBOUND = 0x0,
+ DISABLE_OUTBOUND = 0x1,
+};
+
+enum hinic3_doorbell_ctrl {
+ ENABLE_DOORBELL = 0x0,
+ DISABLE_DOORBELL = 0x1,
+};
+
+enum hinic3_pf_status {
+ HINIC3_PF_STATUS_INIT = 0X0,
+ HINIC3_PF_STATUS_ACTIVE_FLAG = 0x11,
+ HINIC3_PF_STATUS_FLR_START_FLAG = 0x12,
+ HINIC3_PF_STATUS_FLR_FINISH_FLAG = 0x13,
+};
+
+#define HINIC3_HWIF_NUM_AEQS(hwif) ((hwif)->attr.num_aeqs)
+#define HINIC3_HWIF_NUM_CEQS(hwif) ((hwif)->attr.num_ceqs)
+#define HINIC3_HWIF_NUM_IRQS(hwif) ((hwif)->attr.num_irqs)
+#define HINIC3_HWIF_GLOBAL_IDX(hwif) ((hwif)->attr.func_global_idx)
+#define HINIC3_HWIF_GLOBAL_VF_OFFSET(hwif) ((hwif)->attr.global_vf_id_of_pf)
+#define HINIC3_HWIF_PPF_IDX(hwif) ((hwif)->attr.ppf_idx)
+#define HINIC3_PCI_INTF_IDX(hwif) ((hwif)->attr.pci_intf_idx)
+
+#define HINIC3_FUNC_TYPE(dev) ((dev)->hwif->attr.func_type)
+#define HINIC3_IS_PF(dev) (HINIC3_FUNC_TYPE(dev) == TYPE_PF)
+#define HINIC3_IS_VF(dev) (HINIC3_FUNC_TYPE(dev) == TYPE_VF)
+#define HINIC3_IS_PPF(dev) (HINIC3_FUNC_TYPE(dev) == TYPE_PPF)
+
+u32 hinic3_hwif_read_reg(struct hinic3_hwif *hwif, u32 reg);
+
+void hinic3_hwif_write_reg(struct hinic3_hwif *hwif, u32 reg, u32 val);
+
+void hinic3_set_pf_status(struct hinic3_hwif *hwif,
+ enum hinic3_pf_status status);
+
+enum hinic3_pf_status hinic3_get_pf_status(struct hinic3_hwif *hwif);
+
+void hinic3_disable_doorbell(struct hinic3_hwif *hwif);
+
+void hinic3_enable_doorbell(struct hinic3_hwif *hwif);
+
+int hinic3_init_hwif(struct hinic3_hwdev *hwdev, void *cfg_reg_base,
+ void *intr_reg_base, void *mgmt_regs_base, u64 db_base_phy,
+ void *db_base, u64 db_dwqe_len);
+
+void hinic3_free_hwif(struct hinic3_hwdev *hwdev);
+
+void hinic3_show_chip_err_info(struct hinic3_hwdev *hwdev);
+
+u8 hinic3_host_ppf_idx(struct hinic3_hwdev *hwdev, u8 host_id);
+
+bool get_card_present_state(struct hinic3_hwdev *hwdev);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_lld.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_lld.c
new file mode 100644
index 000000000000..6c0ddfeaa424
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_lld.c
@@ -0,0 +1,1410 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <net/addrconf.h>
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/io-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/inetdevice.h>
+#include <linux/time.h>
+#include <linux/timex.h>
+#include <linux/rtc.h>
+#include <linux/aer.h>
+#include <linux/debugfs.h>
+
+#include "ossl_knl.h"
+#include "hinic3_mt.h"
+#include "hinic3_common.h"
+#include "hinic3_crm.h"
+#include "hinic3_pci_id_tbl.h"
+#include "hinic3_sriov.h"
+#include "hinic3_dev_mgmt.h"
+#include "hinic3_nictool.h"
+#include "hinic3_hw.h"
+#include "hinic3_lld.h"
+
+#include "hinic3_profile.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_prof_adap.h"
+#include "comm_msg_intf.h"
+
+static bool disable_vf_load;
+module_param(disable_vf_load, bool, 0444);
+MODULE_PARM_DESC(disable_vf_load,
+ "Disable virtual functions probe or not - default is false");
+
+static bool disable_attach;
+module_param(disable_attach, bool, 0444);
+MODULE_PARM_DESC(disable_attach, "disable_attach or not - default is false");
+
+#define HINIC3_WAIT_SRIOV_CFG_TIMEOUT 15000
+
+MODULE_AUTHOR("Huawei Technologies CO., Ltd");
+MODULE_DESCRIPTION(HINIC3_DRV_DESC);
+MODULE_VERSION(HINIC3_DRV_VERSION);
+MODULE_LICENSE("GPL");
+
+#if !(defined(HAVE_SRIOV_CONFIGURE) || defined(HAVE_RHEL6_SRIOV_CONFIGURE))
+static DEVICE_ATTR(sriov_numvfs, 0664,
+ hinic3_sriov_numvfs_show, hinic3_sriov_numvfs_store);
+static DEVICE_ATTR(sriov_totalvfs, 0444,
+ hinic3_sriov_totalvfs_show, NULL);
+#endif /* !(HAVE_SRIOV_CONFIGURE || HAVE_RHEL6_SRIOV_CONFIGURE) */
+
+static struct attribute *hinic3_attributes[] = {
+#if !(defined(HAVE_SRIOV_CONFIGURE) || defined(HAVE_RHEL6_SRIOV_CONFIGURE))
+ &dev_attr_sriov_numvfs.attr,
+ &dev_attr_sriov_totalvfs.attr,
+#endif /* !(HAVE_SRIOV_CONFIGURE || HAVE_RHEL6_SRIOV_CONFIGURE) */
+ NULL
+};
+
+static const struct attribute_group hinic3_attr_group = {
+ .attrs = hinic3_attributes,
+};
+
+struct hinic3_uld_info g_uld_info[SERVICE_T_MAX] = { {0} };
+
+#define HINIC3_EVENT_PROCESS_TIMEOUT 10000
+struct mutex g_uld_mutex;
+
+void hinic3_uld_lock_init(void)
+{
+ mutex_init(&g_uld_mutex);
+}
+
+static const char *s_uld_name[SERVICE_T_MAX] = {
+ "nic", "ovs", "roce", "toe", "ioe",
+ "fc", "vbs", "ipsec", "virtio", "migrate", "ppa", "custom"};
+
+const char **hinic3_get_uld_names(void)
+{
+ return s_uld_name;
+}
+
+static int attach_uld(struct hinic3_pcidev *dev, enum hinic3_service_type type,
+ const struct hinic3_uld_info *uld_info)
+{
+ void *uld_dev = NULL;
+ int err;
+
+ mutex_lock(&dev->pdev_mutex);
+
+ if (dev->uld_dev[type]) {
+ sdk_err(&dev->pcidev->dev,
+ "%s driver has attached to pcie device\n",
+ s_uld_name[type]);
+ err = 0;
+ goto out_unlock;
+ }
+
+ atomic_set(&dev->uld_ref_cnt[type], 0);
+
+ err = uld_info->probe(&dev->lld_dev, &uld_dev, dev->uld_dev_name[type]);
+ if (err) {
+ sdk_err(&dev->pcidev->dev,
+ "Failed to add object for %s driver to pcie device\n",
+ s_uld_name[type]);
+ goto probe_failed;
+ }
+
+ dev->uld_dev[type] = uld_dev;
+ set_bit(type, &dev->uld_state);
+ mutex_unlock(&dev->pdev_mutex);
+
+ sdk_info(&dev->pcidev->dev,
+ "Attach %s driver to pcie device succeed\n", s_uld_name[type]);
+ return 0;
+
+probe_failed:
+out_unlock:
+ mutex_unlock(&dev->pdev_mutex);
+
+ return err;
+}
+
+static void wait_uld_unused(struct hinic3_pcidev *dev, enum hinic3_service_type type)
+{
+ u32 loop_cnt = 0;
+
+ while (atomic_read(&dev->uld_ref_cnt[type])) {
+ loop_cnt++;
+ if (loop_cnt % PRINT_ULD_DETACH_TIMEOUT_INTERVAL == 0)
+ sdk_err(&dev->pcidev->dev, "Wait for uld unused for %lds, reference count: %d\n",
+ loop_cnt / MSEC_PER_SEC, atomic_read(&dev->uld_ref_cnt[type]));
+
+ usleep_range(ULD_LOCK_MIN_USLEEP_TIME, ULD_LOCK_MAX_USLEEP_TIME);
+ }
+}
+
+static void detach_uld(struct hinic3_pcidev *dev,
+ enum hinic3_service_type type)
+{
+ struct hinic3_uld_info *uld_info = &g_uld_info[type];
+ unsigned long end;
+ bool timeout = true;
+
+ mutex_lock(&dev->pdev_mutex);
+ if (!dev->uld_dev[type]) {
+ mutex_unlock(&dev->pdev_mutex);
+ return;
+ }
+
+ end = jiffies + msecs_to_jiffies(HINIC3_EVENT_PROCESS_TIMEOUT);
+ do {
+ if (!test_and_set_bit(type, &dev->state)) {
+ timeout = false;
+ break;
+ }
+ usleep_range(900, 1000); /* sleep 900 us ~ 1000 us */
+ } while (time_before(jiffies, end));
+
+ if (timeout && !test_and_set_bit(type, &dev->state))
+ timeout = false;
+
+ spin_lock_bh(&dev->uld_lock);
+ clear_bit(type, &dev->uld_state);
+ spin_unlock_bh(&dev->uld_lock);
+
+ wait_uld_unused(dev, type);
+
+ uld_info->remove(&dev->lld_dev, dev->uld_dev[type]);
+
+ dev->uld_dev[type] = NULL;
+ if (!timeout)
+ clear_bit(type, &dev->state);
+
+ sdk_info(&dev->pcidev->dev,
+ "Detach %s driver from pcie device succeed\n",
+ s_uld_name[type]);
+ mutex_unlock(&dev->pdev_mutex);
+}
+
+static void attach_ulds(struct hinic3_pcidev *dev)
+{
+ enum hinic3_service_type type;
+ struct pci_dev *pdev = dev->pcidev;
+
+ lld_hold();
+ mutex_lock(&g_uld_mutex);
+
+ for (type = SERVICE_T_NIC; type < SERVICE_T_MAX; type++) {
+ if (g_uld_info[type].probe) {
+ if (pdev->is_virtfn &&
+ (!hinic3_get_vf_service_load(pdev, (u16)type))) {
+ sdk_info(&pdev->dev, "VF device disable service_type = %d load in host\n",
+ type);
+ continue;
+ }
+ attach_uld(dev, type, &g_uld_info[type]);
+ }
+ }
+ mutex_unlock(&g_uld_mutex);
+ lld_put();
+}
+
+static void detach_ulds(struct hinic3_pcidev *dev)
+{
+ enum hinic3_service_type type;
+
+ lld_hold();
+ mutex_lock(&g_uld_mutex);
+ for (type = SERVICE_T_MAX - 1; type > SERVICE_T_NIC; type--) {
+ if (g_uld_info[type].probe)
+ detach_uld(dev, type);
+ }
+
+ if (g_uld_info[SERVICE_T_NIC].probe)
+ detach_uld(dev, SERVICE_T_NIC);
+ mutex_unlock(&g_uld_mutex);
+ lld_put();
+}
+
+int hinic3_register_uld(enum hinic3_service_type type,
+ struct hinic3_uld_info *uld_info)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+ struct list_head *chip_list = NULL;
+
+ if (type >= SERVICE_T_MAX) {
+ pr_err("Unknown type %d of up layer driver to register\n",
+ type);
+ return -EINVAL;
+ }
+
+ if (!uld_info || !uld_info->probe || !uld_info->remove) {
+ pr_err("Invalid information of %s driver to register\n",
+ s_uld_name[type]);
+ return -EINVAL;
+ }
+
+ lld_hold();
+ mutex_lock(&g_uld_mutex);
+
+ if (g_uld_info[type].probe) {
+ pr_err("%s driver has registered\n", s_uld_name[type]);
+ mutex_unlock(&g_uld_mutex);
+ lld_put();
+ return -EINVAL;
+ }
+
+ chip_list = get_hinic3_chip_list();
+ memcpy(&g_uld_info[type], uld_info, sizeof(*uld_info));
+ list_for_each_entry(chip_node, chip_list, node) {
+ list_for_each_entry(dev, &chip_node->func_list, node) {
+ if (attach_uld(dev, type, uld_info)) {
+ sdk_err(&dev->pcidev->dev,
+ "Attach %s driver to pcie device failed\n",
+ s_uld_name[type]);
+#ifdef CONFIG_MODULE_PROF
+ hinic3_probe_fault_process(dev->pcidev, FAULT_LEVEL_HOST);
+ break;
+#else
+ continue;
+#endif
+ }
+ }
+ }
+
+ mutex_unlock(&g_uld_mutex);
+ lld_put();
+
+ pr_info("Register %s driver succeed\n", s_uld_name[type]);
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_register_uld);
+
+void hinic3_unregister_uld(enum hinic3_service_type type)
+{
+ struct card_node *chip_node = NULL;
+ struct hinic3_pcidev *dev = NULL;
+ struct hinic3_uld_info *uld_info = NULL;
+ struct list_head *chip_list = NULL;
+
+ if (type >= SERVICE_T_MAX) {
+ pr_err("Unknown type %d of up layer driver to unregister\n",
+ type);
+ return;
+ }
+
+ lld_hold();
+ mutex_lock(&g_uld_mutex);
+ chip_list = get_hinic3_chip_list();
+ list_for_each_entry(chip_node, chip_list, node) {
+ /* detach vf first */
+ list_for_each_entry(dev, &chip_node->func_list, node)
+ if (hinic3_func_type(dev->hwdev) == TYPE_VF)
+ detach_uld(dev, type);
+
+ list_for_each_entry(dev, &chip_node->func_list, node)
+ if (hinic3_func_type(dev->hwdev) == TYPE_PF)
+ detach_uld(dev, type);
+
+ list_for_each_entry(dev, &chip_node->func_list, node)
+ if (hinic3_func_type(dev->hwdev) == TYPE_PPF)
+ detach_uld(dev, type);
+ }
+
+ uld_info = &g_uld_info[type];
+ memset(uld_info, 0, sizeof(*uld_info));
+ mutex_unlock(&g_uld_mutex);
+ lld_put();
+}
+EXPORT_SYMBOL(hinic3_unregister_uld);
+
+int hinic3_attach_nic(struct hinic3_lld_dev *lld_dev)
+{
+ struct hinic3_pcidev *dev = NULL;
+
+ if (!lld_dev)
+ return -EINVAL;
+
+ dev = container_of(lld_dev, struct hinic3_pcidev, lld_dev);
+ return attach_uld(dev, SERVICE_T_NIC, &g_uld_info[SERVICE_T_NIC]);
+}
+EXPORT_SYMBOL(hinic3_attach_nic);
+
+void hinic3_detach_nic(const struct hinic3_lld_dev *lld_dev)
+{
+ struct hinic3_pcidev *dev = NULL;
+
+ if (!lld_dev)
+ return;
+
+ dev = container_of(lld_dev, struct hinic3_pcidev, lld_dev);
+ detach_uld(dev, SERVICE_T_NIC);
+}
+EXPORT_SYMBOL(hinic3_detach_nic);
+
+int hinic3_attach_service(const struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type)
+{
+ struct hinic3_pcidev *dev = NULL;
+
+ if (!lld_dev || type >= SERVICE_T_MAX)
+ return -EINVAL;
+
+ dev = container_of(lld_dev, struct hinic3_pcidev, lld_dev);
+ return attach_uld(dev, type, &g_uld_info[type]);
+}
+EXPORT_SYMBOL(hinic3_attach_service);
+
+void hinic3_detach_service(const struct hinic3_lld_dev *lld_dev, enum hinic3_service_type type)
+{
+ struct hinic3_pcidev *dev = NULL;
+
+ if (!lld_dev || type >= SERVICE_T_MAX)
+ return;
+
+ dev = container_of(lld_dev, struct hinic3_pcidev, lld_dev);
+ detach_uld(dev, type);
+}
+EXPORT_SYMBOL(hinic3_detach_service);
+
+static void hinic3_sync_time_to_fmw(struct hinic3_pcidev *pdev_pri)
+{
+ struct timeval tv = {0};
+ struct rtc_time rt_time = {0};
+ u64 tv_msec;
+ int err;
+
+ do_gettimeofday(&tv);
+
+ tv_msec = (u64)(tv.tv_sec * MSEC_PER_SEC + tv.tv_usec / USEC_PER_MSEC);
+ err = hinic3_sync_time(pdev_pri->hwdev, tv_msec);
+ if (err) {
+ sdk_err(&pdev_pri->pcidev->dev, "Synchronize UTC time to firmware failed, errno:%d.\n",
+ err);
+ } else {
+ rtc_time_to_tm((unsigned long)(tv.tv_sec), &rt_time);
+ sdk_info(&pdev_pri->pcidev->dev, "Synchronize UTC time to firmware succeed. UTC time %d-%02d-%02d %02d:%02d:%02d.\n",
+ rt_time.tm_year + HINIC3_SYNC_YEAR_OFFSET,
+ rt_time.tm_mon + HINIC3_SYNC_MONTH_OFFSET,
+ rt_time.tm_mday, rt_time.tm_hour,
+ rt_time.tm_min, rt_time.tm_sec);
+ }
+}
+
+static void send_uld_dev_event(struct hinic3_pcidev *dev,
+ struct hinic3_event_info *event)
+{
+ enum hinic3_service_type type;
+
+ for (type = SERVICE_T_NIC; type < SERVICE_T_MAX; type++) {
+ if (test_and_set_bit(type, &dev->state)) {
+ sdk_warn(&dev->pcidev->dev, "Svc: 0x%x, event: 0x%x can't handler, %s is in detach\n",
+ event->service, event->type, s_uld_name[type]);
+ continue;
+ }
+
+ if (g_uld_info[type].event)
+ g_uld_info[type].event(&dev->lld_dev,
+ dev->uld_dev[type], event);
+ clear_bit(type, &dev->state);
+ }
+}
+
+static void send_event_to_dst_pf(struct hinic3_pcidev *dev, u16 func_id,
+ struct hinic3_event_info *event)
+{
+ struct hinic3_pcidev *des_dev = NULL;
+
+ lld_hold();
+ list_for_each_entry(des_dev, &dev->chip_node->func_list, node) {
+ if (dev->lld_state == HINIC3_IN_REMOVE)
+ continue;
+
+ if (hinic3_func_type(des_dev->hwdev) == TYPE_VF)
+ continue;
+
+ if (hinic3_global_func_id(des_dev->hwdev) == func_id) {
+ send_uld_dev_event(des_dev, event);
+ break;
+ }
+ }
+ lld_put();
+}
+
+static void send_event_to_all_pf(struct hinic3_pcidev *dev,
+ struct hinic3_event_info *event)
+{
+ struct hinic3_pcidev *des_dev = NULL;
+
+ lld_hold();
+ list_for_each_entry(des_dev, &dev->chip_node->func_list, node) {
+ if (dev->lld_state == HINIC3_IN_REMOVE)
+ continue;
+
+ if (hinic3_func_type(des_dev->hwdev) == TYPE_VF)
+ continue;
+
+ send_uld_dev_event(des_dev, event);
+ }
+ lld_put();
+}
+
+static void hinic3_event_process(void *adapter, struct hinic3_event_info *event)
+{
+ struct hinic3_pcidev *dev = adapter;
+ struct hinic3_fault_event *fault = (void *)event->event_data;
+ u16 func_id;
+
+ if ((event->service == EVENT_SRV_COMM && event->type == EVENT_COMM_FAULT) &&
+ fault->fault_level == FAULT_LEVEL_SERIOUS_FLR &&
+ fault->event.chip.func_id < hinic3_max_pf_num(dev->hwdev)) {
+ func_id = fault->event.chip.func_id;
+ return send_event_to_dst_pf(adapter, func_id, event);
+ }
+
+ if (event->type == EVENT_COMM_MGMT_WATCHDOG)
+ send_event_to_all_pf(adapter, event);
+ else
+ send_uld_dev_event(adapter, event);
+}
+
+static void uld_def_init(struct hinic3_pcidev *pci_adapter)
+{
+ int type;
+
+ for (type = 0; type < SERVICE_T_MAX; type++) {
+ atomic_set(&pci_adapter->uld_ref_cnt[type], 0);
+ clear_bit(type, &pci_adapter->uld_state);
+ }
+
+ spin_lock_init(&pci_adapter->uld_lock);
+}
+
+static int mapping_bar(struct pci_dev *pdev,
+ struct hinic3_pcidev *pci_adapter)
+{
+ int cfg_bar;
+
+ cfg_bar = HINIC3_IS_VF_DEV(pdev) ?
+ HINIC3_VF_PCI_CFG_REG_BAR : HINIC3_PF_PCI_CFG_REG_BAR;
+
+ pci_adapter->cfg_reg_base = pci_ioremap_bar(pdev, cfg_bar);
+ if (!pci_adapter->cfg_reg_base) {
+ sdk_err(&pdev->dev,
+ "Failed to map configuration regs\n");
+ return -ENOMEM;
+ }
+
+ pci_adapter->intr_reg_base = pci_ioremap_bar(pdev,
+ HINIC3_PCI_INTR_REG_BAR);
+ if (!pci_adapter->intr_reg_base) {
+ sdk_err(&pdev->dev,
+ "Failed to map interrupt regs\n");
+ goto map_intr_bar_err;
+ }
+
+ if (!HINIC3_IS_VF_DEV(pdev)) {
+ pci_adapter->mgmt_reg_base =
+ pci_ioremap_bar(pdev, HINIC3_PCI_MGMT_REG_BAR);
+ if (!pci_adapter->mgmt_reg_base) {
+ sdk_err(&pdev->dev,
+ "Failed to map mgmt regs\n");
+ goto map_mgmt_bar_err;
+ }
+ }
+
+ pci_adapter->db_base_phy = pci_resource_start(pdev, HINIC3_PCI_DB_BAR);
+ pci_adapter->db_dwqe_len = pci_resource_len(pdev, HINIC3_PCI_DB_BAR);
+ pci_adapter->db_base = pci_ioremap_bar(pdev, HINIC3_PCI_DB_BAR);
+ if (!pci_adapter->db_base) {
+ sdk_err(&pdev->dev,
+ "Failed to map doorbell regs\n");
+ goto map_db_err;
+ }
+
+ return 0;
+
+map_db_err:
+ if (!HINIC3_IS_VF_DEV(pdev))
+ iounmap(pci_adapter->mgmt_reg_base);
+
+map_mgmt_bar_err:
+ iounmap(pci_adapter->intr_reg_base);
+
+map_intr_bar_err:
+ iounmap(pci_adapter->cfg_reg_base);
+
+ return -ENOMEM;
+}
+
+static void unmapping_bar(struct hinic3_pcidev *pci_adapter)
+{
+ iounmap(pci_adapter->db_base);
+
+ if (!HINIC3_IS_VF_DEV(pci_adapter->pcidev))
+ iounmap(pci_adapter->mgmt_reg_base);
+
+ iounmap(pci_adapter->intr_reg_base);
+ iounmap(pci_adapter->cfg_reg_base);
+}
+
+static int hinic3_pci_init(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+ int err;
+
+ pci_adapter = kzalloc(sizeof(*pci_adapter), GFP_KERNEL);
+ if (!pci_adapter) {
+ sdk_err(&pdev->dev,
+ "Failed to alloc pci device adapter\n");
+ return -ENOMEM;
+ }
+ pci_adapter->pcidev = pdev;
+ mutex_init(&pci_adapter->pdev_mutex);
+
+ pci_set_drvdata(pdev, pci_adapter);
+
+ err = pci_enable_device(pdev);
+ if (err) {
+ sdk_err(&pdev->dev, "Failed to enable PCI device\n");
+ goto pci_enable_err;
+ }
+
+ err = pci_request_regions(pdev, HINIC3_DRV_NAME);
+ if (err) {
+ sdk_err(&pdev->dev, "Failed to request regions\n");
+ goto pci_regions_err;
+ }
+
+ pci_enable_pcie_error_reporting(pdev);
+
+ pci_set_master(pdev);
+
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(64)); /* 64 bit DMA mask */
+ if (err) {
+ sdk_warn(&pdev->dev, "Couldn't set 64-bit DMA mask\n");
+ err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32)); /* 32 bit DMA mask */
+ if (err) {
+ sdk_err(&pdev->dev, "Failed to set DMA mask\n");
+ goto dma_mask_err;
+ }
+ }
+
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64)); /* 64 bit DMA mask */
+ if (err) {
+ sdk_warn(&pdev->dev,
+ "Couldn't set 64-bit coherent DMA mask\n");
+ err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)); /* 32 bit DMA mask */
+ if (err) {
+ sdk_err(&pdev->dev,
+ "Failed to set coherent DMA mask\n");
+ goto dma_consistnet_mask_err;
+ }
+ }
+
+ return 0;
+
+dma_consistnet_mask_err:
+dma_mask_err:
+ pci_clear_master(pdev);
+ pci_disable_pcie_error_reporting(pdev);
+ pci_release_regions(pdev);
+
+pci_regions_err:
+ pci_disable_device(pdev);
+
+pci_enable_err:
+ pci_set_drvdata(pdev, NULL);
+ kfree(pci_adapter);
+
+ return err;
+}
+
+static void hinic3_pci_deinit(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(pdev);
+
+ pci_clear_master(pdev);
+ pci_release_regions(pdev);
+ pci_disable_pcie_error_reporting(pdev);
+ pci_disable_device(pdev);
+ pci_set_drvdata(pdev, NULL);
+ kfree(pci_adapter);
+}
+
+#ifdef CONFIG_X86
+/**
+ * cfg_order_reg - when cpu model is haswell or broadwell, should configure dma
+ * order register to zero
+ * @pci_adapter: pci_adapter
+ **/
+/*lint -save -e40 */
+static void cfg_order_reg(struct hinic3_pcidev *pci_adapter)
+{
+ u8 cpu_model[] = {0x3c, 0x3f, 0x45, 0x46, 0x3d, 0x47, 0x4f, 0x56};
+ struct cpuinfo_x86 *cpuinfo = NULL;
+ u32 i;
+
+ if (hinic3_func_type(pci_adapter->hwdev) == TYPE_VF)
+ return;
+
+ cpuinfo = &cpu_data(0);
+ for (i = 0; i < sizeof(cpu_model); i++) {
+ if (cpu_model[i] == cpuinfo->x86_model)
+ hinic3_set_pcie_order_cfg(pci_adapter->hwdev);
+ }
+}
+
+/*lint -restore*/
+#endif
+
+static int hinic3_func_init(struct pci_dev *pdev, struct hinic3_pcidev *pci_adapter)
+{
+ struct hinic3_init_para init_para = {0};
+ bool cqm_init_en = false;
+ int err;
+
+ init_para.adapter_hdl = pci_adapter;
+ init_para.pcidev_hdl = pdev;
+ init_para.dev_hdl = &pdev->dev;
+ init_para.cfg_reg_base = pci_adapter->cfg_reg_base;
+ init_para.intr_reg_base = pci_adapter->intr_reg_base;
+ init_para.mgmt_reg_base = pci_adapter->mgmt_reg_base;
+ init_para.db_base = pci_adapter->db_base;
+ init_para.db_base_phy = pci_adapter->db_base_phy;
+ init_para.db_dwqe_len = pci_adapter->db_dwqe_len;
+ init_para.hwdev = &pci_adapter->hwdev;
+ init_para.chip_node = pci_adapter->chip_node;
+ init_para.probe_fault_level = pci_adapter->probe_fault_level;
+ err = hinic3_init_hwdev(&init_para);
+ if (err) {
+ pci_adapter->hwdev = NULL;
+ pci_adapter->probe_fault_level = init_para.probe_fault_level;
+ sdk_err(&pdev->dev, "Failed to initialize hardware device\n");
+ return -EFAULT;
+ }
+
+ cqm_init_en = hinic3_need_init_stateful_default(pci_adapter->hwdev);
+ if (cqm_init_en) {
+ err = hinic3_stateful_init(pci_adapter->hwdev);
+ if (err) {
+ sdk_err(&pdev->dev, "Failed to init stateful\n");
+ goto stateful_init_err;
+ }
+ }
+
+ pci_adapter->lld_dev.pdev = pdev;
+
+ pci_adapter->lld_dev.hwdev = pci_adapter->hwdev;
+ if (hinic3_func_type(pci_adapter->hwdev) != TYPE_VF)
+ set_bit(HINIC3_FUNC_PERSENT, &pci_adapter->sriov_info.state);
+
+ hinic3_event_register(pci_adapter->hwdev, pci_adapter,
+ hinic3_event_process);
+
+ if (hinic3_func_type(pci_adapter->hwdev) != TYPE_VF)
+ hinic3_sync_time_to_fmw(pci_adapter);
+
+ /* dbgtool init */
+ lld_lock_chip_node();
+ err = nictool_k_init(pci_adapter->hwdev, pci_adapter->chip_node);
+ if (err) {
+ lld_unlock_chip_node();
+ sdk_err(&pdev->dev, "Failed to initialize dbgtool\n");
+ goto nictool_init_err;
+ }
+ list_add_tail(&pci_adapter->node, &pci_adapter->chip_node->func_list);
+ lld_unlock_chip_node();
+
+ if (!disable_attach) {
+ attach_ulds(pci_adapter);
+
+ if (hinic3_func_type(pci_adapter->hwdev) != TYPE_VF) {
+ err = sysfs_create_group(&pdev->dev.kobj,
+ &hinic3_attr_group);
+ if (err) {
+ sdk_err(&pdev->dev, "Failed to create sysfs group\n");
+ goto create_sysfs_err;
+ }
+ }
+
+#ifdef CONFIG_X86
+ cfg_order_reg(pci_adapter);
+#endif
+ }
+
+ return 0;
+
+create_sysfs_err:
+ detach_ulds(pci_adapter);
+
+ lld_lock_chip_node();
+ list_del(&pci_adapter->node);
+ lld_unlock_chip_node();
+
+ wait_lld_dev_unused(pci_adapter);
+
+ lld_lock_chip_node();
+ nictool_k_uninit(pci_adapter->hwdev, pci_adapter->chip_node);
+ lld_unlock_chip_node();
+
+nictool_init_err:
+ hinic3_event_unregister(pci_adapter->hwdev);
+ if (cqm_init_en)
+ hinic3_stateful_deinit(pci_adapter->hwdev);
+stateful_init_err:
+ hinic3_free_hwdev(pci_adapter->hwdev);
+
+ return err;
+}
+
+static void hinic3_func_deinit(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(pdev);
+
+ /* When function deinit, disable mgmt initiative report events firstly,
+ * then flush mgmt work-queue.
+ */
+ hinic3_disable_mgmt_msg_report(pci_adapter->hwdev);
+
+ hinic3_flush_mgmt_workq(pci_adapter->hwdev);
+
+ lld_lock_chip_node();
+ list_del(&pci_adapter->node);
+ lld_unlock_chip_node();
+
+ detach_ulds(pci_adapter);
+
+ wait_lld_dev_unused(pci_adapter);
+
+ lld_lock_chip_node();
+ nictool_k_uninit(pci_adapter->hwdev, pci_adapter->chip_node);
+ lld_unlock_chip_node();
+
+ hinic3_event_unregister(pci_adapter->hwdev);
+
+ hinic3_free_stateful(pci_adapter->hwdev);
+
+ hinic3_free_hwdev(pci_adapter->hwdev);
+}
+
+static void wait_sriov_cfg_complete(struct hinic3_pcidev *pci_adapter)
+{
+ struct hinic3_sriov_info *sriov_info;
+ unsigned long end;
+
+ sriov_info = &pci_adapter->sriov_info;
+ clear_bit(HINIC3_FUNC_PERSENT, &sriov_info->state);
+ usleep_range(9900, 10000); /* sleep 9900 us ~ 10000 us */
+
+ end = jiffies + msecs_to_jiffies(HINIC3_WAIT_SRIOV_CFG_TIMEOUT);
+ do {
+ if (!test_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state) &&
+ !test_bit(HINIC3_SRIOV_DISABLE, &sriov_info->state))
+ return;
+
+ usleep_range(9900, 10000); /* sleep 9900 us ~ 10000 us */
+ } while (time_before(jiffies, end));
+}
+
+bool hinic3_get_vf_load_state(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+ struct pci_dev *pf_pdev = NULL;
+
+ if (!pdev) {
+ pr_err("pdev is null.\n");
+ return false;
+ }
+
+ /* vf used in vm */
+ if (pci_is_root_bus(pdev->bus))
+ return false;
+
+ if (pdev->is_virtfn)
+ pf_pdev = pdev->physfn;
+ else
+ pf_pdev = pdev;
+
+ pci_adapter = pci_get_drvdata(pf_pdev);
+ if (!pci_adapter) {
+ sdk_err(&pdev->dev, "pci_adapter is null.\n");
+ return false;
+ }
+
+ return !pci_adapter->disable_vf_load;
+}
+
+int hinic3_set_vf_load_state(struct pci_dev *pdev, bool vf_load_state)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+
+ if (!pdev) {
+ pr_err("pdev is null.\n");
+ return -EINVAL;
+ }
+
+ pci_adapter = pci_get_drvdata(pdev);
+ if (!pci_adapter) {
+ sdk_err(&pdev->dev, "pci_adapter is null.\n");
+ return -EINVAL;
+ }
+
+ if (hinic3_func_type(pci_adapter->hwdev) == TYPE_VF)
+ return 0;
+
+ pci_adapter->disable_vf_load = !vf_load_state;
+ sdk_info(&pci_adapter->pcidev->dev, "Current function %s vf load in host\n",
+ vf_load_state ? "enable" : "disable");
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_vf_load_state);
+
+bool hinic3_get_vf_service_load(struct pci_dev *pdev, u16 service)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+ struct pci_dev *pf_pdev = NULL;
+
+ if (!pdev) {
+ pr_err("pdev is null.\n");
+ return false;
+ }
+
+ if (pdev->is_virtfn)
+ pf_pdev = pdev->physfn;
+ else
+ pf_pdev = pdev;
+
+ pci_adapter = pci_get_drvdata(pf_pdev);
+ if (!pci_adapter) {
+ sdk_err(&pdev->dev, "pci_adapter is null.\n");
+ return false;
+ }
+
+ if (service >= SERVICE_T_MAX) {
+ sdk_err(&pdev->dev, "service_type = %u state is error\n",
+ service);
+ return false;
+ }
+
+ return !pci_adapter->disable_srv_load[service];
+}
+
+int hinic3_set_vf_service_load(struct pci_dev *pdev, u16 service,
+ bool vf_srv_load)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+
+ if (!pdev) {
+ pr_err("pdev is null.\n");
+ return -EINVAL;
+ }
+
+ if (service >= SERVICE_T_MAX) {
+ sdk_err(&pdev->dev, "service_type = %u state is error\n",
+ service);
+ return -EFAULT;
+ }
+
+ pci_adapter = pci_get_drvdata(pdev);
+ if (!pci_adapter) {
+ sdk_err(&pdev->dev, "pci_adapter is null.\n");
+ return -EINVAL;
+ }
+
+ if (hinic3_func_type(pci_adapter->hwdev) == TYPE_VF)
+ return 0;
+
+ pci_adapter->disable_srv_load[service] = !vf_srv_load;
+ sdk_info(&pci_adapter->pcidev->dev, "Current function %s vf load in host\n",
+ vf_srv_load ? "enable" : "disable");
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_set_vf_service_load);
+
+static int hinic3_remove_func(struct hinic3_pcidev *pci_adapter)
+{
+ struct pci_dev *pdev = pci_adapter->pcidev;
+
+ mutex_lock(&pci_adapter->pdev_mutex);
+ if (pci_adapter->lld_state != HINIC3_PROBE_OK) {
+ sdk_warn(&pdev->dev, "Current function don not need remove\n");
+ mutex_unlock(&pci_adapter->pdev_mutex);
+ return 0;
+ }
+ pci_adapter->lld_state = HINIC3_IN_REMOVE;
+ mutex_unlock(&pci_adapter->pdev_mutex);
+
+ hinic3_detect_hw_present(pci_adapter->hwdev);
+
+ hisdk3_remove_pre_process(pci_adapter->hwdev);
+
+ if (hinic3_func_type(pci_adapter->hwdev) != TYPE_VF) {
+ sysfs_remove_group(&pdev->dev.kobj, &hinic3_attr_group);
+ wait_sriov_cfg_complete(pci_adapter);
+ hinic3_pci_sriov_disable(pdev);
+ }
+
+ hinic3_func_deinit(pdev);
+
+ lld_lock_chip_node();
+ free_chip_node(pci_adapter);
+ lld_unlock_chip_node();
+
+ unmapping_bar(pci_adapter);
+
+ mutex_lock(&pci_adapter->pdev_mutex);
+ pci_adapter->lld_state = HINIC3_NOT_PROBE;
+ mutex_unlock(&pci_adapter->pdev_mutex);
+
+ sdk_info(&pdev->dev, "Pcie device removed function\n");
+
+ return 0;
+}
+
+static void hinic3_remove(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(pdev);
+
+ if (!pci_adapter)
+ return;
+
+ sdk_info(&pdev->dev, "Pcie device remove begin\n");
+
+ hinic3_remove_func(pci_adapter);
+
+ hinic3_pci_deinit(pdev);
+ hinic3_probe_pre_unprocess(pdev);
+
+ sdk_info(&pdev->dev, "Pcie device removed\n");
+}
+
+static int probe_func_param_init(struct hinic3_pcidev *pci_adapter)
+{
+ struct pci_dev *pdev = NULL;
+
+ if (!pci_adapter)
+ return -EFAULT;
+
+ pdev = pci_adapter->pcidev;
+ if (!pdev)
+ return -EFAULT;
+
+ mutex_lock(&pci_adapter->pdev_mutex);
+ if (pci_adapter->lld_state >= HINIC3_PROBE_START) {
+ sdk_warn(&pdev->dev, "Don not probe repeat\n");
+ mutex_unlock(&pci_adapter->pdev_mutex);
+ return 0;
+ }
+ pci_adapter->lld_state = HINIC3_PROBE_START;
+ mutex_unlock(&pci_adapter->pdev_mutex);
+
+ return 0;
+}
+
+static int hinic3_probe_func(struct hinic3_pcidev *pci_adapter)
+{
+ struct pci_dev *pdev = pci_adapter->pcidev;
+ int err;
+
+ err = probe_func_param_init(pci_adapter);
+ if (err)
+ return err;
+
+ err = mapping_bar(pdev, pci_adapter);
+ if (err) {
+ sdk_err(&pdev->dev, "Failed to map bar\n");
+ goto map_bar_failed;
+ }
+
+ uld_def_init(pci_adapter);
+
+ /* if chip information of pcie function exist, add the function into chip */
+ lld_lock_chip_node();
+ err = alloc_chip_node(pci_adapter);
+ if (err) {
+ lld_unlock_chip_node();
+ sdk_err(&pdev->dev, "Failed to add new chip node to global list\n");
+ goto alloc_chip_node_fail;
+ }
+ lld_unlock_chip_node();
+
+ err = hinic3_func_init(pdev, pci_adapter);
+ if (err)
+ goto func_init_err;
+
+ if (hinic3_func_type(pci_adapter->hwdev) != TYPE_VF) {
+ err = hinic3_set_bdf_ctxt(pci_adapter->hwdev, pdev->bus->number,
+ PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+ if (err) {
+ sdk_err(&pdev->dev, "Failed to set BDF info to MPU\n");
+ goto set_bdf_err;
+ }
+ }
+
+ hinic3_probe_success(pci_adapter->hwdev);
+
+ mutex_lock(&pci_adapter->pdev_mutex);
+ pci_adapter->lld_state = HINIC3_PROBE_OK;
+ mutex_unlock(&pci_adapter->pdev_mutex);
+
+ return 0;
+
+set_bdf_err:
+ hinic3_func_deinit(pdev);
+
+func_init_err:
+ lld_lock_chip_node();
+ free_chip_node(pci_adapter);
+ lld_unlock_chip_node();
+
+alloc_chip_node_fail:
+ unmapping_bar(pci_adapter);
+
+map_bar_failed:
+ sdk_err(&pdev->dev, "Pcie device probe function failed\n");
+ return err;
+}
+
+static int hinic3_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+ u16 probe_fault_level = FAULT_LEVEL_SERIOUS_FLR;
+ int err;
+
+ sdk_info(&pdev->dev, "Pcie device probe begin\n");
+
+ err = hinic3_probe_pre_process(pdev);
+ if (err != 0 && err != HINIC3_NOT_PROBE)
+ goto out;
+
+ if (err == HINIC3_NOT_PROBE)
+ return 0;
+
+ err = hinic3_pci_init(pdev);
+ if (err)
+ goto pci_init_err;
+
+ pci_adapter = pci_get_drvdata(pdev);
+ pci_adapter->disable_vf_load = disable_vf_load;
+ pci_adapter->id = *id;
+ pci_adapter->lld_state = HINIC3_NOT_PROBE;
+ pci_adapter->probe_fault_level = probe_fault_level;
+ lld_dev_cnt_init(pci_adapter);
+
+ if (pdev->is_virtfn && (!hinic3_get_vf_load_state(pdev))) {
+ sdk_info(&pdev->dev, "VF device disable load in host\n");
+ return 0;
+ }
+
+ err = hinic3_probe_func(pci_adapter);
+ if (err)
+ goto hinic3_probe_func_fail;
+
+ sdk_info(&pdev->dev, "Pcie device probed\n");
+ return 0;
+
+hinic3_probe_func_fail:
+ probe_fault_level = pci_adapter->probe_fault_level;
+ hinic3_pci_deinit(pdev);
+
+pci_init_err:
+ hinic3_probe_pre_unprocess(pdev);
+
+out:
+ hinic3_probe_fault_process(pdev, probe_fault_level);
+ sdk_err(&pdev->dev, "Pcie device probe failed\n");
+ return err;
+}
+
+static int hinic3_get_pf_info(struct pci_dev *pdev, u16 service,
+ struct hinic3_hw_pf_infos **pf_infos)
+{
+ struct hinic3_pcidev *dev = pci_get_drvdata(pdev);
+ int err;
+
+ if (service >= SERVICE_T_MAX) {
+ sdk_err(&pdev->dev, "Current vf do not supports set service_type = %u state in host\n",
+ service);
+ return -EFAULT;
+ }
+
+ *pf_infos = kzalloc(sizeof(struct hinic3_hw_pf_infos), GFP_KERNEL);
+ err = hinic3_get_hw_pf_infos(dev->hwdev, *pf_infos, HINIC3_CHANNEL_COMM);
+ if (err) {
+ kfree(*pf_infos);
+ sdk_err(&pdev->dev, "Get chipf pf info failed, ret %d\n", err);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int hinic3_set_func_en(struct pci_dev *des_pdev, struct hinic3_pcidev *dst_dev,
+ bool en, u16 vf_func_id)
+{
+ int err;
+
+ /* unload invalid vf func id */
+ if (!en && vf_func_id != hinic3_global_func_id(dst_dev->hwdev) &&
+ !strcmp(des_pdev->driver->name, HINIC3_DRV_NAME)) {
+ pr_err("dst_dev func id:%u, vf_func_id:%u\n",
+ hinic3_global_func_id(dst_dev->hwdev), vf_func_id);
+ mutex_unlock(&dst_dev->pdev_mutex);
+ return -EFAULT;
+ }
+
+ if (!en && dst_dev->lld_state == HINIC3_PROBE_OK) {
+ mutex_unlock(&dst_dev->pdev_mutex);
+ hinic3_remove_func(dst_dev);
+ } else if (en && dst_dev->lld_state == HINIC3_NOT_PROBE) {
+ mutex_unlock(&dst_dev->pdev_mutex);
+ err = hinic3_probe_func(dst_dev);
+ if (err)
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int get_vf_service_state_param(struct pci_dev *pdev, struct hinic3_pcidev **dev_ptr,
+ u16 service, struct hinic3_hw_pf_infos **pf_infos)
+{
+ int err;
+
+ if (!pdev)
+ return -EINVAL;
+
+ *dev_ptr = pci_get_drvdata(pdev);
+ if (!(*dev_ptr))
+ return -EINVAL;
+
+ err = hinic3_get_pf_info(pdev, service, pf_infos);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+#define BUS_MAX_DEV_NUM 256
+static int hinic3_dst_pdev_valid(struct hinic3_pcidev *dst_dev, struct pci_dev **des_pdev_ptr,
+ u16 vf_devfn, bool en)
+{
+ u16 bus;
+
+ bus = dst_dev->pcidev->bus->number + vf_devfn / BUS_MAX_DEV_NUM;
+ *des_pdev_ptr = pci_get_domain_bus_and_slot(pci_domain_nr(dst_dev->pcidev->bus),
+ bus, vf_devfn % BUS_MAX_DEV_NUM);
+ if (!(*des_pdev_ptr)) {
+ pr_err("des_pdev is NULL\n");
+ return -EFAULT;
+ }
+
+ if ((*des_pdev_ptr)->driver == NULL) {
+ pr_err("des_pdev_ptr->driver is NULL\n");
+ return -EFAULT;
+ }
+
+ /* OVS sriov hw scene, when vf bind to vf_io return error. */
+ if ((!en && strcmp((*des_pdev_ptr)->driver->name, HINIC3_DRV_NAME))) {
+ pr_err("vf bind driver:%s\n", (*des_pdev_ptr)->driver->name);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int paramerter_is_unexpected(struct hinic3_pcidev *dst_dev, u16 *func_id, u16 *vf_start,
+ u16 *vf_end, u16 vf_func_id)
+{
+ if (hinic3_func_type(dst_dev->hwdev) == TYPE_VF)
+ return -EPERM;
+
+ *func_id = hinic3_global_func_id(dst_dev->hwdev);
+ *vf_start = hinic3_glb_pf_vf_offset(dst_dev->hwdev) + 1;
+ *vf_end = *vf_start + hinic3_func_max_vf(dst_dev->hwdev);
+ if (vf_func_id < *vf_start || vf_func_id > *vf_end)
+ return -EPERM;
+
+ return 0;
+}
+
+int hinic3_set_vf_service_state(struct pci_dev *pdev, u16 vf_func_id, u16 service, bool en)
+{
+ struct hinic3_hw_pf_infos *pf_infos = NULL;
+ struct hinic3_pcidev *dev = NULL, *dst_dev = NULL;
+ struct pci_dev *des_pdev = NULL;
+ u16 vf_start, vf_end, vf_devfn, func_id;
+ int err;
+ bool find_dst_dev = false;
+
+ err = get_vf_service_state_param(pdev, &dev, service, &pf_infos);
+ if (err)
+ return err;
+
+ lld_hold();
+ list_for_each_entry(dst_dev, &dev->chip_node->func_list, node) {
+ if (paramerter_is_unexpected(dst_dev, &func_id, &vf_start, &vf_end, vf_func_id))
+ continue;
+
+ vf_devfn = pf_infos->infos[func_id].vf_offset + (vf_func_id - vf_start) +
+ (u16)dst_dev->pcidev->devfn;
+ err = hinic3_dst_pdev_valid(dst_dev, &des_pdev, vf_devfn, en);
+ if (err) {
+ sdk_err(&pdev->dev, "Can not get vf func_id %u from pf %u\n",
+ vf_func_id, func_id);
+ lld_put();
+ goto free_pf_info;
+ }
+
+ dst_dev = pci_get_drvdata(des_pdev);
+ /* When enable vf scene, if vf bind to vf-io, return ok */
+ if (strcmp(des_pdev->driver->name, HINIC3_DRV_NAME) ||
+ !dst_dev || (!en && dst_dev->lld_state != HINIC3_PROBE_OK) ||
+ (en && dst_dev->lld_state != HINIC3_NOT_PROBE)) {
+ lld_put();
+ goto free_pf_info;
+ }
+
+ if (en)
+ pci_dev_put(des_pdev);
+ mutex_lock(&dst_dev->pdev_mutex);
+ find_dst_dev = true;
+ break;
+ }
+ lld_put();
+
+ if (!find_dst_dev) {
+ err = -EFAULT;
+ sdk_err(&pdev->dev, "Invalid parameter vf_id %u \n", vf_func_id);
+ goto free_pf_info;
+ }
+
+ err = hinic3_set_func_en(des_pdev, dst_dev, en, vf_func_id);
+
+free_pf_info:
+ kfree(pf_infos);
+ return err;
+}
+EXPORT_SYMBOL(hinic3_set_vf_service_state);
+
+/*lint -save -e133 -e10*/
+static const struct pci_device_id hinic3_pci_table[] = {
+ {PCI_VDEVICE(HUAWEI, HINIC3_DEV_ID_SPU), 0},
+ {PCI_VDEVICE(HUAWEI, HINIC3_DEV_ID_STANDARD), 0},
+ {PCI_VDEVICE(HUAWEI, HINIC3_DEV_ID_SDI_5_1_PF), 0},
+ {PCI_VDEVICE(HUAWEI, HINIC3_DEV_ID_SDI_5_0_PF), 0},
+ {PCI_VDEVICE(HUAWEI, HINIC3_DEV_ID_VF), 0},
+ {0, 0}
+
+};
+
+/*lint -restore*/
+
+MODULE_DEVICE_TABLE(pci, hinic3_pci_table);
+
+/**
+ * hinic3_io_error_detected - called when PCI error is detected
+ * @pdev: Pointer to PCI device
+ * @state: The current pci connection state
+ *
+ * This function is called after a PCI bus error affecting
+ * this device has been detected.
+ *
+ * Since we only need error detecting not error handling, so we
+ * always return PCI_ERS_RESULT_CAN_RECOVER to tell the AER
+ * driver that we don't need reset(error handling).
+ */
+static pci_ers_result_t hinic3_io_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct hinic3_pcidev *pci_adapter = NULL;
+
+ sdk_err(&pdev->dev,
+ "Uncorrectable error detected, log and cleanup error status: 0x%08x\n",
+ state);
+
+ pci_cleanup_aer_uncorrect_error_status(pdev);
+ pci_adapter = pci_get_drvdata(pdev);
+ if (pci_adapter)
+ hinic3_record_pcie_error(pci_adapter->hwdev);
+
+ return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static void hinic3_shutdown(struct pci_dev *pdev)
+{
+ struct hinic3_pcidev *pci_adapter = pci_get_drvdata(pdev);
+
+ sdk_info(&pdev->dev, "Shutdown device\n");
+
+ if (pci_adapter)
+ hinic3_shutdown_hwdev(pci_adapter->hwdev);
+
+ pci_disable_device(pdev);
+
+ if (pci_adapter)
+ hinic3_set_api_stop(pci_adapter->hwdev);
+}
+
+#ifdef HAVE_RHEL6_SRIOV_CONFIGURE
+static struct pci_driver_rh hinic3_driver_rh = {
+ .sriov_configure = hinic3_pci_sriov_configure,
+};
+#endif
+
+/* Cause we only need error detecting not error handling, so only error_detected
+ * callback is enough.
+ */
+static struct pci_error_handlers hinic3_err_handler = {
+ .error_detected = hinic3_io_error_detected,
+};
+
+static struct pci_driver hinic3_driver = {
+ .name = HINIC3_DRV_NAME,
+ .id_table = hinic3_pci_table,
+ .probe = hinic3_probe,
+ .remove = hinic3_remove,
+ .shutdown = hinic3_shutdown,
+#if defined(HAVE_SRIOV_CONFIGURE)
+ .sriov_configure = hinic3_pci_sriov_configure,
+#elif defined(HAVE_RHEL6_SRIOV_CONFIGURE)
+ .rh_reserved = &hinic3_driver_rh,
+#endif
+ .err_handler = &hinic3_err_handler
+};
+
+int hinic3_lld_init(void)
+{
+ int err;
+
+ pr_info("%s - version %s\n", HINIC3_DRV_DESC, HINIC3_DRV_VERSION);
+ memset(g_uld_info, 0, sizeof(g_uld_info));
+
+ hinic3_lld_lock_init();
+ hinic3_uld_lock_init();
+
+ err = hinic3_module_pre_init();
+ if (err) {
+ pr_err("Init custom failed\n");
+ return err;
+ }
+
+ err = pci_register_driver(&hinic3_driver);
+ if (err) {
+ hinic3_module_post_exit();
+ return err;
+ }
+
+ return 0;
+}
+
+void hinic3_lld_exit(void)
+{
+ pci_unregister_driver(&hinic3_driver);
+
+ hinic3_module_post_exit();
+
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.c
new file mode 100644
index 000000000000..f23910d53573
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.c
@@ -0,0 +1,1841 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/pci.h>
+#include <linux/delay.h>
+#include <linux/types.h>
+#include <linux/semaphore.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+
+#include "ossl_knl.h"
+#include "hinic3_hw.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_csr.h"
+#include "hinic3_hwif.h"
+#include "hinic3_eqs.h"
+#include "hinic3_prof_adap.h"
+#include "hinic3_common.h"
+#include "hinic3_mbox.h"
+
+#define HINIC3_MBOX_INT_DST_AEQN_SHIFT 10
+#define HINIC3_MBOX_INT_SRC_RESP_AEQN_SHIFT 12
+#define HINIC3_MBOX_INT_STAT_DMA_SHIFT 14
+/* The size of data to be send (unit of 4 bytes) */
+#define HINIC3_MBOX_INT_TX_SIZE_SHIFT 20
+/* SO_RO(strong order, relax order) */
+#define HINIC3_MBOX_INT_STAT_DMA_SO_RO_SHIFT 25
+#define HINIC3_MBOX_INT_WB_EN_SHIFT 28
+
+#define HINIC3_MBOX_INT_DST_AEQN_MASK 0x3
+#define HINIC3_MBOX_INT_SRC_RESP_AEQN_MASK 0x3
+#define HINIC3_MBOX_INT_STAT_DMA_MASK 0x3F
+#define HINIC3_MBOX_INT_TX_SIZE_MASK 0x1F
+#define HINIC3_MBOX_INT_STAT_DMA_SO_RO_MASK 0x3
+#define HINIC3_MBOX_INT_WB_EN_MASK 0x1
+
+#define HINIC3_MBOX_INT_SET(val, field) \
+ (((val) & HINIC3_MBOX_INT_##field##_MASK) << \
+ HINIC3_MBOX_INT_##field##_SHIFT)
+
+enum hinic3_mbox_tx_status {
+ TX_NOT_DONE = 1,
+};
+
+#define HINIC3_MBOX_CTRL_TRIGGER_AEQE_SHIFT 0
+/* specifies the issue request for the message data.
+ * 0 - Tx request is done;
+ * 1 - Tx request is in process.
+ */
+#define HINIC3_MBOX_CTRL_TX_STATUS_SHIFT 1
+#define HINIC3_MBOX_CTRL_DST_FUNC_SHIFT 16
+
+#define HINIC3_MBOX_CTRL_TRIGGER_AEQE_MASK 0x1
+#define HINIC3_MBOX_CTRL_TX_STATUS_MASK 0x1
+#define HINIC3_MBOX_CTRL_DST_FUNC_MASK 0x1FFF
+
+#define HINIC3_MBOX_CTRL_SET(val, field) \
+ (((val) & HINIC3_MBOX_CTRL_##field##_MASK) << \
+ HINIC3_MBOX_CTRL_##field##_SHIFT)
+
+#define MBOX_SEGLEN_MASK \
+ HINIC3_MSG_HEADER_SET(HINIC3_MSG_HEADER_SEG_LEN_MASK, SEG_LEN)
+
+#define MBOX_MSG_POLLING_TIMEOUT 8000
+#define HINIC3_MBOX_COMP_TIME 40000U
+
+#define MBOX_MAX_BUF_SZ 2048U
+#define MBOX_HEADER_SZ 8
+#define HINIC3_MBOX_DATA_SIZE (MBOX_MAX_BUF_SZ - MBOX_HEADER_SZ)
+
+/* MBOX size is 64B, 8B for mbox_header, 8B reserved */
+#define MBOX_SEG_LEN 48
+#define MBOX_SEG_LEN_ALIGN 4
+#define MBOX_WB_STATUS_LEN 16UL
+
+#define SEQ_ID_START_VAL 0
+#define SEQ_ID_MAX_VAL 42
+#define MBOX_LAST_SEG_MAX_LEN (MBOX_MAX_BUF_SZ - \
+ SEQ_ID_MAX_VAL * MBOX_SEG_LEN)
+
+/* mbox write back status is 16B, only first 4B is used */
+#define MBOX_WB_STATUS_ERRCODE_MASK 0xFFFF
+#define MBOX_WB_STATUS_MASK 0xFF
+#define MBOX_WB_ERROR_CODE_MASK 0xFF00
+#define MBOX_WB_STATUS_FINISHED_SUCCESS 0xFF
+#define MBOX_WB_STATUS_FINISHED_WITH_ERR 0xFE
+#define MBOX_WB_STATUS_NOT_FINISHED 0x00
+
+#define MBOX_STATUS_FINISHED(wb) \
+ (((wb) & MBOX_WB_STATUS_MASK) != MBOX_WB_STATUS_NOT_FINISHED)
+#define MBOX_STATUS_SUCCESS(wb) \
+ (((wb) & MBOX_WB_STATUS_MASK) == MBOX_WB_STATUS_FINISHED_SUCCESS)
+#define MBOX_STATUS_ERRCODE(wb) \
+ ((wb) & MBOX_WB_ERROR_CODE_MASK)
+
+#define DST_AEQ_IDX_DEFAULT_VAL 0
+#define SRC_AEQ_IDX_DEFAULT_VAL 0
+#define NO_DMA_ATTRIBUTE_VAL 0
+
+#define MBOX_MSG_NO_DATA_LEN 1
+
+#define MBOX_BODY_FROM_HDR(header) ((u8 *)(header) + MBOX_HEADER_SZ)
+#define MBOX_AREA(hwif) \
+ ((hwif)->cfg_regs_base + HINIC3_FUNC_CSR_MAILBOX_DATA_OFF)
+
+#define MBOX_DMA_MSG_QUEUE_DEPTH 32
+
+#define MBOX_MQ_CI_OFFSET (HINIC3_CFG_REGS_FLAG + HINIC3_FUNC_CSR_MAILBOX_DATA_OFF + \
+ MBOX_HEADER_SZ + MBOX_SEG_LEN)
+
+#define MBOX_MQ_SYNC_CI_SHIFT 0
+#define MBOX_MQ_ASYNC_CI_SHIFT 8
+
+#define MBOX_MQ_SYNC_CI_MASK 0xFF
+#define MBOX_MQ_ASYNC_CI_MASK 0xFF
+
+#define MBOX_MQ_CI_SET(val, field) \
+ (((val) & MBOX_MQ_##field##_CI_MASK) << MBOX_MQ_##field##_CI_SHIFT)
+#define MBOX_MQ_CI_GET(val, field) \
+ (((val) >> MBOX_MQ_##field##_CI_SHIFT) & MBOX_MQ_##field##_CI_MASK)
+#define MBOX_MQ_CI_CLEAR(val, field) \
+ ((val) & (~(MBOX_MQ_##field##_CI_MASK << MBOX_MQ_##field##_CI_SHIFT)))
+
+#define IS_PF_OR_PPF_SRC(hwdev, src_func_idx) \
+ ((src_func_idx) < HINIC3_MAX_PF_NUM(hwdev))
+
+#define MBOX_RESPONSE_ERROR 0x1
+#define MBOX_MSG_ID_MASK 0xF
+#define MBOX_MSG_ID(func_to_func) ((func_to_func)->send_msg_id)
+#define MBOX_MSG_ID_INC(func_to_func) \
+ (MBOX_MSG_ID(func_to_func) = \
+ (MBOX_MSG_ID(func_to_func) + 1) & MBOX_MSG_ID_MASK)
+
+/* max message counter wait to process for one function */
+#define HINIC3_MAX_MSG_CNT_TO_PROCESS 10
+
+#define MBOX_MSG_CHANNEL_STOP(func_to_func) \
+ ((((func_to_func)->lock_channel_en) && \
+ test_bit((func_to_func)->cur_msg_channel, \
+ &(func_to_func)->channel_stop)) ? true : false)
+
+enum mbox_ordering_type {
+ STRONG_ORDER,
+};
+
+enum mbox_write_back_type {
+ WRITE_BACK = 1,
+};
+
+enum mbox_aeq_trig_type {
+ NOT_TRIGGER,
+ TRIGGER,
+};
+
+static int send_mbox_msg(struct hinic3_mbox *func_to_func, u8 mod, u16 cmd,
+ void *msg, u16 msg_len, u16 dst_func,
+ enum hinic3_msg_direction_type direction,
+ enum hinic3_msg_ack_type ack_type,
+ struct mbox_msg_info *msg_info);
+
+static struct hinic3_msg_desc *get_mbox_msg_desc(struct hinic3_mbox *func_to_func,
+ u64 dir, u64 src_func_id);
+
+/**
+ * hinic3_register_ppf_mbox_cb - register mbox callback for ppf
+ * @hwdev: the pointer to hw device
+ * @mod: specific mod that the callback will handle
+ * @pri_handle specific mod's private data that will be used in callback
+ * @callback: callback function
+ * Return: 0 - success, negative - failure
+ */
+int hinic3_register_ppf_mbox_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_ppf_mbox_cb callback)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+
+ if (mod >= HINIC3_MOD_MAX || !hwdev)
+ return -EFAULT;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+
+ func_to_func->ppf_mbox_cb[mod] = callback;
+ func_to_func->ppf_mbox_data[mod] = pri_handle;
+
+ set_bit(HINIC3_PPF_MBOX_CB_REG, &func_to_func->ppf_mbox_cb_state[mod]);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_register_ppf_mbox_cb);
+
+/**
+ * hinic3_register_pf_mbox_cb - register mbox callback for pf
+ * @hwdev: the pointer to hw device
+ * @mod: specific mod that the callback will handle
+ * @pri_handle specific mod's private data that will be used in callback
+ * @callback: callback function
+ * Return: 0 - success, negative - failure
+ */
+int hinic3_register_pf_mbox_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_pf_mbox_cb callback)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+
+ if (mod >= HINIC3_MOD_MAX || !hwdev)
+ return -EFAULT;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+
+ func_to_func->pf_mbox_cb[mod] = callback;
+ func_to_func->pf_mbox_data[mod] = pri_handle;
+
+ set_bit(HINIC3_PF_MBOX_CB_REG, &func_to_func->pf_mbox_cb_state[mod]);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_register_pf_mbox_cb);
+
+/**
+ * hinic3_register_vf_mbox_cb - register mbox callback for vf
+ * @hwdev: the pointer to hw device
+ * @mod: specific mod that the callback will handle
+ * @pri_handle specific mod's private data that will be used in callback
+ * @callback: callback function
+ * Return: 0 - success, negative - failure
+ */
+int hinic3_register_vf_mbox_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_vf_mbox_cb callback)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+
+ if (mod >= HINIC3_MOD_MAX || !hwdev)
+ return -EFAULT;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+
+ func_to_func->vf_mbox_cb[mod] = callback;
+ func_to_func->vf_mbox_data[mod] = pri_handle;
+
+ set_bit(HINIC3_VF_MBOX_CB_REG, &func_to_func->vf_mbox_cb_state[mod]);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_register_vf_mbox_cb);
+
+/**
+ * hinic3_unregister_ppf_mbox_cb - unregister the mbox callback for ppf
+ * @hwdev: the pointer to hw device
+ * @mod: specific mod that the callback will handle
+ */
+void hinic3_unregister_ppf_mbox_cb(void *hwdev, u8 mod)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+
+ if (mod >= HINIC3_MOD_MAX || !hwdev)
+ return;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+
+ clear_bit(HINIC3_PPF_MBOX_CB_REG,
+ &func_to_func->ppf_mbox_cb_state[mod]);
+
+ while (test_bit(HINIC3_PPF_MBOX_CB_RUNNING,
+ &func_to_func->ppf_mbox_cb_state[mod]))
+ usleep_range(900, 1000); /* sleep 900 us ~ 1000 us */
+
+ func_to_func->ppf_mbox_data[mod] = NULL;
+ func_to_func->ppf_mbox_cb[mod] = NULL;
+}
+EXPORT_SYMBOL(hinic3_unregister_ppf_mbox_cb);
+
+/**
+ * hinic3_unregister_ppf_mbox_cb - unregister the mbox callback for pf
+ * @hwdev: the pointer to hw device
+ * @mod: specific mod that the callback will handle
+ */
+void hinic3_unregister_pf_mbox_cb(void *hwdev, u8 mod)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+
+ if (mod >= HINIC3_MOD_MAX || !hwdev)
+ return;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+
+ clear_bit(HINIC3_PF_MBOX_CB_REG, &func_to_func->pf_mbox_cb_state[mod]);
+
+ while (test_bit(HINIC3_PF_MBOX_CB_RUNNING, &func_to_func->pf_mbox_cb_state[mod]) != 0)
+ usleep_range(900, 1000); /* sleep 900 us ~ 1000 us */
+
+ func_to_func->pf_mbox_data[mod] = NULL;
+ func_to_func->pf_mbox_cb[mod] = NULL;
+}
+EXPORT_SYMBOL(hinic3_unregister_pf_mbox_cb);
+
+/**
+ * hinic3_unregister_vf_mbox_cb - unregister the mbox callback for vf
+ * @hwdev: the pointer to hw device
+ * @mod: specific mod that the callback will handle
+ */
+void hinic3_unregister_vf_mbox_cb(void *hwdev, u8 mod)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+
+ if (mod >= HINIC3_MOD_MAX || !hwdev)
+ return;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+
+ clear_bit(HINIC3_VF_MBOX_CB_REG, &func_to_func->vf_mbox_cb_state[mod]);
+
+ while (test_bit(HINIC3_VF_MBOX_CB_RUNNING, &func_to_func->vf_mbox_cb_state[mod]) != 0)
+ usleep_range(900, 1000); /* sleep 900 us ~ 1000 us */
+
+ func_to_func->vf_mbox_data[mod] = NULL;
+ func_to_func->vf_mbox_cb[mod] = NULL;
+}
+EXPORT_SYMBOL(hinic3_unregister_vf_mbox_cb);
+
+/**
+ * hinic3_unregister_ppf_mbox_cb - unregister the mbox callback for pf from ppf
+ * @hwdev: the pointer to hw device
+ * @mod: specific mod that the callback will handle
+ */
+void hinic3_unregister_ppf_to_pf_mbox_cb(void *hwdev, u8 mod)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+
+ if (mod >= HINIC3_MOD_MAX || !hwdev)
+ return;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+
+ clear_bit(HINIC3_PPF_TO_PF_MBOX_CB_REG,
+ &func_to_func->ppf_to_pf_mbox_cb_state[mod]);
+
+ while (test_bit(HINIC3_PPF_TO_PF_MBOX_CB_RUNNIG,
+ &func_to_func->ppf_to_pf_mbox_cb_state[mod]))
+ usleep_range(900, 1000); /* sleep 900 us ~ 1000 us */
+
+ func_to_func->pf_recv_ppf_mbox_data[mod] = NULL;
+ func_to_func->pf_recv_ppf_mbox_cb[mod] = NULL;
+}
+
+static int recv_vf_mbox_handler(struct hinic3_mbox *func_to_func,
+ struct hinic3_recv_mbox *recv_mbox,
+ void *buf_out, u16 *out_size)
+{
+ hinic3_vf_mbox_cb cb;
+ int ret;
+
+ if (recv_mbox->mod >= HINIC3_MOD_MAX) {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "Receive illegal mbox message, mod = %hhu\n",
+ recv_mbox->mod);
+ return -EINVAL;
+ }
+
+ set_bit(HINIC3_VF_MBOX_CB_RUNNING,
+ &func_to_func->vf_mbox_cb_state[recv_mbox->mod]);
+
+ cb = func_to_func->vf_mbox_cb[recv_mbox->mod];
+ if (cb && test_bit(HINIC3_VF_MBOX_CB_REG,
+ &func_to_func->vf_mbox_cb_state[recv_mbox->mod])) {
+ ret = cb(func_to_func->vf_mbox_data[recv_mbox->mod],
+ recv_mbox->cmd, recv_mbox->msg,
+ recv_mbox->msg_len, buf_out, out_size);
+ } else {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "VF mbox cb is not registered\n");
+ ret = -EINVAL;
+ }
+
+ clear_bit(HINIC3_VF_MBOX_CB_RUNNING,
+ &func_to_func->vf_mbox_cb_state[recv_mbox->mod]);
+
+ return ret;
+}
+
+static int recv_pf_from_ppf_handler(struct hinic3_mbox *func_to_func,
+ struct hinic3_recv_mbox *recv_mbox,
+ void *buf_out, u16 *out_size)
+{
+ hinic3_pf_recv_from_ppf_mbox_cb cb;
+ enum hinic3_mod_type mod = recv_mbox->mod;
+ int ret;
+
+ if (mod >= HINIC3_MOD_MAX) {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "Receive illegal mbox message, mod = %d\n",
+ mod);
+ return -EINVAL;
+ }
+
+ set_bit(HINIC3_PPF_TO_PF_MBOX_CB_RUNNIG,
+ &func_to_func->ppf_to_pf_mbox_cb_state[mod]);
+
+ cb = func_to_func->pf_recv_ppf_mbox_cb[mod];
+ if (cb && test_bit(HINIC3_PPF_TO_PF_MBOX_CB_REG,
+ &func_to_func->ppf_to_pf_mbox_cb_state[mod]) != 0) {
+ ret = cb(func_to_func->pf_recv_ppf_mbox_data[mod],
+ recv_mbox->cmd, recv_mbox->msg, recv_mbox->msg_len,
+ buf_out, out_size);
+ } else {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "PF receive ppf mailbox callback is not registered\n");
+ ret = -EINVAL;
+ }
+
+ clear_bit(HINIC3_PPF_TO_PF_MBOX_CB_RUNNIG,
+ &func_to_func->ppf_to_pf_mbox_cb_state[mod]);
+
+ return ret;
+}
+
+static int recv_ppf_mbox_handler(struct hinic3_mbox *func_to_func,
+ struct hinic3_recv_mbox *recv_mbox,
+ u8 pf_id, void *buf_out, u16 *out_size)
+{
+ hinic3_ppf_mbox_cb cb;
+ u16 vf_id = 0;
+ int ret;
+
+ if (recv_mbox->mod >= HINIC3_MOD_MAX) {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "Receive illegal mbox message, mod = %hhu\n",
+ recv_mbox->mod);
+ return -EINVAL;
+ }
+
+ set_bit(HINIC3_PPF_MBOX_CB_RUNNING,
+ &func_to_func->ppf_mbox_cb_state[recv_mbox->mod]);
+
+ cb = func_to_func->ppf_mbox_cb[recv_mbox->mod];
+ if (cb && test_bit(HINIC3_PPF_MBOX_CB_REG,
+ &func_to_func->ppf_mbox_cb_state[recv_mbox->mod])) {
+ ret = cb(func_to_func->ppf_mbox_data[recv_mbox->mod],
+ pf_id, vf_id, recv_mbox->cmd, recv_mbox->msg,
+ recv_mbox->msg_len, buf_out, out_size);
+ } else {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "PPF mbox cb is not registered, mod = %hhu\n",
+ recv_mbox->mod);
+ ret = -EINVAL;
+ }
+
+ clear_bit(HINIC3_PPF_MBOX_CB_RUNNING,
+ &func_to_func->ppf_mbox_cb_state[recv_mbox->mod]);
+
+ return ret;
+}
+
+static int recv_pf_from_vf_mbox_handler(struct hinic3_mbox *func_to_func,
+ struct hinic3_recv_mbox *recv_mbox,
+ u16 src_func_idx, void *buf_out,
+ u16 *out_size)
+{
+ hinic3_pf_mbox_cb cb;
+ u16 vf_id = 0;
+ int ret;
+
+ if (recv_mbox->mod >= HINIC3_MOD_MAX) {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "Receive illegal mbox message, mod = %hhu\n",
+ recv_mbox->mod);
+ return -EINVAL;
+ }
+
+ set_bit(HINIC3_PF_MBOX_CB_RUNNING,
+ &func_to_func->pf_mbox_cb_state[recv_mbox->mod]);
+
+ cb = func_to_func->pf_mbox_cb[recv_mbox->mod];
+ if (cb && test_bit(HINIC3_PF_MBOX_CB_REG,
+ &func_to_func->pf_mbox_cb_state[recv_mbox->mod]) != 0) {
+ vf_id = src_func_idx -
+ hinic3_glb_pf_vf_offset(func_to_func->hwdev);
+ ret = cb(func_to_func->pf_mbox_data[recv_mbox->mod],
+ vf_id, recv_mbox->cmd, recv_mbox->msg,
+ recv_mbox->msg_len, buf_out, out_size);
+ } else {
+ sdk_warn(func_to_func->hwdev->dev_hdl, "PF mbox mod(0x%x) cb is not registered\n",
+ recv_mbox->mod);
+ ret = -EINVAL;
+ }
+
+ clear_bit(HINIC3_PF_MBOX_CB_RUNNING,
+ &func_to_func->pf_mbox_cb_state[recv_mbox->mod]);
+
+ return ret;
+}
+
+static void response_for_recv_func_mbox(struct hinic3_mbox *func_to_func,
+ struct hinic3_recv_mbox *recv_mbox,
+ int err, u16 out_size, u16 src_func_idx)
+{
+ struct mbox_msg_info msg_info = {0};
+ u16 size = out_size;
+
+ msg_info.msg_id = recv_mbox->msg_id;
+ if (err)
+ msg_info.status = HINIC3_MBOX_PF_SEND_ERR;
+
+ /* if not data need to response, set out_size to 1 */
+ if (!out_size || err)
+ size = MBOX_MSG_NO_DATA_LEN;
+
+ if (size > HINIC3_MBOX_DATA_SIZE) {
+ sdk_err(func_to_func->hwdev->dev_hdl, "Response msg len(%d) exceed limit(%d)\n",
+ size, HINIC3_MBOX_DATA_SIZE);
+ size = HINIC3_MBOX_DATA_SIZE;
+ }
+
+ send_mbox_msg(func_to_func, recv_mbox->mod, recv_mbox->cmd,
+ recv_mbox->resp_buff, size, src_func_idx,
+ HINIC3_MSG_RESPONSE, HINIC3_MSG_NO_ACK, &msg_info);
+}
+
+static void recv_func_mbox_handler(struct hinic3_mbox *func_to_func,
+ struct hinic3_recv_mbox *recv_mbox)
+{
+ struct hinic3_hwdev *dev = func_to_func->hwdev;
+ void *buf_out = recv_mbox->resp_buff;
+ u16 src_func_idx = recv_mbox->src_func_idx;
+ u16 out_size = HINIC3_MBOX_DATA_SIZE;
+ int err = 0;
+
+ if (HINIC3_IS_VF(dev)) {
+ err = recv_vf_mbox_handler(func_to_func, recv_mbox, buf_out,
+ &out_size);
+ } else { /* pf/ppf process */
+ if (IS_PF_OR_PPF_SRC(dev, src_func_idx)) {
+ if (HINIC3_IS_PPF(dev)) {
+ err = recv_ppf_mbox_handler(func_to_func,
+ recv_mbox,
+ (u8)src_func_idx,
+ buf_out, &out_size);
+ if (err)
+ goto out;
+ } else {
+ err = recv_pf_from_ppf_handler(func_to_func,
+ recv_mbox,
+ buf_out,
+ &out_size);
+ if (err)
+ goto out;
+ }
+ /* The source is neither PF nor PPF, so it is from VF */
+ } else {
+ err = recv_pf_from_vf_mbox_handler(func_to_func,
+ recv_mbox,
+ src_func_idx,
+ buf_out, &out_size);
+ }
+ }
+
+out:
+ if (recv_mbox->ack_type == HINIC3_MSG_ACK)
+ response_for_recv_func_mbox(func_to_func, recv_mbox, err,
+ out_size, src_func_idx);
+}
+
+static struct hinic3_recv_mbox *alloc_recv_mbox(void)
+{
+ struct hinic3_recv_mbox *recv_msg = NULL;
+
+ recv_msg = kzalloc(sizeof(*recv_msg), GFP_KERNEL);
+ if (!recv_msg)
+ return NULL;
+
+ recv_msg->msg = kzalloc(MBOX_MAX_BUF_SZ, GFP_KERNEL);
+ if (!recv_msg->msg)
+ goto alloc_msg_err;
+
+ recv_msg->resp_buff = kzalloc(MBOX_MAX_BUF_SZ, GFP_KERNEL);
+ if (!recv_msg->resp_buff)
+ goto alloc_resp_bff_err;
+
+ return recv_msg;
+
+alloc_resp_bff_err:
+ kfree(recv_msg->msg);
+
+alloc_msg_err:
+ kfree(recv_msg);
+
+ return NULL;
+}
+
+static void free_recv_mbox(struct hinic3_recv_mbox *recv_msg)
+{
+ kfree(recv_msg->resp_buff);
+ kfree(recv_msg->msg);
+ kfree(recv_msg);
+}
+
+static void recv_func_mbox_work_handler(struct work_struct *work)
+{
+ struct hinic3_mbox_work *mbox_work =
+ container_of(work, struct hinic3_mbox_work, work);
+
+ recv_func_mbox_handler(mbox_work->func_to_func, mbox_work->recv_mbox);
+
+ atomic_dec(&mbox_work->msg_ch->recv_msg_cnt);
+
+ destroy_work(&mbox_work->work);
+
+ free_recv_mbox(mbox_work->recv_mbox);
+ kfree(mbox_work);
+}
+
+static void resp_mbox_handler(struct hinic3_mbox *func_to_func,
+ const struct hinic3_msg_desc *msg_desc)
+{
+ spin_lock(&func_to_func->mbox_lock);
+ if (msg_desc->msg_info.msg_id == func_to_func->send_msg_id &&
+ func_to_func->event_flag == EVENT_START)
+ func_to_func->event_flag = EVENT_SUCCESS;
+ else
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Mbox response timeout, current send msg id(0x%x), recv msg id(0x%x), status(0x%x)\n",
+ func_to_func->send_msg_id, msg_desc->msg_info.msg_id,
+ msg_desc->msg_info.status);
+ spin_unlock(&func_to_func->mbox_lock);
+}
+
+static void recv_mbox_msg_handler(struct hinic3_mbox *func_to_func,
+ struct hinic3_msg_desc *msg_desc,
+ u64 mbox_header)
+{
+ struct hinic3_hwdev *hwdev = func_to_func->hwdev;
+ struct hinic3_recv_mbox *recv_msg = NULL;
+ struct hinic3_mbox_work *mbox_work = NULL;
+ struct hinic3_msg_channel *msg_ch =
+ container_of(msg_desc, struct hinic3_msg_channel, recv_msg);
+ u16 src_func_idx = HINIC3_MSG_HEADER_GET(mbox_header, SRC_GLB_FUNC_IDX);
+
+ if (atomic_read(&msg_ch->recv_msg_cnt) >
+ HINIC3_MAX_MSG_CNT_TO_PROCESS) {
+ sdk_warn(hwdev->dev_hdl, "This function(%u) have %d message wait to process, can't add to work queue\n",
+ src_func_idx, atomic_read(&msg_ch->recv_msg_cnt));
+ return;
+ }
+
+ recv_msg = alloc_recv_mbox();
+ if (!recv_msg) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc receive mbox message buffer\n");
+ return;
+ }
+ recv_msg->msg_len = msg_desc->msg_len;
+ memcpy(recv_msg->msg, msg_desc->msg, recv_msg->msg_len);
+ recv_msg->msg_id = msg_desc->msg_info.msg_id;
+ recv_msg->mod = HINIC3_MSG_HEADER_GET(mbox_header, MODULE);
+ recv_msg->cmd = HINIC3_MSG_HEADER_GET(mbox_header, CMD);
+ recv_msg->ack_type = HINIC3_MSG_HEADER_GET(mbox_header, NO_ACK);
+ recv_msg->src_func_idx = src_func_idx;
+
+ mbox_work = kzalloc(sizeof(*mbox_work), GFP_KERNEL);
+ if (!mbox_work) {
+ sdk_err(hwdev->dev_hdl, "Allocate mbox work memory failed.\n");
+ free_recv_mbox(recv_msg);
+ return;
+ }
+
+ atomic_inc(&msg_ch->recv_msg_cnt);
+
+ mbox_work->func_to_func = func_to_func;
+ mbox_work->recv_mbox = recv_msg;
+ mbox_work->msg_ch = msg_ch;
+
+ INIT_WORK(&mbox_work->work, recv_func_mbox_work_handler);
+ queue_work_on(hisdk3_get_work_cpu_affinity(hwdev, WORK_TYPE_MBOX),
+ func_to_func->workq, &mbox_work->work);
+}
+
+static bool check_mbox_segment(struct hinic3_mbox *func_to_func,
+ struct hinic3_msg_desc *msg_desc,
+ u64 mbox_header, void *mbox_body)
+{
+ u8 seq_id, seg_len, msg_id, mod;
+ u16 src_func_idx, cmd;
+
+ seq_id = HINIC3_MSG_HEADER_GET(mbox_header, SEQID);
+ seg_len = HINIC3_MSG_HEADER_GET(mbox_header, SEG_LEN);
+ msg_id = HINIC3_MSG_HEADER_GET(mbox_header, MSG_ID);
+ mod = HINIC3_MSG_HEADER_GET(mbox_header, MODULE);
+ cmd = HINIC3_MSG_HEADER_GET(mbox_header, CMD);
+ src_func_idx = HINIC3_MSG_HEADER_GET(mbox_header, SRC_GLB_FUNC_IDX);
+
+ if (seq_id > SEQ_ID_MAX_VAL || seg_len > MBOX_SEG_LEN ||
+ (seq_id == SEQ_ID_MAX_VAL && seg_len > MBOX_LAST_SEG_MAX_LEN))
+ goto seg_err;
+
+ if (seq_id == 0) {
+ msg_desc->seq_id = seq_id;
+ msg_desc->msg_info.msg_id = msg_id;
+ msg_desc->mod = mod;
+ msg_desc->cmd = cmd;
+ } else {
+ if (seq_id != msg_desc->seq_id + 1 || msg_id != msg_desc->msg_info.msg_id ||
+ mod != msg_desc->mod || cmd != msg_desc->cmd)
+ goto seg_err;
+
+ msg_desc->seq_id = seq_id;
+ }
+
+ return true;
+
+seg_err:
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Mailbox segment check failed, src func id: 0x%x, front seg info: seq id: 0x%x, msg id: 0x%x, mod: 0x%x, cmd: 0x%x\n",
+ src_func_idx, msg_desc->seq_id, msg_desc->msg_info.msg_id,
+ msg_desc->mod, msg_desc->cmd);
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Current seg info: seg len: 0x%x, seq id: 0x%x, msg id: 0x%x, mod: 0x%x, cmd: 0x%x\n",
+ seg_len, seq_id, msg_id, mod, cmd);
+
+ return false;
+}
+
+static void recv_mbox_handler(struct hinic3_mbox *func_to_func,
+ u64 *header, struct hinic3_msg_desc *msg_desc)
+{
+ u64 mbox_header = *header;
+ void *mbox_body = MBOX_BODY_FROM_HDR(((void *)header));
+ u8 seq_id, seg_len;
+ int pos;
+
+ if (!check_mbox_segment(func_to_func, msg_desc, mbox_header, mbox_body)) {
+ msg_desc->seq_id = SEQ_ID_MAX_VAL;
+ return;
+ }
+
+ seq_id = HINIC3_MSG_HEADER_GET(mbox_header, SEQID);
+ seg_len = HINIC3_MSG_HEADER_GET(mbox_header, SEG_LEN);
+
+ pos = seq_id * MBOX_SEG_LEN;
+ memcpy((u8 *)msg_desc->msg + pos, mbox_body, seg_len);
+
+ if (!HINIC3_MSG_HEADER_GET(mbox_header, LAST))
+ return;
+
+ msg_desc->msg_len = HINIC3_MSG_HEADER_GET(mbox_header, MSG_LEN);
+ msg_desc->msg_info.status = HINIC3_MSG_HEADER_GET(mbox_header, STATUS);
+
+ if (HINIC3_MSG_HEADER_GET(mbox_header, DIRECTION) ==
+ HINIC3_MSG_RESPONSE) {
+ resp_mbox_handler(func_to_func, msg_desc);
+ return;
+ }
+
+ recv_mbox_msg_handler(func_to_func, msg_desc, mbox_header);
+}
+
+void hinic3_mbox_func_aeqe_handler(void *handle, u8 *header, u8 size)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+ struct hinic3_msg_desc *msg_desc = NULL;
+ u64 mbox_header = *((u64 *)header);
+ u64 src, dir;
+
+ func_to_func = ((struct hinic3_hwdev *)handle)->func_to_func;
+
+ dir = HINIC3_MSG_HEADER_GET(mbox_header, DIRECTION);
+ src = HINIC3_MSG_HEADER_GET(mbox_header, SRC_GLB_FUNC_IDX);
+
+ msg_desc = get_mbox_msg_desc(func_to_func, dir, src);
+ if (!msg_desc) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Mailbox source function id: %u is invalid for current function\n",
+ (u32)src);
+ return;
+ }
+
+ recv_mbox_handler(func_to_func, (u64 *)header, msg_desc);
+}
+
+static int init_mbox_dma_queue(struct hinic3_hwdev *hwdev, struct mbox_dma_queue *mq)
+{
+ u32 size;
+
+ mq->depth = MBOX_DMA_MSG_QUEUE_DEPTH;
+ mq->prod_idx = 0;
+ mq->cons_idx = 0;
+
+ size = mq->depth * MBOX_MAX_BUF_SZ;
+ mq->dma_buff_vaddr = dma_zalloc_coherent(hwdev->dev_hdl, size, &mq->dma_buff_paddr,
+ GFP_KERNEL);
+ if (!mq->dma_buff_vaddr) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc dma_buffer\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void deinit_mbox_dma_queue(struct hinic3_hwdev *hwdev, struct mbox_dma_queue *mq)
+{
+ dma_free_coherent(hwdev->dev_hdl, mq->depth * MBOX_MAX_BUF_SZ,
+ mq->dma_buff_vaddr, mq->dma_buff_paddr);
+}
+
+static int hinic3_init_mbox_dma_queue(struct hinic3_mbox *func_to_func)
+{
+ u32 val;
+ int err;
+
+ err = init_mbox_dma_queue(func_to_func->hwdev, &func_to_func->sync_msg_queue);
+ if (err)
+ return err;
+
+ err = init_mbox_dma_queue(func_to_func->hwdev, &func_to_func->async_msg_queue);
+ if (err) {
+ deinit_mbox_dma_queue(func_to_func->hwdev, &func_to_func->sync_msg_queue);
+ return err;
+ }
+
+ val = hinic3_hwif_read_reg(func_to_func->hwdev->hwif, MBOX_MQ_CI_OFFSET);
+ val = MBOX_MQ_CI_CLEAR(val, SYNC);
+ val = MBOX_MQ_CI_CLEAR(val, ASYNC);
+ hinic3_hwif_write_reg(func_to_func->hwdev->hwif, MBOX_MQ_CI_OFFSET, val);
+
+ return 0;
+}
+
+static void hinic3_deinit_mbox_dma_queue(struct hinic3_mbox *func_to_func)
+{
+ deinit_mbox_dma_queue(func_to_func->hwdev, &func_to_func->sync_msg_queue);
+ deinit_mbox_dma_queue(func_to_func->hwdev, &func_to_func->async_msg_queue);
+}
+
+#define MBOX_DMA_MSG_INIT_XOR_VAL 0x5a5a5a5a
+#define MBOX_XOR_DATA_ALIGN 4
+static u32 mbox_dma_msg_xor(u32 *data, u16 msg_len)
+{
+ u32 xor = MBOX_DMA_MSG_INIT_XOR_VAL;
+ u16 dw_len = msg_len / sizeof(u32);
+ u16 i;
+
+ for (i = 0; i < dw_len; i++)
+ xor ^= data[i];
+
+ return xor;
+}
+
+#define MQ_ID_MASK(mq, idx) ((idx) & ((mq)->depth - 1))
+#define IS_MSG_QUEUE_FULL(mq) (MQ_ID_MASK(mq, (mq)->prod_idx + 1) == \
+ MQ_ID_MASK(mq, (mq)->cons_idx))
+
+static int mbox_prepare_dma_entry(struct hinic3_mbox *func_to_func, struct mbox_dma_queue *mq,
+ struct mbox_dma_msg *dma_msg, void *msg, u16 msg_len)
+{
+ u64 dma_addr, offset;
+ void *dma_vaddr;
+
+ if (IS_MSG_QUEUE_FULL(mq)) {
+ sdk_err(func_to_func->hwdev->dev_hdl, "Mbox sync message queue is busy, pi: %u, ci: %u\n",
+ mq->prod_idx, MQ_ID_MASK(mq, mq->cons_idx));
+ return -EBUSY;
+ }
+
+ /* copy data to DMA buffer */
+ offset = mq->prod_idx * MBOX_MAX_BUF_SZ;
+ dma_vaddr = (u8 *)mq->dma_buff_vaddr + offset;
+ memcpy(dma_vaddr, msg, msg_len);
+ dma_addr = mq->dma_buff_paddr + offset;
+ dma_msg->dma_addr_high = upper_32_bits(dma_addr);
+ dma_msg->dma_addr_low = lower_32_bits(dma_addr);
+ dma_msg->msg_len = msg_len;
+ /* The firmware obtains message based on 4B alignment. */
+ dma_msg->xor = mbox_dma_msg_xor(dma_vaddr, ALIGN(msg_len, MBOX_XOR_DATA_ALIGN));
+
+ mq->prod_idx++;
+ mq->prod_idx = MQ_ID_MASK(mq, mq->prod_idx);
+
+ return 0;
+}
+
+static int mbox_prepare_dma_msg(struct hinic3_mbox *func_to_func, enum hinic3_msg_ack_type ack_type,
+ struct mbox_dma_msg *dma_msg, void *msg, u16 msg_len)
+{
+ struct mbox_dma_queue *mq = NULL;
+ u32 val;
+
+ val = hinic3_hwif_read_reg(func_to_func->hwdev->hwif, MBOX_MQ_CI_OFFSET);
+ if (ack_type == HINIC3_MSG_ACK) {
+ mq = &func_to_func->sync_msg_queue;
+ mq->cons_idx = MBOX_MQ_CI_GET(val, SYNC);
+ } else {
+ mq = &func_to_func->async_msg_queue;
+ mq->cons_idx = MBOX_MQ_CI_GET(val, ASYNC);
+ }
+
+ return mbox_prepare_dma_entry(func_to_func, mq, dma_msg, msg, msg_len);
+}
+
+static void clear_mbox_status(struct hinic3_send_mbox *mbox)
+{
+ *mbox->wb_status = 0;
+
+ /* clear mailbox write back status */
+ wmb();
+}
+
+static void mbox_copy_header(struct hinic3_hwdev *hwdev,
+ struct hinic3_send_mbox *mbox, u64 *header)
+{
+ u32 *data = (u32 *)header;
+ u32 i, idx_max = MBOX_HEADER_SZ / sizeof(u32);
+
+ for (i = 0; i < idx_max; i++) {
+ __raw_writel(cpu_to_be32(*(data + i)),
+ mbox->data + i * sizeof(u32));
+ }
+}
+
+static void mbox_copy_send_data(struct hinic3_hwdev *hwdev,
+ struct hinic3_send_mbox *mbox, void *seg,
+ u16 seg_len)
+{
+ u32 *data = seg;
+ u32 data_len, chk_sz = sizeof(u32);
+ u32 i, idx_max;
+ u8 mbox_max_buf[MBOX_SEG_LEN] = {0};
+
+ /* The mbox message should be aligned in 4 bytes. */
+ if (seg_len % chk_sz) {
+ memcpy(mbox_max_buf, seg, seg_len);
+ data = (u32 *)mbox_max_buf;
+ }
+
+ data_len = seg_len;
+ idx_max = ALIGN(data_len, chk_sz) / chk_sz;
+
+ for (i = 0; i < idx_max; i++) {
+ __raw_writel(cpu_to_be32(*(data + i)),
+ mbox->data + MBOX_HEADER_SZ + i * sizeof(u32));
+ }
+}
+
+static void write_mbox_msg_attr(struct hinic3_mbox *func_to_func,
+ u16 dst_func, u16 dst_aeqn, u16 seg_len)
+{
+ u32 mbox_int, mbox_ctrl;
+ u16 func = dst_func;
+
+ /* for VF to PF's message, dest func id will self-learning by HW */
+ if (HINIC3_IS_VF(func_to_func->hwdev) && dst_func != HINIC3_MGMT_SRC_ID)
+ func = 0; /* the destination is the VF's PF */
+
+ mbox_int = HINIC3_MBOX_INT_SET(dst_aeqn, DST_AEQN) |
+ HINIC3_MBOX_INT_SET(0, SRC_RESP_AEQN) |
+ HINIC3_MBOX_INT_SET(NO_DMA_ATTRIBUTE_VAL, STAT_DMA) |
+ HINIC3_MBOX_INT_SET(ALIGN(seg_len + MBOX_HEADER_SZ,
+ MBOX_SEG_LEN_ALIGN) >> 2,
+ TX_SIZE) |
+ HINIC3_MBOX_INT_SET(STRONG_ORDER, STAT_DMA_SO_RO) |
+ HINIC3_MBOX_INT_SET(WRITE_BACK, WB_EN);
+
+ hinic3_hwif_write_reg(func_to_func->hwdev->hwif,
+ HINIC3_FUNC_CSR_MAILBOX_INT_OFFSET_OFF, mbox_int);
+
+ wmb(); /* writing the mbox int attributes */
+ mbox_ctrl = HINIC3_MBOX_CTRL_SET(TX_NOT_DONE, TX_STATUS);
+
+ mbox_ctrl |= HINIC3_MBOX_CTRL_SET(NOT_TRIGGER, TRIGGER_AEQE);
+
+ mbox_ctrl |= HINIC3_MBOX_CTRL_SET(func, DST_FUNC);
+
+ hinic3_hwif_write_reg(func_to_func->hwdev->hwif,
+ HINIC3_FUNC_CSR_MAILBOX_CONTROL_OFF, mbox_ctrl);
+}
+
+static void dump_mbox_reg(struct hinic3_hwdev *hwdev)
+{
+ u32 val;
+
+ val = hinic3_hwif_read_reg(hwdev->hwif,
+ HINIC3_FUNC_CSR_MAILBOX_CONTROL_OFF);
+ sdk_err(hwdev->dev_hdl, "Mailbox control reg: 0x%x\n", val);
+ val = hinic3_hwif_read_reg(hwdev->hwif,
+ HINIC3_FUNC_CSR_MAILBOX_INT_OFFSET_OFF);
+ sdk_err(hwdev->dev_hdl, "Mailbox interrupt offset: 0x%x\n", val);
+}
+
+static u16 get_mbox_status(const struct hinic3_send_mbox *mbox)
+{
+ /* write back is 16B, but only use first 4B */
+ u64 wb_val = be64_to_cpu(*mbox->wb_status);
+
+ rmb(); /* verify reading before check */
+
+ return (u16)(wb_val & MBOX_WB_STATUS_ERRCODE_MASK);
+}
+
+static enum hinic3_wait_return check_mbox_wb_status(void *priv_data)
+{
+ struct hinic3_mbox *func_to_func = priv_data;
+ u16 wb_status;
+
+ if (MBOX_MSG_CHANNEL_STOP(func_to_func) || !func_to_func->hwdev->chip_present_flag)
+ return WAIT_PROCESS_ERR;
+
+ wb_status = get_mbox_status(&func_to_func->send_mbox);
+
+ return MBOX_STATUS_FINISHED(wb_status) ?
+ WAIT_PROCESS_CPL : WAIT_PROCESS_WAITING;
+}
+
+static int send_mbox_seg(struct hinic3_mbox *func_to_func, u64 header,
+ u16 dst_func, void *seg, u16 seg_len, void *msg_info)
+{
+ struct hinic3_send_mbox *send_mbox = &func_to_func->send_mbox;
+ struct hinic3_hwdev *hwdev = func_to_func->hwdev;
+ u8 num_aeqs = hwdev->hwif->attr.num_aeqs;
+ u16 dst_aeqn, wb_status = 0, errcode;
+ u16 seq_dir = HINIC3_MSG_HEADER_GET(header, DIRECTION);
+ int err;
+
+ /* mbox to mgmt cpu, hardware don't care dst aeq id */
+ if (num_aeqs > HINIC3_MBOX_RSP_MSG_AEQ)
+ dst_aeqn = (seq_dir == HINIC3_MSG_DIRECT_SEND) ?
+ HINIC3_ASYNC_MSG_AEQ : HINIC3_MBOX_RSP_MSG_AEQ;
+ else
+ dst_aeqn = 0;
+
+ clear_mbox_status(send_mbox);
+
+ mbox_copy_header(hwdev, send_mbox, &header);
+
+ mbox_copy_send_data(hwdev, send_mbox, seg, seg_len);
+
+ write_mbox_msg_attr(func_to_func, dst_func, dst_aeqn, seg_len);
+
+ wmb(); /* writing the mbox msg attributes */
+
+ err = hinic3_wait_for_timeout(func_to_func, check_mbox_wb_status,
+ MBOX_MSG_POLLING_TIMEOUT, USEC_PER_MSEC);
+ wb_status = get_mbox_status(send_mbox);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Send mailbox segment timeout, wb status: 0x%x\n",
+ wb_status);
+ dump_mbox_reg(hwdev);
+ return -ETIMEDOUT;
+ }
+
+ if (!MBOX_STATUS_SUCCESS(wb_status)) {
+ sdk_err(hwdev->dev_hdl, "Send mailbox segment to function %u error, wb status: 0x%x\n",
+ dst_func, wb_status);
+ errcode = MBOX_STATUS_ERRCODE(wb_status);
+ return errcode ? errcode : -EFAULT;
+ }
+
+ return 0;
+}
+
+static int send_mbox_msg(struct hinic3_mbox *func_to_func, u8 mod, u16 cmd,
+ void *msg, u16 msg_len, u16 dst_func,
+ enum hinic3_msg_direction_type direction,
+ enum hinic3_msg_ack_type ack_type,
+ struct mbox_msg_info *msg_info)
+{
+ struct hinic3_hwdev *hwdev = func_to_func->hwdev;
+ struct mbox_dma_msg dma_msg = {0};
+ enum hinic3_data_type data_type = HINIC3_DATA_INLINE;
+ int err = 0;
+ u32 seq_id = 0;
+ u16 seg_len = MBOX_SEG_LEN;
+ u16 rsp_aeq_id, left;
+ u8 *msg_seg = NULL;
+ u64 header = 0;
+
+ if (hwdev->poll || hwdev->hwif->attr.num_aeqs >= 0x2)
+ rsp_aeq_id = HINIC3_MBOX_RSP_MSG_AEQ;
+ else
+ rsp_aeq_id = 0;
+
+ mutex_lock(&func_to_func->msg_send_lock);
+
+ if (IS_DMA_MBX_MSG(dst_func) && !COMM_SUPPORT_MBOX_SEGMENT(hwdev)) {
+ err = mbox_prepare_dma_msg(func_to_func, ack_type, &dma_msg, msg, msg_len);
+ if (err != 0)
+ goto send_err;
+
+ msg = &dma_msg;
+ msg_len = sizeof(dma_msg);
+ data_type = HINIC3_DATA_DMA;
+ }
+
+ msg_seg = (u8 *)msg;
+ left = msg_len;
+
+ header = HINIC3_MSG_HEADER_SET(msg_len, MSG_LEN) |
+ HINIC3_MSG_HEADER_SET(mod, MODULE) |
+ HINIC3_MSG_HEADER_SET(seg_len, SEG_LEN) |
+ HINIC3_MSG_HEADER_SET(ack_type, NO_ACK) |
+ HINIC3_MSG_HEADER_SET(data_type, DATA_TYPE) |
+ HINIC3_MSG_HEADER_SET(SEQ_ID_START_VAL, SEQID) |
+ HINIC3_MSG_HEADER_SET(NOT_LAST_SEGMENT, LAST) |
+ HINIC3_MSG_HEADER_SET(direction, DIRECTION) |
+ HINIC3_MSG_HEADER_SET(cmd, CMD) |
+ /* The vf's offset to it's associated pf */
+ HINIC3_MSG_HEADER_SET(msg_info->msg_id, MSG_ID) |
+ HINIC3_MSG_HEADER_SET(rsp_aeq_id, AEQ_ID) |
+ HINIC3_MSG_HEADER_SET(HINIC3_MSG_FROM_MBOX, SOURCE) |
+ HINIC3_MSG_HEADER_SET(!!msg_info->status, STATUS) |
+ HINIC3_MSG_HEADER_SET(hinic3_global_func_id(hwdev),
+ SRC_GLB_FUNC_IDX);
+
+ while (!(HINIC3_MSG_HEADER_GET(header, LAST))) {
+ if (left <= MBOX_SEG_LEN) {
+ header &= ~MBOX_SEGLEN_MASK;
+ header |= HINIC3_MSG_HEADER_SET(left, SEG_LEN);
+ header |= HINIC3_MSG_HEADER_SET(LAST_SEGMENT, LAST);
+
+ seg_len = left;
+ }
+
+ err = send_mbox_seg(func_to_func, header, dst_func, msg_seg,
+ seg_len, msg_info);
+ if (err != 0) {
+ sdk_err(hwdev->dev_hdl, "Failed to send mbox seg, seq_id=0x%llx\n",
+ HINIC3_MSG_HEADER_GET(header, SEQID));
+ goto send_err;
+ }
+
+ left -= MBOX_SEG_LEN;
+ msg_seg += MBOX_SEG_LEN; /*lint !e662 */
+
+ seq_id++;
+ header &= ~(HINIC3_MSG_HEADER_SET(HINIC3_MSG_HEADER_SEQID_MASK,
+ SEQID));
+ header |= HINIC3_MSG_HEADER_SET(seq_id, SEQID);
+ }
+
+send_err:
+ mutex_unlock(&func_to_func->msg_send_lock);
+
+ return err;
+}
+
+static void set_mbox_to_func_event(struct hinic3_mbox *func_to_func,
+ enum mbox_event_state event_flag)
+{
+ spin_lock(&func_to_func->mbox_lock);
+ func_to_func->event_flag = event_flag;
+ spin_unlock(&func_to_func->mbox_lock);
+}
+
+static enum hinic3_wait_return check_mbox_msg_finish(void *priv_data)
+{
+ struct hinic3_mbox *func_to_func = priv_data;
+
+ if (MBOX_MSG_CHANNEL_STOP(func_to_func) || func_to_func->hwdev->chip_present_flag == 0)
+ return WAIT_PROCESS_ERR;
+
+ if (func_to_func->hwdev->poll) {
+ }
+
+ return (func_to_func->event_flag == EVENT_SUCCESS) ?
+ WAIT_PROCESS_CPL : WAIT_PROCESS_WAITING;
+}
+
+static int wait_mbox_msg_completion(struct hinic3_mbox *func_to_func,
+ u32 timeout)
+{
+ u32 wait_time;
+ int err;
+
+ wait_time = (timeout != 0) ? timeout : HINIC3_MBOX_COMP_TIME;
+ err = hinic3_wait_for_timeout(func_to_func, check_mbox_msg_finish,
+ wait_time, USEC_PER_MSEC);
+ if (err) {
+ set_mbox_to_func_event(func_to_func, EVENT_TIMEOUT);
+ return -ETIMEDOUT;
+ }
+
+ set_mbox_to_func_event(func_to_func, EVENT_END);
+
+ return 0;
+}
+
+#define TRY_MBOX_LOCK_SLEPP 1000
+static int send_mbox_msg_lock(struct hinic3_mbox *func_to_func, u16 channel)
+{
+ if (!func_to_func->lock_channel_en) {
+ mutex_lock(&func_to_func->mbox_send_lock);
+ return 0;
+ }
+
+ while (test_bit(channel, &func_to_func->channel_stop) == 0) {
+ if (mutex_trylock(&func_to_func->mbox_send_lock) != 0)
+ return 0;
+
+ usleep_range(TRY_MBOX_LOCK_SLEPP - 1, TRY_MBOX_LOCK_SLEPP);
+ }
+
+ return -EAGAIN;
+}
+
+static void send_mbox_msg_unlock(struct hinic3_mbox *func_to_func)
+{
+ mutex_unlock(&func_to_func->mbox_send_lock);
+}
+
+int hinic3_mbox_to_func(struct hinic3_mbox *func_to_func, u8 mod, u16 cmd,
+ u16 dst_func, void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size, u32 timeout, u16 channel)
+{
+ /* use mbox_resp to hole data which responsed from other function */
+ struct hinic3_msg_desc *msg_desc = NULL;
+ struct mbox_msg_info msg_info = {0};
+ int err;
+
+ if (func_to_func->hwdev->chip_present_flag == 0)
+ return -EPERM;
+
+ /* expect response message */
+ msg_desc = get_mbox_msg_desc(func_to_func, HINIC3_MSG_RESPONSE,
+ dst_func);
+ if (!msg_desc)
+ return -EFAULT;
+
+ err = send_mbox_msg_lock(func_to_func, channel);
+ if (err)
+ return err;
+
+ func_to_func->cur_msg_channel = channel;
+ msg_info.msg_id = MBOX_MSG_ID_INC(func_to_func);
+
+ set_mbox_to_func_event(func_to_func, EVENT_START);
+
+ err = send_mbox_msg(func_to_func, mod, cmd, buf_in, in_size, dst_func,
+ HINIC3_MSG_DIRECT_SEND, HINIC3_MSG_ACK, &msg_info);
+ if (err) {
+ sdk_err(func_to_func->hwdev->dev_hdl, "Send mailbox mod %u, cmd %u failed, msg_id: %u, err: %d\n",
+ mod, cmd, msg_info.msg_id, err);
+ set_mbox_to_func_event(func_to_func, EVENT_FAIL);
+ goto send_err;
+ }
+
+ if (wait_mbox_msg_completion(func_to_func, timeout)) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Send mbox msg timeout, msg_id: %u\n", msg_info.msg_id);
+ hinic3_dump_aeq_info(func_to_func->hwdev);
+ err = -ETIMEDOUT;
+ goto send_err;
+ }
+
+ if (mod != msg_desc->mod || cmd != msg_desc->cmd) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Invalid response mbox message, mod: 0x%x, cmd: 0x%x, expect mod: 0x%x, cmd: 0x%x\n",
+ msg_desc->mod, msg_desc->cmd, mod, cmd);
+ err = -EFAULT;
+ goto send_err;
+ }
+
+ if (msg_desc->msg_info.status) {
+ err = msg_desc->msg_info.status;
+ goto send_err;
+ }
+
+ if (buf_out && out_size) {
+ if (*out_size < msg_desc->msg_len) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Invalid response mbox message length: %u for mod %d cmd %u, should less than: %u\n",
+ msg_desc->msg_len, mod, cmd, *out_size);
+ err = -EFAULT;
+ goto send_err;
+ }
+
+ if (msg_desc->msg_len)
+ memcpy(buf_out, msg_desc->msg, msg_desc->msg_len);
+
+ *out_size = msg_desc->msg_len;
+ }
+
+send_err:
+ send_mbox_msg_unlock(func_to_func);
+
+ return err;
+}
+
+static int mbox_func_params_valid(struct hinic3_mbox *func_to_func,
+ void *buf_in, u16 in_size, u16 channel)
+{
+ if (!buf_in || !in_size)
+ return -EINVAL;
+
+ if (in_size > HINIC3_MBOX_DATA_SIZE) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Mbox msg len %u exceed limit: [1, %u]\n",
+ in_size, HINIC3_MBOX_DATA_SIZE);
+ return -EINVAL;
+ }
+
+ if (channel >= HINIC3_CHANNEL_MAX) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Invalid channel id: 0x%x\n", channel);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_mbox_to_func_no_ack(struct hinic3_hwdev *hwdev, u16 func_idx,
+ u8 mod, u16 cmd, void *buf_in, u16 in_size,
+ u16 channel)
+{
+ struct mbox_msg_info msg_info = {0};
+ int err = mbox_func_params_valid(hwdev->func_to_func, buf_in, in_size,
+ channel);
+ if (err)
+ return err;
+
+ err = send_mbox_msg_lock(hwdev->func_to_func, channel);
+ if (err)
+ return err;
+
+ err = send_mbox_msg(hwdev->func_to_func, mod, cmd, buf_in, in_size,
+ func_idx, HINIC3_MSG_DIRECT_SEND,
+ HINIC3_MSG_NO_ACK, &msg_info);
+ if (err)
+ sdk_err(hwdev->dev_hdl, "Send mailbox no ack failed\n");
+
+ send_mbox_msg_unlock(hwdev->func_to_func);
+
+ return err;
+}
+
+int hinic3_send_mbox_to_mgmt(struct hinic3_hwdev *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size, u32 timeout, u16 channel)
+{
+ struct hinic3_mbox *func_to_func = hwdev->func_to_func;
+ int err = mbox_func_params_valid(func_to_func, buf_in, in_size,
+ channel);
+ if (err)
+ return err;
+
+ /* TODO: MPU have not implement this cmd yet */
+ if (mod == HINIC3_MOD_COMM && cmd == COMM_MGMT_CMD_SEND_API_ACK_BY_UP)
+ return 0;
+
+ return hinic3_mbox_to_func(func_to_func, mod, cmd, HINIC3_MGMT_SRC_ID,
+ buf_in, in_size, buf_out, out_size, timeout,
+ channel);
+}
+
+void hinic3_response_mbox_to_mgmt(struct hinic3_hwdev *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, u16 msg_id)
+{
+ struct mbox_msg_info msg_info;
+
+ msg_info.msg_id = (u8)msg_id;
+ msg_info.status = 0;
+
+ send_mbox_msg(hwdev->func_to_func, mod, cmd, buf_in, in_size,
+ HINIC3_MGMT_SRC_ID, HINIC3_MSG_RESPONSE,
+ HINIC3_MSG_NO_ACK, &msg_info);
+}
+
+int hinic3_send_mbox_to_mgmt_no_ack(struct hinic3_hwdev *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, u16 channel)
+{
+ struct hinic3_mbox *func_to_func = hwdev->func_to_func;
+ int err = mbox_func_params_valid(func_to_func, buf_in, in_size,
+ channel);
+ if (err)
+ return err;
+
+ return hinic3_mbox_to_func_no_ack(hwdev, HINIC3_MGMT_SRC_ID, mod, cmd,
+ buf_in, in_size, channel);
+}
+
+int hinic3_mbox_ppf_to_host(void *hwdev, u8 mod, u16 cmd, u8 host_id,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size, u32 timeout, u16 channel)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ u16 dst_ppf_func;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (!(dev->chip_present_flag))
+ return -EPERM;
+
+ err = mbox_func_params_valid(dev->func_to_func, buf_in, in_size,
+ channel);
+ if (err)
+ return err;
+
+ if (!HINIC3_IS_PPF(dev)) {
+ sdk_err(dev->dev_hdl, "Params error, only ppf support send mbox to ppf. func_type: %d\n",
+ hinic3_func_type(dev));
+ return -EINVAL;
+ }
+
+ if (host_id >= HINIC3_MAX_HOST_NUM(dev) ||
+ host_id == HINIC3_PCI_INTF_IDX(dev->hwif)) {
+ sdk_err(dev->dev_hdl, "Params error, host id: %u\n", host_id);
+ return -EINVAL;
+ }
+
+ dst_ppf_func = hinic3_host_ppf_idx(dev, host_id);
+ if (dst_ppf_func >= HINIC3_MAX_PF_NUM(dev)) {
+ sdk_err(dev->dev_hdl, "Dest host(%u) have not elect ppf(0x%x).\n",
+ host_id, dst_ppf_func);
+ return -EINVAL;
+ }
+
+ return hinic3_mbox_to_func(dev->func_to_func, mod, cmd,
+ dst_ppf_func, buf_in, in_size,
+ buf_out, out_size, timeout, channel);
+}
+EXPORT_SYMBOL(hinic3_mbox_ppf_to_host);
+
+int hinic3_mbox_to_pf(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout, u16 channel)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (!(dev->chip_present_flag))
+ return -EPERM;
+
+ err = mbox_func_params_valid(dev->func_to_func, buf_in, in_size,
+ channel);
+ if (err)
+ return err;
+
+ if (!HINIC3_IS_VF(dev)) {
+ sdk_err(dev->dev_hdl, "Params error, func_type: %d\n",
+ hinic3_func_type(dev));
+ return -EINVAL;
+ }
+
+ return hinic3_mbox_to_func(dev->func_to_func, mod, cmd,
+ hinic3_pf_id_of_vf(dev), buf_in, in_size,
+ buf_out, out_size, timeout, channel);
+}
+EXPORT_SYMBOL(hinic3_mbox_to_pf);
+
+int hinic3_mbox_to_vf(void *hwdev, u16 vf_id, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size, u32 timeout,
+ u16 channel)
+{
+ struct hinic3_mbox *func_to_func = NULL;
+ int err = 0;
+ u16 dst_func_idx;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ func_to_func = ((struct hinic3_hwdev *)hwdev)->func_to_func;
+ err = mbox_func_params_valid(func_to_func, buf_in, in_size, channel);
+ if (err != 0)
+ return err;
+
+ if (HINIC3_IS_VF((struct hinic3_hwdev *)hwdev)) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl, "Params error, func_type: %d\n",
+ hinic3_func_type(hwdev));
+ return -EINVAL;
+ }
+
+ if (!vf_id) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "VF id(%u) error!\n", vf_id);
+ return -EINVAL;
+ }
+
+ /* vf_offset_to_pf + vf_id is the vf's global function id of vf in
+ * this pf
+ */
+ dst_func_idx = hinic3_glb_pf_vf_offset(hwdev) + vf_id;
+
+ return hinic3_mbox_to_func(func_to_func, mod, cmd, dst_func_idx, buf_in,
+ in_size, buf_out, out_size, timeout,
+ channel);
+}
+EXPORT_SYMBOL(hinic3_mbox_to_vf);
+
+int hinic3_mbox_set_channel_status(struct hinic3_hwdev *hwdev, u16 channel,
+ bool enable)
+{
+ if (channel >= HINIC3_CHANNEL_MAX) {
+ sdk_err(hwdev->dev_hdl, "Invalid channel id: 0x%x\n", channel);
+ return -EINVAL;
+ }
+
+ if (enable)
+ clear_bit(channel, &hwdev->func_to_func->channel_stop);
+ else
+ set_bit(channel, &hwdev->func_to_func->channel_stop);
+
+ sdk_info(hwdev->dev_hdl, "%s mbox channel 0x%x\n",
+ enable ? "Enable" : "Disable", channel);
+
+ return 0;
+}
+
+void hinic3_mbox_enable_channel_lock(struct hinic3_hwdev *hwdev, bool enable)
+{
+ hwdev->func_to_func->lock_channel_en = enable;
+
+ sdk_info(hwdev->dev_hdl, "%s mbox channel lock\n",
+ enable ? "Enable" : "Disable");
+}
+
+static int alloc_mbox_msg_channel(struct hinic3_msg_channel *msg_ch)
+{
+ msg_ch->resp_msg.msg = kzalloc(MBOX_MAX_BUF_SZ, GFP_KERNEL);
+ if (!msg_ch->resp_msg.msg)
+ return -ENOMEM;
+
+ msg_ch->recv_msg.msg = kzalloc(MBOX_MAX_BUF_SZ, GFP_KERNEL);
+ if (!msg_ch->recv_msg.msg) {
+ kfree(msg_ch->resp_msg.msg);
+ return -ENOMEM;
+ }
+
+ msg_ch->resp_msg.seq_id = SEQ_ID_MAX_VAL;
+ msg_ch->recv_msg.seq_id = SEQ_ID_MAX_VAL;
+ atomic_set(&msg_ch->recv_msg_cnt, 0);
+
+ return 0;
+}
+
+static void free_mbox_msg_channel(struct hinic3_msg_channel *msg_ch)
+{
+ kfree(msg_ch->recv_msg.msg);
+ kfree(msg_ch->resp_msg.msg);
+}
+
+static int init_mgmt_msg_channel(struct hinic3_mbox *func_to_func)
+{
+ int err;
+
+ err = alloc_mbox_msg_channel(&func_to_func->mgmt_msg);
+ if (err != 0) {
+ sdk_err(func_to_func->hwdev->dev_hdl, "Failed to alloc mgmt message channel\n");
+ return err;
+ }
+
+ err = hinic3_init_mbox_dma_queue(func_to_func);
+ if (err != 0) {
+ sdk_err(func_to_func->hwdev->dev_hdl, "Failed to init mbox dma queue\n");
+ free_mbox_msg_channel(&func_to_func->mgmt_msg);
+ }
+
+ return err;
+}
+
+static void deinit_mgmt_msg_channel(struct hinic3_mbox *func_to_func)
+{
+ hinic3_deinit_mbox_dma_queue(func_to_func);
+ free_mbox_msg_channel(&func_to_func->mgmt_msg);
+}
+
+int hinic3_mbox_init_host_msg_channel(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_mbox *func_to_func = hwdev->func_to_func;
+ u8 host_num = HINIC3_MAX_HOST_NUM(hwdev);
+ int i, host_id, err;
+
+ if (host_num == 0)
+ return 0;
+
+ func_to_func->host_msg = kcalloc(host_num,
+ sizeof(*func_to_func->host_msg),
+ GFP_KERNEL);
+ if (!func_to_func->host_msg) {
+ sdk_err(func_to_func->hwdev->dev_hdl, "Failed to alloc host message array\n");
+ return -ENOMEM;
+ }
+
+ for (host_id = 0; host_id < host_num; host_id++) {
+ err = alloc_mbox_msg_channel(&func_to_func->host_msg[host_id]);
+ if (err) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Failed to alloc host %d message channel\n",
+ host_id);
+ goto alloc_msg_ch_err;
+ }
+ }
+
+ func_to_func->support_h2h_msg = true;
+
+ return 0;
+
+alloc_msg_ch_err:
+ for (i = 0; i < host_id; i++)
+ free_mbox_msg_channel(&func_to_func->host_msg[i]);
+
+ kfree(func_to_func->host_msg);
+ func_to_func->host_msg = NULL;
+
+ return -ENOMEM;
+}
+
+static void deinit_host_msg_channel(struct hinic3_mbox *func_to_func)
+{
+ int i;
+
+ if (!func_to_func->host_msg)
+ return;
+
+ for (i = 0; i < HINIC3_MAX_HOST_NUM(func_to_func->hwdev); i++)
+ free_mbox_msg_channel(&func_to_func->host_msg[i]);
+
+ kfree(func_to_func->host_msg);
+ func_to_func->host_msg = NULL;
+}
+
+int hinic3_init_func_mbox_msg_channel(void *hwdev, u16 num_func)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ struct hinic3_mbox *func_to_func = NULL;
+ u16 func_id, i;
+ int err;
+
+ if (!hwdev || !num_func || num_func > HINIC3_MAX_FUNCTIONS)
+ return -EINVAL;
+
+ func_to_func = dev->func_to_func;
+ if (func_to_func->func_msg)
+ return (func_to_func->num_func_msg == num_func) ? 0 : -EFAULT;
+
+ func_to_func->func_msg =
+ kcalloc(num_func, sizeof(*func_to_func->func_msg), GFP_KERNEL);
+ if (!func_to_func->func_msg) {
+ sdk_err(func_to_func->hwdev->dev_hdl, "Failed to alloc func message array\n");
+ return -ENOMEM;
+ }
+
+ for (func_id = 0; func_id < num_func; func_id++) {
+ err = alloc_mbox_msg_channel(&func_to_func->func_msg[func_id]);
+ if (err != 0) {
+ sdk_err(func_to_func->hwdev->dev_hdl,
+ "Failed to alloc func %hu message channel\n",
+ func_id);
+ goto alloc_msg_ch_err;
+ }
+ }
+
+ func_to_func->num_func_msg = num_func;
+
+ return 0;
+
+alloc_msg_ch_err:
+ for (i = 0; i < func_id; i++)
+ free_mbox_msg_channel(&func_to_func->func_msg[i]);
+
+ kfree(func_to_func->func_msg);
+ func_to_func->func_msg = NULL;
+
+ return -ENOMEM;
+}
+
+static void hinic3_deinit_func_mbox_msg_channel(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_mbox *func_to_func = hwdev->func_to_func;
+ u16 i;
+
+ if (!func_to_func->func_msg)
+ return;
+
+ for (i = 0; i < func_to_func->num_func_msg; i++)
+ free_mbox_msg_channel(&func_to_func->func_msg[i]);
+
+ kfree(func_to_func->func_msg);
+ func_to_func->func_msg = NULL;
+}
+
+static struct hinic3_msg_desc *get_mbox_msg_desc(struct hinic3_mbox *func_to_func,
+ u64 dir, u64 src_func_id)
+{
+ struct hinic3_hwdev *hwdev = func_to_func->hwdev;
+ struct hinic3_msg_channel *msg_ch = NULL;
+ u16 id;
+
+ if (src_func_id == HINIC3_MGMT_SRC_ID) {
+ msg_ch = &func_to_func->mgmt_msg;
+ } else if (HINIC3_IS_VF(hwdev)) {
+ /* message from pf */
+ msg_ch = func_to_func->func_msg;
+ if (src_func_id != hinic3_pf_id_of_vf(hwdev) || !msg_ch)
+ return NULL;
+ } else if (src_func_id > hinic3_glb_pf_vf_offset(hwdev)) {
+ /* message from vf */
+ id = (u16)(src_func_id - 1U) - hinic3_glb_pf_vf_offset(hwdev);
+ if (id >= func_to_func->num_func_msg)
+ return NULL;
+
+ msg_ch = &func_to_func->func_msg[id];
+ } else {
+ /* message from other host's ppf */
+ if (!func_to_func->support_h2h_msg)
+ return NULL;
+
+ for (id = 0; id < HINIC3_MAX_HOST_NUM(hwdev); id++) {
+ if (src_func_id == hinic3_host_ppf_idx(hwdev, (u8)id))
+ break;
+ }
+
+ if (id == HINIC3_MAX_HOST_NUM(hwdev) || !func_to_func->host_msg)
+ return NULL;
+
+ msg_ch = &func_to_func->host_msg[id];
+ }
+
+ return (dir == HINIC3_MSG_DIRECT_SEND) ?
+ &msg_ch->recv_msg : &msg_ch->resp_msg;
+}
+
+static void prepare_send_mbox(struct hinic3_mbox *func_to_func)
+{
+ struct hinic3_send_mbox *send_mbox = &func_to_func->send_mbox;
+
+ send_mbox->data = MBOX_AREA(func_to_func->hwdev->hwif);
+}
+
+static int alloc_mbox_wb_status(struct hinic3_mbox *func_to_func)
+{
+ struct hinic3_send_mbox *send_mbox = &func_to_func->send_mbox;
+ struct hinic3_hwdev *hwdev = func_to_func->hwdev;
+ u32 addr_h, addr_l;
+
+ send_mbox->wb_vaddr = dma_zalloc_coherent(hwdev->dev_hdl,
+ MBOX_WB_STATUS_LEN,
+ &send_mbox->wb_paddr,
+ GFP_KERNEL);
+ if (!send_mbox->wb_vaddr)
+ return -ENOMEM;
+
+ send_mbox->wb_status = send_mbox->wb_vaddr;
+
+ addr_h = upper_32_bits(send_mbox->wb_paddr);
+ addr_l = lower_32_bits(send_mbox->wb_paddr);
+
+ hinic3_hwif_write_reg(hwdev->hwif, HINIC3_FUNC_CSR_MAILBOX_RESULT_H_OFF,
+ addr_h);
+ hinic3_hwif_write_reg(hwdev->hwif, HINIC3_FUNC_CSR_MAILBOX_RESULT_L_OFF,
+ addr_l);
+
+ return 0;
+}
+
+static void free_mbox_wb_status(struct hinic3_mbox *func_to_func)
+{
+ struct hinic3_send_mbox *send_mbox = &func_to_func->send_mbox;
+ struct hinic3_hwdev *hwdev = func_to_func->hwdev;
+
+ hinic3_hwif_write_reg(hwdev->hwif, HINIC3_FUNC_CSR_MAILBOX_RESULT_H_OFF,
+ 0);
+ hinic3_hwif_write_reg(hwdev->hwif, HINIC3_FUNC_CSR_MAILBOX_RESULT_L_OFF,
+ 0);
+
+ dma_free_coherent(hwdev->dev_hdl, MBOX_WB_STATUS_LEN,
+ send_mbox->wb_vaddr, send_mbox->wb_paddr);
+}
+
+int hinic3_func_to_func_init(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_mbox *func_to_func;
+ int err = -ENOMEM;
+
+ func_to_func = kzalloc(sizeof(*func_to_func), GFP_KERNEL);
+ if (!func_to_func)
+ return -ENOMEM;
+
+ hwdev->func_to_func = func_to_func;
+ func_to_func->hwdev = hwdev;
+ mutex_init(&func_to_func->mbox_send_lock);
+ mutex_init(&func_to_func->msg_send_lock);
+ spin_lock_init(&func_to_func->mbox_lock);
+ func_to_func->workq = create_singlethread_workqueue(HINIC3_MBOX_WQ_NAME);
+ if (!func_to_func->workq) {
+ sdk_err(hwdev->dev_hdl, "Failed to initialize MBOX workqueue\n");
+ goto create_mbox_workq_err;
+ }
+
+ err = init_mgmt_msg_channel(func_to_func);
+ if (err)
+ goto init_mgmt_msg_ch_err;
+
+ if (HINIC3_IS_VF(hwdev)) {
+ /* VF to PF mbox message channel */
+ err = hinic3_init_func_mbox_msg_channel(hwdev, 1);
+ if (err)
+ goto init_func_msg_ch_err;
+ }
+
+ err = alloc_mbox_wb_status(func_to_func);
+ if (err) {
+ sdk_err(hwdev->dev_hdl, "Failed to alloc mbox write back status\n");
+ goto alloc_wb_status_err;
+ }
+
+ prepare_send_mbox(func_to_func);
+
+ return 0;
+
+alloc_wb_status_err:
+ if (HINIC3_IS_VF(hwdev))
+ hinic3_deinit_func_mbox_msg_channel(hwdev);
+
+init_func_msg_ch_err:
+ deinit_mgmt_msg_channel(func_to_func);
+
+init_mgmt_msg_ch_err:
+ destroy_workqueue(func_to_func->workq);
+
+create_mbox_workq_err:
+ spin_lock_deinit(&func_to_func->mbox_lock);
+ mutex_deinit(&func_to_func->msg_send_lock);
+ mutex_deinit(&func_to_func->mbox_send_lock);
+ kfree(func_to_func);
+
+ return err;
+}
+
+void hinic3_func_to_func_free(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_mbox *func_to_func = hwdev->func_to_func;
+
+ /* destroy workqueue before free related mbox resources in case of
+ * illegal resource access
+ */
+ destroy_workqueue(func_to_func->workq);
+
+ free_mbox_wb_status(func_to_func);
+ if (HINIC3_IS_PPF(hwdev))
+ deinit_host_msg_channel(func_to_func);
+ hinic3_deinit_func_mbox_msg_channel(hwdev);
+ deinit_mgmt_msg_channel(func_to_func);
+ spin_lock_deinit(&func_to_func->mbox_lock);
+ mutex_deinit(&func_to_func->mbox_send_lock);
+ mutex_deinit(&func_to_func->msg_send_lock);
+
+ kfree(func_to_func);
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.h
new file mode 100644
index 000000000000..bf723e8a68fb
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.h
@@ -0,0 +1,267 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_MBOX_H
+#define HINIC3_MBOX_H
+
+#include "hinic3_crm.h"
+#include "hinic3_hwdev.h"
+
+#define HINIC3_MBOX_PF_SEND_ERR 0x1
+
+#define HINIC3_MGMT_SRC_ID 0x1FFF
+#define HINIC3_MAX_FUNCTIONS 4096
+
+/* message header define */
+#define HINIC3_MSG_HEADER_SRC_GLB_FUNC_IDX_SHIFT 0
+#define HINIC3_MSG_HEADER_STATUS_SHIFT 13
+#define HINIC3_MSG_HEADER_SOURCE_SHIFT 15
+#define HINIC3_MSG_HEADER_AEQ_ID_SHIFT 16
+#define HINIC3_MSG_HEADER_MSG_ID_SHIFT 18
+#define HINIC3_MSG_HEADER_CMD_SHIFT 22
+
+#define HINIC3_MSG_HEADER_MSG_LEN_SHIFT 32
+#define HINIC3_MSG_HEADER_MODULE_SHIFT 43
+#define HINIC3_MSG_HEADER_SEG_LEN_SHIFT 48
+#define HINIC3_MSG_HEADER_NO_ACK_SHIFT 54
+#define HINIC3_MSG_HEADER_DATA_TYPE_SHIFT 55
+#define HINIC3_MSG_HEADER_SEQID_SHIFT 56
+#define HINIC3_MSG_HEADER_LAST_SHIFT 62
+#define HINIC3_MSG_HEADER_DIRECTION_SHIFT 63
+
+#define HINIC3_MSG_HEADER_SRC_GLB_FUNC_IDX_MASK 0x1FFF
+#define HINIC3_MSG_HEADER_STATUS_MASK 0x1
+#define HINIC3_MSG_HEADER_SOURCE_MASK 0x1
+#define HINIC3_MSG_HEADER_AEQ_ID_MASK 0x3
+#define HINIC3_MSG_HEADER_MSG_ID_MASK 0xF
+#define HINIC3_MSG_HEADER_CMD_MASK 0x3FF
+
+#define HINIC3_MSG_HEADER_MSG_LEN_MASK 0x7FF
+#define HINIC3_MSG_HEADER_MODULE_MASK 0x1F
+#define HINIC3_MSG_HEADER_SEG_LEN_MASK 0x3F
+#define HINIC3_MSG_HEADER_NO_ACK_MASK 0x1
+#define HINIC3_MSG_HEADER_DATA_TYPE_MASK 0x1
+#define HINIC3_MSG_HEADER_SEQID_MASK 0x3F
+#define HINIC3_MSG_HEADER_LAST_MASK 0x1
+#define HINIC3_MSG_HEADER_DIRECTION_MASK 0x1
+
+#define HINIC3_MSG_HEADER_GET(val, field) \
+ (((val) >> HINIC3_MSG_HEADER_##field##_SHIFT) & \
+ HINIC3_MSG_HEADER_##field##_MASK)
+#define HINIC3_MSG_HEADER_SET(val, field) \
+ ((u64)(((u64)(val)) & HINIC3_MSG_HEADER_##field##_MASK) << \
+ HINIC3_MSG_HEADER_##field##_SHIFT)
+
+#define IS_DMA_MBX_MSG(dst_func) ((dst_func) == HINIC3_MGMT_SRC_ID)
+
+enum hinic3_msg_direction_type {
+ HINIC3_MSG_DIRECT_SEND = 0,
+ HINIC3_MSG_RESPONSE = 1,
+};
+
+enum hinic3_msg_segment_type {
+ NOT_LAST_SEGMENT = 0,
+ LAST_SEGMENT = 1,
+};
+
+enum hinic3_msg_ack_type {
+ HINIC3_MSG_ACK,
+ HINIC3_MSG_NO_ACK,
+};
+
+enum hinic3_data_type {
+ HINIC3_DATA_INLINE = 0,
+ HINIC3_DATA_DMA = 1,
+};
+
+enum hinic3_msg_src_type {
+ HINIC3_MSG_FROM_MGMT = 0,
+ HINIC3_MSG_FROM_MBOX = 1,
+};
+
+enum hinic3_msg_aeq_type {
+ HINIC3_ASYNC_MSG_AEQ = 0,
+ /* indicate dest func or mgmt cpu which aeq to response mbox message */
+ HINIC3_MBOX_RSP_MSG_AEQ = 1,
+ /* indicate mgmt cpu which aeq to response api cmd message */
+ HINIC3_MGMT_RSP_MSG_AEQ = 2,
+};
+
+#define HINIC3_MBOX_WQ_NAME "hinic3_mbox"
+
+struct mbox_msg_info {
+ u8 msg_id;
+ u8 status; /* can only use 1 bit */
+};
+
+struct hinic3_msg_desc {
+ void *msg;
+ u16 msg_len;
+ u8 seq_id;
+ u8 mod;
+ u16 cmd;
+ struct mbox_msg_info msg_info;
+};
+
+struct hinic3_msg_channel {
+ struct hinic3_msg_desc resp_msg;
+ struct hinic3_msg_desc recv_msg;
+
+ atomic_t recv_msg_cnt;
+};
+
+/* Receive other functions mbox message */
+struct hinic3_recv_mbox {
+ void *msg;
+ u16 msg_len;
+ u8 msg_id;
+ u8 mod;
+ u16 cmd;
+ u16 src_func_idx;
+
+ enum hinic3_msg_ack_type ack_type;
+ u32 rsvd1;
+
+ void *resp_buff;
+};
+
+struct hinic3_send_mbox {
+ u8 *data;
+
+ u64 *wb_status; /* write back status */
+ void *wb_vaddr;
+ dma_addr_t wb_paddr;
+};
+
+enum mbox_event_state {
+ EVENT_START = 0,
+ EVENT_FAIL,
+ EVENT_SUCCESS,
+ EVENT_TIMEOUT,
+ EVENT_END,
+};
+
+enum hinic3_mbox_cb_state {
+ HINIC3_VF_MBOX_CB_REG = 0,
+ HINIC3_VF_MBOX_CB_RUNNING,
+ HINIC3_PF_MBOX_CB_REG,
+ HINIC3_PF_MBOX_CB_RUNNING,
+ HINIC3_PPF_MBOX_CB_REG,
+ HINIC3_PPF_MBOX_CB_RUNNING,
+ HINIC3_PPF_TO_PF_MBOX_CB_REG,
+ HINIC3_PPF_TO_PF_MBOX_CB_RUNNIG,
+};
+
+struct mbox_dma_msg {
+ u32 xor;
+ u32 dma_addr_high;
+ u32 dma_addr_low;
+ u32 msg_len;
+ u64 rsvd;
+};
+
+struct mbox_dma_queue {
+ void *dma_buff_vaddr;
+ dma_addr_t dma_buff_paddr;
+
+ u16 depth;
+ u16 prod_idx;
+ u16 cons_idx;
+};
+
+struct hinic3_mbox {
+ struct hinic3_hwdev *hwdev;
+
+ bool lock_channel_en;
+ unsigned long channel_stop;
+ u16 cur_msg_channel;
+ u32 rsvd1;
+
+ /* lock for send mbox message and ack message */
+ struct mutex mbox_send_lock;
+ /* lock for send mbox message */
+ struct mutex msg_send_lock;
+ struct hinic3_send_mbox send_mbox;
+
+ struct mbox_dma_queue sync_msg_queue;
+ struct mbox_dma_queue async_msg_queue;
+
+ struct workqueue_struct *workq;
+
+ struct hinic3_msg_channel mgmt_msg; /* driver and MGMT CPU */
+ struct hinic3_msg_channel *host_msg; /* PPF message between hosts */
+ struct hinic3_msg_channel *func_msg; /* PF to VF or VF to PF */
+ u16 num_func_msg;
+ bool support_h2h_msg; /* host to host */
+
+ /* vf receive pf/ppf callback */
+ hinic3_vf_mbox_cb vf_mbox_cb[HINIC3_MOD_MAX];
+ void *vf_mbox_data[HINIC3_MOD_MAX];
+ /* pf/ppf receive vf callback */
+ hinic3_pf_mbox_cb pf_mbox_cb[HINIC3_MOD_MAX];
+ void *pf_mbox_data[HINIC3_MOD_MAX];
+ /* ppf receive pf/ppf callback */
+ hinic3_ppf_mbox_cb ppf_mbox_cb[HINIC3_MOD_MAX];
+ void *ppf_mbox_data[HINIC3_MOD_MAX];
+ /* pf receive ppf callback */
+ hinic3_pf_recv_from_ppf_mbox_cb pf_recv_ppf_mbox_cb[HINIC3_MOD_MAX];
+ void *pf_recv_ppf_mbox_data[HINIC3_MOD_MAX];
+ unsigned long ppf_to_pf_mbox_cb_state[HINIC3_MOD_MAX];
+ unsigned long ppf_mbox_cb_state[HINIC3_MOD_MAX];
+ unsigned long pf_mbox_cb_state[HINIC3_MOD_MAX];
+ unsigned long vf_mbox_cb_state[HINIC3_MOD_MAX];
+
+ u8 send_msg_id;
+ u16 rsvd2;
+ enum mbox_event_state event_flag;
+ /* lock for mbox event flag */
+ spinlock_t mbox_lock;
+ u64 rsvd3;
+};
+
+struct hinic3_mbox_work {
+ struct work_struct work;
+ struct hinic3_mbox *func_to_func;
+ struct hinic3_recv_mbox *recv_mbox;
+ struct hinic3_msg_channel *msg_ch;
+};
+
+struct vf_cmd_check_handle {
+ u16 cmd;
+ bool (*check_cmd)(struct hinic3_hwdev *hwdev, u16 src_func_idx,
+ void *buf_in, u16 in_size);
+};
+
+void hinic3_mbox_func_aeqe_handler(void *handle, u8 *header, u8 size);
+
+bool hinic3_mbox_check_cmd_valid(struct hinic3_hwdev *hwdev,
+ struct vf_cmd_check_handle *cmd_handle,
+ u16 vf_id, u16 cmd, void *buf_in, u16 in_size,
+ u8 size);
+
+int hinic3_func_to_func_init(struct hinic3_hwdev *hwdev);
+
+void hinic3_func_to_func_free(struct hinic3_hwdev *hwdev);
+
+int hinic3_send_mbox_to_mgmt(struct hinic3_hwdev *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size, u32 timeout, u16 channel);
+
+void hinic3_response_mbox_to_mgmt(struct hinic3_hwdev *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, u16 msg_id);
+
+int hinic3_send_mbox_to_mgmt_no_ack(struct hinic3_hwdev *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, u16 channel);
+int hinic3_mbox_to_func(struct hinic3_mbox *func_to_func, u8 mod, u16 cmd,
+ u16 dst_func, void *buf_in, u16 in_size,
+ void *buf_out, u16 *out_size, u32 timeout, u16 channel);
+
+int hinic3_mbox_init_host_msg_channel(struct hinic3_hwdev *hwdev);
+
+int hinic3_mbox_set_channel_status(struct hinic3_hwdev *hwdev, u16 channel,
+ bool enable);
+
+void hinic3_mbox_enable_channel_lock(struct hinic3_hwdev *hwdev, bool enable);
+
+#endif
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.c
new file mode 100644
index 000000000000..f633262e8b71
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.c
@@ -0,0 +1,1515 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/completion.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/semaphore.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_common.h"
+#include "hinic3_comm_cmd.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_eqs.h"
+#include "hinic3_mbox.h"
+#include "hinic3_api_cmd.h"
+#include "hinic3_prof_adap.h"
+#include "hinic3_csr.h"
+#include "hinic3_mgmt.h"
+
+#define HINIC3_MSG_TO_MGMT_MAX_LEN 2016
+
+#define HINIC3_API_CHAIN_AEQ_ID 2
+#define MAX_PF_MGMT_BUF_SIZE 2048UL
+#define SEGMENT_LEN 48
+#define ASYNC_MSG_FLAG 0x8
+#define MGMT_MSG_MAX_SEQ_ID (ALIGN(HINIC3_MSG_TO_MGMT_MAX_LEN, \
+ SEGMENT_LEN) / SEGMENT_LEN)
+
+#define MGMT_MSG_LAST_SEG_MAX_LEN (MAX_PF_MGMT_BUF_SIZE - \
+ SEGMENT_LEN * MGMT_MSG_MAX_SEQ_ID)
+
+#define BUF_OUT_DEFAULT_SIZE 1
+
+#define MGMT_MSG_SIZE_MIN 20
+#define MGMT_MSG_SIZE_STEP 16
+#define MGMT_MSG_RSVD_FOR_DEV 8
+
+#define SYNC_MSG_ID_MASK 0x7
+#define ASYNC_MSG_ID_MASK 0x7
+
+#define SYNC_FLAG 0
+#define ASYNC_FLAG 1
+
+#define MSG_NO_RESP 0xFFFF
+
+#define MGMT_MSG_TIMEOUT 20000 /* millisecond */
+
+#define SYNC_MSG_ID(pf_to_mgmt) ((pf_to_mgmt)->sync_msg_id)
+
+#define SYNC_MSG_ID_INC(pf_to_mgmt) (SYNC_MSG_ID(pf_to_mgmt) = \
+ (SYNC_MSG_ID(pf_to_mgmt) + 1) & SYNC_MSG_ID_MASK)
+#define ASYNC_MSG_ID(pf_to_mgmt) ((pf_to_mgmt)->async_msg_id)
+
+#define ASYNC_MSG_ID_INC(pf_to_mgmt) (ASYNC_MSG_ID(pf_to_mgmt) = \
+ ((ASYNC_MSG_ID(pf_to_mgmt) + 1) & ASYNC_MSG_ID_MASK) \
+ | ASYNC_MSG_FLAG)
+
+static void pf_to_mgmt_send_event_set(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ int event_flag)
+{
+ spin_lock(&pf_to_mgmt->sync_event_lock);
+ pf_to_mgmt->event_flag = event_flag;
+ spin_unlock(&pf_to_mgmt->sync_event_lock);
+}
+
+/**
+ * hinic3_register_mgmt_msg_cb - register sync msg handler for a module
+ * @hwdev: the pointer to hw device
+ * @mod: module in the chip that this handler will handle its sync messages
+ * @pri_handle: specific mod's private data that will be used in callback
+ * @callback: the handler for a sync message that will handle messages
+ **/
+int hinic3_register_mgmt_msg_cb(void *hwdev, u8 mod, void *pri_handle,
+ hinic3_mgmt_msg_cb callback)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt = NULL;
+
+ if (mod >= HINIC3_MOD_HW_MAX || !hwdev)
+ return -EFAULT;
+
+ pf_to_mgmt = ((struct hinic3_hwdev *)hwdev)->pf_to_mgmt;
+ if (!pf_to_mgmt)
+ return -EINVAL;
+
+ pf_to_mgmt->recv_mgmt_msg_cb[mod] = callback;
+ pf_to_mgmt->recv_mgmt_msg_data[mod] = pri_handle;
+
+ set_bit(HINIC3_MGMT_MSG_CB_REG, &pf_to_mgmt->mgmt_msg_cb_state[mod]);
+
+ return 0;
+}
+EXPORT_SYMBOL(hinic3_register_mgmt_msg_cb);
+
+/**
+ * hinic3_unregister_mgmt_msg_cb - unregister sync msg handler for a module
+ * @hwdev: the pointer to hw device
+ * @mod: module in the chip that this handler will handle its sync messages
+ **/
+void hinic3_unregister_mgmt_msg_cb(void *hwdev, u8 mod)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt = NULL;
+
+ if (!hwdev || mod >= HINIC3_MOD_HW_MAX)
+ return;
+
+ pf_to_mgmt = ((struct hinic3_hwdev *)hwdev)->pf_to_mgmt;
+ if (!pf_to_mgmt)
+ return;
+
+ clear_bit(HINIC3_MGMT_MSG_CB_REG, &pf_to_mgmt->mgmt_msg_cb_state[mod]);
+
+ while (test_bit(HINIC3_MGMT_MSG_CB_RUNNING,
+ &pf_to_mgmt->mgmt_msg_cb_state[mod]))
+ usleep_range(900, 1000); /* sleep 900 us ~ 1000 us */
+
+ pf_to_mgmt->recv_mgmt_msg_cb[mod] = NULL;
+ pf_to_mgmt->recv_mgmt_msg_data[mod] = NULL;
+}
+EXPORT_SYMBOL(hinic3_unregister_mgmt_msg_cb);
+
+/**
+ * mgmt_msg_len - calculate the total message length
+ * @msg_data_len: the length of the message data
+ * Return: the total message length
+ **/
+static u16 mgmt_msg_len(u16 msg_data_len)
+{
+ /* u64 - the size of the header */
+ u16 msg_size;
+
+ msg_size = (u16)(MGMT_MSG_RSVD_FOR_DEV + sizeof(u64) + msg_data_len);
+
+ if (msg_size > MGMT_MSG_SIZE_MIN)
+ msg_size = MGMT_MSG_SIZE_MIN +
+ ALIGN((msg_size - MGMT_MSG_SIZE_MIN),
+ MGMT_MSG_SIZE_STEP);
+ else
+ msg_size = MGMT_MSG_SIZE_MIN;
+
+ return msg_size;
+}
+
+/**
+ * prepare_header - prepare the header of the message
+ * @pf_to_mgmt: PF to MGMT channel
+ * @header: pointer of the header to prepare
+ * @msg_len: the length of the message
+ * @mod: module in the chip that will get the message
+ * @direction: the direction of the original message
+ * @msg_id: message id
+ **/
+static void prepare_header(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ u64 *header, u16 msg_len, u8 mod,
+ enum hinic3_msg_ack_type ack_type,
+ enum hinic3_msg_direction_type direction,
+ enum hinic3_mgmt_cmd cmd, u32 msg_id)
+{
+ struct hinic3_hwif *hwif = pf_to_mgmt->hwdev->hwif;
+
+ *header = HINIC3_MSG_HEADER_SET(msg_len, MSG_LEN) |
+ HINIC3_MSG_HEADER_SET(mod, MODULE) |
+ HINIC3_MSG_HEADER_SET(msg_len, SEG_LEN) |
+ HINIC3_MSG_HEADER_SET(ack_type, NO_ACK) |
+ HINIC3_MSG_HEADER_SET(HINIC3_DATA_INLINE, DATA_TYPE) |
+ HINIC3_MSG_HEADER_SET(0, SEQID) |
+ HINIC3_MSG_HEADER_SET(HINIC3_API_CHAIN_AEQ_ID, AEQ_ID) |
+ HINIC3_MSG_HEADER_SET(LAST_SEGMENT, LAST) |
+ HINIC3_MSG_HEADER_SET(direction, DIRECTION) |
+ HINIC3_MSG_HEADER_SET(cmd, CMD) |
+ HINIC3_MSG_HEADER_SET(HINIC3_MSG_FROM_MGMT, SOURCE) |
+ HINIC3_MSG_HEADER_SET(hwif->attr.func_global_idx,
+ SRC_GLB_FUNC_IDX) |
+ HINIC3_MSG_HEADER_SET(msg_id, MSG_ID);
+}
+
+static void clp_prepare_header(struct hinic3_hwdev *hwdev, u64 *header,
+ u16 msg_len, u8 mod,
+ enum hinic3_msg_ack_type ack_type,
+ enum hinic3_msg_direction_type direction,
+ enum hinic3_mgmt_cmd cmd, u32 msg_id)
+{
+ struct hinic3_hwif *hwif = hwdev->hwif;
+
+ *header = HINIC3_MSG_HEADER_SET(msg_len, MSG_LEN) |
+ HINIC3_MSG_HEADER_SET(mod, MODULE) |
+ HINIC3_MSG_HEADER_SET(msg_len, SEG_LEN) |
+ HINIC3_MSG_HEADER_SET(ack_type, NO_ACK) |
+ HINIC3_MSG_HEADER_SET(HINIC3_DATA_INLINE, DATA_TYPE) |
+ HINIC3_MSG_HEADER_SET(0, SEQID) |
+ HINIC3_MSG_HEADER_SET(HINIC3_API_CHAIN_AEQ_ID, AEQ_ID) |
+ HINIC3_MSG_HEADER_SET(LAST_SEGMENT, LAST) |
+ HINIC3_MSG_HEADER_SET(direction, DIRECTION) |
+ HINIC3_MSG_HEADER_SET(cmd, CMD) |
+ HINIC3_MSG_HEADER_SET(hwif->attr.func_global_idx,
+ SRC_GLB_FUNC_IDX) |
+ HINIC3_MSG_HEADER_SET(msg_id, MSG_ID);
+}
+
+/**
+ * prepare_mgmt_cmd - prepare the mgmt command
+ * @mgmt_cmd: pointer to the command to prepare
+ * @header: pointer of the header to prepare
+ * @msg: the data of the message
+ * @msg_len: the length of the message
+ **/
+static void prepare_mgmt_cmd(u8 *mgmt_cmd, u64 *header, const void *msg,
+ int msg_len)
+{
+ u8 *mgmt_cmd_new = mgmt_cmd;
+
+ memset(mgmt_cmd_new, 0, MGMT_MSG_RSVD_FOR_DEV);
+
+ mgmt_cmd_new += MGMT_MSG_RSVD_FOR_DEV;
+ memcpy(mgmt_cmd_new, header, sizeof(*header));
+
+ mgmt_cmd_new += sizeof(*header);
+ memcpy(mgmt_cmd_new, msg, (size_t)(u32)msg_len);
+}
+
+/**
+ * send_msg_to_mgmt_sync - send async message
+ * @pf_to_mgmt: PF to MGMT channel
+ * @mod: module in the chip that will get the message
+ * @cmd: command of the message
+ * @msg: the msg data
+ * @msg_len: the msg data length
+ * @direction: the direction of the original message
+ * @resp_msg_id: msg id to response for
+ * Return: 0 - success, negative - failure
+ **/
+static int send_msg_to_mgmt_sync(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ u8 mod, u16 cmd, const void *msg, u16 msg_len,
+ enum hinic3_msg_ack_type ack_type,
+ enum hinic3_msg_direction_type direction,
+ u16 resp_msg_id)
+{
+ void *mgmt_cmd = pf_to_mgmt->sync_msg_buf;
+ struct hinic3_api_cmd_chain *chain = NULL;
+ u8 node_id = HINIC3_MGMT_CPU_NODE_ID(pf_to_mgmt->hwdev);
+ u64 header;
+ u16 cmd_size = mgmt_msg_len(msg_len);
+
+ if (hinic3_get_chip_present_flag(pf_to_mgmt->hwdev) == 0)
+ return -EFAULT;
+
+ if (cmd_size > HINIC3_MSG_TO_MGMT_MAX_LEN)
+ return -EFAULT;
+
+ if (direction == HINIC3_MSG_RESPONSE)
+ prepare_header(pf_to_mgmt, &header, msg_len, mod, ack_type,
+ direction, cmd, resp_msg_id);
+ else
+ prepare_header(pf_to_mgmt, &header, msg_len, mod, ack_type,
+ direction, cmd, SYNC_MSG_ID_INC(pf_to_mgmt));
+ chain = pf_to_mgmt->cmd_chain[HINIC3_API_CMD_WRITE_TO_MGMT_CPU];
+
+ if (ack_type == HINIC3_MSG_ACK)
+ pf_to_mgmt_send_event_set(pf_to_mgmt, SEND_EVENT_START);
+
+ prepare_mgmt_cmd((u8 *)mgmt_cmd, &header, msg, msg_len);
+
+ return hinic3_api_cmd_write(chain, node_id, mgmt_cmd, cmd_size);
+}
+
+/**
+ * send_msg_to_mgmt_async - send async message
+ * @pf_to_mgmt: PF to MGMT channel
+ * @mod: module in the chip that will get the message
+ * @cmd: command of the message
+ * @msg: the data of the message
+ * @msg_len: the length of the message
+ * @direction: the direction of the original message
+ * Return: 0 - success, negative - failure
+ **/
+static int send_msg_to_mgmt_async(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ u8 mod, u16 cmd, const void *msg, u16 msg_len,
+ enum hinic3_msg_direction_type direction)
+{
+ void *mgmt_cmd = pf_to_mgmt->async_msg_buf;
+ struct hinic3_api_cmd_chain *chain = NULL;
+ u8 node_id = HINIC3_MGMT_CPU_NODE_ID(pf_to_mgmt->hwdev);
+ u64 header;
+ u16 cmd_size = mgmt_msg_len(msg_len);
+
+ if (hinic3_get_chip_present_flag(pf_to_mgmt->hwdev) == 0)
+ return -EFAULT;
+
+ if (cmd_size > HINIC3_MSG_TO_MGMT_MAX_LEN)
+ return -EFAULT;
+
+ prepare_header(pf_to_mgmt, &header, msg_len, mod, HINIC3_MSG_NO_ACK,
+ direction, cmd, ASYNC_MSG_ID(pf_to_mgmt));
+
+ prepare_mgmt_cmd((u8 *)mgmt_cmd, &header, msg, msg_len);
+
+ chain = pf_to_mgmt->cmd_chain[HINIC3_API_CMD_WRITE_ASYNC_TO_MGMT_CPU];
+
+ return hinic3_api_cmd_write(chain, node_id, mgmt_cmd, cmd_size);
+}
+
+static inline void msg_to_mgmt_pre(u8 mod, void *buf_in)
+{
+ struct hinic3_msg_head *msg_head = NULL;
+
+ /* set aeq fix num to 3, need to ensure response aeq id < 3 */
+ if (mod == HINIC3_MOD_COMM || mod == HINIC3_MOD_L2NIC) {
+ msg_head = buf_in;
+
+ if (msg_head->resp_aeq_num >= HINIC3_MAX_AEQS)
+ msg_head->resp_aeq_num = 0;
+ }
+}
+
+int hinic3_pf_to_mgmt_sync(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt = NULL;
+ void *dev = ((struct hinic3_hwdev *)hwdev)->dev_hdl;
+ struct hinic3_recv_msg *recv_msg = NULL;
+ struct completion *recv_done = NULL;
+ ulong timeo;
+ int err;
+ ulong ret;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ msg_to_mgmt_pre(mod, buf_in);
+
+ pf_to_mgmt = ((struct hinic3_hwdev *)hwdev)->pf_to_mgmt;
+
+ /* Lock the sync_msg_buf */
+ down(&pf_to_mgmt->sync_msg_lock);
+ recv_msg = &pf_to_mgmt->recv_resp_msg_from_mgmt;
+ recv_done = &recv_msg->recv_done;
+
+ init_completion(recv_done);
+
+ err = send_msg_to_mgmt_sync(pf_to_mgmt, mod, cmd, buf_in, in_size,
+ HINIC3_MSG_ACK, HINIC3_MSG_DIRECT_SEND,
+ MSG_NO_RESP);
+ if (err) {
+ sdk_err(dev, "Failed to send sync msg to mgmt, sync_msg_id: %u\n",
+ pf_to_mgmt->sync_msg_id);
+ pf_to_mgmt_send_event_set(pf_to_mgmt, SEND_EVENT_FAIL);
+ goto unlock_sync_msg;
+ }
+
+ timeo = msecs_to_jiffies(timeout ? timeout : MGMT_MSG_TIMEOUT);
+
+ ret = wait_for_completion_timeout(recv_done, timeo);
+ if (!ret) {
+ sdk_err(dev, "Mgmt response sync cmd timeout, sync_msg_id: %u\n",
+ pf_to_mgmt->sync_msg_id);
+ hinic3_dump_aeq_info((struct hinic3_hwdev *)hwdev);
+ err = -ETIMEDOUT;
+ pf_to_mgmt_send_event_set(pf_to_mgmt, SEND_EVENT_TIMEOUT);
+ goto unlock_sync_msg;
+ }
+
+ spin_lock(&pf_to_mgmt->sync_event_lock);
+ if (pf_to_mgmt->event_flag == SEND_EVENT_TIMEOUT) {
+ spin_unlock(&pf_to_mgmt->sync_event_lock);
+ err = -ETIMEDOUT;
+ goto unlock_sync_msg;
+ }
+ spin_unlock(&pf_to_mgmt->sync_event_lock);
+
+ pf_to_mgmt_send_event_set(pf_to_mgmt, SEND_EVENT_END);
+
+ if (!(((struct hinic3_hwdev *)hwdev)->chip_present_flag)) {
+ destroy_completion(recv_done);
+ up(&pf_to_mgmt->sync_msg_lock);
+ return -ETIMEDOUT;
+ }
+
+ if (buf_out && out_size) {
+ if (*out_size < recv_msg->msg_len) {
+ sdk_err(dev, "Invalid response message length: %u for mod %d cmd %u from mgmt, should less than: %u\n",
+ recv_msg->msg_len, mod, cmd, *out_size);
+ err = -EFAULT;
+ goto unlock_sync_msg;
+ }
+
+ if (recv_msg->msg_len)
+ memcpy(buf_out, recv_msg->msg, recv_msg->msg_len);
+
+ *out_size = recv_msg->msg_len;
+ }
+
+unlock_sync_msg:
+ destroy_completion(recv_done);
+ up(&pf_to_mgmt->sync_msg_lock);
+
+ return err;
+}
+
+int hinic3_pf_to_mgmt_async(void *hwdev, u8 mod, u16 cmd, const void *buf_in,
+ u16 in_size)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt;
+ void *dev = ((struct hinic3_hwdev *)hwdev)->dev_hdl;
+ int err;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ pf_to_mgmt = ((struct hinic3_hwdev *)hwdev)->pf_to_mgmt;
+
+ /* Lock the async_msg_buf */
+ spin_lock_bh(&pf_to_mgmt->async_msg_lock);
+ ASYNC_MSG_ID_INC(pf_to_mgmt);
+
+ err = send_msg_to_mgmt_async(pf_to_mgmt, mod, cmd, buf_in, in_size,
+ HINIC3_MSG_DIRECT_SEND);
+ spin_unlock_bh(&pf_to_mgmt->async_msg_lock);
+
+ if (err) {
+ sdk_err(dev, "Failed to send async mgmt msg\n");
+ return err;
+ }
+
+ return 0;
+}
+
+int hinic3_pf_msg_to_mgmt_sync(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout)
+{
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_get_chip_present_flag(hwdev) == 0)
+ return -EPERM;
+
+ if (in_size > HINIC3_MSG_TO_MGMT_MAX_LEN)
+ return -EINVAL;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ return hinic3_pf_to_mgmt_sync(hwdev, mod, cmd, buf_in, in_size,
+ buf_out, out_size, timeout);
+}
+
+int hinic3_msg_to_mgmt_sync(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout, u16 channel)
+{
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_get_chip_present_flag(hwdev) == 0)
+ return -EPERM;
+
+ return hinic3_send_mbox_to_mgmt(hwdev, mod, cmd, buf_in, in_size,
+ buf_out, out_size, timeout, channel);
+}
+EXPORT_SYMBOL(hinic3_msg_to_mgmt_sync);
+
+int hinic3_msg_to_mgmt_no_ack(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, u16 channel)
+{
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_get_chip_present_flag(hwdev) == 0)
+ return -EPERM;
+
+ return hinic3_send_mbox_to_mgmt_no_ack(hwdev, mod, cmd, buf_in,
+ in_size, channel);
+}
+EXPORT_SYMBOL(hinic3_msg_to_mgmt_no_ack);
+
+int hinic3_msg_to_mgmt_async(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, u16 channel)
+{
+ return hinic3_msg_to_mgmt_api_chain_async(hwdev, mod, cmd, buf_in,
+ in_size);
+}
+EXPORT_SYMBOL(hinic3_msg_to_mgmt_async);
+
+int hinic3_msg_to_mgmt_api_chain_sync(void *hwdev, u8 mod, u16 cmd,
+ void *buf_in, u16 in_size, void *buf_out,
+ u16 *out_size, u32 timeout)
+{
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_get_chip_present_flag(hwdev) == 0)
+ return -EPERM;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev)) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "PF don't support api chain\n");
+ return -EPERM;
+ }
+
+ return hinic3_pf_msg_to_mgmt_sync(hwdev, mod, cmd, buf_in, in_size,
+ buf_out, out_size, timeout);
+}
+
+int hinic3_msg_to_mgmt_api_chain_async(void *hwdev, u8 mod, u16 cmd,
+ const void *buf_in, u16 in_size)
+{
+ int err;
+
+ if (!hwdev)
+ return -EINVAL;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF) {
+ err = -EFAULT;
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "VF don't support async cmd\n");
+ } else if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev)) {
+ err = -EPERM;
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "PF don't support api chain\n");
+ } else {
+ err = hinic3_pf_to_mgmt_async(hwdev, mod, cmd, buf_in, in_size);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL(hinic3_msg_to_mgmt_api_chain_async);
+
+static void send_mgmt_ack(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ u8 mod, u16 cmd, void *buf_in, u16 in_size,
+ u16 msg_id)
+{
+ u16 buf_size;
+
+ if (!in_size)
+ buf_size = BUF_OUT_DEFAULT_SIZE;
+ else
+ buf_size = in_size;
+
+ hinic3_response_mbox_to_mgmt(pf_to_mgmt->hwdev, mod, cmd, buf_in,
+ buf_size, msg_id);
+}
+
+static void mgmt_recv_msg_handler(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ u8 mod, u16 cmd, void *buf_in, u16 in_size,
+ u16 msg_id, int need_resp)
+{
+ void *dev = pf_to_mgmt->hwdev->dev_hdl;
+ void *buf_out = pf_to_mgmt->mgmt_ack_buf;
+ enum hinic3_mod_type tmp_mod = mod;
+ bool ack_first = false;
+ u16 out_size = 0;
+
+ memset(buf_out, 0, MAX_PF_MGMT_BUF_SIZE);
+
+ if (mod >= HINIC3_MOD_HW_MAX) {
+ sdk_warn(dev, "Receive illegal message from mgmt cpu, mod = %d\n",
+ mod);
+ goto unsupported;
+ }
+
+ set_bit(HINIC3_MGMT_MSG_CB_RUNNING,
+ &pf_to_mgmt->mgmt_msg_cb_state[tmp_mod]);
+
+ if (!pf_to_mgmt->recv_mgmt_msg_cb[mod] ||
+ !test_bit(HINIC3_MGMT_MSG_CB_REG,
+ &pf_to_mgmt->mgmt_msg_cb_state[tmp_mod])) {
+ sdk_warn(dev, "Receive mgmt callback is null, mod = %u, cmd=%u\n", mod, cmd);
+ clear_bit(HINIC3_MGMT_MSG_CB_RUNNING,
+ &pf_to_mgmt->mgmt_msg_cb_state[tmp_mod]);
+ goto unsupported;
+ }
+
+ pf_to_mgmt->recv_mgmt_msg_cb[tmp_mod](pf_to_mgmt->recv_mgmt_msg_data[tmp_mod],
+ cmd, buf_in, in_size,
+ buf_out, &out_size);
+
+ clear_bit(HINIC3_MGMT_MSG_CB_RUNNING,
+ &pf_to_mgmt->mgmt_msg_cb_state[tmp_mod]);
+
+ goto resp;
+
+unsupported:
+ out_size = sizeof(struct mgmt_msg_head);
+ ((struct mgmt_msg_head *)buf_out)->status = HINIC3_MGMT_CMD_UNSUPPORTED;
+
+resp:
+ if (!ack_first && need_resp)
+ send_mgmt_ack(pf_to_mgmt, mod, cmd, buf_out, out_size, msg_id);
+}
+
+/**
+ * mgmt_resp_msg_handler - handler for response message from mgmt cpu
+ * @pf_to_mgmt: PF to MGMT channel
+ * @recv_msg: received message details
+ **/
+static void mgmt_resp_msg_handler(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ struct hinic3_recv_msg *recv_msg)
+{
+ void *dev = pf_to_mgmt->hwdev->dev_hdl;
+
+ /* delete async msg */
+ if (recv_msg->msg_id & ASYNC_MSG_FLAG)
+ return;
+
+ spin_lock(&pf_to_mgmt->sync_event_lock);
+ if (recv_msg->msg_id == pf_to_mgmt->sync_msg_id &&
+ pf_to_mgmt->event_flag == SEND_EVENT_START) {
+ pf_to_mgmt->event_flag = SEND_EVENT_SUCCESS;
+ complete(&recv_msg->recv_done);
+ } else if (recv_msg->msg_id != pf_to_mgmt->sync_msg_id) {
+ sdk_err(dev, "Send msg id(0x%x) recv msg id(0x%x) dismatch, event state=%d\n",
+ pf_to_mgmt->sync_msg_id, recv_msg->msg_id,
+ pf_to_mgmt->event_flag);
+ } else {
+ sdk_err(dev, "Wait timeout, send msg id(0x%x) recv msg id(0x%x), event state=%d!\n",
+ pf_to_mgmt->sync_msg_id, recv_msg->msg_id,
+ pf_to_mgmt->event_flag);
+ }
+ spin_unlock(&pf_to_mgmt->sync_event_lock);
+}
+
+static void recv_mgmt_msg_work_handler(struct work_struct *work)
+{
+ struct hinic3_mgmt_msg_handle_work *mgmt_work =
+ container_of(work, struct hinic3_mgmt_msg_handle_work, work);
+
+ mgmt_recv_msg_handler(mgmt_work->pf_to_mgmt, mgmt_work->mod,
+ mgmt_work->cmd, mgmt_work->msg,
+ mgmt_work->msg_len, mgmt_work->msg_id,
+ !mgmt_work->async_mgmt_to_pf);
+
+ destroy_work(&mgmt_work->work);
+
+ kfree(mgmt_work->msg);
+ kfree(mgmt_work);
+}
+
+static bool check_mgmt_head_info(struct hinic3_recv_msg *recv_msg,
+ u8 seq_id, u8 seg_len, u16 msg_id)
+{
+ if (seq_id > MGMT_MSG_MAX_SEQ_ID || seg_len > SEGMENT_LEN ||
+ (seq_id == MGMT_MSG_MAX_SEQ_ID && seg_len > MGMT_MSG_LAST_SEG_MAX_LEN))
+ return false;
+
+ if (seq_id == 0) {
+ recv_msg->seq_id = seq_id;
+ recv_msg->msg_id = msg_id;
+ } else {
+ if (seq_id != recv_msg->seq_id + 1 || msg_id != recv_msg->msg_id)
+ return false;
+
+ recv_msg->seq_id = seq_id;
+ }
+
+ return true;
+}
+
+static void init_mgmt_msg_work(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ struct hinic3_recv_msg *recv_msg)
+{
+ struct hinic3_mgmt_msg_handle_work *mgmt_work = NULL;
+ struct hinic3_hwdev *hwdev = pf_to_mgmt->hwdev;
+
+ mgmt_work = kzalloc(sizeof(*mgmt_work), GFP_KERNEL);
+ if (!mgmt_work) {
+ sdk_err(hwdev->dev_hdl, "Allocate mgmt work memory failed\n");
+ return;
+ }
+
+ if (recv_msg->msg_len) {
+ mgmt_work->msg = kzalloc(recv_msg->msg_len, GFP_KERNEL);
+ if (!mgmt_work->msg) {
+ sdk_err(hwdev->dev_hdl, "Allocate mgmt msg memory failed\n");
+ kfree(mgmt_work);
+ return;
+ }
+ }
+
+ mgmt_work->pf_to_mgmt = pf_to_mgmt;
+ mgmt_work->msg_len = recv_msg->msg_len;
+ memcpy(mgmt_work->msg, recv_msg->msg, recv_msg->msg_len);
+ mgmt_work->msg_id = recv_msg->msg_id;
+ mgmt_work->mod = recv_msg->mod;
+ mgmt_work->cmd = recv_msg->cmd;
+ mgmt_work->async_mgmt_to_pf = recv_msg->async_mgmt_to_pf;
+
+ INIT_WORK(&mgmt_work->work, recv_mgmt_msg_work_handler);
+ queue_work_on(hisdk3_get_work_cpu_affinity(hwdev, WORK_TYPE_MGMT_MSG),
+ pf_to_mgmt->workq, &mgmt_work->work);
+}
+
+/**
+ * recv_mgmt_msg_handler - handler a message from mgmt cpu
+ * @pf_to_mgmt: PF to MGMT channel
+ * @header: the header of the message
+ * @recv_msg: received message details
+ **/
+static void recv_mgmt_msg_handler(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt,
+ u8 *header, struct hinic3_recv_msg *recv_msg)
+{
+ struct hinic3_hwdev *hwdev = pf_to_mgmt->hwdev;
+ u64 mbox_header = *((u64 *)header);
+ void *msg_body = header + sizeof(mbox_header);
+ u8 seq_id, seq_len;
+ u16 msg_id;
+ u32 offset;
+ u64 dir;
+
+ /* Don't need to get anything from hw when cmd is async */
+ dir = HINIC3_MSG_HEADER_GET(mbox_header, DIRECTION);
+ if (dir == HINIC3_MSG_RESPONSE &&
+ (HINIC3_MSG_HEADER_GET(mbox_header, MSG_ID) & ASYNC_MSG_FLAG))
+ return;
+
+ seq_len = HINIC3_MSG_HEADER_GET(mbox_header, SEG_LEN);
+ seq_id = HINIC3_MSG_HEADER_GET(mbox_header, SEQID);
+ msg_id = HINIC3_MSG_HEADER_GET(mbox_header, MSG_ID);
+ if (!check_mgmt_head_info(recv_msg, seq_id, seq_len, msg_id)) {
+ sdk_err(hwdev->dev_hdl, "Mgmt msg sequence id and segment length check failed\n");
+ sdk_err(hwdev->dev_hdl,
+ "Front seq_id: 0x%x,current seq_id: 0x%x, seg len: 0x%x, front msg_id: %d, cur: %d\n",
+ recv_msg->seq_id, seq_id, seq_len, recv_msg->msg_id, msg_id);
+ /* set seq_id to invalid seq_id */
+ recv_msg->seq_id = MGMT_MSG_MAX_SEQ_ID;
+ return;
+ }
+
+ offset = seq_id * SEGMENT_LEN;
+ memcpy((u8 *)recv_msg->msg + offset, msg_body, seq_len);
+
+ if (!HINIC3_MSG_HEADER_GET(mbox_header, LAST))
+ return;
+
+ recv_msg->cmd = HINIC3_MSG_HEADER_GET(mbox_header, CMD);
+ recv_msg->mod = HINIC3_MSG_HEADER_GET(mbox_header, MODULE);
+ recv_msg->async_mgmt_to_pf = HINIC3_MSG_HEADER_GET(mbox_header,
+ NO_ACK);
+ recv_msg->msg_len = HINIC3_MSG_HEADER_GET(mbox_header, MSG_LEN);
+ recv_msg->msg_id = msg_id;
+ recv_msg->seq_id = MGMT_MSG_MAX_SEQ_ID;
+
+ if (HINIC3_MSG_HEADER_GET(mbox_header, DIRECTION) ==
+ HINIC3_MSG_RESPONSE) {
+ mgmt_resp_msg_handler(pf_to_mgmt, recv_msg);
+ return;
+ }
+
+ init_mgmt_msg_work(pf_to_mgmt, recv_msg);
+}
+
+/**
+ * hinic3_mgmt_msg_aeqe_handler - handler for a mgmt message event
+ * @handle: PF to MGMT channel
+ * @header: the header of the message
+ * @size: unused
+ **/
+void hinic3_mgmt_msg_aeqe_handler(void *hwdev, u8 *header, u8 size)
+{
+ struct hinic3_hwdev *dev = (struct hinic3_hwdev *)hwdev;
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt = NULL;
+ struct hinic3_recv_msg *recv_msg = NULL;
+ bool is_send_dir = false;
+
+ if ((HINIC3_MSG_HEADER_GET(*(u64 *)header, SOURCE) ==
+ HINIC3_MSG_FROM_MBOX)) {
+ hinic3_mbox_func_aeqe_handler(hwdev, header, size);
+ return;
+ }
+
+ pf_to_mgmt = dev->pf_to_mgmt;
+ if (!pf_to_mgmt)
+ return;
+
+ is_send_dir = (HINIC3_MSG_HEADER_GET(*(u64 *)header, DIRECTION) ==
+ HINIC3_MSG_DIRECT_SEND) ? true : false;
+
+ recv_msg = is_send_dir ? &pf_to_mgmt->recv_msg_from_mgmt :
+ &pf_to_mgmt->recv_resp_msg_from_mgmt;
+
+ recv_mgmt_msg_handler(pf_to_mgmt, header, recv_msg);
+}
+
+/**
+ * alloc_recv_msg - allocate received message memory
+ * @recv_msg: pointer that will hold the allocated data
+ * Return: 0 - success, negative - failure
+ **/
+static int alloc_recv_msg(struct hinic3_recv_msg *recv_msg)
+{
+ recv_msg->seq_id = MGMT_MSG_MAX_SEQ_ID;
+
+ recv_msg->msg = kzalloc(MAX_PF_MGMT_BUF_SIZE, GFP_KERNEL);
+ if (!recv_msg->msg)
+ return -ENOMEM;
+
+ return 0;
+}
+
+/**
+ * free_recv_msg - free received message memory
+ * @recv_msg: pointer that holds the allocated data
+ **/
+static void free_recv_msg(struct hinic3_recv_msg *recv_msg)
+{
+ kfree(recv_msg->msg);
+}
+
+/**
+ * alloc_msg_buf - allocate all the message buffers of PF to MGMT channel
+ * @pf_to_mgmt: PF to MGMT channel
+ * Return: 0 - success, negative - failure
+ **/
+static int alloc_msg_buf(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt)
+{
+ int err;
+ void *dev = pf_to_mgmt->hwdev->dev_hdl;
+
+ err = alloc_recv_msg(&pf_to_mgmt->recv_msg_from_mgmt);
+ if (err) {
+ sdk_err(dev, "Failed to allocate recv msg\n");
+ return err;
+ }
+
+ err = alloc_recv_msg(&pf_to_mgmt->recv_resp_msg_from_mgmt);
+ if (err) {
+ sdk_err(dev, "Failed to allocate resp recv msg\n");
+ goto alloc_msg_for_resp_err;
+ }
+
+ pf_to_mgmt->async_msg_buf = kzalloc(MAX_PF_MGMT_BUF_SIZE, GFP_KERNEL);
+ if (!pf_to_mgmt->async_msg_buf) {
+ err = -ENOMEM;
+ goto async_msg_buf_err;
+ }
+
+ pf_to_mgmt->sync_msg_buf = kzalloc(MAX_PF_MGMT_BUF_SIZE, GFP_KERNEL);
+ if (!pf_to_mgmt->sync_msg_buf) {
+ err = -ENOMEM;
+ goto sync_msg_buf_err;
+ }
+
+ pf_to_mgmt->mgmt_ack_buf = kzalloc(MAX_PF_MGMT_BUF_SIZE, GFP_KERNEL);
+ if (!pf_to_mgmt->mgmt_ack_buf) {
+ err = -ENOMEM;
+ goto ack_msg_buf_err;
+ }
+
+ return 0;
+
+ack_msg_buf_err:
+ kfree(pf_to_mgmt->sync_msg_buf);
+
+sync_msg_buf_err:
+ kfree(pf_to_mgmt->async_msg_buf);
+
+async_msg_buf_err:
+ free_recv_msg(&pf_to_mgmt->recv_resp_msg_from_mgmt);
+
+alloc_msg_for_resp_err:
+ free_recv_msg(&pf_to_mgmt->recv_msg_from_mgmt);
+ return err;
+}
+
+/**
+ * free_msg_buf - free all the message buffers of PF to MGMT channel
+ * @pf_to_mgmt: PF to MGMT channel
+ * Return: 0 - success, negative - failure
+ **/
+static void free_msg_buf(struct hinic3_msg_pf_to_mgmt *pf_to_mgmt)
+{
+ kfree(pf_to_mgmt->mgmt_ack_buf);
+ kfree(pf_to_mgmt->sync_msg_buf);
+ kfree(pf_to_mgmt->async_msg_buf);
+
+ free_recv_msg(&pf_to_mgmt->recv_resp_msg_from_mgmt);
+ free_recv_msg(&pf_to_mgmt->recv_msg_from_mgmt);
+}
+
+/**
+ * hinic_pf_to_mgmt_init - initialize PF to MGMT channel
+ * @hwdev: the pointer to hw device
+ * Return: 0 - success, negative - failure
+ **/
+int hinic3_pf_to_mgmt_init(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt;
+ void *dev = hwdev->dev_hdl;
+ int err;
+
+ pf_to_mgmt = kzalloc(sizeof(*pf_to_mgmt), GFP_KERNEL);
+ if (!pf_to_mgmt)
+ return -ENOMEM;
+
+ hwdev->pf_to_mgmt = pf_to_mgmt;
+ pf_to_mgmt->hwdev = hwdev;
+ spin_lock_init(&pf_to_mgmt->async_msg_lock);
+ spin_lock_init(&pf_to_mgmt->sync_event_lock);
+ sema_init(&pf_to_mgmt->sync_msg_lock, 1);
+ pf_to_mgmt->workq = create_singlethread_workqueue(HINIC3_MGMT_WQ_NAME);
+ if (!pf_to_mgmt->workq) {
+ sdk_err(dev, "Failed to initialize MGMT workqueue\n");
+ err = -ENOMEM;
+ goto create_mgmt_workq_err;
+ }
+
+ err = alloc_msg_buf(pf_to_mgmt);
+ if (err) {
+ sdk_err(dev, "Failed to allocate msg buffers\n");
+ goto alloc_msg_buf_err;
+ }
+
+ err = hinic3_api_cmd_init(hwdev, pf_to_mgmt->cmd_chain);
+ if (err) {
+ sdk_err(dev, "Failed to init the api cmd chains\n");
+ goto api_cmd_init_err;
+ }
+
+ return 0;
+
+api_cmd_init_err:
+ free_msg_buf(pf_to_mgmt);
+
+alloc_msg_buf_err:
+ destroy_workqueue(pf_to_mgmt->workq);
+
+create_mgmt_workq_err:
+ spin_lock_deinit(&pf_to_mgmt->sync_event_lock);
+ spin_lock_deinit(&pf_to_mgmt->async_msg_lock);
+ sema_deinit(&pf_to_mgmt->sync_msg_lock);
+ kfree(pf_to_mgmt);
+
+ return err;
+}
+
+/**
+ * hinic_pf_to_mgmt_free - free PF to MGMT channel
+ * @hwdev: the pointer to hw device
+ **/
+void hinic3_pf_to_mgmt_free(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt = hwdev->pf_to_mgmt;
+
+ /* destroy workqueue before free related pf_to_mgmt resources in case of
+ * illegal resource access
+ */
+ destroy_workqueue(pf_to_mgmt->workq);
+ hinic3_api_cmd_free(hwdev, pf_to_mgmt->cmd_chain);
+
+ free_msg_buf(pf_to_mgmt);
+ spin_lock_deinit(&pf_to_mgmt->sync_event_lock);
+ spin_lock_deinit(&pf_to_mgmt->async_msg_lock);
+ sema_deinit(&pf_to_mgmt->sync_msg_lock);
+ kfree(pf_to_mgmt);
+}
+
+void hinic3_flush_mgmt_workq(void *hwdev)
+{
+ struct hinic3_hwdev *dev = (struct hinic3_hwdev *)hwdev;
+
+ flush_workqueue(dev->aeqs->workq);
+
+ if (hinic3_func_type(dev) != TYPE_VF)
+ flush_workqueue(dev->pf_to_mgmt->workq);
+}
+
+int hinic3_api_cmd_read_ack(void *hwdev, u8 dest, const void *cmd,
+ u16 size, void *ack, u16 ack_size)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt = NULL;
+ struct hinic3_api_cmd_chain *chain = NULL;
+
+ if (!hwdev || !cmd || (ack_size && !ack) || size > MAX_PF_MGMT_BUF_SIZE)
+ return -EINVAL;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ pf_to_mgmt = ((struct hinic3_hwdev *)hwdev)->pf_to_mgmt;
+ chain = pf_to_mgmt->cmd_chain[HINIC3_API_CMD_POLL_READ];
+
+ if (!(((struct hinic3_hwdev *)hwdev)->chip_present_flag))
+ return -EPERM;
+
+ return hinic3_api_cmd_read(chain, dest, cmd, size, ack, ack_size);
+}
+
+/**
+ * api cmd write or read bypass default use poll, if want to use aeq interrupt,
+ * please set wb_trigger_aeqe to 1
+ **/
+int hinic3_api_cmd_write_nack(void *hwdev, u8 dest, const void *cmd, u16 size)
+{
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt = NULL;
+ struct hinic3_api_cmd_chain *chain = NULL;
+
+ if (!hwdev || !size || !cmd || size > MAX_PF_MGMT_BUF_SIZE)
+ return -EINVAL;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ pf_to_mgmt = ((struct hinic3_hwdev *)hwdev)->pf_to_mgmt;
+ chain = pf_to_mgmt->cmd_chain[HINIC3_API_CMD_POLL_WRITE];
+
+ if (!(((struct hinic3_hwdev *)hwdev)->chip_present_flag))
+ return -EPERM;
+
+ return hinic3_api_cmd_write(chain, dest, cmd, size);
+}
+
+static int get_clp_reg(void *hwdev, enum clp_data_type data_type,
+ enum clp_reg_type reg_type, u32 *reg_addr)
+{
+ switch (reg_type) {
+ case HINIC3_CLP_BA_HOST:
+ *reg_addr = (data_type == HINIC3_CLP_REQ_HOST) ?
+ HINIC3_CLP_REG(REQBASE) :
+ HINIC3_CLP_REG(RSPBASE);
+ break;
+
+ case HINIC3_CLP_SIZE_HOST:
+ *reg_addr = HINIC3_CLP_REG(SIZE);
+ break;
+
+ case HINIC3_CLP_LEN_HOST:
+ *reg_addr = (data_type == HINIC3_CLP_REQ_HOST) ?
+ HINIC3_CLP_REG(REQ) : HINIC3_CLP_REG(RSP);
+ break;
+
+ case HINIC3_CLP_START_REQ_HOST:
+ *reg_addr = HINIC3_CLP_REG(REQ);
+ break;
+
+ case HINIC3_CLP_READY_RSP_HOST:
+ *reg_addr = HINIC3_CLP_REG(RSP);
+ break;
+
+ default:
+ *reg_addr = 0;
+ break;
+ }
+ if (*reg_addr == 0)
+ return -EINVAL;
+
+ return 0;
+}
+
+static inline int clp_param_valid(struct hinic3_hwdev *hwdev,
+ enum clp_data_type data_type,
+ enum clp_reg_type reg_type)
+{
+ if (data_type == HINIC3_CLP_REQ_HOST &&
+ reg_type == HINIC3_CLP_READY_RSP_HOST)
+ return -EINVAL;
+
+ if (data_type == HINIC3_CLP_RSP_HOST &&
+ reg_type == HINIC3_CLP_START_REQ_HOST)
+ return -EINVAL;
+
+ return 0;
+}
+
+static u32 get_clp_reg_value(struct hinic3_hwdev *hwdev,
+ enum clp_data_type data_type,
+ enum clp_reg_type reg_type, u32 reg_addr)
+{
+ u32 value;
+
+ value = hinic3_hwif_read_reg(hwdev->hwif, reg_addr);
+
+ switch (reg_type) {
+ case HINIC3_CLP_BA_HOST:
+ value = ((value >> HINIC3_CLP_OFFSET(BASE)) &
+ HINIC3_CLP_MASK(BASE));
+ break;
+
+ case HINIC3_CLP_SIZE_HOST:
+ if (data_type == HINIC3_CLP_REQ_HOST)
+ value = ((value >> HINIC3_CLP_OFFSET(REQ_SIZE)) &
+ HINIC3_CLP_MASK(SIZE));
+ else
+ value = ((value >> HINIC3_CLP_OFFSET(RSP_SIZE)) &
+ HINIC3_CLP_MASK(SIZE));
+ break;
+
+ case HINIC3_CLP_LEN_HOST:
+ value = ((value >> HINIC3_CLP_OFFSET(LEN)) &
+ HINIC3_CLP_MASK(LEN));
+ break;
+
+ case HINIC3_CLP_START_REQ_HOST:
+ value = ((value >> HINIC3_CLP_OFFSET(START)) &
+ HINIC3_CLP_MASK(START));
+ break;
+
+ case HINIC3_CLP_READY_RSP_HOST:
+ value = ((value >> HINIC3_CLP_OFFSET(READY)) &
+ HINIC3_CLP_MASK(READY));
+ break;
+
+ default:
+ break;
+ }
+
+ return value;
+}
+
+static int hinic3_read_clp_reg(struct hinic3_hwdev *hwdev,
+ enum clp_data_type data_type,
+ enum clp_reg_type reg_type, u32 *read_value)
+{
+ u32 reg_addr;
+ int err;
+
+ err = clp_param_valid(hwdev, data_type, reg_type);
+ if (err)
+ return err;
+
+ err = get_clp_reg(hwdev, data_type, reg_type, ®_addr);
+ if (err)
+ return err;
+
+ *read_value = get_clp_reg_value(hwdev, data_type, reg_type, reg_addr);
+
+ return 0;
+}
+
+static int check_data_type(enum clp_data_type data_type,
+ enum clp_reg_type reg_type)
+{
+ if (data_type == HINIC3_CLP_REQ_HOST &&
+ reg_type == HINIC3_CLP_READY_RSP_HOST)
+ return -EINVAL;
+ if (data_type == HINIC3_CLP_RSP_HOST &&
+ reg_type == HINIC3_CLP_START_REQ_HOST)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int check_reg_value(enum clp_reg_type reg_type, u32 value)
+{
+ if (reg_type == HINIC3_CLP_BA_HOST &&
+ value > HINIC3_CLP_SRAM_BASE_REG_MAX)
+ return -EINVAL;
+
+ if (reg_type == HINIC3_CLP_SIZE_HOST &&
+ value > HINIC3_CLP_SRAM_SIZE_REG_MAX)
+ return -EINVAL;
+
+ if (reg_type == HINIC3_CLP_LEN_HOST &&
+ value > HINIC3_CLP_LEN_REG_MAX)
+ return -EINVAL;
+
+ if ((reg_type == HINIC3_CLP_START_REQ_HOST ||
+ reg_type == HINIC3_CLP_READY_RSP_HOST) &&
+ value > HINIC3_CLP_START_OR_READY_REG_MAX)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int hinic3_check_clp_init_status(struct hinic3_hwdev *hwdev)
+{
+ int err;
+ u32 reg_value = 0;
+
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_REQ_HOST,
+ HINIC3_CLP_BA_HOST, ®_value);
+ if (err || !reg_value) {
+ sdk_err(hwdev->dev_hdl, "Wrong req ba value: 0x%x\n",
+ reg_value);
+ return -EINVAL;
+ }
+
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_RSP_HOST,
+ HINIC3_CLP_BA_HOST, ®_value);
+ if (err || !reg_value) {
+ sdk_err(hwdev->dev_hdl, "Wrong rsp ba value: 0x%x\n",
+ reg_value);
+ return -EINVAL;
+ }
+
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_REQ_HOST,
+ HINIC3_CLP_SIZE_HOST, ®_value);
+ if (err || !reg_value) {
+ sdk_err(hwdev->dev_hdl, "Wrong req size\n");
+ return -EINVAL;
+ }
+
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_RSP_HOST,
+ HINIC3_CLP_SIZE_HOST, ®_value);
+ if (err || !reg_value) {
+ sdk_err(hwdev->dev_hdl, "Wrong rsp size\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static void hinic3_write_clp_reg(struct hinic3_hwdev *hwdev,
+ enum clp_data_type data_type,
+ enum clp_reg_type reg_type, u32 value)
+{
+ u32 reg_addr, reg_value;
+
+ if (check_data_type(data_type, reg_type))
+ return;
+
+ if (check_reg_value(reg_type, value))
+ return;
+
+ if (get_clp_reg(hwdev, data_type, reg_type, ®_addr))
+ return;
+
+ reg_value = hinic3_hwif_read_reg(hwdev->hwif, reg_addr);
+
+ switch (reg_type) {
+ case HINIC3_CLP_LEN_HOST:
+ reg_value = reg_value &
+ (~(HINIC3_CLP_MASK(LEN) << HINIC3_CLP_OFFSET(LEN)));
+ reg_value = reg_value | (value << HINIC3_CLP_OFFSET(LEN));
+ break;
+
+ case HINIC3_CLP_START_REQ_HOST:
+ reg_value = reg_value &
+ (~(HINIC3_CLP_MASK(START) <<
+ HINIC3_CLP_OFFSET(START)));
+ reg_value = reg_value | (value << HINIC3_CLP_OFFSET(START));
+ break;
+
+ case HINIC3_CLP_READY_RSP_HOST:
+ reg_value = reg_value &
+ (~(HINIC3_CLP_MASK(READY) <<
+ HINIC3_CLP_OFFSET(READY)));
+ reg_value = reg_value | (value << HINIC3_CLP_OFFSET(READY));
+ break;
+
+ default:
+ return;
+ }
+
+ hinic3_hwif_write_reg(hwdev->hwif, reg_addr, reg_value);
+}
+
+static int hinic3_read_clp_data(struct hinic3_hwdev *hwdev,
+ void *buf_out, u16 *out_size)
+{
+ int err;
+ u32 reg = HINIC3_CLP_DATA(RSP);
+ u32 ready, delay_cnt;
+ u32 *ptr = (u32 *)buf_out;
+ u32 temp_out_size = 0;
+
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_RSP_HOST,
+ HINIC3_CLP_READY_RSP_HOST, &ready);
+ if (err)
+ return err;
+
+ delay_cnt = 0;
+ while (ready == 0) {
+ usleep_range(9000, 10000); /* sleep 9000 us ~ 10000 us */
+ delay_cnt++;
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_RSP_HOST,
+ HINIC3_CLP_READY_RSP_HOST, &ready);
+ if (err || delay_cnt > HINIC3_CLP_DELAY_CNT_MAX) {
+ sdk_err(hwdev->dev_hdl, "Timeout with delay_cnt: %u\n",
+ delay_cnt);
+ return -EINVAL;
+ }
+ }
+
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_RSP_HOST,
+ HINIC3_CLP_LEN_HOST, &temp_out_size);
+ if (err)
+ return err;
+
+ if (temp_out_size > HINIC3_CLP_SRAM_SIZE_REG_MAX || !temp_out_size) {
+ sdk_err(hwdev->dev_hdl, "Invalid temp_out_size: %u\n",
+ temp_out_size);
+ return -EINVAL;
+ }
+
+ *out_size = (u16)temp_out_size;
+ for (; temp_out_size > 0; temp_out_size--) {
+ *ptr = hinic3_hwif_read_reg(hwdev->hwif, reg);
+ ptr++;
+ /* read 4 bytes every time */
+ reg = reg + 4;
+ }
+
+ hinic3_write_clp_reg(hwdev, HINIC3_CLP_RSP_HOST,
+ HINIC3_CLP_READY_RSP_HOST, (u32)0x0);
+ hinic3_write_clp_reg(hwdev, HINIC3_CLP_RSP_HOST, HINIC3_CLP_LEN_HOST,
+ (u32)0x0);
+
+ return 0;
+}
+
+static int hinic3_write_clp_data(struct hinic3_hwdev *hwdev,
+ void *buf_in, u16 in_size)
+{
+ int err;
+ u32 reg = HINIC3_CLP_DATA(REQ);
+ u32 start = 1;
+ u32 delay_cnt = 0;
+ u32 *ptr = (u32 *)buf_in;
+ u16 size_in = in_size;
+
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_REQ_HOST,
+ HINIC3_CLP_START_REQ_HOST, &start);
+ if (err != 0)
+ return err;
+
+ while (start == 1) {
+ usleep_range(9000, 10000); /* sleep 9000 us ~ 10000 us */
+ delay_cnt++;
+ err = hinic3_read_clp_reg(hwdev, HINIC3_CLP_REQ_HOST,
+ HINIC3_CLP_START_REQ_HOST, &start);
+ if (err || delay_cnt > HINIC3_CLP_DELAY_CNT_MAX)
+ return -EINVAL;
+ }
+
+ hinic3_write_clp_reg(hwdev, HINIC3_CLP_REQ_HOST,
+ HINIC3_CLP_LEN_HOST, size_in);
+ hinic3_write_clp_reg(hwdev, HINIC3_CLP_REQ_HOST,
+ HINIC3_CLP_START_REQ_HOST, (u32)0x1);
+
+ for (; size_in > 0; size_in--) {
+ hinic3_hwif_write_reg(hwdev->hwif, reg, *ptr);
+ ptr++;
+ reg = reg + sizeof(u32);
+ }
+
+ return 0;
+}
+
+static void hinic3_clear_clp_data(struct hinic3_hwdev *hwdev,
+ enum clp_data_type data_type)
+{
+ u32 reg = (data_type == HINIC3_CLP_REQ_HOST) ?
+ HINIC3_CLP_DATA(REQ) : HINIC3_CLP_DATA(RSP);
+ u32 count = HINIC3_CLP_INPUT_BUF_LEN_HOST / HINIC3_CLP_DATA_UNIT_HOST;
+
+ for (; count > 0; count--) {
+ hinic3_hwif_write_reg(hwdev->hwif, reg, 0x0);
+ reg = reg + sizeof(u32);
+ }
+}
+
+int hinic3_pf_clp_to_mgmt(void *hwdev, u8 mod, u16 cmd, const void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size)
+{
+ struct hinic3_clp_pf_to_mgmt *clp_pf_to_mgmt;
+ struct hinic3_hwdev *dev = hwdev;
+ u64 header;
+ u16 real_size;
+ u8 *clp_msg_buf;
+ int err;
+
+ if (!COMM_SUPPORT_CLP(dev))
+ return -EPERM;
+
+ clp_pf_to_mgmt = ((struct hinic3_hwdev *)hwdev)->clp_pf_to_mgmt;
+ if (!clp_pf_to_mgmt)
+ return -EPERM;
+
+ clp_msg_buf = clp_pf_to_mgmt->clp_msg_buf;
+
+ /* 4 bytes alignment */
+ if (in_size % HINIC3_CLP_DATA_UNIT_HOST)
+ real_size = (in_size + (u16)sizeof(header) +
+ HINIC3_CLP_DATA_UNIT_HOST);
+ else
+ real_size = in_size + (u16)sizeof(header);
+ real_size = real_size / HINIC3_CLP_DATA_UNIT_HOST;
+
+ if (real_size >
+ (HINIC3_CLP_INPUT_BUF_LEN_HOST / HINIC3_CLP_DATA_UNIT_HOST)) {
+ sdk_err(dev->dev_hdl, "Invalid real_size: %u\n", real_size);
+ return -EINVAL;
+ }
+ down(&clp_pf_to_mgmt->clp_msg_lock);
+
+ err = hinic3_check_clp_init_status(dev);
+ if (err) {
+ sdk_err(dev->dev_hdl, "Check clp init status failed\n");
+ up(&clp_pf_to_mgmt->clp_msg_lock);
+ return err;
+ }
+
+ hinic3_clear_clp_data(dev, HINIC3_CLP_RSP_HOST);
+ hinic3_write_clp_reg(dev, HINIC3_CLP_RSP_HOST,
+ HINIC3_CLP_READY_RSP_HOST, 0x0);
+
+ /* Send request */
+ memset(clp_msg_buf, 0x0, HINIC3_CLP_INPUT_BUF_LEN_HOST);
+ clp_prepare_header(dev, &header, in_size, mod, 0, 0, cmd, 0);
+
+ memcpy(clp_msg_buf, &header, sizeof(header));
+ clp_msg_buf += sizeof(header);
+ memcpy(clp_msg_buf, buf_in, in_size);
+
+ clp_msg_buf = clp_pf_to_mgmt->clp_msg_buf;
+
+ hinic3_clear_clp_data(dev, HINIC3_CLP_REQ_HOST);
+ err = hinic3_write_clp_data(hwdev,
+ clp_pf_to_mgmt->clp_msg_buf, real_size);
+ if (err) {
+ sdk_err(dev->dev_hdl, "Send clp request failed\n");
+ up(&clp_pf_to_mgmt->clp_msg_lock);
+ return -EINVAL;
+ }
+
+ /* Get response */
+ clp_msg_buf = clp_pf_to_mgmt->clp_msg_buf;
+ memset(clp_msg_buf, 0x0, HINIC3_CLP_INPUT_BUF_LEN_HOST);
+ err = hinic3_read_clp_data(hwdev, clp_msg_buf, &real_size);
+ hinic3_clear_clp_data(dev, HINIC3_CLP_RSP_HOST);
+ if (err) {
+ sdk_err(dev->dev_hdl, "Read clp response failed\n");
+ up(&clp_pf_to_mgmt->clp_msg_lock);
+ return -EINVAL;
+ }
+
+ real_size = (u16)((real_size * HINIC3_CLP_DATA_UNIT_HOST) & 0xffff);
+ if (real_size <= sizeof(header) || real_size > HINIC3_CLP_INPUT_BUF_LEN_HOST) {
+ sdk_err(dev->dev_hdl, "Invalid response size: %u", real_size);
+ up(&clp_pf_to_mgmt->clp_msg_lock);
+ return -EINVAL;
+ }
+ real_size = real_size - sizeof(header);
+ if (real_size != *out_size) {
+ sdk_err(dev->dev_hdl, "Invalid real_size:%u, out_size: %u\n",
+ real_size, *out_size);
+ up(&clp_pf_to_mgmt->clp_msg_lock);
+ return -EINVAL;
+ }
+
+ memcpy(buf_out, (clp_msg_buf + sizeof(header)), real_size);
+ up(&clp_pf_to_mgmt->clp_msg_lock);
+
+ return 0;
+}
+
+int hinic3_clp_to_mgmt(void *hwdev, u8 mod, u16 cmd, const void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size)
+
+{
+ struct hinic3_hwdev *dev = hwdev;
+ int err;
+
+ if (!dev)
+ return -EINVAL;
+
+ if (!dev->chip_present_flag)
+ return -EPERM;
+
+ if (hinic3_func_type(hwdev) == TYPE_VF)
+ return -EINVAL;
+
+ if (!COMM_SUPPORT_CLP(dev))
+ return -EPERM;
+
+ err = hinic3_pf_clp_to_mgmt(dev, mod, cmd, buf_in, in_size, buf_out,
+ out_size);
+
+ return err;
+}
+
+int hinic3_clp_pf_to_mgmt_init(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_clp_pf_to_mgmt *clp_pf_to_mgmt;
+
+ if (!COMM_SUPPORT_CLP(hwdev))
+ return 0;
+
+ clp_pf_to_mgmt = kzalloc(sizeof(*clp_pf_to_mgmt), GFP_KERNEL);
+ if (!clp_pf_to_mgmt)
+ return -ENOMEM;
+
+ clp_pf_to_mgmt->clp_msg_buf = kzalloc(HINIC3_CLP_INPUT_BUF_LEN_HOST,
+ GFP_KERNEL);
+ if (!clp_pf_to_mgmt->clp_msg_buf) {
+ kfree(clp_pf_to_mgmt);
+ return -ENOMEM;
+ }
+ sema_init(&clp_pf_to_mgmt->clp_msg_lock, 1);
+
+ hwdev->clp_pf_to_mgmt = clp_pf_to_mgmt;
+
+ return 0;
+}
+
+void hinic3_clp_pf_to_mgmt_free(struct hinic3_hwdev *hwdev)
+{
+ struct hinic3_clp_pf_to_mgmt *clp_pf_to_mgmt = hwdev->clp_pf_to_mgmt;
+
+ if (!COMM_SUPPORT_CLP(hwdev))
+ return;
+
+ sema_deinit(&clp_pf_to_mgmt->clp_msg_lock);
+ kfree(clp_pf_to_mgmt->clp_msg_buf);
+ kfree(clp_pf_to_mgmt);
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.h
new file mode 100644
index 000000000000..ad86a82e7040
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_MGMT_H
+#define HINIC3_MGMT_H
+
+#include <linux/types.h>
+#include <linux/completion.h>
+#include <linux/semaphore.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+
+#include "comm_defs.h"
+#include "hinic3_hw.h"
+#include "hinic3_api_cmd.h"
+#include "hinic3_hwdev.h"
+
+#define HINIC3_MGMT_WQ_NAME "hinic3_mgmt"
+
+#define HINIC3_CLP_REG_GAP 0x20
+#define HINIC3_CLP_INPUT_BUF_LEN_HOST 4096UL
+#define HINIC3_CLP_DATA_UNIT_HOST 4UL
+
+enum clp_data_type {
+ HINIC3_CLP_REQ_HOST = 0,
+ HINIC3_CLP_RSP_HOST = 1
+};
+
+enum clp_reg_type {
+ HINIC3_CLP_BA_HOST = 0,
+ HINIC3_CLP_SIZE_HOST = 1,
+ HINIC3_CLP_LEN_HOST = 2,
+ HINIC3_CLP_START_REQ_HOST = 3,
+ HINIC3_CLP_READY_RSP_HOST = 4
+};
+
+#define HINIC3_CLP_REQ_SIZE_OFFSET 0
+#define HINIC3_CLP_RSP_SIZE_OFFSET 16
+#define HINIC3_CLP_BASE_OFFSET 0
+#define HINIC3_CLP_LEN_OFFSET 0
+#define HINIC3_CLP_START_OFFSET 31
+#define HINIC3_CLP_READY_OFFSET 31
+#define HINIC3_CLP_OFFSET(member) (HINIC3_CLP_##member##_OFFSET)
+
+#define HINIC3_CLP_SIZE_MASK 0x7ffUL
+#define HINIC3_CLP_BASE_MASK 0x7ffffffUL
+#define HINIC3_CLP_LEN_MASK 0x7ffUL
+#define HINIC3_CLP_START_MASK 0x1UL
+#define HINIC3_CLP_READY_MASK 0x1UL
+#define HINIC3_CLP_MASK(member) (HINIC3_CLP_##member##_MASK)
+
+#define HINIC3_CLP_DELAY_CNT_MAX 200UL
+#define HINIC3_CLP_SRAM_SIZE_REG_MAX 0x3ff
+#define HINIC3_CLP_SRAM_BASE_REG_MAX 0x7ffffff
+#define HINIC3_CLP_LEN_REG_MAX 0x3ff
+#define HINIC3_CLP_START_OR_READY_REG_MAX 0x1
+
+struct hinic3_recv_msg {
+ void *msg;
+
+ u16 msg_len;
+ u16 rsvd1;
+ enum hinic3_mod_type mod;
+
+ u16 cmd;
+ u8 seq_id;
+ u8 rsvd2;
+ u16 msg_id;
+ u16 rsvd3;
+
+ int async_mgmt_to_pf;
+ u32 rsvd4;
+
+ struct completion recv_done;
+};
+
+struct hinic3_msg_head {
+ u8 status;
+ u8 version;
+ u8 resp_aeq_num;
+ u8 rsvd0[5];
+};
+
+enum comm_pf_to_mgmt_event_state {
+ SEND_EVENT_UNINIT = 0,
+ SEND_EVENT_START,
+ SEND_EVENT_SUCCESS,
+ SEND_EVENT_FAIL,
+ SEND_EVENT_TIMEOUT,
+ SEND_EVENT_END,
+};
+
+enum hinic3_mgmt_msg_cb_state {
+ HINIC3_MGMT_MSG_CB_REG = 0,
+ HINIC3_MGMT_MSG_CB_RUNNING,
+};
+
+struct hinic3_clp_pf_to_mgmt {
+ struct semaphore clp_msg_lock;
+ void *clp_msg_buf;
+};
+
+struct hinic3_msg_pf_to_mgmt {
+ struct hinic3_hwdev *hwdev;
+
+ /* Async cmd can not be scheduling */
+ spinlock_t async_msg_lock;
+ struct semaphore sync_msg_lock;
+
+ struct workqueue_struct *workq;
+
+ void *async_msg_buf;
+ void *sync_msg_buf;
+ void *mgmt_ack_buf;
+
+ struct hinic3_recv_msg recv_msg_from_mgmt;
+ struct hinic3_recv_msg recv_resp_msg_from_mgmt;
+
+ u16 async_msg_id;
+ u16 sync_msg_id;
+ u32 rsvd1;
+ struct hinic3_api_cmd_chain *cmd_chain[HINIC3_API_CMD_MAX];
+
+ hinic3_mgmt_msg_cb recv_mgmt_msg_cb[HINIC3_MOD_HW_MAX];
+ void *recv_mgmt_msg_data[HINIC3_MOD_HW_MAX];
+ unsigned long mgmt_msg_cb_state[HINIC3_MOD_HW_MAX];
+
+ void *async_msg_cb_data[HINIC3_MOD_HW_MAX];
+
+ /* lock when sending msg */
+ spinlock_t sync_event_lock;
+ enum comm_pf_to_mgmt_event_state event_flag;
+ u64 rsvd2;
+};
+
+struct hinic3_mgmt_msg_handle_work {
+ struct work_struct work;
+ struct hinic3_msg_pf_to_mgmt *pf_to_mgmt;
+
+ void *msg;
+ u16 msg_len;
+ u16 rsvd1;
+
+ enum hinic3_mod_type mod;
+ u16 cmd;
+ u16 msg_id;
+
+ int async_mgmt_to_pf;
+};
+
+void hinic3_mgmt_msg_aeqe_handler(void *hwdev, u8 *header, u8 size);
+
+int hinic3_pf_to_mgmt_init(struct hinic3_hwdev *hwdev);
+
+void hinic3_pf_to_mgmt_free(struct hinic3_hwdev *hwdev);
+
+int hinic3_pf_to_mgmt_sync(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout);
+int hinic3_pf_to_mgmt_async(void *hwdev, u8 mod, u16 cmd, const void *buf_in,
+ u16 in_size);
+
+int hinic3_pf_msg_to_mgmt_sync(void *hwdev, u8 mod, u16 cmd, void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size,
+ u32 timeout);
+
+int hinic3_api_cmd_read_ack(void *hwdev, u8 dest, const void *cmd, u16 size,
+ void *ack, u16 ack_size);
+
+int hinic3_api_cmd_write_nack(void *hwdev, u8 dest, const void *cmd, u16 size);
+
+int hinic3_pf_clp_to_mgmt(void *hwdev, u8 mod, u16 cmd, const void *buf_in,
+ u16 in_size, void *buf_out, u16 *out_size);
+
+int hinic3_clp_pf_to_mgmt_init(struct hinic3_hwdev *hwdev);
+
+void hinic3_clp_pf_to_mgmt_free(struct hinic3_hwdev *hwdev);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.c
new file mode 100644
index 000000000000..bd39ee7d54a7
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.c
@@ -0,0 +1,974 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <net/sock.h>
+#include <linux/cdev.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/pci.h>
+
+#include "ossl_knl.h"
+#include "hinic3_mt.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_hw_cfg.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_lld.h"
+#include "hinic3_hw_mt.h"
+#include "hinic3_nictool.h"
+
+static int g_nictool_ref_cnt;
+
+static dev_t g_dev_id = {0};
+/*lint -save -e104 -e808*/
+static struct class *g_nictool_class;
+/*lint -restore*/
+static struct cdev g_nictool_cdev;
+
+#define HINIC3_MAX_BUF_SIZE (2048 * 1024)
+
+void *g_card_node_array[MAX_CARD_NUM] = {0};
+void *g_card_vir_addr[MAX_CARD_NUM] = {0};
+u64 g_card_phy_addr[MAX_CARD_NUM] = {0};
+int card_id;
+
+#define HIADM3_DEV_PATH "/dev/hinic3_nictool_dev"
+#define HIADM3_DEV_CLASS "hinic3_nictool_class"
+#define HIADM3_DEV_NAME "hinic3_nictool_dev"
+
+typedef int (*hw_driv_module)(struct hinic3_lld_dev *lld_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size);
+struct hw_drv_module_handle {
+ enum driver_cmd_type driv_cmd_name;
+ hw_driv_module driv_func;
+};
+
+static int get_single_card_info(struct hinic3_lld_dev *lld_dev, const void *buf_in,
+ u32 in_size, void *buf_out, u32 *out_size)
+{
+ if (!buf_out || *out_size != sizeof(struct card_info)) {
+ pr_err("buf_out is NULL, or out_size != %lu\n", sizeof(struct card_info));
+ return -EINVAL;
+ }
+
+ hinic3_get_card_info(hinic3_get_sdk_hwdev_by_lld(lld_dev), buf_out);
+
+ return 0;
+}
+
+static int is_driver_in_vm(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ bool in_host = false;
+
+ if (!buf_out || (*out_size != sizeof(u8))) {
+ pr_err("buf_out is NULL, or out_size != %lu\n", sizeof(u8));
+ return -EINVAL;
+ }
+
+ in_host = hinic3_is_in_host();
+ if (in_host)
+ *((u8 *)buf_out) = 0;
+ else
+ *((u8 *)buf_out) = 1;
+
+ return 0;
+}
+
+static int get_all_chip_id_cmd(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ if (*out_size != sizeof(struct nic_card_id) || !buf_out) {
+ pr_err("Invalid parameter: out_buf_size %u, expect %lu\n",
+ *out_size, sizeof(struct nic_card_id));
+ return -EFAULT;
+ }
+
+ hinic3_get_all_chip_id(buf_out);
+
+ return 0;
+}
+
+static int get_card_usr_api_chain_mem(int card_idx)
+{
+ unsigned char *tmp = NULL;
+ int i;
+
+ card_id = card_idx;
+ if (!g_card_vir_addr[card_idx]) {
+ g_card_vir_addr[card_idx] =
+ (void *)__get_free_pages(GFP_KERNEL,
+ DBGTOOL_PAGE_ORDER);
+ if (!g_card_vir_addr[card_idx]) {
+ pr_err("Alloc api chain memory fail for card %d!\n", card_idx);
+ return -EFAULT;
+ }
+
+ memset(g_card_vir_addr[card_idx], 0,
+ PAGE_SIZE * (1 << DBGTOOL_PAGE_ORDER));
+
+ g_card_phy_addr[card_idx] =
+ virt_to_phys(g_card_vir_addr[card_idx]);
+ if (!g_card_phy_addr[card_idx]) {
+ pr_err("phy addr for card %d is 0\n", card_idx);
+ free_pages((unsigned long)g_card_vir_addr[card_idx], DBGTOOL_PAGE_ORDER);
+ g_card_vir_addr[card_idx] = NULL;
+ return -EFAULT;
+ }
+
+ tmp = g_card_vir_addr[card_idx];
+ for (i = 0; i < (1 << DBGTOOL_PAGE_ORDER); i++) {
+ SetPageReserved(virt_to_page(tmp));
+ tmp += PAGE_SIZE;
+ }
+ }
+
+ return 0;
+}
+
+static void chipif_get_all_pf_dev_info(struct pf_dev_info *dev_info, int card_idx,
+ void **g_func_handle_array)
+{
+ u32 func_idx;
+ void *hwdev = NULL;
+ struct pci_dev *pdev = NULL;
+
+ for (func_idx = 0; func_idx < PF_DEV_INFO_NUM; func_idx++) {
+ hwdev = (void *)g_func_handle_array[func_idx];
+
+ dev_info[func_idx].phy_addr = g_card_phy_addr[card_idx];
+
+ if (!hwdev) {
+ dev_info[func_idx].bar0_size = 0;
+ dev_info[func_idx].bus = 0;
+ dev_info[func_idx].slot = 0;
+ dev_info[func_idx].func = 0;
+ } else {
+ pdev = (struct pci_dev *)hinic3_get_pcidev_hdl(hwdev);
+ dev_info[func_idx].bar0_size =
+ pci_resource_len(pdev, 0);
+ dev_info[func_idx].bus = pdev->bus->number;
+ dev_info[func_idx].slot = PCI_SLOT(pdev->devfn);
+ dev_info[func_idx].func = PCI_FUNC(pdev->devfn);
+ }
+ }
+}
+
+static int get_pf_dev_info(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ struct pf_dev_info *dev_info = buf_out;
+ struct card_node *card_info = hinic3_get_chip_node_by_lld(lld_dev);
+ int id, err;
+
+ if (!buf_out || *out_size != sizeof(struct pf_dev_info) * PF_DEV_INFO_NUM) {
+ pr_err("Invalid parameter: out_buf_size %u, expect %lu\n",
+ *out_size, sizeof(dev_info) * PF_DEV_INFO_NUM);
+ return -EFAULT;
+ }
+
+ err = sscanf(card_info->chip_name, HINIC3_CHIP_NAME "%d", &id);
+ if (err < 0) {
+ pr_err("Failed to get card id\n");
+ return err;
+ }
+
+ if (id >= MAX_CARD_NUM || id < 0) {
+ pr_err("chip id %d exceed limit[0-%d]\n", id, MAX_CARD_NUM - 1);
+ return -EINVAL;
+ }
+
+ chipif_get_all_pf_dev_info(dev_info, id, card_info->func_handle_array);
+
+ err = get_card_usr_api_chain_mem(id);
+ if (err) {
+ pr_err("Faile to get api chain memory for userspace %s\n",
+ card_info->chip_name);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static long dbgtool_knl_free_mem(int id)
+{
+ unsigned char *tmp = NULL;
+ int i;
+
+ if (!g_card_vir_addr[id])
+ return 0;
+
+ tmp = g_card_vir_addr[id];
+ for (i = 0; i < (1 << DBGTOOL_PAGE_ORDER); i++) {
+ ClearPageReserved(virt_to_page(tmp));
+ tmp += PAGE_SIZE;
+ }
+
+ free_pages((unsigned long)g_card_vir_addr[id], DBGTOOL_PAGE_ORDER);
+ g_card_vir_addr[id] = NULL;
+ g_card_phy_addr[id] = 0;
+
+ return 0;
+}
+
+static int free_knl_mem(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ struct card_node *card_info = hinic3_get_chip_node_by_lld(lld_dev);
+ int id, err;
+
+ err = sscanf(card_info->chip_name, HINIC3_CHIP_NAME "%d", &id);
+ if (err < 0) {
+ pr_err("Failed to get card id\n");
+ return err;
+ }
+
+ if (id >= MAX_CARD_NUM || id < 0) {
+ pr_err("chip id %d exceed limit[0-%d]\n", id, MAX_CARD_NUM - 1);
+ return -EINVAL;
+ }
+
+ dbgtool_knl_free_mem(id);
+
+ return 0;
+}
+
+static int card_info_param_valid(char *dev_name, const void *buf_out, u32 buf_out_size, int *id)
+{
+ int err;
+
+ if (!buf_out || buf_out_size != sizeof(struct hinic3_card_func_info)) {
+ pr_err("Invalid parameter: out_buf_size %u, expect %lu\n",
+ buf_out_size, sizeof(struct hinic3_card_func_info));
+ return -EINVAL;
+ }
+
+ err = memcmp(dev_name, HINIC3_CHIP_NAME, strlen(HINIC3_CHIP_NAME));
+ if (err) {
+ pr_err("Invalid chip name %s\n", dev_name);
+ return err;
+ }
+
+ err = sscanf(dev_name, HINIC3_CHIP_NAME "%d", id);
+ if (err < 0) {
+ pr_err("Failed to get card id\n");
+ return err;
+ }
+
+ if (*id >= MAX_CARD_NUM || *id < 0) {
+ pr_err("chip id %d exceed limit[0-%d]\n",
+ *id, MAX_CARD_NUM - 1);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int get_card_func_info(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ struct hinic3_card_func_info *card_func_info = buf_out;
+ struct card_node *card_info = hinic3_get_chip_node_by_lld(lld_dev);
+ int err, id = 0;
+
+ err = card_info_param_valid(card_info->chip_name, buf_out, *out_size, &id);
+ if (err)
+ return err;
+
+ hinic3_get_card_func_info_by_card_name(card_info->chip_name, card_func_info);
+
+ if (!card_func_info->num_pf) {
+ pr_err("None function found for %s\n", card_info->chip_name);
+ return -EFAULT;
+ }
+
+ err = get_card_usr_api_chain_mem(id);
+ if (err) {
+ pr_err("Faile to get api chain memory for userspace %s\n",
+ card_info->chip_name);
+ return -EFAULT;
+ }
+
+ card_func_info->usr_api_phy_addr = g_card_phy_addr[id];
+
+ return 0;
+}
+
+static int get_pf_cap_info(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ struct service_cap *func_cap = NULL;
+ struct hinic3_hwdev *hwdev = NULL;
+ struct card_node *card_info = hinic3_get_chip_node_by_lld(lld_dev);
+ struct svc_cap_info *svc_cap_info_in = (struct svc_cap_info *)buf_in;
+ struct svc_cap_info *svc_cap_info_out = (struct svc_cap_info *)buf_out;
+
+ if (*out_size != sizeof(struct svc_cap_info) || in_size != sizeof(struct svc_cap_info) ||
+ !buf_in || !buf_out) {
+ pr_err("Invalid parameter: out_buf_size %u, in_size: %u, expect %lu\n",
+ *out_size, in_size, sizeof(struct svc_cap_info));
+ return -EINVAL;
+ }
+
+ if (svc_cap_info_in->func_idx >= MAX_FUNCTION_NUM) {
+ pr_err("func_idx is illegal. func_idx: %u, max_num: %u\n",
+ svc_cap_info_in->func_idx, MAX_FUNCTION_NUM);
+ return -EINVAL;
+ }
+
+ lld_hold();
+ hwdev = (struct hinic3_hwdev *)(card_info->func_handle_array)[svc_cap_info_in->func_idx];
+ if (!hwdev) {
+ lld_put();
+ return -EINVAL;
+ }
+
+ func_cap = &hwdev->cfg_mgmt->svc_cap;
+ memcpy(&svc_cap_info_out->cap, func_cap, sizeof(struct service_cap));
+ lld_put();
+
+ return 0;
+}
+
+static int get_hw_drv_version(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ struct drv_version_info *ver_info = buf_out;
+ int err;
+
+ if (!buf_out) {
+ pr_err("Buf_out is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (*out_size != sizeof(*ver_info)) {
+ pr_err("Unexpect out buf size from user :%u, expect: %lu\n",
+ *out_size, sizeof(*ver_info));
+ return -EINVAL;
+ }
+
+ err = snprintf(ver_info->ver, sizeof(ver_info->ver), "%s %s", HINIC3_DRV_VERSION,
+ "2023-05-17_19:56:38");
+ if (err < 0)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int get_pf_id(struct hinic3_lld_dev *lld_dev, const void *buf_in, u32 in_size,
+ void *buf_out, u32 *out_size)
+{
+ struct hinic3_pf_info *pf_info = NULL;
+ struct card_node *chip_node = hinic3_get_chip_node_by_lld(lld_dev);
+ u32 port_id;
+ int err;
+
+ if (!chip_node)
+ return -ENODEV;
+
+ if (!buf_out || (*out_size != sizeof(*pf_info)) || !buf_in || in_size != sizeof(u32)) {
+ pr_err("Unexpect out buf size from user :%u, expect: %lu, in size:%u\n",
+ *out_size, sizeof(*pf_info), in_size);
+ return -EINVAL;
+ }
+
+ port_id = *((u32 *)buf_in);
+ pf_info = (struct hinic3_pf_info *)buf_out;
+ err = hinic3_get_pf_id(chip_node, port_id, &pf_info->pf_id, &pf_info->isvalid);
+ if (err)
+ return err;
+
+ *out_size = sizeof(*pf_info);
+
+ return 0;
+}
+
+struct hw_drv_module_handle hw_driv_module_cmd_handle[] = {
+ {FUNC_TYPE, get_func_type},
+ {GET_FUNC_IDX, get_func_id},
+ {GET_HW_STATS, (hw_driv_module)get_hw_driver_stats},
+ {CLEAR_HW_STATS, clear_hw_driver_stats},
+ {GET_SELF_TEST_RES, get_self_test_result},
+ {GET_CHIP_FAULT_STATS, (hw_driv_module)get_chip_faults_stats},
+ {GET_SINGLE_CARD_INFO, (hw_driv_module)get_single_card_info},
+ {IS_DRV_IN_VM, is_driver_in_vm},
+ {GET_CHIP_ID, get_all_chip_id_cmd},
+ {GET_PF_DEV_INFO, get_pf_dev_info},
+ {CMD_FREE_MEM, free_knl_mem},
+ {GET_CHIP_INFO, get_card_func_info},
+ {GET_FUNC_CAP, get_pf_cap_info},
+ {GET_DRV_VERSION, get_hw_drv_version},
+ {GET_PF_ID, get_pf_id},
+};
+
+static int alloc_tmp_buf(void *hwdev, struct msg_module *nt_msg, u32 in_size,
+ void **buf_in, u32 out_size, void **buf_out)
+{
+ int ret;
+
+ ret = alloc_buff_in(hwdev, nt_msg, in_size, buf_in);
+ if (ret) {
+ pr_err("Alloc tool cmd buff in failed\n");
+ return ret;
+ }
+
+ ret = alloc_buff_out(hwdev, nt_msg, out_size, buf_out);
+ if (ret) {
+ pr_err("Alloc tool cmd buff out failed\n");
+ goto out_free_buf_in;
+ }
+
+ return 0;
+
+out_free_buf_in:
+ free_buff_in(hwdev, nt_msg, *buf_in);
+
+ return ret;
+}
+
+static void free_tmp_buf(void *hwdev, struct msg_module *nt_msg,
+ void *buf_in, void *buf_out)
+{
+ free_buff_out(hwdev, nt_msg, buf_out);
+ free_buff_in(hwdev, nt_msg, buf_in);
+}
+
+static int send_to_hw_driver(struct hinic3_lld_dev *lld_dev, struct msg_module *nt_msg,
+ const void *buf_in, u32 in_size, void *buf_out, u32 *out_size)
+{
+ int index, num_cmds = sizeof(hw_driv_module_cmd_handle) /
+ sizeof(hw_driv_module_cmd_handle[0]);
+ enum driver_cmd_type cmd_type =
+ (enum driver_cmd_type)(nt_msg->msg_formate);
+ int err = 0;
+
+ for (index = 0; index < num_cmds; index++) {
+ if (cmd_type ==
+ hw_driv_module_cmd_handle[index].driv_cmd_name) {
+ err = hw_driv_module_cmd_handle[index].driv_func
+ (lld_dev, buf_in, in_size, buf_out, out_size);
+ break;
+ }
+ }
+
+ if (index == num_cmds) {
+ pr_err("Can't find callback for %d\n", cmd_type);
+ return -EINVAL;
+ }
+
+ return err;
+}
+
+static int send_to_service_driver(struct hinic3_lld_dev *lld_dev, struct msg_module *nt_msg,
+ const void *buf_in, u32 in_size, void *buf_out, u32 *out_size)
+{
+ const char **service_name = NULL;
+ enum hinic3_service_type type;
+ void *uld_dev = NULL;
+ int ret = -EINVAL;
+
+ service_name = hinic3_get_uld_names();
+ type = nt_msg->module - SEND_TO_SRV_DRV_BASE;
+ if (type >= SERVICE_T_MAX) {
+ pr_err("Ioctl input module id: %u is incorrectly\n", nt_msg->module);
+ return -EINVAL;
+ }
+
+ uld_dev = hinic3_get_uld_dev(lld_dev, type);
+ if (!uld_dev) {
+ if (nt_msg->msg_formate == GET_DRV_VERSION)
+ return 0;
+
+ pr_err("Can not get the uld dev correctly: %s, %s driver may be not register\n",
+ nt_msg->device_name, service_name[type]);
+ return -EINVAL;
+ }
+
+ if (g_uld_info[type].ioctl)
+ ret = g_uld_info[type].ioctl(uld_dev, nt_msg->msg_formate,
+ buf_in, in_size, buf_out, out_size);
+ uld_dev_put(lld_dev, type);
+
+ return ret;
+}
+
+static int nictool_exec_cmd(struct hinic3_lld_dev *lld_dev, struct msg_module *nt_msg,
+ void *buf_in, u32 in_size, void *buf_out, u32 *out_size)
+{
+ int ret = 0;
+
+ switch (nt_msg->module) {
+ case SEND_TO_HW_DRIVER:
+ ret = send_to_hw_driver(lld_dev, nt_msg, buf_in, in_size, buf_out, out_size);
+ break;
+ case SEND_TO_MPU:
+ ret = send_to_mpu(hinic3_get_sdk_hwdev_by_lld(lld_dev),
+ nt_msg, buf_in, in_size, buf_out, out_size);
+ break;
+ case SEND_TO_SM:
+ ret = send_to_sm(hinic3_get_sdk_hwdev_by_lld(lld_dev),
+ nt_msg, buf_in, in_size, buf_out, out_size);
+ break;
+ case SEND_TO_NPU:
+ ret = send_to_npu(hinic3_get_sdk_hwdev_by_lld(lld_dev),
+ nt_msg, buf_in, in_size, buf_out, out_size);
+ break;
+ default:
+ ret = send_to_service_driver(lld_dev, nt_msg, buf_in, in_size, buf_out, out_size);
+ break;
+ }
+
+ return ret;
+}
+
+static int cmd_parameter_valid(struct msg_module *nt_msg, unsigned long arg,
+ u32 *out_size_expect, u32 *in_size)
+{
+ if (copy_from_user(nt_msg, (void *)arg, sizeof(*nt_msg))) {
+ pr_err("Copy information from user failed\n");
+ return -EFAULT;
+ }
+
+ *out_size_expect = nt_msg->buf_out_size;
+ *in_size = nt_msg->buf_in_size;
+ if (*out_size_expect > HINIC3_MAX_BUF_SIZE ||
+ *in_size > HINIC3_MAX_BUF_SIZE) {
+ pr_err("Invalid in size: %u or out size: %u\n",
+ *in_size, *out_size_expect);
+ return -EFAULT;
+ }
+
+ nt_msg->device_name[IFNAMSIZ - 1] = '\0';
+
+ return 0;
+}
+
+static struct hinic3_lld_dev *get_lld_dev_by_nt_msg(struct msg_module *nt_msg)
+{
+ struct hinic3_lld_dev *lld_dev = NULL;
+
+ if (nt_msg->module >= SEND_TO_SRV_DRV_BASE && nt_msg->module < SEND_TO_DRIVER_MAX &&
+ nt_msg->module != SEND_TO_HW_DRIVER && nt_msg->msg_formate != GET_DRV_VERSION) {
+ lld_dev = hinic3_get_lld_dev_by_dev_name(nt_msg->device_name,
+ nt_msg->module - SEND_TO_SRV_DRV_BASE);
+ } else {
+ lld_dev = hinic3_get_lld_dev_by_chip_name(nt_msg->device_name);
+ if (!lld_dev)
+ lld_dev = hinic3_get_lld_dev_by_dev_name(nt_msg->device_name,
+ SERVICE_T_MAX);
+ }
+
+ if (nt_msg->module == SEND_TO_NIC_DRIVER && (nt_msg->msg_formate == GET_XSFP_INFO ||
+ nt_msg->msg_formate == GET_XSFP_PRESENT))
+ lld_dev = hinic3_get_lld_dev_by_chip_and_port(nt_msg->device_name,
+ nt_msg->port_id);
+
+ if (nt_msg->module == SEND_TO_CUSTOM_DRIVER &&
+ nt_msg->msg_formate == CMD_CUSTOM_BOND_GET_CHIP_NAME)
+ lld_dev = hinic3_get_lld_dev_by_dev_name(nt_msg->device_name, SERVICE_T_MAX);
+
+ return lld_dev;
+}
+
+static long hinicadm_k_unlocked_ioctl(struct file *pfile, unsigned long arg)
+{
+ struct hinic3_lld_dev *lld_dev = NULL;
+ struct msg_module nt_msg;
+ void *buf_out = NULL;
+ void *buf_in = NULL;
+ u32 out_size_expect = 0;
+ u32 out_size = 0;
+ u32 in_size = 0;
+ int ret = 0;
+
+ memset(&nt_msg, 0, sizeof(nt_msg));
+ if (cmd_parameter_valid(&nt_msg, arg, &out_size_expect, &in_size))
+ return -EFAULT;
+
+ lld_dev = get_lld_dev_by_nt_msg(&nt_msg);
+ if (!lld_dev) {
+ if (nt_msg.msg_formate != DEV_NAME_TEST)
+ pr_err("Can not find device %s for module %d\n",
+ nt_msg.device_name, nt_msg.module);
+
+ return -ENODEV;
+ }
+
+ if (nt_msg.msg_formate == DEV_NAME_TEST)
+ return 0;
+
+ ret = alloc_tmp_buf(hinic3_get_sdk_hwdev_by_lld(lld_dev), &nt_msg,
+ in_size, &buf_in, out_size_expect, &buf_out);
+ if (ret) {
+ pr_err("Alloc tmp buff failed\n");
+ goto out_free_lock;
+ }
+
+ out_size = out_size_expect;
+
+ ret = nictool_exec_cmd(lld_dev, &nt_msg, buf_in, in_size, buf_out, &out_size);
+ if (ret) {
+ pr_err("nictool_exec_cmd failed, module: %u, ret: %d.\n", nt_msg.module, ret);
+ goto out_free_buf;
+ }
+
+ if (out_size > out_size_expect) {
+ ret = -EFAULT;
+ pr_err("Out size is greater than expected out size from user: %u, out size: %u\n",
+ out_size_expect, out_size);
+ goto out_free_buf;
+ }
+
+ ret = copy_buf_out_to_user(&nt_msg, out_size, buf_out);
+ if (ret)
+ pr_err("Copy information to user failed\n");
+
+out_free_buf:
+ free_tmp_buf(hinic3_get_sdk_hwdev_by_lld(lld_dev), &nt_msg, buf_in, buf_out);
+
+out_free_lock:
+ lld_dev_put(lld_dev);
+ return (long)ret;
+}
+
+/**
+ * dbgtool_knl_ffm_info_rd - Read ffm information
+ * @para: the dbgtool parameter
+ * @dbgtool_info: the dbgtool info
+ **/
+static long dbgtool_knl_ffm_info_rd(struct dbgtool_param *para,
+ struct dbgtool_k_glb_info *dbgtool_info)
+{
+ /* Copy the ffm_info to user mode */
+ if (copy_to_user(para->param.ffm_rd, dbgtool_info->ffm,
+ (unsigned int)sizeof(struct ffm_record_info))) {
+ pr_err("Copy ffm_info to user fail\n");
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static long dbgtool_k_unlocked_ioctl(struct file *pfile,
+ unsigned int real_cmd,
+ unsigned long arg)
+{
+ long ret;
+ struct dbgtool_param param;
+ struct dbgtool_k_glb_info *dbgtool_info = NULL;
+ struct card_node *card_info = NULL;
+ int i;
+
+ (void)memset(¶m, 0, sizeof(param));
+
+ if (copy_from_user(¶m, (void *)arg, sizeof(param))) {
+ pr_err("Copy param from user fail\n");
+ return -EFAULT;
+ }
+
+ lld_hold();
+ for (i = 0; i < MAX_CARD_NUM; i++) {
+ card_info = (struct card_node *)g_card_node_array[i];
+ if (!card_info)
+ continue;
+ if (!strncmp(param.chip_name, card_info->chip_name, IFNAMSIZ))
+ break;
+ }
+
+ if (i == MAX_CARD_NUM || !card_info) {
+ lld_put();
+ pr_err("Can't find this card %s\n", param.chip_name);
+ return -EFAULT;
+ }
+
+ card_id = i;
+ dbgtool_info = (struct dbgtool_k_glb_info *)card_info->dbgtool_info;
+
+ down(&dbgtool_info->dbgtool_sem);
+
+ switch (real_cmd) {
+ case DBGTOOL_CMD_FFM_RD:
+ ret = dbgtool_knl_ffm_info_rd(¶m, dbgtool_info);
+ break;
+ case DBGTOOL_CMD_MSG_2_UP:
+ pr_err("Not suppose to use this cmd(0x%x).\n", real_cmd);
+ ret = 0;
+ break;
+
+ default:
+ pr_err("Dbgtool cmd(0x%x) not support now\n", real_cmd);
+ ret = -EFAULT;
+ }
+
+ up(&dbgtool_info->dbgtool_sem);
+
+ lld_put();
+
+ return ret;
+}
+
+static int nictool_k_release(struct inode *pnode, struct file *pfile)
+{
+ return 0;
+}
+
+static int nictool_k_open(struct inode *pnode, struct file *pfile)
+{
+ return 0;
+}
+
+static ssize_t nictool_k_read(struct file *pfile, char __user *ubuf,
+ size_t size, loff_t *ppos)
+{
+ return 0;
+}
+
+static ssize_t nictool_k_write(struct file *pfile, const char __user *ubuf,
+ size_t size, loff_t *ppos)
+{
+ return 0;
+}
+
+static long nictool_k_unlocked_ioctl(struct file *pfile,
+ unsigned int cmd, unsigned long arg)
+{
+ unsigned int real_cmd;
+
+ real_cmd = _IOC_NR(cmd);
+
+ return (real_cmd == NICTOOL_CMD_TYPE) ?
+ hinicadm_k_unlocked_ioctl(pfile, arg) :
+ dbgtool_k_unlocked_ioctl(pfile, real_cmd, arg);
+}
+
+static int hinic3_mem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+ unsigned long vmsize = vma->vm_end - vma->vm_start;
+ phys_addr_t offset = (phys_addr_t)vma->vm_pgoff << PAGE_SHIFT;
+ phys_addr_t phy_addr;
+
+ if (vmsize > (PAGE_SIZE * (1 << DBGTOOL_PAGE_ORDER))) {
+ pr_err("Map size = %lu is bigger than alloc\n", vmsize);
+ return -EAGAIN;
+ }
+
+ /* old version of tool set vma->vm_pgoff to 0 */
+ phy_addr = offset ? offset : g_card_phy_addr[card_id];
+
+ if (!phy_addr) {
+ pr_err("Card_id = %d physical address is 0\n", card_id);
+ return -EAGAIN;
+ }
+
+ /* Disable cache and write buffer in the mapping area */
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+ if (remap_pfn_range(vma, vma->vm_start, (phy_addr >> PAGE_SHIFT),
+ vmsize, vma->vm_page_prot)) {
+ pr_err("Remap pfn range failed.\n");
+ return -EAGAIN;
+ }
+
+ return 0;
+}
+
+static const struct file_operations fifo_operations = {
+ .owner = THIS_MODULE,
+ .release = nictool_k_release,
+ .open = nictool_k_open,
+ .read = nictool_k_read,
+ .write = nictool_k_write,
+ .unlocked_ioctl = nictool_k_unlocked_ioctl,
+ .mmap = hinic3_mem_mmap,
+};
+
+static void free_dbgtool_info(void *hwdev, struct card_node *chip_info)
+{
+ struct dbgtool_k_glb_info *dbgtool_info = NULL;
+ int err, id;
+
+ if (hinic3_func_type(hwdev) != TYPE_VF)
+ chip_info->func_handle_array[hinic3_global_func_id(hwdev)] = NULL;
+
+ if (--chip_info->func_num)
+ return;
+
+ err = sscanf(chip_info->chip_name, HINIC3_CHIP_NAME "%d", &id);
+ if (err < 0)
+ pr_err("Failed to get card id\n");
+
+ if (id < MAX_CARD_NUM)
+ g_card_node_array[id] = NULL;
+
+ dbgtool_info = chip_info->dbgtool_info;
+ /* FFM deinit */
+ kfree(dbgtool_info->ffm);
+ dbgtool_info->ffm = NULL;
+
+ kfree(dbgtool_info);
+ chip_info->dbgtool_info = NULL;
+
+ if (id < MAX_CARD_NUM)
+ (void)dbgtool_knl_free_mem(id);
+}
+
+static int alloc_dbgtool_info(void *hwdev, struct card_node *chip_info)
+{
+ struct dbgtool_k_glb_info *dbgtool_info = NULL;
+ int err, id = 0;
+
+ if (hinic3_func_type(hwdev) != TYPE_VF)
+ chip_info->func_handle_array[hinic3_global_func_id(hwdev)] = hwdev;
+
+ if (chip_info->func_num++)
+ return 0;
+
+ dbgtool_info = (struct dbgtool_k_glb_info *)
+ kzalloc(sizeof(struct dbgtool_k_glb_info), GFP_KERNEL);
+ if (!dbgtool_info) {
+ pr_err("Failed to allocate dbgtool_info\n");
+ goto dbgtool_info_fail;
+ }
+
+ chip_info->dbgtool_info = dbgtool_info;
+
+ /* FFM init */
+ dbgtool_info->ffm = (struct ffm_record_info *)
+ kzalloc(sizeof(struct ffm_record_info), GFP_KERNEL);
+ if (!dbgtool_info->ffm) {
+ pr_err("Failed to allocate cell contexts for a chain\n");
+ goto dbgtool_info_ffm_fail;
+ }
+
+ sema_init(&dbgtool_info->dbgtool_sem, 1);
+
+ err = sscanf(chip_info->chip_name, HINIC3_CHIP_NAME "%d", &id);
+ if (err < 0) {
+ pr_err("Failed to get card id\n");
+ goto sscanf_chdev_fail;
+ }
+
+ g_card_node_array[id] = chip_info;
+
+ return 0;
+
+sscanf_chdev_fail:
+ kfree(dbgtool_info->ffm);
+
+dbgtool_info_ffm_fail:
+ kfree(dbgtool_info);
+ chip_info->dbgtool_info = NULL;
+
+dbgtool_info_fail:
+ if (hinic3_func_type(hwdev) != TYPE_VF)
+ chip_info->func_handle_array[hinic3_global_func_id(hwdev)] = NULL;
+ chip_info->func_num--;
+ return -ENOMEM;
+}
+
+/**
+ * nictool_k_init - initialize the hw interface
+ **/
+/* temp for dbgtool_info */
+/*lint -e438*/
+int nictool_k_init(void *hwdev, void *chip_node)
+{
+ struct card_node *chip_info = (struct card_node *)chip_node;
+ struct device *pdevice = NULL;
+ int err;
+
+ err = alloc_dbgtool_info(hwdev, chip_info);
+ if (err)
+ return err;
+
+ if (g_nictool_ref_cnt++) {
+ /* already initialized */
+ return 0;
+ }
+
+ err = alloc_chrdev_region(&g_dev_id, 0, 1, HIADM3_DEV_NAME);
+ if (err) {
+ pr_err("Register nictool_dev failed(0x%x)\n", err);
+ goto alloc_chdev_fail;
+ }
+
+ /* Create equipment */
+ /*lint -save -e160*/
+ g_nictool_class = class_create(THIS_MODULE, HIADM3_DEV_CLASS);
+ /*lint -restore*/
+ if (IS_ERR(g_nictool_class)) {
+ pr_err("Create nictool_class fail\n");
+ err = -EFAULT;
+ goto class_create_err;
+ }
+
+ /* Initializing the character device */
+ cdev_init(&g_nictool_cdev, &fifo_operations);
+
+ /* Add devices to the operating system */
+ err = cdev_add(&g_nictool_cdev, g_dev_id, 1);
+ if (err < 0) {
+ pr_err("Add nictool_dev to operating system fail(0x%x)\n", err);
+ goto cdev_add_err;
+ }
+
+ /* Export device information to user space
+ * (/sys/class/class name/device name)
+ */
+ pdevice = device_create(g_nictool_class, NULL,
+ g_dev_id, NULL, HIADM3_DEV_NAME);
+ if (IS_ERR(pdevice)) {
+ pr_err("Export nictool device information to user space fail\n");
+ err = -EFAULT;
+ goto device_create_err;
+ }
+
+ pr_info("Register nictool_dev to system succeed\n");
+
+ return 0;
+
+device_create_err:
+ cdev_del(&g_nictool_cdev);
+
+cdev_add_err:
+ class_destroy(g_nictool_class);
+
+class_create_err:
+ g_nictool_class = NULL;
+ unregister_chrdev_region(g_dev_id, 1);
+
+alloc_chdev_fail:
+ g_nictool_ref_cnt--;
+ free_dbgtool_info(hwdev, chip_info);
+
+ return err;
+} /*lint +e438*/
+
+void nictool_k_uninit(void *hwdev, void *chip_node)
+{
+ struct card_node *chip_info = (struct card_node *)chip_node;
+
+ free_dbgtool_info(hwdev, chip_info);
+
+ if (!g_nictool_ref_cnt)
+ return;
+
+ if (--g_nictool_ref_cnt)
+ return;
+
+ if (!g_nictool_class || IS_ERR(g_nictool_class)) {
+ pr_err("Nictool class is NULL.\n");
+ return;
+ }
+
+ device_destroy(g_nictool_class, g_dev_id);
+ cdev_del(&g_nictool_cdev);
+ class_destroy(g_nictool_class);
+ g_nictool_class = NULL;
+
+ unregister_chrdev_region(g_dev_id, 1);
+
+ pr_info("Unregister nictool_dev succeed\n");
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.h
new file mode 100644
index 000000000000..f368133e341e
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_NICTOOL_H
+#define HINIC3_NICTOOL_H
+
+#include "hinic3_mt.h"
+#include "hinic3_crm.h"
+
+#ifndef MAX_SIZE
+#define MAX_SIZE (16)
+#endif
+
+#define DBGTOOL_PAGE_ORDER (10)
+
+#define MAX_CARD_NUM (64)
+
+int nictool_k_init(void *hwdev, void *chip_node);
+void nictool_k_uninit(void *hwdev, void *chip_node);
+
+void hinic3_get_all_chip_id(void *id_info);
+
+void hinic3_get_card_func_info_by_card_name
+ (const char *chip_name, struct hinic3_card_func_info *card_func);
+
+void hinic3_get_card_info(const void *hwdev, void *bufin);
+
+bool hinic3_is_in_host(void);
+
+int hinic3_get_pf_id(struct card_node *chip_node, u32 port_id, u32 *pf_id, u32 *isvalid);
+
+extern struct hinic3_uld_info g_uld_info[SERVICE_T_MAX];
+
+#endif
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_pci_id_tbl.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_pci_id_tbl.h
new file mode 100644
index 000000000000..d028ca62fab3
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_pci_id_tbl.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_PCI_ID_TBL_H
+#define HINIC3_PCI_ID_TBL_H
+
+#define PCI_VENDOR_ID_HUAWEI 0x19e5
+#define HINIC3_DEV_ID_STANDARD 0x0222
+#define HINIC3_DEV_ID_SDI_5_1_PF 0x0226
+#define HINIC3_DEV_ID_SDI_5_0_PF 0x0225
+#define HINIC3_DEV_ID_VF 0x375F
+#define HINIC3_DEV_ID_VF_HV 0x379F
+#define HINIC3_DEV_ID_SPU 0xAC00
+#endif
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.c
new file mode 100644
index 000000000000..fbb6198a30f6
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/kernel.h>
+#include <linux/semaphore.h>
+#include <linux/workqueue.h>
+
+#include "ossl_knl.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_profile.h"
+#include "hinic3_prof_adap.h"
+
+static bool is_match_prof_default_adapter(void *device)
+{
+ /* always match default profile adapter in standard scene */
+ return true;
+}
+
+struct hinic3_prof_adapter prof_adap_objs[] = {
+ /* Add prof adapter before default profile */
+ {
+ .type = PROF_ADAP_TYPE_DEFAULT,
+ .match = is_match_prof_default_adapter,
+ .init = NULL,
+ .deinit = NULL,
+ },
+};
+
+void hisdk3_init_profile_adapter(struct hinic3_hwdev *hwdev)
+{
+ u16 num_adap = ARRAY_SIZE(prof_adap_objs);
+
+ hwdev->prof_adap = hinic3_prof_init(hwdev, prof_adap_objs, num_adap,
+ (void *)&hwdev->prof_attr);
+ if (hwdev->prof_adap)
+ sdk_info(hwdev->dev_hdl, "Find profile adapter type: %d\n", hwdev->prof_adap->type);
+}
+
+void hisdk3_deinit_profile_adapter(struct hinic3_hwdev *hwdev)
+{
+ hinic3_prof_deinit(hwdev->prof_adap, hwdev->prof_attr);
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.h
new file mode 100644
index 000000000000..e244d1197d42
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.h
@@ -0,0 +1,109 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_PROF_ADAP_H
+#define HINIC3_PROF_ADAP_H
+
+#include <linux/workqueue.h>
+
+#include "hinic3_profile.h"
+#include "hinic3_hwdev.h"
+
+enum cpu_affinity_work_type {
+ WORK_TYPE_AEQ,
+ WORK_TYPE_MBOX,
+ WORK_TYPE_MGMT_MSG,
+ WORK_TYPE_COMM,
+};
+
+enum hisdk3_sw_features {
+ HISDK3_SW_F_CHANNEL_LOCK = BIT(0),
+};
+
+struct hisdk3_prof_ops {
+ void (*fault_recover)(void *data, u16 src, u16 level);
+ int (*get_work_cpu_affinity)(void *data, u32 work_type);
+ void (*probe_success)(void *data);
+ void (*remove_pre_handle)(struct hinic3_hwdev *hwdev);
+};
+
+struct hisdk3_prof_attr {
+ void *priv_data;
+ u64 hw_feature_cap;
+ u64 sw_feature_cap;
+ u64 dft_hw_feature;
+ u64 dft_sw_feature;
+
+ struct hisdk3_prof_ops *ops;
+};
+
+#define GET_PROF_ATTR_OPS(hwdev) \
+ ((hwdev)->prof_attr ? (hwdev)->prof_attr->ops : NULL)
+
+#ifdef static
+#undef static
+#define LLT_STATIC_DEF_SAVED
+#endif
+
+static inline int hisdk3_get_work_cpu_affinity(struct hinic3_hwdev *hwdev,
+ enum cpu_affinity_work_type type)
+{
+ struct hisdk3_prof_ops *ops = GET_PROF_ATTR_OPS(hwdev);
+
+ if (ops && ops->get_work_cpu_affinity)
+ return ops->get_work_cpu_affinity(hwdev->prof_attr->priv_data, type);
+
+ return WORK_CPU_UNBOUND;
+}
+
+static inline void hisdk3_fault_post_process(struct hinic3_hwdev *hwdev,
+ u16 src, u16 level)
+{
+ struct hisdk3_prof_ops *ops = GET_PROF_ATTR_OPS(hwdev);
+
+ if (ops && ops->fault_recover)
+ ops->fault_recover(hwdev->prof_attr->priv_data, src, level);
+}
+
+static inline void hisdk3_probe_success(struct hinic3_hwdev *hwdev)
+{
+ struct hisdk3_prof_ops *ops = GET_PROF_ATTR_OPS(hwdev);
+
+ if (ops && ops->probe_success)
+ ops->probe_success(hwdev->prof_attr->priv_data);
+}
+
+static inline bool hisdk3_sw_feature_en(const struct hinic3_hwdev *hwdev,
+ u64 feature_bit)
+{
+ if (!hwdev->prof_attr)
+ return false;
+
+ return (hwdev->prof_attr->sw_feature_cap & feature_bit) &&
+ (hwdev->prof_attr->dft_sw_feature & feature_bit);
+}
+
+#ifdef CONFIG_MODULE_PROF
+static inline void hisdk3_remove_pre_process(struct hinic3_hwdev *hwdev)
+{
+ struct hisdk3_prof_ops *ops = NULL;
+
+ if (!hwdev)
+ return;
+
+ ops = GET_PROF_ATTR_OPS(hwdev);
+
+ if (ops && ops->remove_pre_handle)
+ ops->remove_pre_handle(hwdev);
+}
+#else
+static inline void hisdk3_remove_pre_process(struct hinic3_hwdev *hwdev) {};
+#endif
+#define SW_FEATURE_EN(hwdev, f_bit) \
+ hisdk3_sw_feature_en(hwdev, HISDK3_SW_F_##f_bit)
+#define HISDK3_F_CHANNEL_LOCK_EN(hwdev) SW_FEATURE_EN(hwdev, CHANNEL_LOCK)
+
+void hisdk3_init_profile_adapter(struct hinic3_hwdev *hwdev);
+void hisdk3_deinit_profile_adapter(struct hinic3_hwdev *hwdev);
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sm_lt.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sm_lt.h
new file mode 100644
index 000000000000..e204a9815ea8
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sm_lt.h
@@ -0,0 +1,160 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef CHIPIF_SM_LT_H
+#define CHIPIF_SM_LT_H
+
+#include <linux/types.h>
+
+#define SM_LT_LOAD (0x12)
+#define SM_LT_STORE (0x14)
+
+#define SM_LT_NUM_OFFSET 13
+#define SM_LT_ABUF_FLG_OFFSET 12
+#define SM_LT_BC_OFFSET 11
+
+#define SM_LT_ENTRY_16B 16
+#define SM_LT_ENTRY_32B 32
+#define SM_LT_ENTRY_48B 48
+#define SM_LT_ENTRY_64B 64
+
+#define TBL_LT_OFFSET_DEFAULT 0
+
+#define SM_CACHE_LINE_SHFT 4 /* log2(16) */
+#define SM_CACHE_LINE_SIZE 16 /* the size of cache line */
+
+#define MAX_SM_LT_READ_LINE_NUM 4
+#define MAX_SM_LT_WRITE_LINE_NUM 3
+
+#define SM_LT_FULL_BYTEENB 0xFFFF
+
+#define TBL_GET_ENB3_MASK(bitmask) ((u16)(((bitmask) >> 32) & 0xFFFF))
+#define TBL_GET_ENB2_MASK(bitmask) ((u16)(((bitmask) >> 16) & 0xFFFF))
+#define TBL_GET_ENB1_MASK(bitmask) ((u16)((bitmask) & 0xFFFF))
+
+enum {
+ SM_LT_NUM_0 = 0, /* lt num = 0, load/store 16B */
+ SM_LT_NUM_1, /* lt num = 1, load/store 32B */
+ SM_LT_NUM_2, /* lt num = 2, load/store 48B */
+ SM_LT_NUM_3 /* lt num = 3, load 64B */
+};
+
+/* lt load request */
+union sml_lt_req_head {
+ struct {
+ u32 offset:8;
+ u32 pad:3;
+ u32 bc:1;
+ u32 abuf_flg:1;
+ u32 num:2;
+ u32 ack:1;
+ u32 op_id:5;
+ u32 instance:6;
+ u32 src:5;
+ } bs;
+
+ u32 value;
+};
+
+struct sml_lt_load_req {
+ u32 extra;
+ union sml_lt_req_head head;
+ u32 index;
+ u32 pad0;
+ u32 pad1;
+};
+
+struct sml_lt_store_req {
+ u32 extra;
+ union sml_lt_req_head head;
+ u32 index;
+ u32 byte_enb[2];
+ u8 write_data[48];
+};
+
+enum {
+ SM_LT_OFFSET_1 = 1,
+ SM_LT_OFFSET_2,
+ SM_LT_OFFSET_3,
+ SM_LT_OFFSET_4,
+ SM_LT_OFFSET_5,
+ SM_LT_OFFSET_6,
+ SM_LT_OFFSET_7,
+ SM_LT_OFFSET_8,
+ SM_LT_OFFSET_9,
+ SM_LT_OFFSET_10,
+ SM_LT_OFFSET_11,
+ SM_LT_OFFSET_12,
+ SM_LT_OFFSET_13,
+ SM_LT_OFFSET_14,
+ SM_LT_OFFSET_15
+};
+
+enum HINIC_CSR_API_DATA_OPERATION_ID {
+ HINIC_CSR_OPERATION_WRITE_CSR = 0x1E,
+ HINIC_CSR_OPERATION_READ_CSR = 0x1F
+};
+
+enum HINIC_CSR_API_DATA_NEED_RESPONSE_DATA {
+ HINIC_CSR_NO_RESP_DATA = 0,
+ HINIC_CSR_NEED_RESP_DATA = 1
+};
+
+enum HINIC_CSR_API_DATA_DATA_SIZE {
+ HINIC_CSR_DATA_SZ_32 = 0,
+ HINIC_CSR_DATA_SZ_64 = 1
+};
+
+struct hinic_csr_request_api_data {
+ u32 dw0;
+
+ union {
+ struct {
+ u32 reserved1:13;
+ /* this field indicates the write/read data size:
+ * 2'b00: 32 bits
+ * 2'b01: 64 bits
+ * 2'b10~2'b11:reserved
+ */
+ u32 data_size:2;
+ /* this field indicates that requestor expect receive a
+ * response data or not.
+ * 1'b0: expect not to receive a response data.
+ * 1'b1: expect to receive a response data.
+ */
+ u32 need_response:1;
+ /* this field indicates the operation that the requestor
+ * expected.
+ * 5'b1_1110: write value to csr space.
+ * 5'b1_1111: read register from csr space.
+ */
+ u32 operation_id:5;
+ u32 reserved2:6;
+ /* this field specifies the Src node ID for this API
+ * request message.
+ */
+ u32 src_node_id:5;
+ } bits;
+
+ u32 val32;
+ } dw1;
+
+ union {
+ struct {
+ /* it specifies the CSR address. */
+ u32 csr_addr:26;
+ u32 reserved3:6;
+ } bits;
+
+ u32 val32;
+ } dw2;
+
+ /* if data_size=2'b01, it is high 32 bits of write data. else, it is
+ * 32'hFFFF_FFFF.
+ */
+ u32 csr_write_data_h;
+ /* the low 32 bits of write data. */
+ u32 csr_write_data_l;
+};
+#endif
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sml_lt.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sml_lt.c
new file mode 100644
index 000000000000..b802104153c5
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sml_lt.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/spinlock.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+
+#include "ossl_knl.h"
+#include "hinic3_common.h"
+#include "hinic3_sm_lt.h"
+#include "hinic3_hw.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_api_cmd.h"
+#include "hinic3_mgmt.h"
+
+#define ACK 1
+#define NOACK 0
+
+#define LT_LOAD16_API_SIZE (16 + 4)
+#define LT_STORE16_API_SIZE (32 + 4)
+
+#ifndef HTONL
+#define HTONL(x) \
+ ((((x) & 0x000000ff) << 24) \
+ | (((x) & 0x0000ff00) << 8) \
+ | (((x) & 0x00ff0000) >> 8) \
+ | (((x) & 0xff000000) >> 24))
+#endif
+
+static inline void sm_lt_build_head(union sml_lt_req_head *head,
+ u8 instance_id,
+ u8 op_id, u8 ack,
+ u8 offset, u8 num)
+{
+ head->value = 0;
+ head->bs.instance = instance_id;
+ head->bs.op_id = op_id;
+ head->bs.ack = ack;
+ head->bs.num = num;
+ head->bs.abuf_flg = 0;
+ head->bs.bc = 1;
+ head->bs.offset = offset;
+ head->value = HTONL((head->value));
+}
+
+static inline void sm_lt_load_build_req(struct sml_lt_load_req *req,
+ u8 instance_id,
+ u8 op_id, u8 ack,
+ u32 lt_index,
+ u8 offset, u8 num)
+{
+ sm_lt_build_head(&req->head, instance_id, op_id, ack, offset, num);
+ req->extra = 0;
+ req->index = lt_index;
+ req->index = HTONL(req->index);
+ req->pad0 = 0;
+ req->pad1 = 0;
+}
+
+static void sml_lt_store_data(u32 *dst, const u32 *src, u8 num)
+{
+ switch (num) {
+ case SM_LT_NUM_2:
+ *(dst + SM_LT_OFFSET_11) = *(src + SM_LT_OFFSET_11);
+ *(dst + SM_LT_OFFSET_10) = *(src + SM_LT_OFFSET_10);
+ *(dst + SM_LT_OFFSET_9) = *(src + SM_LT_OFFSET_9);
+ *(dst + SM_LT_OFFSET_8) = *(src + SM_LT_OFFSET_8);
+ /*lint -fallthrough */
+ case SM_LT_NUM_1:
+ *(dst + SM_LT_OFFSET_7) = *(src + SM_LT_OFFSET_7);
+ *(dst + SM_LT_OFFSET_6) = *(src + SM_LT_OFFSET_6);
+ *(dst + SM_LT_OFFSET_5) = *(src + SM_LT_OFFSET_5);
+ *(dst + SM_LT_OFFSET_4) = *(src + SM_LT_OFFSET_4);
+ /*lint -fallthrough */
+ case SM_LT_NUM_0:
+ *(dst + SM_LT_OFFSET_3) = *(src + SM_LT_OFFSET_3);
+ *(dst + SM_LT_OFFSET_2) = *(src + SM_LT_OFFSET_2);
+ *(dst + SM_LT_OFFSET_1) = *(src + SM_LT_OFFSET_1);
+ *dst = *src;
+ break;
+ default:
+ break;
+ }
+}
+
+static inline void sm_lt_store_build_req(struct sml_lt_store_req *req,
+ u8 instance_id,
+ u8 op_id, u8 ack,
+ u32 lt_index,
+ u8 offset,
+ u8 num,
+ u16 byte_enb3,
+ u16 byte_enb2,
+ u16 byte_enb1,
+ u8 *data)
+{
+ sm_lt_build_head(&req->head, instance_id, op_id, ack, offset, num);
+ req->index = lt_index;
+ req->index = HTONL(req->index);
+ req->extra = 0;
+ req->byte_enb[0] = (u32)(byte_enb3);
+ req->byte_enb[0] = HTONL(req->byte_enb[0]);
+ req->byte_enb[1] = HTONL((((u32)byte_enb2) << 16) | byte_enb1);
+ sml_lt_store_data((u32 *)req->write_data, (u32 *)(void *)data, num);
+}
+
+int hinic3_dbg_lt_rd_16byte(void *hwdev, u8 dest, u8 instance,
+ u32 lt_index, u8 *data)
+{
+ struct sml_lt_load_req req;
+ int ret;
+
+ if (!hwdev)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ sm_lt_load_build_req(&req, instance, SM_LT_LOAD, ACK, lt_index, 0, 0);
+
+ ret = hinic3_api_cmd_read_ack(hwdev, dest, (u8 *)(&req),
+ LT_LOAD16_API_SIZE, (void *)data, 0x10);
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Read linear table 16byte fail, err: %d\n", ret);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+int hinic3_dbg_lt_wr_16byte_mask(void *hwdev, u8 dest, u8 instance,
+ u32 lt_index, u8 *data, u16 mask)
+{
+ struct sml_lt_store_req req;
+ int ret;
+
+ if (!hwdev || !data)
+ return -EFAULT;
+
+ if (!COMM_SUPPORT_API_CHAIN((struct hinic3_hwdev *)hwdev))
+ return -EPERM;
+
+ sm_lt_store_build_req(&req, instance, SM_LT_STORE, NOACK, lt_index,
+ 0, 0, 0, 0, mask, data);
+
+ ret = hinic3_api_cmd_write_nack(hwdev, dest, &req, LT_STORE16_API_SIZE);
+ if (ret) {
+ sdk_err(((struct hinic3_hwdev *)hwdev)->dev_hdl,
+ "Write linear table 16byte fail, err: %d\n", ret);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.c
new file mode 100644
index 000000000000..b23b69f3dbe7
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.c
@@ -0,0 +1,267 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [NIC]" fmt
+
+#include <linux/pci.h>
+#include <linux/interrupt.h>
+
+#include "ossl_knl.h"
+#include "hinic3_crm.h"
+#include "hinic3_hw.h"
+#include "hinic3_lld.h"
+#include "hinic3_dev_mgmt.h"
+#include "hinic3_sriov.h"
+
+static int hinic3_init_vf_hw(void *hwdev, u16 start_vf_id, u16 end_vf_id)
+{
+ u16 i, func_idx;
+ int err;
+
+ /* mbox msg channel resources will be freed during remove process */
+ err = hinic3_init_func_mbox_msg_channel(hwdev,
+ hinic3_func_max_vf(hwdev));
+ if (err != 0)
+ return err;
+
+ /* vf use 256K as default wq page size, and can't change it */
+ for (i = start_vf_id; i <= end_vf_id; i++) {
+ func_idx = hinic3_glb_pf_vf_offset(hwdev) + i;
+ err = hinic3_set_wq_page_size(hwdev, func_idx,
+ HINIC3_DEFAULT_WQ_PAGE_SIZE,
+ HINIC3_CHANNEL_COMM);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static int hinic3_deinit_vf_hw(void *hwdev, u16 start_vf_id, u16 end_vf_id)
+{
+ u16 func_idx, idx;
+
+ for (idx = start_vf_id; idx <= end_vf_id; idx++) {
+ func_idx = hinic3_glb_pf_vf_offset(hwdev) + idx;
+ hinic3_set_wq_page_size(hwdev, func_idx,
+ HINIC3_HW_WQ_PAGE_SIZE,
+ HINIC3_CHANNEL_COMM);
+ }
+
+ return 0;
+}
+
+#if !(defined(HAVE_SRIOV_CONFIGURE) || defined(HAVE_RHEL6_SRIOV_CONFIGURE))
+ssize_t hinic3_sriov_totalvfs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ return sprintf(buf, "%d\n", pci_sriov_get_totalvfs(pdev));
+}
+
+ssize_t hinic3_sriov_numvfs_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ return sprintf(buf, "%d\n", pci_num_vf(pdev));
+}
+
+/*lint -save -e713*/
+ssize_t hinic3_sriov_numvfs_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ int ret;
+ u16 num_vfs;
+ int cur_vfs, total_vfs;
+
+ ret = kstrtou16(buf, 0, &num_vfs);
+ if (ret < 0)
+ return ret;
+
+ cur_vfs = pci_num_vf(pdev);
+ total_vfs = pci_sriov_get_totalvfs(pdev);
+ if (num_vfs > total_vfs)
+ return -ERANGE;
+
+ if (num_vfs == cur_vfs)
+ return count; /* no change */
+
+ if (num_vfs == 0) {
+ /* disable VFs */
+ ret = hinic3_pci_sriov_configure(pdev, 0);
+ if (ret < 0)
+ return ret;
+ return count;
+ }
+
+ /* enable VFs */
+ if (cur_vfs) {
+ nic_warn(&pdev->dev, "%d VFs already enabled. Disable before enabling %d VFs\n",
+ cur_vfs, num_vfs);
+ return -EBUSY;
+ }
+
+ ret = hinic3_pci_sriov_configure(pdev, num_vfs);
+ if (ret < 0)
+ return ret;
+
+ if (ret != num_vfs)
+ nic_warn(&pdev->dev, "%d VFs requested; only %d enabled\n",
+ num_vfs, ret);
+
+ return count;
+}
+
+/*lint -restore*/
+#endif /* !(HAVE_SRIOV_CONFIGURE || HAVE_RHEL6_SRIOV_CONFIGURE) */
+
+int hinic3_pci_sriov_disable(struct pci_dev *dev)
+{
+#ifdef CONFIG_PCI_IOV
+ struct hinic3_sriov_info *sriov_info = NULL;
+ struct hinic3_event_info event = {0};
+ void *hwdev = NULL;
+ u16 tmp_vfs;
+
+ sriov_info = hinic3_get_sriov_info_by_pcidev(dev);
+ hwdev = hinic3_get_hwdev_by_pcidev(dev);
+ if (!hwdev) {
+ sdk_err(&dev->dev, "SR-IOV disable is not permitted, please wait...\n");
+ return -EPERM;
+ }
+
+ /* if SR-IOV is already disabled then there is nothing to do */
+ if (!sriov_info->sriov_enabled)
+ return 0;
+
+ if (test_and_set_bit(HINIC3_SRIOV_DISABLE, &sriov_info->state)) {
+ sdk_err(&dev->dev, "SR-IOV disable in process, please wait");
+ return -EPERM;
+ }
+
+ /* If our VFs are assigned we cannot shut down SR-IOV
+ * without causing issues, so just leave the hardware
+ * available but disabled
+ */
+ if (pci_vfs_assigned(dev)) {
+ clear_bit(HINIC3_SRIOV_DISABLE, &sriov_info->state);
+ sdk_warn(&dev->dev, "Unloading driver while VFs are assigned - VFs will not be deallocated\n");
+ return -EPERM;
+ }
+
+ event.service = EVENT_SRV_COMM;
+ event.type = EVENT_COMM_SRIOV_STATE_CHANGE;
+ ((struct hinic3_sriov_state_info *)(void *)event.event_data)->enable = 0;
+ hinic3_event_callback(hwdev, &event);
+
+ sriov_info->sriov_enabled = false;
+
+ /* disable iov and allow time for transactions to clear */
+ pci_disable_sriov(dev);
+
+ tmp_vfs = (u16)sriov_info->num_vfs;
+ sriov_info->num_vfs = 0;
+ hinic3_deinit_vf_hw(hwdev, 1, tmp_vfs);
+
+ clear_bit(HINIC3_SRIOV_DISABLE, &sriov_info->state);
+
+#endif
+
+ return 0;
+}
+
+int hinic3_pci_sriov_enable(struct pci_dev *dev, int num_vfs)
+{
+#ifdef CONFIG_PCI_IOV
+ struct hinic3_sriov_info *sriov_info = NULL;
+ struct hinic3_event_info event = {0};
+ void *hwdev = NULL;
+ int pre_existing_vfs = 0;
+ int err = 0;
+
+ sriov_info = hinic3_get_sriov_info_by_pcidev(dev);
+ hwdev = hinic3_get_hwdev_by_pcidev(dev);
+ if (!hwdev) {
+ sdk_err(&dev->dev, "SR-IOV enable is not permitted, please wait...\n");
+ return -EPERM;
+ }
+
+ if (test_and_set_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state)) {
+ sdk_err(&dev->dev, "SR-IOV enable in process, please wait, num_vfs %d\n",
+ num_vfs);
+ return -EPERM;
+ }
+
+ pre_existing_vfs = pci_num_vf(dev);
+
+ if (num_vfs > pci_sriov_get_totalvfs(dev)) {
+ clear_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state);
+ return -ERANGE;
+ }
+ if (pre_existing_vfs && pre_existing_vfs != num_vfs) {
+ err = hinic3_pci_sriov_disable(dev);
+ if (err) {
+ clear_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state);
+ return err;
+ }
+ } else if (pre_existing_vfs == num_vfs) {
+ clear_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state);
+ return num_vfs;
+ }
+
+ err = hinic3_init_vf_hw(hwdev, 1, (u16)num_vfs);
+ if (err) {
+ sdk_err(&dev->dev, "Failed to init vf in hardware before enable sriov, error %d\n",
+ err);
+ clear_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state);
+ return err;
+ }
+
+ err = pci_enable_sriov(dev, num_vfs);
+ if (err) {
+ sdk_err(&dev->dev, "Failed to enable SR-IOV, error %d\n", err);
+ clear_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state);
+ return err;
+ }
+
+ sriov_info->sriov_enabled = true;
+ sriov_info->num_vfs = num_vfs;
+
+ event.service = EVENT_SRV_COMM;
+ event.type = EVENT_COMM_SRIOV_STATE_CHANGE;
+ ((struct hinic3_sriov_state_info *)(void *)event.event_data)->enable = 1;
+ ((struct hinic3_sriov_state_info *)(void *)event.event_data)->num_vfs = (u16)num_vfs;
+ hinic3_event_callback(hwdev, &event);
+
+ clear_bit(HINIC3_SRIOV_ENABLE, &sriov_info->state);
+
+ return num_vfs;
+#else
+
+ return 0;
+#endif
+}
+
+int hinic3_pci_sriov_configure(struct pci_dev *dev, int num_vfs)
+{
+ struct hinic3_sriov_info *sriov_info = NULL;
+
+ sriov_info = hinic3_get_sriov_info_by_pcidev(dev);
+ if (!sriov_info)
+ return -EFAULT;
+
+ if (!test_bit(HINIC3_FUNC_PERSENT, &sriov_info->state))
+ return -EFAULT;
+
+ if (num_vfs == 0)
+ return hinic3_pci_sriov_disable(dev);
+ else
+ return hinic3_pci_sriov_enable(dev, num_vfs);
+}
+
+/*lint -restore*/
+
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.h b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.h
new file mode 100644
index 000000000000..4a640adf15b4
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef HINIC3_SRIOV_H
+#define HINIC3_SRIOV_H
+#include <linux/types.h>
+#include <linux/pci.h>
+
+#if !(defined(HAVE_SRIOV_CONFIGURE) || defined(HAVE_RHEL6_SRIOV_CONFIGURE))
+ssize_t hinic3_sriov_totalvfs_show(struct device *dev,
+ struct device_attribute *attr, char *buf);
+ssize_t hinic3_sriov_numvfs_show(struct device *dev,
+ struct device_attribute *attr, char *buf);
+ssize_t hinic3_sriov_numvfs_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count);
+#endif /* !(HAVE_SRIOV_CONFIGURE || HAVE_RHEL6_SRIOV_CONFIGURE) */
+
+enum hinic3_sriov_state {
+ HINIC3_SRIOV_DISABLE,
+ HINIC3_SRIOV_ENABLE,
+ HINIC3_FUNC_PERSENT,
+};
+
+struct hinic3_sriov_info {
+ bool sriov_enabled;
+ unsigned int num_vfs;
+ unsigned long state;
+};
+
+struct hinic3_sriov_info *hinic3_get_sriov_info_by_pcidev(struct pci_dev *pdev);
+int hinic3_pci_sriov_disable(struct pci_dev *dev);
+int hinic3_pci_sriov_enable(struct pci_dev *dev, int num_vfs);
+int hinic3_pci_sriov_configure(struct pci_dev *dev, int num_vfs);
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/hinic3_wq.c b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_wq.c
new file mode 100644
index 000000000000..2f5e0984e429
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/hinic3_wq.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": [COMM]" fmt
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/dma-mapping.h>
+#include <linux/device.h>
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "ossl_knl.h"
+#include "hinic3_common.h"
+#include "hinic3_hwdev.h"
+#include "hinic3_wq.h"
+
+#define WQ_MIN_DEPTH 64
+#define WQ_MAX_DEPTH 65536
+#define WQ_MAX_NUM_PAGES (PAGE_SIZE / sizeof(u64))
+
+static int wq_init_wq_block(struct hinic3_wq *wq)
+{
+ int i;
+
+ if (WQ_IS_0_LEVEL_CLA(wq)) {
+ wq->wq_block_paddr = wq->wq_pages[0].align_paddr;
+ wq->wq_block_vaddr = wq->wq_pages[0].align_vaddr;
+
+ return 0;
+ }
+
+ if (wq->num_wq_pages > WQ_MAX_NUM_PAGES) {
+ sdk_err(wq->dev_hdl, "num_wq_pages exceed limit: %lu\n",
+ WQ_MAX_NUM_PAGES);
+ return -EFAULT;
+ }
+
+ wq->wq_block_vaddr = dma_zalloc_coherent(wq->dev_hdl, PAGE_SIZE,
+ &wq->wq_block_paddr,
+ GFP_KERNEL);
+ if (!wq->wq_block_vaddr) {
+ sdk_err(wq->dev_hdl, "Failed to alloc wq block\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < wq->num_wq_pages; i++)
+ wq->wq_block_vaddr[i] =
+ cpu_to_be64(wq->wq_pages[i].align_paddr);
+
+ return 0;
+}
+
+static int wq_alloc_pages(struct hinic3_wq *wq)
+{
+ int i, page_idx, err;
+
+ wq->wq_pages = kcalloc(wq->num_wq_pages, sizeof(*wq->wq_pages),
+ GFP_KERNEL);
+ if (!wq->wq_pages) {
+ sdk_err(wq->dev_hdl, "Failed to alloc wq pages handle\n");
+ return -ENOMEM;
+ }
+
+ for (page_idx = 0; page_idx < wq->num_wq_pages; page_idx++) {
+ err = hinic3_dma_zalloc_coherent_align(wq->dev_hdl,
+ wq->wq_page_size,
+ wq->wq_page_size,
+ GFP_KERNEL,
+ &wq->wq_pages[page_idx]);
+ if (err) {
+ sdk_err(wq->dev_hdl, "Failed to alloc wq page\n");
+ goto free_wq_pages;
+ }
+ }
+
+ err = wq_init_wq_block(wq);
+ if (err)
+ goto free_wq_pages;
+
+ return 0;
+
+free_wq_pages:
+ for (i = 0; i < page_idx; i++)
+ hinic3_dma_free_coherent_align(wq->dev_hdl, &wq->wq_pages[i]);
+
+ kfree(wq->wq_pages);
+ wq->wq_pages = NULL;
+
+ return -ENOMEM;
+}
+
+static void wq_free_pages(struct hinic3_wq *wq)
+{
+ int i;
+
+ if (!WQ_IS_0_LEVEL_CLA(wq))
+ dma_free_coherent(wq->dev_hdl, PAGE_SIZE, wq->wq_block_vaddr,
+ wq->wq_block_paddr);
+
+ for (i = 0; i < wq->num_wq_pages; i++)
+ hinic3_dma_free_coherent_align(wq->dev_hdl, &wq->wq_pages[i]);
+
+ kfree(wq->wq_pages);
+ wq->wq_pages = NULL;
+}
+
+int hinic3_wq_create(void *hwdev, struct hinic3_wq *wq, u32 q_depth,
+ u16 wqebb_size)
+{
+ struct hinic3_hwdev *dev = hwdev;
+ u32 wq_page_size;
+
+ if (!wq || !dev) {
+ pr_err("Invalid wq or dev_hdl\n");
+ return -EINVAL;
+ }
+
+ if (q_depth < WQ_MIN_DEPTH || q_depth > WQ_MAX_DEPTH ||
+ (q_depth & (q_depth - 1)) || !wqebb_size ||
+ (wqebb_size & (wqebb_size - 1))) {
+ sdk_err(dev->dev_hdl, "Wq q_depth(%u) or wqebb_size(%u) is invalid\n",
+ q_depth, wqebb_size);
+ return -EINVAL;
+ }
+
+ wq_page_size = ALIGN(dev->wq_page_size, PAGE_SIZE);
+
+ memset(wq, 0, sizeof(*wq));
+ wq->dev_hdl = dev->dev_hdl;
+ wq->q_depth = q_depth;
+ wq->idx_mask = (u16)(q_depth - 1);
+ wq->wqebb_size = wqebb_size;
+ wq->wqebb_size_shift = (u16)ilog2(wq->wqebb_size);
+ wq->wq_page_size = wq_page_size;
+
+ wq->wqebbs_per_page = wq_page_size / wqebb_size;
+ /* In case of wq_page_size is larger than q_depth * wqebb_size */
+ if (wq->wqebbs_per_page > q_depth)
+ wq->wqebbs_per_page = q_depth;
+ wq->wqebbs_per_page_shift = (u16)ilog2(wq->wqebbs_per_page);
+ wq->wqebbs_per_page_mask = (u16)(wq->wqebbs_per_page - 1);
+ wq->num_wq_pages = (u16)(ALIGN(((u32)q_depth * wqebb_size),
+ wq_page_size) / wq_page_size);
+
+ return wq_alloc_pages(wq);
+}
+EXPORT_SYMBOL(hinic3_wq_create);
+
+void hinic3_wq_destroy(struct hinic3_wq *wq)
+{
+ if (!wq)
+ return;
+
+ wq_free_pages(wq);
+}
+EXPORT_SYMBOL(hinic3_wq_destroy);
diff --git a/drivers/net/ethernet/huawei/hinic3/hw/ossl_knl_linux.c b/drivers/net/ethernet/huawei/hinic3/hw/ossl_knl_linux.c
new file mode 100644
index 000000000000..fafbc2f23067
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/hw/ossl_knl_linux.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#include <linux/vmalloc.h>
+#include "ossl_knl_linux.h"
+
+#define OSSL_MINUTE_BASE (60)
+
+
+struct file *file_creat(const char *file_name)
+{
+ return filp_open(file_name, O_CREAT | O_RDWR | O_APPEND, 0);
+}
+
+struct file *file_open(const char *file_name)
+{
+ return filp_open(file_name, O_RDONLY, 0);
+}
+
+void file_close(struct file *file_handle)
+{
+ (void)filp_close(file_handle, NULL);
+}
+
+u32 get_file_size(struct file *file_handle)
+{
+ struct inode *file_inode;
+
+ file_inode = file_handle->f_inode;
+
+ return (u32)(file_inode->i_size);
+}
+
+void set_file_position(struct file *file_handle, u32 position)
+{
+ file_handle->f_pos = position;
+}
+
+int file_read(struct file *file_handle, char *log_buffer, u32 rd_length,
+ u32 *file_pos)
+{
+ return (int)kernel_read(file_handle, log_buffer, rd_length,
+ &file_handle->f_pos);
+}
+
+u32 file_write(struct file *file_handle, const char *log_buffer, u32 wr_length)
+{
+ return (u32)kernel_write(file_handle, log_buffer, wr_length,
+ &file_handle->f_pos);
+}
+
+static int _linux_thread_func(void *thread)
+{
+ struct sdk_thread_info *info = (struct sdk_thread_info *)thread;
+
+ while (!kthread_should_stop())
+ info->thread_fn(info->data);
+
+ return 0;
+}
+
+int creat_thread(struct sdk_thread_info *thread_info)
+{
+ thread_info->thread_obj = kthread_run(_linux_thread_func, thread_info,
+ thread_info->name);
+ if (!thread_info->thread_obj)
+ return -EFAULT;
+
+ return 0;
+}
+
+void stop_thread(struct sdk_thread_info *thread_info)
+{
+ if (thread_info->thread_obj)
+ (void)kthread_stop(thread_info->thread_obj);
+}
+
+void utctime_to_localtime(u64 utctime, u64 *localtime)
+{
+ *localtime = utctime - sys_tz.tz_minuteswest *
+ OSSL_MINUTE_BASE; /*lint !e647*/
+}
+
+#ifndef HAVE_TIMER_SETUP
+void initialize_timer(const void *adapter_hdl, struct timer_list *timer)
+{
+ if (!adapter_hdl || !timer)
+ return;
+
+ init_timer(timer);
+}
+#endif
+
+void add_to_timer(struct timer_list *timer, long period)
+{
+ if (!timer)
+ return;
+
+ add_timer(timer);
+}
+
+void stop_timer(struct timer_list *timer) {}
+
+void delete_timer(struct timer_list *timer)
+{
+ if (!timer)
+ return;
+
+ del_timer_sync(timer);
+}
+
+u64 ossl_get_real_time(void)
+{
+ struct timeval tv = {0};
+ u64 tv_msec;
+
+ do_gettimeofday(&tv);
+
+ tv_msec = (u64)tv.tv_sec * MSEC_PER_SEC + (u64)tv.tv_usec / USEC_PER_MSEC;
+ return tv_msec;
+}
diff --git a/drivers/net/ethernet/huawei/hinic3/mag_cmd.h b/drivers/net/ethernet/huawei/hinic3/mag_cmd.h
new file mode 100644
index 000000000000..3af40511c7b9
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/mag_cmd.h
@@ -0,0 +1,886 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2021-2021. All rights reserved.
+ * Description: serdes/mag cmd definition between driver and mpu
+ * Author: ETH group
+ * Create: 2021-07-30
+ */
+
+#ifndef MAG_CMD_H
+#define MAG_CMD_H
+
+#include "mgmt_msg_base.h"
+
+/* serdes/mag消息命令字定义 */
+enum mag_cmd {
+ /* serdes命令字,统一封装所有serdes命令 */
+ SERDES_CMD_PROCESS = 0,
+
+ /* mag命令字,按功能划分 */
+ /* 端口配置相关 0-29 */
+ MAG_CMD_SET_PORT_CFG = 1,
+ MAG_CMD_SET_PORT_ADAPT = 2,
+ MAG_CMD_CFG_LOOPBACK_MODE = 3,
+
+ MAG_CMD_GET_PORT_ENABLE = 5,
+ MAG_CMD_SET_PORT_ENABLE = 6,
+ MAG_CMD_GET_LINK_STATUS = 7,
+ MAG_CMD_SET_LINK_FOLLOW = 8,
+ MAG_CMD_SET_PMA_ENABLE = 9,
+ MAG_CMD_CFG_FEC_MODE = 10,
+
+ MAG_CMD_CFG_AN_TYPE = 12, /* reserved for future use */
+ MAG_CMD_CFG_LINK_TIME = 13,
+
+ MAG_CMD_SET_PANGEA_ADAPT = 15,
+
+ /* bios link配置相关 30-49 */
+ MAG_CMD_CFG_BIOS_LINK_CFG = 31,
+ MAG_CMD_RESTORE_LINK_CFG = 32,
+ MAG_CMD_ACTIVATE_BIOS_LINK_CFG = 33,
+
+ /* 光模块、LED、PHY等外设配置管理 50-99 */
+ /* LED */
+ MAG_CMD_SET_LED_CFG = 50,
+
+ /* PHY */
+ MAG_CMD_GET_PHY_INIT_STATUS = 55, /* reserved for future use */
+
+ /* 光模块 */
+ MAG_CMD_GET_XSFP_INFO = 60,
+ MAG_CMD_SET_XSFP_ENABLE = 61,
+ MAG_CMD_GET_XSFP_PRESENT = 62,
+ MAG_CMD_SET_XSFP_RW = 63, /* sfp/qsfp single byte read/write, for equipment test */
+ MAG_CMD_CFG_XSFP_TEMPERATURE = 64,
+
+ /* 事件上报 100-149 */
+ MAG_CMD_WIRE_EVENT = 100,
+ MAG_CMD_LINK_ERR_EVENT = 101,
+
+ /* DFX、Counter相关 */
+ MAG_CMD_EVENT_PORT_INFO = 150,
+ MAG_CMD_GET_PORT_STAT = 151,
+ MAG_CMD_CLR_PORT_STAT = 152,
+ MAG_CMD_GET_PORT_INFO = 153,
+ MAG_CMD_GET_PCS_ERR_CNT = 154,
+ MAG_CMD_GET_MAG_CNT = 155,
+ MAG_CMD_DUMP_ANTRAIN_INFO = 156,
+
+ MAG_CMD_MAX = 0xFF
+};
+
+/* serdes cmd struct define */
+#define CMD_ARRAY_BUF_SIZE 64
+#define SERDES_CMD_DATA_BUF_SIZE 512
+struct serdes_in_info {
+ u32 chip_id : 16;
+ u32 macro_id : 16;
+ u32 start_sds_id : 16;
+ u32 sds_num : 16;
+
+ u32 cmd_type : 8; /* reserved for iotype */
+ u32 sub_cmd : 8;
+ u32 rw : 1; /* 0: read, 1: write */
+ u32 rsvd : 15;
+
+ u32 val;
+ union {
+ char field[CMD_ARRAY_BUF_SIZE];
+ u32 addr;
+ u8 *ex_param;
+ };
+};
+
+struct serdes_out_info {
+ u32 str_len; /* out_str length */
+ u32 result_offset;
+ u32 type; /* 0:data; 1:string */
+ char out_str[SERDES_CMD_DATA_BUF_SIZE];
+};
+
+struct serdes_cmd_in {
+ struct mgmt_msg_head head;
+
+ struct serdes_in_info serdes_in;
+};
+
+struct serdes_cmd_out {
+ struct mgmt_msg_head head;
+
+ struct serdes_out_info serdes_out;
+};
+
+enum mag_cmd_port_speed {
+ PORT_SPEED_NOT_SET = 0,
+ PORT_SPEED_10MB = 1,
+ PORT_SPEED_100MB = 2,
+ PORT_SPEED_1GB = 3,
+ PORT_SPEED_10GB = 4,
+ PORT_SPEED_25GB = 5,
+ PORT_SPEED_40GB = 6,
+ PORT_SPEED_50GB = 7,
+ PORT_SPEED_100GB = 8,
+ PORT_SPEED_200GB = 9,
+ PORT_SPEED_UNKNOWN
+};
+
+enum mag_cmd_port_an {
+ PORT_AN_NOT_SET = 0,
+ PORT_CFG_AN_ON = 1,
+ PORT_CFG_AN_OFF = 2
+};
+
+enum mag_cmd_port_adapt {
+ PORT_ADAPT_NOT_SET = 0,
+ PORT_CFG_ADAPT_ON = 1,
+ PORT_CFG_ADAPT_OFF = 2
+};
+
+enum mag_cmd_port_sriov {
+ PORT_SRIOV_NOT_SET = 0,
+ PORT_CFG_SRIOV_ON = 1,
+ PORT_CFG_SRIOV_OFF = 2
+};
+
+enum mag_cmd_port_fec {
+ PORT_FEC_NOT_SET = 0,
+ PORT_FEC_RSFEC = 1,
+ PORT_FEC_BASEFEC = 2,
+ PORT_FEC_NOFEC = 3,
+ PORT_FEC_LLRSFEC = 4,
+ PORT_FEC_AUTO = 5
+};
+
+enum mag_cmd_port_lanes {
+ PORT_LANES_NOT_SET = 0,
+ PORT_LANES_X1 = 1,
+ PORT_LANES_X2 = 2,
+ PORT_LANES_X4 = 4,
+ PORT_LANES_X8 = 8 /* reserved for future use */
+};
+
+enum mag_cmd_port_duplex {
+ PORT_DUPLEX_HALF = 0,
+ PORT_DUPLEX_FULL = 1
+};
+
+enum mag_cmd_wire_node {
+ WIRE_NODE_UNDEF = 0,
+ CABLE_10G = 1,
+ FIBER_10G = 2,
+ CABLE_25G = 3,
+ FIBER_25G = 4,
+ CABLE_40G = 5,
+ FIBER_40G = 6,
+ CABLE_50G = 7,
+ FIBER_50G = 8,
+ CABLE_100G = 9,
+ FIBER_100G = 10,
+ CABLE_200G = 11,
+ FIBER_200G = 12,
+ WIRE_NODE_NUM
+};
+
+enum mag_cmd_cnt_type {
+ MAG_RX_RSFEC_DEC_CW_CNT = 0,
+ MAG_RX_RSFEC_CORR_CW_CNT = 1,
+ MAG_RX_RSFEC_UNCORR_CW_CNT = 2,
+ MAG_RX_PCS_BER_CNT = 3,
+ MAG_RX_PCS_ERR_BLOCK_CNT = 4,
+ MAG_RX_PCS_E_BLK_CNT = 5,
+ MAG_RX_PCS_DEC_ERR_BLK_CNT = 6,
+ MAG_RX_PCS_LANE_BIP_ERR_CNT = 7,
+ MAG_CNT_NUM
+};
+
+/* mag_cmd_set_port_cfg config bitmap */
+#define MAG_CMD_SET_SPEED 0x1
+#define MAG_CMD_SET_AUTONEG 0x2
+#define MAG_CMD_SET_FEC 0x4
+#define MAG_CMD_SET_LANES 0x8
+struct mag_cmd_set_port_cfg {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 rsvd0[3];
+
+ u32 config_bitmap;
+ u8 speed;
+ u8 autoneg;
+ u8 fec;
+ u8 lanes;
+ u8 rsvd1[20];
+};
+
+/* mag supported/advertised link mode bitmap */
+enum mag_cmd_link_mode {
+ LINK_MODE_GE = 0,
+ LINK_MODE_10GE_BASE_R = 1,
+ LINK_MODE_25GE_BASE_R = 2,
+ LINK_MODE_40GE_BASE_R4 = 3,
+ LINK_MODE_50GE_BASE_R = 4,
+ LINK_MODE_50GE_BASE_R2 = 5,
+ LINK_MODE_100GE_BASE_R = 6,
+ LINK_MODE_100GE_BASE_R2 = 7,
+ LINK_MODE_100GE_BASE_R4 = 8,
+ LINK_MODE_200GE_BASE_R2 = 9,
+ LINK_MODE_200GE_BASE_R4 = 10,
+ LINK_MODE_MAX_NUMBERS,
+
+ LINK_MODE_UNKNOWN = 0xFFFF
+};
+
+#define LINK_MODE_GE_BIT 0x1u
+#define LINK_MODE_10GE_BASE_R_BIT 0x2u
+#define LINK_MODE_25GE_BASE_R_BIT 0x4u
+#define LINK_MODE_40GE_BASE_R4_BIT 0x8u
+#define LINK_MODE_50GE_BASE_R_BIT 0x10u
+#define LINK_MODE_50GE_BASE_R2_BIT 0x20u
+#define LINK_MODE_100GE_BASE_R_BIT 0x40u
+#define LINK_MODE_100GE_BASE_R2_BIT 0x80u
+#define LINK_MODE_100GE_BASE_R4_BIT 0x100u
+#define LINK_MODE_200GE_BASE_R2_BIT 0x200u
+#define LINK_MODE_200GE_BASE_R4_BIT 0x400u
+
+#define CABLE_10GE_BASE_R_BIT LINK_MODE_10GE_BASE_R_BIT
+#define CABLE_25GE_BASE_R_BIT (LINK_MODE_25GE_BASE_R_BIT | LINK_MODE_10GE_BASE_R_BIT)
+#define CABLE_40GE_BASE_R4_BIT LINK_MODE_40GE_BASE_R4_BIT
+#define CABLE_50GE_BASE_R_BIT (LINK_MODE_50GE_BASE_R_BIT | LINK_MODE_25GE_BASE_R_BIT | \
+ LINK_MODE_10GE_BASE_R_BIT)
+#define CABLE_50GE_BASE_R2_BIT LINK_MODE_50GE_BASE_R2_BIT
+#define CABLE_100GE_BASE_R2_BIT (LINK_MODE_100GE_BASE_R2_BIT | LINK_MODE_50GE_BASE_R2_BIT)
+#define CABLE_100GE_BASE_R4_BIT (LINK_MODE_100GE_BASE_R4_BIT | LINK_MODE_40GE_BASE_R4_BIT)
+#define CABLE_200GE_BASE_R4_BIT (LINK_MODE_200GE_BASE_R4_BIT | LINK_MODE_100GE_BASE_R4_BIT | \
+ LINK_MODE_40GE_BASE_R4_BIT)
+
+struct mag_cmd_get_port_info {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 rsvd0[3];
+
+ u8 wire_type;
+ u8 an_support;
+ u8 an_en;
+ u8 duplex;
+
+ u8 speed;
+ u8 fec;
+ u8 lanes;
+ u8 rsvd1;
+
+ u32 supported_mode;
+ u32 advertised_mode;
+ u8 rsvd2[8];
+};
+
+#define MAG_CMD_OPCODE_GET 0
+#define MAG_CMD_OPCODE_SET 1
+struct mag_cmd_set_port_adapt {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 opcode; /* 0:get adapt info 1:set adapt */
+ u8 enable;
+ u8 rsvd0;
+ u32 speed_mode;
+ u32 rsvd1[3];
+};
+
+#define MAG_CMD_LP_MODE_SDS_S_TX2RX 1
+#define MAG_CMD_LP_MODE_SDS_P_RX2TX 2
+#define MAG_CMD_LP_MODE_SDS_P_TX2RX 3
+#define MAG_CMD_LP_MODE_MAC_RX2TX 4
+#define MAG_CMD_LP_MODE_MAC_TX2RX 5
+#define MAG_CMD_LP_MODE_TXDP2RXDP 6
+struct mag_cmd_cfg_loopback_mode {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 opcode; /* 0:get loopback mode 1:set loopback mode */
+ u8 lp_mode;
+ u8 lp_en; /* 0:disable 1:enable */
+
+ u32 rsvd0[2];
+};
+
+#define MAG_CMD_PORT_DISABLE 0x0
+#define MAG_CMD_TX_ENABLE 0x1
+#define MAG_CMD_RX_ENABLE 0x2
+/* the physical port is disable only when all pf of the port are set to down,
+ * if any pf is enable, the port is enable
+ */
+struct mag_cmd_set_port_enable {
+ struct mgmt_msg_head head;
+
+ u16 function_id; /* function_id should not more than the max support pf_id(32) */
+ u16 rsvd0;
+
+ u8 state; /* bitmap bit0:tx_en bit1:rx_en */
+ u8 rsvd1[3];
+};
+
+struct mag_cmd_get_port_enable {
+ struct mgmt_msg_head head;
+
+ u8 port;
+ u8 state; /* bitmap bit0:tx_en bit1:rx_en */
+ u8 rsvd0[2];
+};
+
+#define PMA_FOLLOW_DEFAULT 0x0
+#define PMA_FOLLOW_ENABLE 0x1
+#define PMA_FOLLOW_DISABLE 0x2
+#define PMA_FOLLOW_GET 0x4
+/* the physical port disable link follow only when all pf of the port are set to follow disable */
+struct mag_cmd_set_link_follow {
+ struct mgmt_msg_head head;
+
+ u16 function_id; /* function_id should not more than the max support pf_id(32) */
+ u16 rsvd0;
+
+ u8 follow;
+ u8 rsvd1[3];
+};
+
+/* firmware also use this cmd report link event to driver */
+struct mag_cmd_get_link_status {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 status; /* 0:link down 1:link up */
+ u8 rsvd0[2];
+};
+
+struct mag_cmd_set_pma_enable {
+ struct mgmt_msg_head head;
+
+ u16 function_id; /* function_id should not more than the max support pf_id(32) */
+ u16 enable;
+};
+
+struct mag_cmd_cfg_an_type {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 opcode; /* 0:get an type 1:set an type */
+ u8 rsvd0[2];
+
+ u32 an_type; /* 0:ieee 1:25G/50 eth consortium */
+};
+
+struct mag_cmd_get_link_time {
+ struct mgmt_msg_head head;
+ u8 port_id;
+ u8 rsvd0[3];
+
+ u32 link_up_begin;
+ u32 link_up_end;
+ u32 link_down_begin;
+ u32 link_down_end;
+};
+
+struct mag_cmd_cfg_fec_mode {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 opcode; /* 0:get fec mode 1:set fec mode */
+ u8 fec;
+ u8 rsvd0;
+};
+
+/* speed */
+#define PANGEA_ADAPT_10G_BITMAP 0xd
+#define PANGEA_ADAPT_25G_BITMAP 0x72
+#define PANGEA_ADAPT_40G_BITMAP 0x680
+#define PANGEA_ADAPT_100G_BITMAP 0x1900
+
+/* speed and fec */
+#define PANGEA_10G_NO_BITMAP 0x8
+#define PANGEA_10G_BASE_BITMAP 0x4
+#define PANGEA_25G_NO_BITMAP 0x10
+#define PANGEA_25G_BASE_BITMAP 0x20
+#define PANGEA_25G_RS_BITMAP 0x40
+#define PANGEA_40G_NO_BITMAP 0x400
+#define PANGEA_40G_BASE_BITMAP 0x200
+#define PANGEA_100G_NO_BITMAP 0x800
+#define PANGEA_100G_RS_BITMAP 0x1000
+
+/* adapt or fec */
+#define PANGEA_ADAPT_ADAPT_BITMAP 0x183
+#define PANGEA_ADAPT_NO_BITMAP 0xc18
+#define PANGEA_ADAPT_BASE_BITMAP 0x224
+#define PANGEA_ADAPT_RS_BITMAP 0x1040
+
+/* default cfg */
+#define PANGEA_ADAPT_CFG_10G_CR 0x200d
+#define PANGEA_ADAPT_CFG_10G_SRLR 0xd
+#define PANGEA_ADAPT_CFG_25G_CR 0x207f
+#define PANGEA_ADAPT_CFG_25G_SRLR 0x72
+#define PANGEA_ADAPT_CFG_40G_CR4 0x2680
+#define PANGEA_ADAPT_CFG_40G_SRLR4 0x680
+#define PANGEA_ADAPT_CFG_100G_CR4 0x3f80
+#define PANGEA_ADAPT_CFG_100G_SRLR4 0x1900
+typedef union {
+ struct {
+ u32 adapt_10g : 1; /* [0] adapt_10g */
+ u32 adapt_25g : 1; /* [1] adapt_25g */
+ u32 base_10g : 1; /* [2] base_10g */
+ u32 no_10g : 1; /* [3] no_10g */
+ u32 no_25g : 1; /* [4] no_25g */
+ u32 base_25g : 1; /* [5] base_25g */
+ u32 rs_25g : 1; /* [6] rs_25g */
+ u32 adapt_40g : 1; /* [7] adapt_40g */
+ u32 adapt_100g : 1; /* [8] adapt_100g */
+ u32 base_40g : 1; /* [9] base_40g */
+ u32 no_40g : 1; /* [10] no_40g */
+ u32 no_100g : 1; /* [11] no_100g */
+ u32 rs_100g : 1; /* [12] rs_100g */
+ u32 auto_neg : 1; /* [13] auto_neg */
+ u32 rsvd0 : 18; /* [31:14] reserved */
+ } bits;
+
+ u32 value;
+} pangea_adapt_bitmap_u;
+
+#define PANGEA_ADAPT_GET 0x0
+#define PANGEA_ADAPT_SET 0x1
+struct mag_cmd_set_pangea_adapt {
+ struct mgmt_msg_head head;
+
+ u16 port_id;
+ u8 opcode; /* 0:get adapt info 1:cfg adapt info */
+ u8 wire_type;
+
+ pangea_adapt_bitmap_u cfg_bitmap;
+ pangea_adapt_bitmap_u cur_bitmap;
+ u32 rsvd1[3];
+};
+
+struct mag_cmd_cfg_bios_link_cfg {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 opcode; /* 0:get bios link info 1:set bios link cfg */
+ u8 clear;
+ u8 rsvd0;
+
+ u32 wire_type;
+ u8 an_en;
+ u8 speed;
+ u8 fec;
+ u8 rsvd1;
+ u32 speed_mode;
+ u32 rsvd2[3];
+};
+
+struct mag_cmd_restore_link_cfg {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 rsvd[7];
+};
+
+struct mag_cmd_activate_bios_link_cfg {
+ struct mgmt_msg_head head;
+
+ u32 rsvd[8];
+};
+
+/* led type */
+enum mag_led_type {
+ MAG_CMD_LED_TYPE_ALARM = 0x0,
+ MAG_CMD_LED_TYPE_LOW_SPEED = 0x1,
+ MAG_CMD_LED_TYPE_HIGH_SPEED = 0x2
+};
+
+/* led mode */
+enum mag_led_mode {
+ MAG_CMD_LED_MODE_DEFAULT = 0x0,
+ MAG_CMD_LED_MODE_FORCE_ON = 0x1,
+ MAG_CMD_LED_MODE_FORCE_OFF = 0x2,
+ MAG_CMD_LED_MODE_FORCE_BLINK_1HZ = 0x3,
+ MAG_CMD_LED_MODE_FORCE_BLINK_2HZ = 0x4,
+ MAG_CMD_LED_MODE_FORCE_BLINK_4HZ = 0x5,
+ MAG_CMD_LED_MODE_1HZ = 0x6,
+ MAG_CMD_LED_MODE_2HZ = 0x7,
+ MAG_CMD_LED_MODE_4HZ = 0x8
+};
+
+/* the led is report alarm when any pf of the port is alram */
+struct mag_cmd_set_led_cfg {
+ struct mgmt_msg_head head;
+
+ u16 function_id;
+ u8 type;
+ u8 mode;
+};
+
+#define XSFP_INFO_MAX_SIZE 640
+/* xsfp wire type, refer to cmis protocol definition */
+enum mag_wire_type {
+ MAG_CMD_WIRE_TYPE_UNKNOWN = 0x0,
+ MAG_CMD_WIRE_TYPE_MM = 0x1,
+ MAG_CMD_WIRE_TYPE_SM = 0x2,
+ MAG_CMD_WIRE_TYPE_COPPER = 0x3,
+ MAG_CMD_WIRE_TYPE_ACC = 0x4,
+ MAG_CMD_WIRE_TYPE_BASET = 0x5,
+ MAG_CMD_WIRE_TYPE_AOC = 0x40,
+ MAG_CMD_WIRE_TYPE_ELECTRIC = 0x41,
+ MAG_CMD_WIRE_TYPE_BACKPLANE = 0x42
+};
+
+struct mag_cmd_get_xsfp_info {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 wire_type;
+ u16 out_len;
+ u32 rsvd;
+ u8 sfp_info[XSFP_INFO_MAX_SIZE];
+};
+
+#define MAG_CMD_XSFP_DISABLE 0x0
+#define MAG_CMD_XSFP_ENABLE 0x1
+/* the sfp is disable only when all pf of the port are set sfp down,
+ * if any pf is enable, the sfp is enable
+ */
+struct mag_cmd_set_xsfp_enable {
+ struct mgmt_msg_head head;
+
+ u32 port_id;
+ u32 status; /* 0:on 1:off */
+};
+
+#define MAG_CMD_XSFP_PRESENT 0x0
+#define MAG_CMD_XSFP_ABSENT 0x1
+struct mag_cmd_get_xsfp_present {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 abs_status; /* 0:present, 1:absent */
+ u8 rsvd[2];
+};
+
+#define MAG_CMD_XSFP_READ 0x0
+#define MAG_CMD_XSFP_WRITE 0x1
+struct mag_cmd_set_xsfp_rw {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 operation; /* 0: read; 1: write */
+ u8 value;
+ u8 rsvd0;
+ u32 devaddr;
+ u32 offset;
+ u32 rsvd1;
+};
+
+struct mag_cmd_cfg_xsfp_temperature {
+ struct mgmt_msg_head head;
+
+ u8 opcode; /* 0:read 1:write */
+ u8 rsvd0[3];
+ s32 max_temp;
+ s32 min_temp;
+};
+
+struct mag_cmd_get_xsfp_temperature {
+ struct mgmt_msg_head head;
+
+ s16 sfp_temp[8];
+ u8 rsvd[32];
+ s32 max_temp;
+ s32 min_temp;
+};
+
+/* xsfp plug event */
+struct mag_cmd_wire_event {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 status; /* 0:present, 1:absent */
+ u8 rsvd[2];
+};
+
+/* link err type definition */
+#define MAG_CMD_ERR_XSFP_UNKNOWN 0x0
+struct mag_cmd_link_err_event {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 link_err_type;
+ u8 rsvd[2];
+};
+
+#define MAG_PARAM_TYPE_DEFAULT_CFG 0x0
+#define MAG_PARAM_TYPE_BIOS_CFG 0x1
+#define MAG_PARAM_TYPE_TOOL_CFG 0x2
+#define MAG_PARAM_TYPE_FINAL_CFG 0x3
+#define MAG_PARAM_TYPE_WIRE_INFO 0x4
+#define MAG_PARAM_TYPE_ADAPT_INFO 0x5
+#define MAG_PARAM_TYPE_MAX_CNT 0x6
+struct param_head {
+ u8 valid_len;
+ u8 info_type;
+ u8 rsvd[2];
+};
+
+struct mag_port_link_param {
+ struct param_head head;
+
+ u8 an;
+ u8 fec;
+ u8 speed;
+ u8 rsvd0;
+
+ u32 used;
+ u32 an_fec_ability;
+ u32 an_speed_ability;
+ u32 an_pause_ability;
+};
+
+struct mag_port_wire_info {
+ struct param_head head;
+
+ u8 status;
+ u8 rsvd0[3];
+
+ u8 wire_type;
+ u8 default_fec;
+ u8 speed;
+ u8 rsvd1;
+ u32 speed_ability;
+};
+
+struct mag_port_adapt_info {
+ struct param_head head;
+
+ u32 adapt_en;
+ u32 flash_adapt;
+ u32 rsvd0[2];
+
+ u32 wire_node;
+ u32 an_en;
+ u32 speed;
+ u32 fec;
+};
+
+struct mag_port_param_info {
+ u8 parameter_cnt;
+ u8 lane_id;
+ u8 lane_num;
+ u8 rsvd0;
+
+ struct mag_port_link_param default_cfg;
+ struct mag_port_link_param bios_cfg;
+ struct mag_port_link_param tool_cfg;
+ struct mag_port_link_param final_cfg;
+
+ struct mag_port_wire_info wire_info;
+ struct mag_port_adapt_info adapt_info;
+};
+
+#define XSFP_VENDOR_NAME_LEN 16
+struct mag_cmd_event_port_info {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 event_type;
+ u8 rsvd0[2];
+
+ // 光模块相关
+ u8 vendor_name[XSFP_VENDOR_NAME_LEN];
+ u32 port_type; /* fiber / copper */
+ u32 port_sub_type; /* sr / lr */
+ u32 cable_length; /* 1/3/5m */
+ u8 cable_temp; /* 温度 */
+ u8 max_speed; /* 光模块最大速率 */
+ u8 sfp_type; /* sfp/qsfp */
+ u8 rsvd1;
+ u32 power[4]; /* 光功率 */
+
+ u8 an_state;
+ u8 fec;
+ u16 speed;
+
+ u8 gpio_insert; /* 0:present 1:absent */
+ u8 alos;
+ u8 rx_los;
+ u8 pma_ctrl;
+
+ u32 pma_fifo_reg;
+ u32 pma_signal_ok_reg;
+ u32 pcs_64_66b_reg;
+ u32 rf_lf;
+ u8 pcs_link;
+ u8 pcs_mac_link;
+ u8 tx_enable;
+ u8 rx_enable;
+ u32 pcs_err_cnt;
+
+ u8 eq_data[38];
+ u8 rsvd2[2];
+
+ u32 his_link_machine_state;
+ u32 cur_link_machine_state;
+ u8 his_machine_state_data[128];
+ u8 cur_machine_state_data[128];
+ u8 his_machine_state_length;
+ u8 cur_machine_state_length;
+
+ struct mag_port_param_info param_info;
+ u8 rsvd3[360];
+};
+
+struct mag_cmd_port_stats {
+ u64 mac_tx_fragment_pkt_num;
+ u64 mac_tx_undersize_pkt_num;
+ u64 mac_tx_undermin_pkt_num;
+ u64 mac_tx_64_oct_pkt_num;
+ u64 mac_tx_65_127_oct_pkt_num;
+ u64 mac_tx_128_255_oct_pkt_num;
+ u64 mac_tx_256_511_oct_pkt_num;
+ u64 mac_tx_512_1023_oct_pkt_num;
+ u64 mac_tx_1024_1518_oct_pkt_num;
+ u64 mac_tx_1519_2047_oct_pkt_num;
+ u64 mac_tx_2048_4095_oct_pkt_num;
+ u64 mac_tx_4096_8191_oct_pkt_num;
+ u64 mac_tx_8192_9216_oct_pkt_num;
+ u64 mac_tx_9217_12287_oct_pkt_num;
+ u64 mac_tx_12288_16383_oct_pkt_num;
+ u64 mac_tx_1519_max_bad_pkt_num;
+ u64 mac_tx_1519_max_good_pkt_num;
+ u64 mac_tx_oversize_pkt_num;
+ u64 mac_tx_jabber_pkt_num;
+ u64 mac_tx_bad_pkt_num;
+ u64 mac_tx_bad_oct_num;
+ u64 mac_tx_good_pkt_num;
+ u64 mac_tx_good_oct_num;
+ u64 mac_tx_total_pkt_num;
+ u64 mac_tx_total_oct_num;
+ u64 mac_tx_uni_pkt_num;
+ u64 mac_tx_multi_pkt_num;
+ u64 mac_tx_broad_pkt_num;
+ u64 mac_tx_pause_num;
+ u64 mac_tx_pfc_pkt_num;
+ u64 mac_tx_pfc_pri0_pkt_num;
+ u64 mac_tx_pfc_pri1_pkt_num;
+ u64 mac_tx_pfc_pri2_pkt_num;
+ u64 mac_tx_pfc_pri3_pkt_num;
+ u64 mac_tx_pfc_pri4_pkt_num;
+ u64 mac_tx_pfc_pri5_pkt_num;
+ u64 mac_tx_pfc_pri6_pkt_num;
+ u64 mac_tx_pfc_pri7_pkt_num;
+ u64 mac_tx_control_pkt_num;
+ u64 mac_tx_err_all_pkt_num;
+ u64 mac_tx_from_app_good_pkt_num;
+ u64 mac_tx_from_app_bad_pkt_num;
+
+ u64 mac_rx_fragment_pkt_num;
+ u64 mac_rx_undersize_pkt_num;
+ u64 mac_rx_undermin_pkt_num;
+ u64 mac_rx_64_oct_pkt_num;
+ u64 mac_rx_65_127_oct_pkt_num;
+ u64 mac_rx_128_255_oct_pkt_num;
+ u64 mac_rx_256_511_oct_pkt_num;
+ u64 mac_rx_512_1023_oct_pkt_num;
+ u64 mac_rx_1024_1518_oct_pkt_num;
+ u64 mac_rx_1519_2047_oct_pkt_num;
+ u64 mac_rx_2048_4095_oct_pkt_num;
+ u64 mac_rx_4096_8191_oct_pkt_num;
+ u64 mac_rx_8192_9216_oct_pkt_num;
+ u64 mac_rx_9217_12287_oct_pkt_num;
+ u64 mac_rx_12288_16383_oct_pkt_num;
+ u64 mac_rx_1519_max_bad_pkt_num;
+ u64 mac_rx_1519_max_good_pkt_num;
+ u64 mac_rx_oversize_pkt_num;
+ u64 mac_rx_jabber_pkt_num;
+ u64 mac_rx_bad_pkt_num;
+ u64 mac_rx_bad_oct_num;
+ u64 mac_rx_good_pkt_num;
+ u64 mac_rx_good_oct_num;
+ u64 mac_rx_total_pkt_num;
+ u64 mac_rx_total_oct_num;
+ u64 mac_rx_uni_pkt_num;
+ u64 mac_rx_multi_pkt_num;
+ u64 mac_rx_broad_pkt_num;
+ u64 mac_rx_pause_num;
+ u64 mac_rx_pfc_pkt_num;
+ u64 mac_rx_pfc_pri0_pkt_num;
+ u64 mac_rx_pfc_pri1_pkt_num;
+ u64 mac_rx_pfc_pri2_pkt_num;
+ u64 mac_rx_pfc_pri3_pkt_num;
+ u64 mac_rx_pfc_pri4_pkt_num;
+ u64 mac_rx_pfc_pri5_pkt_num;
+ u64 mac_rx_pfc_pri6_pkt_num;
+ u64 mac_rx_pfc_pri7_pkt_num;
+ u64 mac_rx_control_pkt_num;
+ u64 mac_rx_sym_err_pkt_num;
+ u64 mac_rx_fcs_err_pkt_num;
+ u64 mac_rx_send_app_good_pkt_num;
+ u64 mac_rx_send_app_bad_pkt_num;
+ u64 mac_rx_unfilter_pkt_num;
+};
+
+struct mag_cmd_port_stats_info {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 rsvd0[3];
+};
+
+struct mag_cmd_get_port_stat {
+ struct mgmt_msg_head head;
+
+ struct mag_cmd_port_stats counter;
+ u64 rsvd1[15];
+};
+
+struct mag_cmd_get_pcs_err_cnt {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 rsvd0[3];
+
+ u32 pcs_err_cnt;
+};
+
+struct mag_cmd_get_mag_cnt {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 len;
+ u8 rsvd0[2];
+
+ u32 mag_csr[128];
+};
+
+struct mag_cmd_dump_antrain_info {
+ struct mgmt_msg_head head;
+
+ u8 port_id;
+ u8 len;
+ u8 rsvd0[2];
+
+ u32 antrain_csr[256];
+};
+
+#define MAG_SFP_PORT_NUM 24
+/* 芯片光模块温度结构体定义 */
+struct mag_cmd_sfp_temp_in_info {
+ struct mgmt_msg_head head; /* 8B */
+ u8 opt_type; /* 0:read operation 1:cfg operation */
+ u8 rsv[3];
+ s32 max_temp; /* 芯片光模块阈值 */
+ s32 min_temp; /* 芯片光模块阈值 */
+};
+
+struct mag_cmd_sfp_temp_out_info {
+ struct mgmt_msg_head head; /* 8B */
+ s16 sfp_temp_data[MAG_SFP_PORT_NUM]; /* 读出的温度 */
+ s32 max_temp; /* 芯片光模块阈值 */
+ s32 min_temp; /* 芯片光模块阈值 */
+};
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/mgmt_msg_base.h b/drivers/net/ethernet/huawei/hinic3/mgmt_msg_base.h
new file mode 100644
index 000000000000..257bf6761df0
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/mgmt_msg_base.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) Huawei Technologies Co., Ltd. 2021-2022. All rights reserved.
+ * File Name : mgmt_msg_base.h
+ * Version : Initial Draft
+ * Created : 2021/6/28
+ * Last Modified :
+ * Description : COMM Command interfaces between Driver and MPU
+ * Function List :
+ */
+
+#ifndef MGMT_MSG_BASE_H
+#define MGMT_MSG_BASE_H
+
+#define MGMT_MSG_CMD_OP_SET 1
+#define MGMT_MSG_CMD_OP_GET 0
+
+#define MGMT_MSG_CMD_OP_START 1
+#define MGMT_MSG_CMD_OP_STOP 0
+
+struct mgmt_msg_head {
+ u8 status;
+ u8 version;
+ u8 rsvd0[6];
+};
+
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/nic_cfg_comm.h b/drivers/net/ethernet/huawei/hinic3/nic_cfg_comm.h
new file mode 100644
index 000000000000..9fb4232716da
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/nic_cfg_comm.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C), 2001-2021, Huawei Tech. Co., Ltd.
+ * File Name : nic_cfg_comm.h
+ * Version : Initial Draft
+ * Description : nic config common header file
+ * Function List :
+ * History :
+ * Modification: Created file
+ */
+
+#ifndef NIC_CFG_COMM_H
+#define NIC_CFG_COMM_H
+
+/* rss */
+#define HINIC3_RSS_TYPE_VALID_SHIFT 23
+#define HINIC3_RSS_TYPE_TCP_IPV6_EXT_SHIFT 24
+#define HINIC3_RSS_TYPE_IPV6_EXT_SHIFT 25
+#define HINIC3_RSS_TYPE_TCP_IPV6_SHIFT 26
+#define HINIC3_RSS_TYPE_IPV6_SHIFT 27
+#define HINIC3_RSS_TYPE_TCP_IPV4_SHIFT 28
+#define HINIC3_RSS_TYPE_IPV4_SHIFT 29
+#define HINIC3_RSS_TYPE_UDP_IPV6_SHIFT 30
+#define HINIC3_RSS_TYPE_UDP_IPV4_SHIFT 31
+
+#define HINIC3_RSS_TYPE_SET(val, member) (((u32)(val) & 0x1) << HINIC3_RSS_TYPE_##member##_SHIFT)
+#define HINIC3_RSS_TYPE_GET(val, member) (((u32)(val) >> HINIC3_RSS_TYPE_##member##_SHIFT) & 0x1)
+
+enum nic_rss_hash_type {
+ NIC_RSS_HASH_TYPE_XOR = 0,
+ NIC_RSS_HASH_TYPE_TOEP,
+
+ NIC_RSS_HASH_TYPE_MAX /* MUST BE THE LAST ONE */
+};
+
+#define NIC_RSS_INDIR_SIZE 256
+#define NIC_RSS_KEY_SIZE 40
+
+/* *
+ * Definition of the NIC receiving mode
+ */
+#define NIC_RX_MODE_UC 0x01
+#define NIC_RX_MODE_MC 0x02
+#define NIC_RX_MODE_BC 0x04
+#define NIC_RX_MODE_MC_ALL 0x08
+#define NIC_RX_MODE_PROMISC 0x10
+
+/* IEEE 802.1Qaz std */
+#define NIC_DCB_COS_MAX 0x8
+#define NIC_DCB_UP_MAX 0x8
+#define NIC_DCB_TC_MAX 0x8
+#define NIC_DCB_PG_MAX 0x8
+#define NIC_DCB_TSA_SP 0x0
+#define NIC_DCB_TSA_CBS 0x1 /* hi1822 do NOT support */
+#define NIC_DCB_TSA_ETS 0x2
+#define NIC_DCB_DSCP_NUM 0x8
+#define NIC_DCB_IP_PRI_MAX 0x40
+
+#define NIC_DCB_PRIO_DWRR 0x0
+#define NIC_DCB_PRIO_STRICT 0x1
+
+#define NIC_DCB_MAX_PFC_NUM 0x4
+#endif
diff --git a/drivers/net/ethernet/huawei/hinic3/ossl_knl.h b/drivers/net/ethernet/huawei/hinic3/ossl_knl.h
new file mode 100644
index 000000000000..fe14371c8f35
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/ossl_knl.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef OSSL_KNL_H
+#define OSSL_KNL_H
+
+#include "ossl_knl_linux.h"
+
+
+#define sdk_err(dev, format, ...) dev_err(dev, "[COMM]" format, ##__VA_ARGS__)
+#define sdk_warn(dev, format, ...) dev_warn(dev, "[COMM]" format, ##__VA_ARGS__)
+#define sdk_notice(dev, format, ...) dev_notice(dev, "[COMM]" format, ##__VA_ARGS__)
+#define sdk_info(dev, format, ...) dev_info(dev, "[COMM]" format, ##__VA_ARGS__)
+
+#define nic_err(dev, format, ...) dev_err(dev, "[NIC]" format, ##__VA_ARGS__)
+#define nic_warn(dev, format, ...) dev_warn(dev, "[NIC]" format, ##__VA_ARGS__)
+#define nic_notice(dev, format, ...) dev_notice(dev, "[NIC]" format, ##__VA_ARGS__)
+#define nic_info(dev, format, ...) dev_info(dev, "[NIC]" format, ##__VA_ARGS__)
+
+#ifndef BIG_ENDIAN
+#define BIG_ENDIAN 0x4321
+#endif
+
+#ifndef LITTLE_ENDIAN
+#define LITTLE_ENDIAN 0x1234
+#endif
+
+#ifdef BYTE_ORDER
+#undef BYTE_ORDER
+#endif
+/* X86 */
+#define BYTE_ORDER LITTLE_ENDIAN
+#define USEC_PER_MSEC 1000L
+#define MSEC_PER_SEC 1000L
+
+#endif /* OSSL_KNL_H */
diff --git a/drivers/net/ethernet/huawei/hinic3/ossl_knl_linux.h b/drivers/net/ethernet/huawei/hinic3/ossl_knl_linux.h
new file mode 100644
index 000000000000..1bda9e99355c
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic3/ossl_knl_linux.h
@@ -0,0 +1,284 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2021 Huawei Technologies Co., Ltd */
+
+#ifndef OSSL_KNL_LINUX_H_
+#define OSSL_KNL_LINUX_H_
+
+#include <net/checksum.h>
+#include <net/ipv6.h>
+#include <linux/string.h>
+#include <linux/pci.h>
+#include <linux/device.h>
+#include <linux/version.h>
+#include <linux/ethtool.h>
+#include <linux/fs.h>
+#include <linux/kthread.h>
+#include <linux/if_vlan.h>
+#include <linux/udp.h>
+#include <linux/highmem.h>
+#include <linux/list.h>
+#include <linux/bitmap.h>
+#include <linux/slab.h>
+
+#ifndef NETIF_F_SCTP_CSUM
+#define NETIF_F_SCTP_CSUM 0
+#endif
+
+#ifndef __GFP_COLD
+#define __GFP_COLD 0
+#endif
+
+#ifndef __GFP_COMP
+#define __GFP_COMP 0
+#endif
+
+#undef __always_unused
+#define __always_unused __attribute__((__unused__))
+
+#define ETH_TYPE_TRANS_SETS_DEV
+#define HAVE_NETDEV_STATS_IN_NETDEV
+
+#ifndef HAVE_SET_RX_MODE
+#define HAVE_SET_RX_MODE
+#endif
+
+#define HAVE_INET6_IFADDR_LIST
+
+#define HAVE_NDO_GET_STATS64
+
+#ifndef HAVE_MQPRIO
+#define HAVE_MQPRIO
+#endif
+
+#ifndef HAVE_SETUP_TC
+#define HAVE_SETUP_TC
+#endif
+
+#ifndef HAVE_NDO_SET_FEATURES
+#define HAVE_NDO_SET_FEATURES
+#endif
+
+#define HAVE_IRQ_AFFINITY_NOTIFY
+
+#define HAVE_ETHTOOL_SET_PHYS_ID
+
+#define HAVE_NETDEV_WANTED_FEAUTES
+
+#ifndef HAVE_PCI_DEV_FLAGS_ASSIGNED
+#define HAVE_PCI_DEV_FLAGS_ASSIGNED
+#define HAVE_VF_SPOOFCHK_CONFIGURE
+#endif
+
+#ifndef HAVE_SKB_L4_RXHASH
+#define HAVE_SKB_L4_RXHASH
+#endif
+
+#define HAVE_ETHTOOL_GRXFHINDIR_SIZE
+#define HAVE_INT_NDO_VLAN_RX_ADD_VID
+
+#ifdef ETHTOOL_SRXNTUPLE
+#undef ETHTOOL_SRXNTUPLE
+#endif
+
+#define _kc_kmap_atomic(page) kmap_atomic(page)
+#define _kc_kunmap_atomic(addr) kunmap_atomic(addr)
+
+#include <linux/of_net.h>
+
+#define HAVE_FDB_OPS
+#define HAVE_ETHTOOL_GET_TS_INFO
+#define HAVE_NAPI_GRO_FLUSH_OLD
+
+#ifndef HAVE_SRIOV_CONFIGURE
+#define HAVE_SRIOV_CONFIGURE
+#endif
+
+#define HAVE_ENCAP_TSO_OFFLOAD
+#define HAVE_SKB_INNER_NETWORK_HEADER
+
+#define HAVE_NDO_SET_VF_LINK_STATE
+#define HAVE_SKB_INNER_PROTOCOL
+#define HAVE_MPLS_FEATURES
+
+#define HAVE_NDO_GET_PHYS_PORT_ID
+#define HAVE_NETIF_SET_XPS_QUEUE_CONST_MASK
+
+#define HAVE_VXLAN_CHECKS
+#define HAVE_NDO_SELECT_QUEUE_ACCEL
+#define HAVE_NET_GET_RANDOM_ONCE
+#define HAVE_HWMON_DEVICE_REGISTER_WITH_GROUPS
+
+#define HAVE_NDO_SELECT_QUEUE_ACCEL_FALLBACK
+
+#define HAVE_NDO_SET_VF_MIN_MAX_TX_RATE
+#define HAVE_VLAN_FIND_DEV_DEEP_RCU
+
+#define HAVE_SKBUFF_CSUM_LEVEL
+#define HAVE_MULTI_VLAN_OFFLOAD_EN
+#define HAVE_ETH_GET_HEADLEN_FUNC
+
+#define HAVE_RXFH_HASHFUNC
+
+#define HAVE_NDO_SET_VF_TRUST
+
+#include <net/devlink.h>
+
+#define HAVE_IO_MAP_WC_SIZE
+
+#define HAVE_NETDEVICE_MIN_MAX_MTU
+
+#define HAVE_VOID_NDO_GET_STATS64
+#define HAVE_VM_OPS_FAULT_NO_VMA
+
+#define HAVE_HWTSTAMP_FILTER_NTP_ALL
+#define HAVE_NDO_SETUP_TC_CHAIN_INDEX
+#define HAVE_PCI_ERROR_HANDLER_RESET_PREPARE
+#define HAVE_PTP_CLOCK_DO_AUX_WORK
+
+#define HAVE_NDO_SETUP_TC_REMOVE_TC_TO_NETDEV
+
+#define HAVE_XDP_SUPPORT
+
+#define HAVE_NDO_BPF_NETDEV_BPF
+#define HAVE_TIMER_SETUP
+#define HAVE_XDP_DATA_META
+
+#define HAVE_MACRO_VM_FAULT_T
+
+#define HAVE_NDO_SELECT_QUEUE_SB_DEV
+
+#define dev_open(x) dev_open(x, NULL)
+#define HAVE_NEW_ETHTOOL_LINK_SETTINGS_ONLY
+
+#ifndef get_ds
+#define get_ds() (KERNEL_DS)
+#endif
+
+#ifndef dma_zalloc_coherent
+#define dma_zalloc_coherent(d, s, h, f) _hinic3_dma_zalloc_coherent(d, s, h, f)
+static inline void *_hinic3_dma_zalloc_coherent(struct device *dev,
+ size_t size, dma_addr_t *dma_handle,
+ gfp_t gfp)
+{
+ /* Above kernel 5.0, fixed up all remaining architectures
+ * to zero the memory in dma_alloc_coherent, and made
+ * dma_zalloc_coherent a no-op wrapper around dma_alloc_coherent,
+ * which fixes all of the above issues.
+ */
+ return dma_alloc_coherent(dev, size, dma_handle, gfp);
+}
+#endif
+
+struct timeval {
+ __kernel_old_time_t tv_sec; /* seconds */
+ __kernel_suseconds_t tv_usec; /* microseconds */
+};
+
+#ifndef do_gettimeofday
+#define do_gettimeofday(time) _kc_do_gettimeofday(time)
+static inline void _kc_do_gettimeofday(struct timeval *tv)
+{
+ struct timespec64 ts;
+
+ ktime_get_real_ts64(&ts);
+ tv->tv_sec = ts.tv_sec;
+ tv->tv_usec = ts.tv_nsec / NSEC_PER_USEC;
+}
+#endif
+
+#define HAVE_NDO_SELECT_QUEUE_SB_DEV_ONLY
+#define ETH_GET_HEADLEN_NEED_DEV
+#define HAVE_GENL_OPS_FIELD_VALIDATE
+
+#ifndef FIELD_SIZEOF
+#define FIELD_SIZEOF(t, f) (sizeof(((t *)0)->f))
+#endif
+
+#define HAVE_DEVLINK_FLASH_UPDATE_PARAMS
+
+#ifndef rtc_time_to_tm
+#define rtc_time_to_tm rtc_time64_to_tm
+#endif
+#define HAVE_NDO_TX_TIMEOUT_TXQ
+#define HAVE_PROC_OPS
+
+#define SUPPORTED_COALESCE_PARAMS
+
+#ifndef pci_cleanup_aer_uncorrect_error_status
+#define pci_cleanup_aer_uncorrect_error_status pci_aer_clear_nonfatal_status
+#endif
+
+#define HAVE_XDP_FRAME_SZ
+
+#define HAVE_DEVLINK_FW_FILE_NAME_MEMBER
+
+#define HAVE_ENCAPSULATION_TSO
+
+#define HAVE_ENCAPSULATION_CSUM
+
+#ifndef netdev_hw_addr_list_for_each
+#define netdev_hw_addr_list_for_each(ha, l) \
+ list_for_each_entry(ha, &(l)->list, list)
+#endif
+
+#define spin_lock_deinit(lock)
+
+struct file *file_creat(const char *file_name);
+
+struct file *file_open(const char *file_name);
+
+void file_close(struct file *file_handle);
+
+u32 get_file_size(struct file *file_handle);
+
+void set_file_position(struct file *file_handle, u32 position);
+
+int file_read(struct file *file_handle, char *log_buffer, u32 rd_length,
+ u32 *file_pos);
+
+u32 file_write(struct file *file_handle, const char *log_buffer, u32 wr_length);
+
+struct sdk_thread_info {
+ struct task_struct *thread_obj;
+ char *name;
+ void (*thread_fn)(void *x);
+ void *thread_event;
+ void *data;
+};
+
+int creat_thread(struct sdk_thread_info *thread_info);
+
+void stop_thread(struct sdk_thread_info *thread_info);
+
+#define destroy_work(work)
+
+void utctime_to_localtime(u64 utctime, u64 *localtime);
+
+#ifndef HAVE_TIMER_SETUP
+void initialize_timer(const void *adapter_hdl, struct timer_list *timer);
+#endif
+
+void add_to_timer(struct timer_list *timer, long period);
+void stop_timer(struct timer_list *timer);
+void delete_timer(struct timer_list *timer);
+u64 ossl_get_real_time(void);
+
+#define nicif_err(priv, type, dev, fmt, args...) \
+ netif_level(err, priv, type, dev, "[NIC]" fmt, ##args)
+#define nicif_warn(priv, type, dev, fmt, args...) \
+ netif_level(warn, priv, type, dev, "[NIC]" fmt, ##args)
+#define nicif_notice(priv, type, dev, fmt, args...) \
+ netif_level(notice, priv, type, dev, "[NIC]" fmt, ##args)
+#define nicif_info(priv, type, dev, fmt, args...) \
+ netif_level(info, priv, type, dev, "[NIC]" fmt, ##args)
+#define nicif_dbg(priv, type, dev, fmt, args...) \
+ netif_level(dbg, priv, type, dev, "[NIC]" fmt, ##args)
+
+#define destroy_completion(completion)
+#define sema_deinit(lock)
+#define mutex_deinit(lock)
+#define rwlock_deinit(lock)
+
+#define tasklet_state(tasklet) ((tasklet)->state)
+
+#endif
--
2.24.0
1
0

18 May '23
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WAZX
--------------------------------
When the number of cores is greater than the number of ECMDQs, the number
of ECMDQs occupied by each NUMA node is less than the number of cores of
the node. Therefore, the first smmu->nr_ecmdq cores do not cover all
ECMDQs.
For example:
---------------------------------------
| Node0 | Node1 |
|---------------------------------------|
| 0 1 2 3 | 4 5 6 7 | CPU ID
|---------------------------------------|
| 0 1 | 2 3 | ECMDQ ID
---------------------------------------
Fixes: 3965519baff5 ("iommu/arm-smmu-v3: Add support for less than one ECMDQ per core")
Signed-off-by: Zhen Lei <thunder.leizhen(a)huawei.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 114 ++++++++++++--------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +-
2 files changed, 73 insertions(+), 44 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8064c5da79612f8..5793b51d44750cb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -387,7 +387,7 @@ static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
if (smmu->ecmdq_enabled) {
struct arm_smmu_ecmdq *ecmdq;
- ecmdq = *this_cpu_ptr(smmu->ecmdq);
+ ecmdq = *this_cpu_ptr(smmu->ecmdqs);
return &ecmdq->cmdq;
}
@@ -486,7 +486,7 @@ static void arm_smmu_ecmdq_skip_err(struct arm_smmu_device *smmu)
for (i = 0; i < smmu->nr_ecmdq; i++) {
unsigned long flags;
- ecmdq = *per_cpu_ptr(smmu->ecmdq, i);
+ ecmdq = *per_cpu_ptr(smmu->ecmdqs, i);
q = &ecmdq->cmdq.q;
prod = readl_relaxed(q->prod_reg);
@@ -4925,9 +4925,50 @@ static int arm_smmu_device_disable(struct arm_smmu_device *smmu)
return ret;
}
+static int arm_smmu_ecmdq_reset(struct arm_smmu_device *smmu)
+{
+ int i, cpu, ret = 0;
+ u32 reg;
+
+ if (!smmu->nr_ecmdq)
+ return 0;
+
+ i = 0;
+ for_each_possible_cpu(cpu) {
+ struct arm_smmu_ecmdq *ecmdq;
+ struct arm_smmu_queue *q;
+
+ ecmdq = *per_cpu_ptr(smmu->ecmdqs, cpu);
+ if (ecmdq != per_cpu_ptr(smmu->ecmdq, cpu))
+ continue;
+
+ q = &ecmdq->cmdq.q;
+ i++;
+
+ if (WARN_ON(q->llq.prod != q->llq.cons)) {
+ q->llq.prod = 0;
+ q->llq.cons = 0;
+ }
+ writeq_relaxed(q->q_base, ecmdq->base + ARM_SMMU_ECMDQ_BASE);
+ writel_relaxed(q->llq.prod, ecmdq->base + ARM_SMMU_ECMDQ_PROD);
+ writel_relaxed(q->llq.cons, ecmdq->base + ARM_SMMU_ECMDQ_CONS);
+
+ /* enable ecmdq */
+ writel(ECMDQ_PROD_EN | q->llq.prod, q->prod_reg);
+ ret = readl_relaxed_poll_timeout(q->cons_reg, reg, reg & ECMDQ_CONS_ENACK,
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+ if (ret) {
+ dev_err(smmu->dev, "ecmdq[%d] enable failed\n", i);
+ smmu->ecmdq_enabled = 0;
+ break;
+ }
+ }
+
+ return ret;
+}
+
static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool resume)
{
- int i;
int ret;
u32 reg, enables;
struct arm_smmu_cmdq_ent cmd;
@@ -4975,31 +5016,7 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool resume)
writel_relaxed(smmu->cmdq.q.llq.prod, smmu->base + ARM_SMMU_CMDQ_PROD);
writel_relaxed(smmu->cmdq.q.llq.cons, smmu->base + ARM_SMMU_CMDQ_CONS);
- for (i = 0; i < smmu->nr_ecmdq; i++) {
- struct arm_smmu_ecmdq *ecmdq;
- struct arm_smmu_queue *q;
-
- ecmdq = *per_cpu_ptr(smmu->ecmdq, i);
- q = &ecmdq->cmdq.q;
-
- if (WARN_ON(q->llq.prod != q->llq.cons)) {
- q->llq.prod = 0;
- q->llq.cons = 0;
- }
- writeq_relaxed(q->q_base, ecmdq->base + ARM_SMMU_ECMDQ_BASE);
- writel_relaxed(q->llq.prod, ecmdq->base + ARM_SMMU_ECMDQ_PROD);
- writel_relaxed(q->llq.cons, ecmdq->base + ARM_SMMU_ECMDQ_CONS);
-
- /* enable ecmdq */
- writel(ECMDQ_PROD_EN | q->llq.prod, q->prod_reg);
- ret = readl_relaxed_poll_timeout(q->cons_reg, reg, reg & ECMDQ_CONS_ENACK,
- 1, ARM_SMMU_POLL_TIMEOUT_US);
- if (ret) {
- dev_err(smmu->dev, "ecmdq[%d] enable failed\n", i);
- smmu->ecmdq_enabled = 0;
- break;
- }
- }
+ arm_smmu_ecmdq_reset(smmu);
enables = CR0_CMDQEN;
ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
@@ -5099,10 +5116,11 @@ static int arm_smmu_ecmdq_layout(struct arm_smmu_device *smmu)
ecmdq = devm_alloc_percpu(smmu->dev, *ecmdq);
if (!ecmdq)
return -ENOMEM;
+ smmu->ecmdq = ecmdq;
if (num_possible_cpus() <= smmu->nr_ecmdq) {
for_each_possible_cpu(cpu)
- *per_cpu_ptr(smmu->ecmdq, cpu) = per_cpu_ptr(ecmdq, cpu);
+ *per_cpu_ptr(smmu->ecmdqs, cpu) = per_cpu_ptr(ecmdq, cpu);
/* A core requires at most one ECMDQ */
smmu->nr_ecmdq = num_possible_cpus();
@@ -5139,7 +5157,16 @@ static int arm_smmu_ecmdq_layout(struct arm_smmu_device *smmu)
* may be left due to truncation rounding.
*/
nr_ecmdqs[node] = nr_cpus_node(node) * nr_remain / num_possible_cpus();
+ }
+
+ for_each_node(node) {
+ if (!nr_cpus_node(node))
+ continue;
+
nr_remain -= nr_ecmdqs[node];
+
+ /* An ECMDQ has been reserved for each node at above [1] */
+ nr_ecmdqs[node]++;
}
/* Divide the remaining ECMDQs */
@@ -5157,25 +5184,23 @@ static int arm_smmu_ecmdq_layout(struct arm_smmu_device *smmu)
}
for_each_node(node) {
- int i, round, shared = 0;
+ int i, round, shared;
if (!nr_cpus_node(node))
continue;
- /* An ECMDQ has been reserved for each node at above [1] */
- nr_ecmdqs[node]++;
-
+ shared = 0;
if (nr_ecmdqs[node] < nr_cpus_node(node))
shared = 1;
i = 0;
for_each_cpu(cpu, cpumask_of_node(node)) {
round = i % nr_ecmdqs[node];
- if (i++ < nr_ecmdqs[node]) {
+ if (i++ < nr_ecmdqs[node])
ecmdqs[round] = per_cpu_ptr(ecmdq, cpu);
+ else
ecmdqs[round]->cmdq.shared = shared;
- }
- *per_cpu_ptr(smmu->ecmdq, cpu) = ecmdqs[round];
+ *per_cpu_ptr(smmu->ecmdqs, cpu) = ecmdqs[round];
}
}
@@ -5199,6 +5224,8 @@ static int arm_smmu_ecmdq_probe(struct arm_smmu_device *smmu)
numq = 1 << FIELD_GET(IDR6_LOG2NUMQ, reg);
smmu->nr_ecmdq = nump * numq;
gap = ECMDQ_CP_RRESET_SIZE >> FIELD_GET(IDR6_LOG2NUMQ, reg);
+ if (!smmu->nr_ecmdq)
+ return -EOPNOTSUPP;
smmu_dma_base = (vmalloc_to_pfn(smmu->base) << PAGE_SHIFT);
cp_regs = ioremap(smmu_dma_base + ARM_SMMU_ECMDQ_CP_BASE, PAGE_SIZE);
@@ -5231,8 +5258,8 @@ static int arm_smmu_ecmdq_probe(struct arm_smmu_device *smmu)
if (!cp_base)
return -ENOMEM;
- smmu->ecmdq = devm_alloc_percpu(smmu->dev, struct arm_smmu_ecmdq *);
- if (!smmu->ecmdq)
+ smmu->ecmdqs = devm_alloc_percpu(smmu->dev, struct arm_smmu_ecmdq *);
+ if (!smmu->ecmdqs)
return -ENOMEM;
ret = arm_smmu_ecmdq_layout(smmu);
@@ -5246,7 +5273,7 @@ static int arm_smmu_ecmdq_probe(struct arm_smmu_device *smmu)
struct arm_smmu_ecmdq *ecmdq;
struct arm_smmu_queue *q;
- ecmdq = *per_cpu_ptr(smmu->ecmdq, cpu);
+ ecmdq = *per_cpu_ptr(smmu->ecmdqs, cpu);
q = &ecmdq->cmdq.q;
/*
@@ -5254,10 +5281,11 @@ static int arm_smmu_ecmdq_probe(struct arm_smmu_device *smmu)
* CPUs. The CPUs that are not selected are not showed in
* cpumask_of_node(node), their 'ecmdq' may be NULL.
*
- * (q->ecmdq_prod & ECMDQ_PROD_EN) indicates that the ECMDQ is
- * shared by multiple cores and has been initialized.
+ * (ecmdq != per_cpu_ptr(smmu->ecmdq, cpu)) indicates that the
+ * ECMDQ is shared by multiple cores and should be initialized
+ * only by the first owner.
*/
- if (!ecmdq || (q->ecmdq_prod & ECMDQ_PROD_EN))
+ if (!ecmdq || (ecmdq != per_cpu_ptr(smmu->ecmdq, cpu)))
continue;
ecmdq->base = cp_base + addr;
@@ -5700,7 +5728,7 @@ static int arm_smmu_ecmdq_disable(struct device *dev)
struct arm_smmu_device *smmu = dev_get_drvdata(dev);
for (i = 0; i < smmu->nr_ecmdq; i++) {
- ecmdq = *per_cpu_ptr(smmu->ecmdq, i);
+ ecmdq = *per_cpu_ptr(smmu->ecmdqs, i);
q = &ecmdq->cmdq.q;
prod = readl_relaxed(q->prod_reg);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1dd49bed58df305..3820452bf30210e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -728,7 +728,8 @@ struct arm_smmu_device {
u32 nr_ecmdq;
u32 ecmdq_enabled;
};
- struct arm_smmu_ecmdq *__percpu *ecmdq;
+ struct arm_smmu_ecmdq *__percpu *ecmdqs;
+ struct arm_smmu_ecmdq __percpu *ecmdq;
struct arm_smmu_cmdq cmdq;
struct arm_smmu_evtq evtq;
--
2.25.1
1
0
您好!
Kernel SIG 邀请您参加 2023-05-19 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG 双周例会
会议内容:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/87319358677?pwd=dVFTTkYxNTR0TStZUEVsWFZaVmtOUT09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-05-19 14:00,
The subject of the conference is openEuler Kernel SIG 双周例会,
Summary:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/87319358677?pwd=dVFTTkYxNTR0TStZUEVsWFZaVmtOUT09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[PATCH openEuler-22.03-LTS 01/14] uaccess: Add speculation barrier to copy_from_user()
by Jialin Zhang 16 May '23
by Jialin Zhang 16 May '23
16 May '23
From: Dave Hansen <dave.hansen(a)linux.intel.com>
stable inclusion
from stable-v5.10.170
commit 3b6ce54cfa2c04f0636fd0c985913af8703b408d
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I71N8L
CVE: CVE-2023-0459
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 74e19ef0ff8061ef55957c3abd71614ef0f42f47 upstream.
The results of "access_ok()" can be mis-speculated. The result is that
you can end speculatively:
if (access_ok(from, size))
// Right here
even for bad from/size combinations. On first glance, it would be ideal
to just add a speculation barrier to "access_ok()" so that its results
can never be mis-speculated.
But there are lots of system calls just doing access_ok() via
"copy_to_user()" and friends (example: fstat() and friends). Those are
generally not problematic because they do not _consume_ data from
userspace other than the pointer. They are also very quick and common
system calls that should not be needlessly slowed down.
"copy_from_user()" on the other hand uses a user-controller pointer and
is frequently followed up with code that might affect caches. Take
something like this:
if (!copy_from_user(&kernelvar, uptr, size))
do_something_with(kernelvar);
If userspace passes in an evil 'uptr' that *actually* points to a kernel
addresses, and then do_something_with() has cache (or other)
side-effects, it could allow userspace to infer kernel data values.
Add a barrier to the common copy_from_user() code to prevent
mis-speculated values which happen after the copy.
Also add a stub for architectures that do not define barrier_nospec().
This makes the macro usable in generic code.
Since the barrier is now usable in generic code, the x86 #ifdef in the
BPF code can also go away.
Reported-by: Jordy Zomer <jordyzomer(a)google.com>
Suggested-by: Linus Torvalds <torvalds(a)linuxfoundation.org>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Daniel Borkmann <daniel(a)iogearbox.net> # BPF bits
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Conflicts:
lib/usercopy.c
Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
include/linux/nospec.h | 4 ++++
kernel/bpf/core.c | 2 --
lib/usercopy.c | 7 +++++++
3 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/linux/nospec.h b/include/linux/nospec.h
index c1e79f72cd89..9f0af4f116d9 100644
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -11,6 +11,10 @@
struct task_struct;
+#ifndef barrier_nospec
+# define barrier_nospec() do { } while (0)
+#endif
+
/**
* array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
* @index: array element index
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index fd2aa6b9909e..c18aed60ce40 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1642,9 +1642,7 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
* reuse preexisting logic from Spectre v1 mitigation that
* happens to produce the required code on x86 for v4 as well.
*/
-#ifdef CONFIG_X86
barrier_nospec();
-#endif
CONT;
#define LDST(SIZEOP, SIZE) \
STX_MEM_##SIZEOP: \
diff --git a/lib/usercopy.c b/lib/usercopy.c
index 7413dd300516..7ee63df042d7 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -3,6 +3,7 @@
#include <linux/fault-inject-usercopy.h>
#include <linux/instrumented.h>
#include <linux/uaccess.h>
+#include <linux/nospec.h>
/* out-of-line parts */
@@ -12,6 +13,12 @@ unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n
unsigned long res = n;
might_fault();
if (!should_fail_usercopy() && likely(access_ok(from, n))) {
+ /*
+ * Ensure that bad access_ok() speculation will not
+ * lead to nasty side effects *after* the copy is
+ * finished:
+ */
+ barrier_nospec();
instrument_copy_from_user(to, from, n);
res = raw_copy_from_user(to, from, n);
}
--
2.25.1
1
13

[PATCH OLK-5.10 01/14] uaccess: Add speculation barrier to copy_from_user()
by Jialin Zhang 16 May '23
by Jialin Zhang 16 May '23
16 May '23
From: Dave Hansen <dave.hansen(a)linux.intel.com>
stable inclusion
from stable-v5.10.170
commit 3b6ce54cfa2c04f0636fd0c985913af8703b408d
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I71N8L
CVE: CVE-2023-0459
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 74e19ef0ff8061ef55957c3abd71614ef0f42f47 upstream.
The results of "access_ok()" can be mis-speculated. The result is that
you can end speculatively:
if (access_ok(from, size))
// Right here
even for bad from/size combinations. On first glance, it would be ideal
to just add a speculation barrier to "access_ok()" so that its results
can never be mis-speculated.
But there are lots of system calls just doing access_ok() via
"copy_to_user()" and friends (example: fstat() and friends). Those are
generally not problematic because they do not _consume_ data from
userspace other than the pointer. They are also very quick and common
system calls that should not be needlessly slowed down.
"copy_from_user()" on the other hand uses a user-controller pointer and
is frequently followed up with code that might affect caches. Take
something like this:
if (!copy_from_user(&kernelvar, uptr, size))
do_something_with(kernelvar);
If userspace passes in an evil 'uptr' that *actually* points to a kernel
addresses, and then do_something_with() has cache (or other)
side-effects, it could allow userspace to infer kernel data values.
Add a barrier to the common copy_from_user() code to prevent
mis-speculated values which happen after the copy.
Also add a stub for architectures that do not define barrier_nospec().
This makes the macro usable in generic code.
Since the barrier is now usable in generic code, the x86 #ifdef in the
BPF code can also go away.
Reported-by: Jordy Zomer <jordyzomer(a)google.com>
Suggested-by: Linus Torvalds <torvalds(a)linuxfoundation.org>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Daniel Borkmann <daniel(a)iogearbox.net> # BPF bits
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Conflicts:
lib/usercopy.c
Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
include/linux/nospec.h | 4 ++++
kernel/bpf/core.c | 2 --
lib/usercopy.c | 7 +++++++
3 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/linux/nospec.h b/include/linux/nospec.h
index c1e79f72cd89..9f0af4f116d9 100644
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -11,6 +11,10 @@
struct task_struct;
+#ifndef barrier_nospec
+# define barrier_nospec() do { } while (0)
+#endif
+
/**
* array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
* @index: array element index
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index fd2aa6b9909e..c18aed60ce40 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1642,9 +1642,7 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
* reuse preexisting logic from Spectre v1 mitigation that
* happens to produce the required code on x86 for v4 as well.
*/
-#ifdef CONFIG_X86
barrier_nospec();
-#endif
CONT;
#define LDST(SIZEOP, SIZE) \
STX_MEM_##SIZEOP: \
diff --git a/lib/usercopy.c b/lib/usercopy.c
index 7413dd300516..7ee63df042d7 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -3,6 +3,7 @@
#include <linux/fault-inject-usercopy.h>
#include <linux/instrumented.h>
#include <linux/uaccess.h>
+#include <linux/nospec.h>
/* out-of-line parts */
@@ -12,6 +13,12 @@ unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n
unsigned long res = n;
might_fault();
if (!should_fail_usercopy() && likely(access_ok(from, n))) {
+ /*
+ * Ensure that bad access_ok() speculation will not
+ * lead to nasty side effects *after* the copy is
+ * finished:
+ */
+ barrier_nospec();
instrument_copy_from_user(to, from, n);
res = raw_copy_from_user(to, from, n);
}
--
2.25.1
1
13
Currently, only x86 architecture supports the CLOCKSOURCE_VALIDATE_LAST_CYCLE
option. This option ensures that the timestamps returned by the clocksource are
monotonically increasing, and helps avoid issues caused by hardware failures.
This commit makes CLOCKSOURCE_VALIDATE_LAST_CYCLE configurable on
the arm64 architecture, helps increase system stability and reliability.
Yu Liao (1):
timekeeping: Make CLOCKSOURCE_VALIDATE_LAST_CYCLE configurable
kernel/time/Kconfig | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS] net: sctp: update stream->incnt after successful allocation of stream_in
by Zhang Changzhong 12 May '23
by Zhang Changzhong 12 May '23
12 May '23
From: Dong Chenchen <dongchenchen2(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 188766
CVE: NA
----------------------------------------
The stream->incnt is used to record number of stream_in.
sctp_stream_alloc_in() allocate array of incnt size for sctp_stream_in.
If array is allocated successfully in sctp_stream_init(), stream->incnt
should be updated with variable incnt.
Fixes: 703397c74f8f5("sctp: leave the err path free in sctp_stream_init to sctp_stream_fre")
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
---
net/sctp/stream.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 435cbf4549e7..c500d4e22cda 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -242,7 +242,11 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
if (!incnt)
return 0;
- return sctp_stream_alloc_in(stream, incnt, gfp);
+ ret = sctp_stream_alloc_in(stream, incnt, gfp);
+ if (!ret)
+ stream->incnt = incnt;
+
+ return ret;
}
int sctp_stream_init_ext(struct sctp_stream *stream, __u16 sid)
--
2.31.1
1
0

[PATCH openEuler-1.0-LTS 1/2] netrom: Fix use-after-free caused by accept on already connected socket
by Yongqiang Liu 11 May '23
by Yongqiang Liu 11 May '23
11 May '23
From: Hyunwoo Kim <v4bel(a)theori.io>
stable inclusion
from stable-v4.19.273
commit 2c1984d101978e979783bdb2376eb6eca9f8f627
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I70OFF
CVE: CVE-2023-32269
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 611792920925fb088ddccbe2783c7f92fdfb6b64 ]
If you call listen() and accept() on an already connect()ed
AF_NETROM socket, accept() can successfully connect.
This is because when the peer socket sends data to sendmsg,
the skb with its own sk stored in the connected socket's
sk->sk_receive_queue is connected, and nr_accept() dequeues
the skb waiting in the sk->sk_receive_queue.
As a result, nr_accept() allocates and returns a sock with
the sk of the parent AF_NETROM socket.
And here use-after-free can happen through complex race conditions:
```
cpu0 cpu1
1. socket_2 = socket(AF_NETROM)
.
.
listen(socket_2)
accepted_socket = accept(socket_2)
2. socket_1 = socket(AF_NETROM)
nr_create() // sk refcount : 1
connect(socket_1)
3. write(accepted_socket)
nr_sendmsg()
nr_output()
nr_kick()
nr_send_iframe()
nr_transmit_buffer()
nr_route_frame()
nr_loopback_queue()
nr_loopback_timer()
nr_rx_frame()
nr_process_rx_frame(sk, skb); // sk : socket_1's sk
nr_state3_machine()
nr_queue_rx_frame()
sock_queue_rcv_skb()
sock_queue_rcv_skb_reason()
__sock_queue_rcv_skb()
__skb_queue_tail(list, skb); // list : socket_1's sk->sk_receive_queue
4. listen(socket_1)
nr_listen()
uaf_socket = accept(socket_1)
nr_accept()
skb_dequeue(&sk->sk_receive_queue);
5. close(accepted_socket)
nr_release()
nr_write_internal(sk, NR_DISCREQ)
nr_transmit_buffer() // NR_DISCREQ
nr_route_frame()
nr_loopback_queue()
nr_loopback_timer()
nr_rx_frame() // sk : socket_1's sk
nr_process_rx_frame() // NR_STATE_3
nr_state3_machine() // NR_DISCREQ
nr_disconnect()
nr_sk(sk)->state = NR_STATE_0;
6. close(socket_1) // sk refcount : 3
nr_release() // NR_STATE_0
sock_put(sk); // sk refcount : 0
sk_free(sk);
close(uaf_socket)
nr_release()
sock_hold(sk); // UAF
```
KASAN report by syzbot:
```
BUG: KASAN: use-after-free in nr_release+0x66/0x460 net/netrom/af_netrom.c:520
Write of size 4 at addr ffff8880235d8080 by task syz-executor564/5128
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:306 [inline]
print_report+0x15e/0x461 mm/kasan/report.c:417
kasan_report+0xbf/0x1f0 mm/kasan/report.c:517
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0x141/0x190 mm/kasan/generic.c:189
instrument_atomic_read_write include/linux/instrumented.h:102 [inline]
atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:116 [inline]
__refcount_add include/linux/refcount.h:193 [inline]
__refcount_inc include/linux/refcount.h:250 [inline]
refcount_inc include/linux/refcount.h:267 [inline]
sock_hold include/net/sock.h:775 [inline]
nr_release+0x66/0x460 net/netrom/af_netrom.c:520
__sock_release+0xcd/0x280 net/socket.c:650
sock_close+0x1c/0x20 net/socket.c:1365
__fput+0x27c/0xa90 fs/file_table.c:320
task_work_run+0x16f/0x270 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xaa8/0x2950 kernel/exit.c:867
do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
get_signal+0x21c3/0x2450 kernel/signal.c:2859
arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f6c19e3c9b9
Code: Unable to access opcode bytes at 0x7f6c19e3c98f.
RSP: 002b:00007fffd4ba2ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: 0000000000000116 RBX: 0000000000000003 RCX: 00007f6c19e3c9b9
RDX: 0000000000000318 RSI: 00000000200bd000 RDI: 0000000000000006
RBP: 0000000000000003 R08: 000000000000000d R09: 000000000000000d
R10: 0000000000000000 R11: 0000000000000246 R12: 000055555566a2c0
R13: 0000000000000011 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Allocated by task 5128:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
____kasan_kmalloc mm/kasan/common.c:371 [inline]
____kasan_kmalloc mm/kasan/common.c:330 [inline]
__kasan_kmalloc+0xa3/0xb0 mm/kasan/common.c:380
kasan_kmalloc include/linux/kasan.h:211 [inline]
__do_kmalloc_node mm/slab_common.c:968 [inline]
__kmalloc+0x5a/0xd0 mm/slab_common.c:981
kmalloc include/linux/slab.h:584 [inline]
sk_prot_alloc+0x140/0x290 net/core/sock.c:2038
sk_alloc+0x3a/0x7a0 net/core/sock.c:2091
nr_create+0xb6/0x5f0 net/netrom/af_netrom.c:433
__sock_create+0x359/0x790 net/socket.c:1515
sock_create net/socket.c:1566 [inline]
__sys_socket_create net/socket.c:1603 [inline]
__sys_socket_create net/socket.c:1588 [inline]
__sys_socket+0x133/0x250 net/socket.c:1636
__do_sys_socket net/socket.c:1649 [inline]
__se_sys_socket net/socket.c:1647 [inline]
__x64_sys_socket+0x73/0xb0 net/socket.c:1647
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 5128:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:518
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free+0x13b/0x1a0 mm/kasan/common.c:200
kasan_slab_free include/linux/kasan.h:177 [inline]
__cache_free mm/slab.c:3394 [inline]
__do_kmem_cache_free mm/slab.c:3580 [inline]
__kmem_cache_free+0xcd/0x3b0 mm/slab.c:3587
sk_prot_free net/core/sock.c:2074 [inline]
__sk_destruct+0x5df/0x750 net/core/sock.c:2166
sk_destruct net/core/sock.c:2181 [inline]
__sk_free+0x175/0x460 net/core/sock.c:2192
sk_free+0x7c/0xa0 net/core/sock.c:2203
sock_put include/net/sock.h:1991 [inline]
nr_release+0x39e/0x460 net/netrom/af_netrom.c:554
__sock_release+0xcd/0x280 net/socket.c:650
sock_close+0x1c/0x20 net/socket.c:1365
__fput+0x27c/0xa90 fs/file_table.c:320
task_work_run+0x16f/0x270 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xaa8/0x2950 kernel/exit.c:867
do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
get_signal+0x21c3/0x2450 kernel/signal.c:2859
arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
```
To fix this issue, nr_listen() returns -EINVAL for sockets that
successfully nr_connect().
Reported-by: syzbot+caa188bdfc1eeafeb418(a)syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hyunwoo Kim <v4bel(a)theori.io>
Reviewed-by: Kuniyuki Iwashima <kuniyu(a)amazon.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/netrom/af_netrom.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 43910e50752c..a5d819fa7c89 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -403,6 +403,11 @@ static int nr_listen(struct socket *sock, int backlog)
struct sock *sk = sock->sk;
lock_sock(sk);
+ if (sock->state != SS_UNCONNECTED) {
+ release_sock(sk);
+ return -EINVAL;
+ }
+
if (sk->sk_state != TCP_LISTEN) {
memset(&nr_sk(sk)->user_addr, 0, AX25_ADDR_LEN);
sk->sk_max_ack_backlog = backlog;
--
2.25.1
1
1

11 May '23
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I71UGQ
CVE: NA
Reference: N/A
----------------------------------------------------------------
MPAM interrupts are used to report error information and are non-functional interrupts.
The current interrupt number is set to the default value 0.
As a result, the device startup log contains the error indicating that the MPAM interrupt
registration fails, which is sensitive.
Therefore, the log level is changed to alarm.
Signed-off-by: Tiancheng Lu <lutiancheng5(a)huawei.com>
---
arch/arm64/kernel/mpam/mpam_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c
index 2aa9a3ab59f2..50bdc56ca005 100644
--- a/arch/arm64/kernel/mpam/mpam_device.c
+++ b/arch/arm64/kernel/mpam/mpam_device.c
@@ -499,7 +499,7 @@ static void mpam_enable_irqs(void)
rc = request_irq(irq, mpam_handle_error_irq, request_flags,
"MPAM ERR IRQ", dev);
if (rc) {
- pr_err_ratelimited("Failed to register irq %u\n", irq);
+ pr_warn_ratelimited("Not support to register irq %u\n", irq);
continue;
}
--
2.17.1
1
0

11 May '23
From: l00797255 <l00797255(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I71UGQ
CVE: NA
Reference: N/A
----------------------------------------------------------------
MPAM interrupts are used to report error information and are non-functional interrupts.
The current interrupt number is set to the default value 0.
As a result, the device startup log contains the error indicating that the MPAM interrupt
registration fails, which is sensitive.
Therefore, the log level is changed to alarm.
Signed-off-by: Tiancheng Lu <lutiancheng5(a)huawei.com>
---
arch/arm64/kernel/mpam/mpam_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c
index 6455c69f132f..adf4bc034a51 100644
--- a/arch/arm64/kernel/mpam/mpam_device.c
+++ b/arch/arm64/kernel/mpam/mpam_device.c
@@ -499,7 +499,7 @@ static void mpam_enable_irqs(void)
rc = request_irq(irq, mpam_handle_error_irq, request_flags,
"MPAM ERR IRQ", dev);
if (rc) {
- pr_err_ratelimited("Failed to register irq %u\n", irq);
+ pr_warn_ratelimited("Not support to register irq %u\n", irq);
continue;
}
--
2.17.1
1
0

[PATCH openEuler-1.0-LTS] mm: memcontrol: switch to rcu protection in drain_all_stock()
by Yongqiang Liu 10 May '23
by Yongqiang Liu 10 May '23
10 May '23
From: Roman Gushchin <guro(a)fb.com>
mainline inclusion
from mainline-v5.4-rc1
commit e1a366be5cb4f849ec4de170d50eebc08bb0af20
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I70T8Z
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Commit 72f0184c8a00 ("mm, memcg: remove hotplug locking from try_charge")
introduced css_tryget()/css_put() calls in drain_all_stock(), which are
supposed to protect the target memory cgroup from being released during
the mem_cgroup_is_descendant() call.
However, it's not completely safe. In theory, memcg can go away between
reading stock->cached pointer and calling css_tryget().
This can happen if drain_all_stock() races with drain_local_stock()
performed on the remote cpu as a result of a work, scheduled by the
previous invocation of drain_all_stock().
The race is a bit theoretical and there are few chances to trigger it, but
the current code looks a bit confusing, so it makes sense to fix it
anyway. The code looks like as if css_tryget() and css_put() are used to
protect stocks drainage. It's not necessary because stocked pages are
holding references to the cached cgroup. And it obviously won't work for
works, scheduled on other cpus.
So, let's read the stock->cached pointer and evaluate the memory cgroup
inside a rcu read section, and get rid of css_tryget()/css_put() calls.
Link: http://lkml.kernel.org/r/20190802192241.3253165-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Hillf Danton <hdanton(a)sina.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Cai Xinchen <caixinchen1(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
mm/memcontrol.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8b073a584e9f..aff8c05b2c72 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2225,21 +2225,22 @@ static void drain_all_stock(struct mem_cgroup *root_memcg)
for_each_online_cpu(cpu) {
struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu);
struct mem_cgroup *memcg;
+ bool flush = false;
+ rcu_read_lock();
memcg = stock->cached;
- if (!memcg || !stock->nr_pages || !css_tryget(&memcg->css))
- continue;
- if (!mem_cgroup_is_descendant(memcg, root_memcg)) {
- css_put(&memcg->css);
- continue;
- }
- if (!test_and_set_bit(FLUSHING_CACHED_CHARGE, &stock->flags)) {
+ if (memcg && stock->nr_pages &&
+ mem_cgroup_is_descendant(memcg, root_memcg))
+ flush = true;
+ rcu_read_unlock();
+
+ if (flush &&
+ !test_and_set_bit(FLUSHING_CACHED_CHARGE, &stock->flags)) {
if (cpu == curcpu)
drain_local_stock(&stock->work);
else
schedule_work_on(cpu, &stock->work);
}
- css_put(&memcg->css);
}
put_cpu();
mutex_unlock(&percpu_charge_mutex);
--
2.25.1
1
0

[PATCH openEuler-22.03-LTS-SP1] USB: gadgetfs: Fix race between mounting and unmounting
by Jialin Zhang 10 May '23
by Jialin Zhang 10 May '23
10 May '23
From: Alan Stern <stern(a)rowland.harvard.edu>
mainline inclusion
from mainline-v6.2-rc5
commit d18dcfe9860e842f394e37ba01ca9440ab2178f4
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I66IZK
CVE: CVE-2022-4382
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
----------------------------------------------------------------------
The syzbot fuzzer and Gerald Lee have identified a use-after-free bug
in the gadgetfs driver, involving processes concurrently mounting and
unmounting the gadgetfs filesystem. In particular, gadgetfs_fill_super()
can race with gadgetfs_kill_sb(), causing the latter to deallocate
the_device while the former is using it. The output from KASAN says,
in part:
BUG: KASAN: use-after-free in instrument_atomic_read_write include/linux/instrumented.h:102 [inline]
BUG: KASAN: use-after-free in atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
BUG: KASAN: use-after-free in __refcount_sub_and_test include/linux/refcount.h:272 [inline]
BUG: KASAN: use-after-free in __refcount_dec_and_test include/linux/refcount.h:315 [inline]
BUG: KASAN: use-after-free in refcount_dec_and_test include/linux/refcount.h:333 [inline]
BUG: KASAN: use-after-free in put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
BUG: KASAN: use-after-free in gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
Write of size 4 at addr ffff8880276d7840 by task syz-executor126/18689
CPU: 0 PID: 18689 Comm: syz-executor126 Not tainted 6.1.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
...
atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
__refcount_sub_and_test include/linux/refcount.h:272 [inline]
__refcount_dec_and_test include/linux/refcount.h:315 [inline]
refcount_dec_and_test include/linux/refcount.h:333 [inline]
put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
deactivate_locked_super+0xa7/0xf0 fs/super.c:332
vfs_get_super fs/super.c:1190 [inline]
get_tree_single+0xd0/0x160 fs/super.c:1207
vfs_get_tree+0x88/0x270 fs/super.c:1531
vfs_fsconfig_locked fs/fsopen.c:232 [inline]
The simplest solution is to ensure that gadgetfs_fill_super() and
gadgetfs_kill_sb() are serialized by making them both acquire a new
mutex.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-and-tested-by: syzbot+33d7ad66d65044b93f16(a)syzkaller.appspotmail.com
Reported-and-tested-by: Gerald Lee <sundaywind2004(a)gmail.com>
Link: https://lore.kernel.org/linux-usb/CAO3qeMVzXDP-JU6v1u5Ags6Q-bb35kg3=C6d04Dj…
Fixes: e5d82a7360d1 ("vfs: Convert gadgetfs to use the new mount API")
CC: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/Y6XCPXBpn3tmjdCC@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com>
Reviewed-by: tong tiangen <tongtiangen(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/usb/gadget/legacy/inode.c | 28 +++++++++++++++++++++-------
1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c
index cd097474b6c3..cbe801640916 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -229,6 +229,7 @@ static void put_ep (struct ep_data *data)
*/
static const char *CHIP;
+static DEFINE_MUTEX(sb_mutex); /* Serialize superblock operations */
/*----------------------------------------------------------------------*/
@@ -2012,13 +2013,20 @@ gadgetfs_fill_super (struct super_block *sb, struct fs_context *fc)
{
struct inode *inode;
struct dev_data *dev;
+ int rc;
- if (the_device)
- return -ESRCH;
+ mutex_lock(&sb_mutex);
+
+ if (the_device) {
+ rc = -ESRCH;
+ goto Done;
+ }
CHIP = usb_get_gadget_udc_name();
- if (!CHIP)
- return -ENODEV;
+ if (!CHIP) {
+ rc = -ENODEV;
+ goto Done;
+ }
/* superblock */
sb->s_blocksize = PAGE_SIZE;
@@ -2055,13 +2063,17 @@ gadgetfs_fill_super (struct super_block *sb, struct fs_context *fc)
* from binding to a controller.
*/
the_device = dev;
- return 0;
+ rc = 0;
+ goto Done;
-Enomem:
+ Enomem:
kfree(CHIP);
CHIP = NULL;
+ rc = -ENOMEM;
- return -ENOMEM;
+ Done:
+ mutex_unlock(&sb_mutex);
+ return rc;
}
/* "mount -t gadgetfs path /dev/gadget" ends up here */
@@ -2083,6 +2095,7 @@ static int gadgetfs_init_fs_context(struct fs_context *fc)
static void
gadgetfs_kill_sb (struct super_block *sb)
{
+ mutex_lock(&sb_mutex);
kill_litter_super (sb);
if (the_device) {
put_dev (the_device);
@@ -2090,6 +2103,7 @@ gadgetfs_kill_sb (struct super_block *sb)
}
kfree(CHIP);
CHIP = NULL;
+ mutex_unlock(&sb_mutex);
}
/*----------------------------------------------------------------------*/
--
2.25.1
1
0

[PATCH openEuler-22.03-LTS] USB: gadgetfs: Fix race between mounting and unmounting
by Jialin Zhang 10 May '23
by Jialin Zhang 10 May '23
10 May '23
From: Alan Stern <stern(a)rowland.harvard.edu>
mainline inclusion
from mainline-v6.2-rc5
commit d18dcfe9860e842f394e37ba01ca9440ab2178f4
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I66IZK
CVE: CVE-2022-4382
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
----------------------------------------------------------------------
The syzbot fuzzer and Gerald Lee have identified a use-after-free bug
in the gadgetfs driver, involving processes concurrently mounting and
unmounting the gadgetfs filesystem. In particular, gadgetfs_fill_super()
can race with gadgetfs_kill_sb(), causing the latter to deallocate
the_device while the former is using it. The output from KASAN says,
in part:
BUG: KASAN: use-after-free in instrument_atomic_read_write include/linux/instrumented.h:102 [inline]
BUG: KASAN: use-after-free in atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
BUG: KASAN: use-after-free in __refcount_sub_and_test include/linux/refcount.h:272 [inline]
BUG: KASAN: use-after-free in __refcount_dec_and_test include/linux/refcount.h:315 [inline]
BUG: KASAN: use-after-free in refcount_dec_and_test include/linux/refcount.h:333 [inline]
BUG: KASAN: use-after-free in put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
BUG: KASAN: use-after-free in gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
Write of size 4 at addr ffff8880276d7840 by task syz-executor126/18689
CPU: 0 PID: 18689 Comm: syz-executor126 Not tainted 6.1.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
...
atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:176 [inline]
__refcount_sub_and_test include/linux/refcount.h:272 [inline]
__refcount_dec_and_test include/linux/refcount.h:315 [inline]
refcount_dec_and_test include/linux/refcount.h:333 [inline]
put_dev drivers/usb/gadget/legacy/inode.c:159 [inline]
gadgetfs_kill_sb+0x33/0x100 drivers/usb/gadget/legacy/inode.c:2086
deactivate_locked_super+0xa7/0xf0 fs/super.c:332
vfs_get_super fs/super.c:1190 [inline]
get_tree_single+0xd0/0x160 fs/super.c:1207
vfs_get_tree+0x88/0x270 fs/super.c:1531
vfs_fsconfig_locked fs/fsopen.c:232 [inline]
The simplest solution is to ensure that gadgetfs_fill_super() and
gadgetfs_kill_sb() are serialized by making them both acquire a new
mutex.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-and-tested-by: syzbot+33d7ad66d65044b93f16(a)syzkaller.appspotmail.com
Reported-and-tested-by: Gerald Lee <sundaywind2004(a)gmail.com>
Link: https://lore.kernel.org/linux-usb/CAO3qeMVzXDP-JU6v1u5Ags6Q-bb35kg3=C6d04Dj…
Fixes: e5d82a7360d1 ("vfs: Convert gadgetfs to use the new mount API")
CC: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/Y6XCPXBpn3tmjdCC@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com>
Reviewed-by: tong tiangen <tongtiangen(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/usb/gadget/legacy/inode.c | 28 +++++++++++++++++++++-------
1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c
index cd097474b6c3..cbe801640916 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -229,6 +229,7 @@ static void put_ep (struct ep_data *data)
*/
static const char *CHIP;
+static DEFINE_MUTEX(sb_mutex); /* Serialize superblock operations */
/*----------------------------------------------------------------------*/
@@ -2012,13 +2013,20 @@ gadgetfs_fill_super (struct super_block *sb, struct fs_context *fc)
{
struct inode *inode;
struct dev_data *dev;
+ int rc;
- if (the_device)
- return -ESRCH;
+ mutex_lock(&sb_mutex);
+
+ if (the_device) {
+ rc = -ESRCH;
+ goto Done;
+ }
CHIP = usb_get_gadget_udc_name();
- if (!CHIP)
- return -ENODEV;
+ if (!CHIP) {
+ rc = -ENODEV;
+ goto Done;
+ }
/* superblock */
sb->s_blocksize = PAGE_SIZE;
@@ -2055,13 +2063,17 @@ gadgetfs_fill_super (struct super_block *sb, struct fs_context *fc)
* from binding to a controller.
*/
the_device = dev;
- return 0;
+ rc = 0;
+ goto Done;
-Enomem:
+ Enomem:
kfree(CHIP);
CHIP = NULL;
+ rc = -ENOMEM;
- return -ENOMEM;
+ Done:
+ mutex_unlock(&sb_mutex);
+ return rc;
}
/* "mount -t gadgetfs path /dev/gadget" ends up here */
@@ -2083,6 +2095,7 @@ static int gadgetfs_init_fs_context(struct fs_context *fc)
static void
gadgetfs_kill_sb (struct super_block *sb)
{
+ mutex_lock(&sb_mutex);
kill_litter_super (sb);
if (the_device) {
put_dev (the_device);
@@ -2090,6 +2103,7 @@ gadgetfs_kill_sb (struct super_block *sb)
}
kfree(CHIP);
CHIP = NULL;
+ mutex_unlock(&sb_mutex);
}
/*----------------------------------------------------------------------*/
--
2.25.1
1
0
Pull new CVEs:
CVE-2023-0458
CVE-2023-2269
CVE-2023-2483
CVE-2023-31436
CVE-2023-2194
CVE-2023-2166
CVE-2023-2176
CVE-2023-2007
fs bugfixes from Baokun Li
bpf bugfixes from Liu Jian
Arnd Bergmann (1):
scsi: dpt_i2o: Remove obsolete driver
Baokun Li (3):
writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
ext4: only update i_reserved_data_blocks on successful block
allocation
ext4: check iomap type only if ext4_iomap_begin() does not fail
Gwangun Jung (1):
net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg
Jason Gunthorpe (1):
RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more
requests
John Fastabend (5):
bpf, sockmap: Fix race in ingress receive verdict with redirect to
self
bpf, sockmap: Fix return codes from tcp_bpf_recvmsg_parser()
bpf, sockmap: Attach map progs to psock early for feature probes
bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
bpf, sockmap: Fix double bpf_prog_put on error case in map_link
Liu Jian (1):
bpf, sockmap: Fix an infinite loop error when len is 0 in
tcp_bpf_recvmsg_parser()
Mike Snitzer (1):
dm ioctl: fix nested locking in table_clear() to remove deadlock
concern
Oliver Hartkopp (1):
can: af_can: fix NULL pointer dereference in can_rcv_filter
Patrisious Haddad (1):
RDMA/core: Refactor rdma_bind_addr
Wei Chen (1):
i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer()
Xia Fukun (1):
prlimit: do_prlimit needs to have a speculation check
Zheng Wang (1):
net: qcom/emac: Fix use after free bug in emac_remove due to race
condition
.../userspace-api/ioctl/ioctl-number.rst | 2 +-
MAINTAINERS | 8 -
drivers/i2c/busses/i2c-xgene-slimpro.c | 3 +
drivers/infiniband/core/cma.c | 246 +-
drivers/infiniband/core/cma_priv.h | 1 +
drivers/md/dm-ioctl.c | 7 +-
drivers/net/ethernet/qualcomm/emac/emac.c | 6 +
drivers/scsi/Kconfig | 11 -
drivers/scsi/Makefile | 1 -
drivers/scsi/dpt/dpti_i2o.h | 441 --
drivers/scsi/dpt/dpti_ioctl.h | 136 -
drivers/scsi/dpt/dptsig.h | 336 --
drivers/scsi/dpt/osd_defs.h | 79 -
drivers/scsi/dpt/osd_util.h | 358 --
drivers/scsi/dpt/sys_info.h | 417 --
drivers/scsi/dpt_i2o.c | 3549 -----------------
drivers/scsi/dpti.h | 331 --
fs/ext4/indirect.c | 8 +
fs/ext4/inode.c | 12 +-
fs/fs-writeback.c | 17 +-
kernel/sys.c | 2 +
mm/backing-dev.c | 12 +-
net/can/af_can.c | 4 +-
net/core/skmsg.c | 4 +
net/core/sock_map.c | 34 +-
net/ipv4/tcp_bpf.c | 77 +
net/sched/sch_qfq.c | 13 +-
27 files changed, 296 insertions(+), 5819 deletions(-)
delete mode 100644 drivers/scsi/dpt/dpti_i2o.h
delete mode 100644 drivers/scsi/dpt/dpti_ioctl.h
delete mode 100644 drivers/scsi/dpt/dptsig.h
delete mode 100644 drivers/scsi/dpt/osd_defs.h
delete mode 100644 drivers/scsi/dpt/osd_util.h
delete mode 100644 drivers/scsi/dpt/sys_info.h
delete mode 100644 drivers/scsi/dpt_i2o.c
delete mode 100644 drivers/scsi/dpti.h
--
2.25.1
2
19
Pull new CVEs:
CVE-2023-0458
CVE-2023-2269
CVE-2023-2483
CVE-2023-31436
CVE-2023-2194
CVE-2023-2166
CVE-2023-2176
CVE-2023-2007
fs bugfixes from Baokun Li
bpf bugfixes from Liu Jian
Arnd Bergmann (1):
scsi: dpt_i2o: Remove obsolete driver
Baokun Li (3):
writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
ext4: only update i_reserved_data_blocks on successful block
allocation
ext4: check iomap type only if ext4_iomap_begin() does not fail
Gwangun Jung (1):
net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg
Jason Gunthorpe (1):
RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more
requests
John Fastabend (5):
bpf, sockmap: Fix race in ingress receive verdict with redirect to
self
bpf, sockmap: Fix return codes from tcp_bpf_recvmsg_parser()
bpf, sockmap: Attach map progs to psock early for feature probes
bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
bpf, sockmap: Fix double bpf_prog_put on error case in map_link
Liu Jian (1):
bpf, sockmap: Fix an infinite loop error when len is 0 in
tcp_bpf_recvmsg_parser()
Mike Snitzer (1):
dm ioctl: fix nested locking in table_clear() to remove deadlock
concern
Oliver Hartkopp (1):
can: af_can: fix NULL pointer dereference in can_rcv_filter
Patrisious Haddad (1):
RDMA/core: Refactor rdma_bind_addr
Wei Chen (1):
i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer()
Xia Fukun (1):
prlimit: do_prlimit needs to have a speculation check
Zheng Wang (1):
net: qcom/emac: Fix use after free bug in emac_remove due to race
condition
.../userspace-api/ioctl/ioctl-number.rst | 2 +-
MAINTAINERS | 8 -
drivers/i2c/busses/i2c-xgene-slimpro.c | 3 +
drivers/infiniband/core/cma.c | 246 +-
drivers/infiniband/core/cma_priv.h | 1 +
drivers/md/dm-ioctl.c | 7 +-
drivers/net/ethernet/qualcomm/emac/emac.c | 6 +
drivers/scsi/Kconfig | 11 -
drivers/scsi/Makefile | 1 -
drivers/scsi/dpt/dpti_i2o.h | 441 --
drivers/scsi/dpt/dpti_ioctl.h | 136 -
drivers/scsi/dpt/dptsig.h | 336 --
drivers/scsi/dpt/osd_defs.h | 79 -
drivers/scsi/dpt/osd_util.h | 358 --
drivers/scsi/dpt/sys_info.h | 417 --
drivers/scsi/dpt_i2o.c | 3549 -----------------
drivers/scsi/dpti.h | 331 --
fs/ext4/indirect.c | 8 +
fs/ext4/inode.c | 12 +-
fs/fs-writeback.c | 17 +-
kernel/sys.c | 2 +
mm/backing-dev.c | 12 +-
net/can/af_can.c | 4 +-
net/core/skmsg.c | 4 +
net/core/sock_map.c | 34 +-
net/ipv4/tcp_bpf.c | 77 +
net/sched/sch_qfq.c | 13 +-
27 files changed, 296 insertions(+), 5819 deletions(-)
delete mode 100644 drivers/scsi/dpt/dpti_i2o.h
delete mode 100644 drivers/scsi/dpt/dpti_ioctl.h
delete mode 100644 drivers/scsi/dpt/dptsig.h
delete mode 100644 drivers/scsi/dpt/osd_defs.h
delete mode 100644 drivers/scsi/dpt/osd_util.h
delete mode 100644 drivers/scsi/dpt/sys_info.h
delete mode 100644 drivers/scsi/dpt_i2o.c
delete mode 100644 drivers/scsi/dpti.h
--
2.25.1
2
19

09 May '23
Pull new CVEs:
CVE-2023-0458
CVE-2023-2269
CVE-2023-2483
CVE-2023-31436
CVE-2023-2194
CVE-2023-2166
CVE-2023-2176
CVE-2023-2007
fs bugfixes from Baokun Li
bpf bugfixes from Liu Jian
Arnd Bergmann (1):
scsi: dpt_i2o: Remove obsolete driver
Baokun Li (3):
writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
ext4: only update i_reserved_data_blocks on successful block
allocation
ext4: check iomap type only if ext4_iomap_begin() does not fail
Gwangun Jung (1):
net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg
Jason Gunthorpe (1):
RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more
requests
John Fastabend (5):
bpf, sockmap: Fix race in ingress receive verdict with redirect to
self
bpf, sockmap: Fix return codes from tcp_bpf_recvmsg_parser()
bpf, sockmap: Attach map progs to psock early for feature probes
bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
bpf, sockmap: Fix double bpf_prog_put on error case in map_link
Liu Jian (1):
bpf, sockmap: Fix an infinite loop error when len is 0 in
tcp_bpf_recvmsg_parser()
Mike Snitzer (1):
dm ioctl: fix nested locking in table_clear() to remove deadlock
concern
Oliver Hartkopp (1):
can: af_can: fix NULL pointer dereference in can_rcv_filter
Patrisious Haddad (1):
RDMA/core: Refactor rdma_bind_addr
Wei Chen (1):
i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer()
Xia Fukun (1):
prlimit: do_prlimit needs to have a speculation check
Zheng Wang (1):
net: qcom/emac: Fix use after free bug in emac_remove due to race
condition
.../userspace-api/ioctl/ioctl-number.rst | 2 +-
MAINTAINERS | 8 -
drivers/i2c/busses/i2c-xgene-slimpro.c | 3 +
drivers/infiniband/core/cma.c | 246 +-
drivers/infiniband/core/cma_priv.h | 1 +
drivers/md/dm-ioctl.c | 7 +-
drivers/net/ethernet/qualcomm/emac/emac.c | 6 +
drivers/scsi/Kconfig | 11 -
drivers/scsi/Makefile | 1 -
drivers/scsi/dpt/dpti_i2o.h | 441 --
drivers/scsi/dpt/dpti_ioctl.h | 136 -
drivers/scsi/dpt/dptsig.h | 336 --
drivers/scsi/dpt/osd_defs.h | 79 -
drivers/scsi/dpt/osd_util.h | 358 --
drivers/scsi/dpt/sys_info.h | 417 --
drivers/scsi/dpt_i2o.c | 3549 -----------------
drivers/scsi/dpti.h | 331 --
fs/ext4/indirect.c | 8 +
fs/ext4/inode.c | 12 +-
fs/fs-writeback.c | 17 +-
kernel/sys.c | 2 +
mm/backing-dev.c | 12 +-
net/can/af_can.c | 4 +-
net/core/skmsg.c | 4 +
net/core/sock_map.c | 34 +-
net/ipv4/tcp_bpf.c | 77 +
net/sched/sch_qfq.c | 13 +-
27 files changed, 296 insertions(+), 5819 deletions(-)
delete mode 100644 drivers/scsi/dpt/dpti_i2o.h
delete mode 100644 drivers/scsi/dpt/dpti_ioctl.h
delete mode 100644 drivers/scsi/dpt/dptsig.h
delete mode 100644 drivers/scsi/dpt/osd_defs.h
delete mode 100644 drivers/scsi/dpt/osd_util.h
delete mode 100644 drivers/scsi/dpt/sys_info.h
delete mode 100644 drivers/scsi/dpt_i2o.c
delete mode 100644 drivers/scsi/dpti.h
--
2.25.1
2
19
Pull new CVEs:
CVE-2023-0458
CVE-2023-2269
CVE-2023-2483
CVE-2023-31436
CVE-2023-2194
CVE-2023-2166
CVE-2023-2176
CVE-2023-2007
fs bugfixes from Baokun Li
bpf bugfixes from Liu Jian
Arnd Bergmann (1):
scsi: dpt_i2o: Remove obsolete driver
Baokun Li (3):
writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
ext4: only update i_reserved_data_blocks on successful block
allocation
ext4: check iomap type only if ext4_iomap_begin() does not fail
Gwangun Jung (1):
net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg
Jason Gunthorpe (1):
RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more
requests
John Fastabend (5):
bpf, sockmap: Fix race in ingress receive verdict with redirect to
self
bpf, sockmap: Fix return codes from tcp_bpf_recvmsg_parser()
bpf, sockmap: Attach map progs to psock early for feature probes
bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
bpf, sockmap: Fix double bpf_prog_put on error case in map_link
Liu Jian (1):
bpf, sockmap: Fix an infinite loop error when len is 0 in
tcp_bpf_recvmsg_parser()
Mike Snitzer (1):
dm ioctl: fix nested locking in table_clear() to remove deadlock
concern
Oliver Hartkopp (1):
can: af_can: fix NULL pointer dereference in can_rcv_filter
Patrisious Haddad (1):
RDMA/core: Refactor rdma_bind_addr
Wei Chen (1):
i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer()
Xia Fukun (1):
prlimit: do_prlimit needs to have a speculation check
Zheng Wang (1):
net: qcom/emac: Fix use after free bug in emac_remove due to race
condition
.../userspace-api/ioctl/ioctl-number.rst | 2 +-
MAINTAINERS | 8 -
drivers/i2c/busses/i2c-xgene-slimpro.c | 3 +
drivers/infiniband/core/cma.c | 246 +-
drivers/infiniband/core/cma_priv.h | 1 +
drivers/md/dm-ioctl.c | 7 +-
drivers/net/ethernet/qualcomm/emac/emac.c | 6 +
drivers/scsi/Kconfig | 11 -
drivers/scsi/Makefile | 1 -
drivers/scsi/dpt/dpti_i2o.h | 441 --
drivers/scsi/dpt/dpti_ioctl.h | 136 -
drivers/scsi/dpt/dptsig.h | 336 --
drivers/scsi/dpt/osd_defs.h | 79 -
drivers/scsi/dpt/osd_util.h | 358 --
drivers/scsi/dpt/sys_info.h | 417 --
drivers/scsi/dpt_i2o.c | 3549 -----------------
drivers/scsi/dpti.h | 331 --
fs/ext4/indirect.c | 8 +
fs/ext4/inode.c | 12 +-
fs/fs-writeback.c | 17 +-
kernel/sys.c | 2 +
mm/backing-dev.c | 12 +-
net/can/af_can.c | 4 +-
net/core/skmsg.c | 4 +
net/core/sock_map.c | 34 +-
net/ipv4/tcp_bpf.c | 77 +
net/sched/sch_qfq.c | 13 +-
27 files changed, 296 insertions(+), 5819 deletions(-)
delete mode 100644 drivers/scsi/dpt/dpti_i2o.h
delete mode 100644 drivers/scsi/dpt/dpti_ioctl.h
delete mode 100644 drivers/scsi/dpt/dptsig.h
delete mode 100644 drivers/scsi/dpt/osd_defs.h
delete mode 100644 drivers/scsi/dpt/osd_util.h
delete mode 100644 drivers/scsi/dpt/sys_info.h
delete mode 100644 drivers/scsi/dpt_i2o.c
delete mode 100644 drivers/scsi/dpti.h
--
2.25.1
2
19

[PATCH openEuler-1.0-LTS 01/12] dm thin: fix deadlock when swapping to thin device
by Yongqiang Liu 08 May '23
by Yongqiang Liu 08 May '23
08 May '23
From: Coly Li <colyli(a)suse.de>
stable inclusion
from stable-v4.19.280
commit 84e13235e08941ce37aa9ee238b6dd007170f0fc
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I715PM
CVE: NA
--------------------------------
commit 9bbf5feecc7eab2c370496c1c161bbfe62084028 upstream.
This is an already known issue that dm-thin volume cannot be used as
swap, otherwise a deadlock may happen when dm-thin internal memory
demand triggers swap I/O on the dm-thin volume itself.
But thanks to commit a666e5c05e7c ("dm: fix deadlock when swapping to
encrypted device"), the limit_swap_bios target flag can also be used
for dm-thin to avoid the recursive I/O when it is used as swap.
Fix is to simply set ti->limit_swap_bios to true in both pool_ctr()
and thin_ctr().
In my test, I create a dm-thin volume /dev/vg/swap and use it as swap
device. Then I run fio on another dm-thin volume /dev/vg/main and use
large --blocksize to trigger swap I/O onto /dev/vg/swap.
The following fio command line is used in my test,
fio --name recursive-swap-io --lockmem 1 --iodepth 128 \
--ioengine libaio --filename /dev/vg/main --rw randrw \
--blocksize 1M --numjobs 32 --time_based --runtime=12h
Without this fix, the whole system can be locked up within 15 seconds.
With this fix, there is no any deadlock or hung task observed after
2 hours of running fio.
Furthermore, if blocksize is changed from 1M to 128M, after around 30
seconds fio has no visible I/O, and the out-of-memory killer message
shows up in kernel message. After around 20 minutes all fio processes
are killed and the whole system is back to being alive.
This is exactly what is expected when recursive I/O happens on dm-thin
volume when it is used as swap.
Depends-on: a666e5c05e7c ("dm: fix deadlock when swapping to encrypted device")
Cc: stable(a)vger.kernel.org
Signed-off-by: Coly Li <colyli(a)suse.de>
Acked-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/md/dm-thin.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index fd8c89e3fc54..d884bb9cef94 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -3361,6 +3361,7 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv)
pt->low_water_blocks = low_water_blocks;
pt->adjusted_pf = pt->requested_pf = pf;
ti->num_flush_bios = 1;
+ ti->limit_swap_bios = true;
/*
* Only need to enable discards if the pool should pass
@@ -4244,6 +4245,7 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv)
goto bad;
ti->num_flush_bios = 1;
+ ti->limit_swap_bios = true;
ti->flush_supported = true;
ti->per_io_data_size = sizeof(struct dm_thin_endio_hook);
--
2.25.1
1
11

08 May '23
From: Arnd Bergmann <arnd(a)arndb.de>
mainline inclusion
from mainline-v6.0-rc1~14
commit b04e75a4a8a81887386a0d2dbf605a48e779d2a0
category: bugfix
bugzilla: 188707, https://gitee.com/src-openeuler/kernel/issues/I6VK2F
CVE: CVE-2023-2007
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
----------------------------------------
The dpt_i2o driver was fixed to stop using virt_to_bus() in 2008, but it
still has a stale reference in an error handling code path that could never
work. I submitted a patch to fix this reference earlier, but Hannes
Reinecke suggested that removing the driver may be just as good here.
The i2o driver layer was removed in 2015 with commit 4a72a7af462d
("staging: remove i2o subsystem"), but the even older dpt_i2o scsi driver
stayed around.
The last non-cleanup patches I could find were from Miquel van Smoorenburg
and Mark Salyzyn back in 2008, they might know if there is any chance of
the hardware still being used anywhere.
Link: https://lore.kernel.org/linux-scsi/CAK8P3a1XfwkTOV7qOs1fTxf4vthNBRXKNu8A5V7…
Link: https://lore.kernel.org/r/20220624155226.2889613-3-arnd@kernel.org
Cc: Miquel van Smoorenburg <mikevs(a)xs4all.net>
Cc: Mark Salyzyn <salyzyn(a)android.com>
Cc: Hannes Reinecke <hare(a)suse.de>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Reviewed-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
Documentation/ioctl/ioctl-number.txt | 2 +-
MAINTAINERS | 8 -
drivers/scsi/Kconfig | 11 -
drivers/scsi/Makefile | 1 -
drivers/scsi/dpt/dpti_i2o.h | 446 ----
drivers/scsi/dpt/dpti_ioctl.h | 139 -
drivers/scsi/dpt/dptsig.h | 336 ---
drivers/scsi/dpt/osd_defs.h | 79 -
drivers/scsi/dpt/osd_util.h | 358 ---
drivers/scsi/dpt/sys_info.h | 417 ---
drivers/scsi/dpt_i2o.c | 3616 --------------------------
drivers/scsi/dpti.h | 335 ---
12 files changed, 1 insertion(+), 5747 deletions(-)
delete mode 100644 drivers/scsi/dpt/dpti_i2o.h
delete mode 100644 drivers/scsi/dpt/dpti_ioctl.h
delete mode 100644 drivers/scsi/dpt/dptsig.h
delete mode 100644 drivers/scsi/dpt/osd_defs.h
delete mode 100644 drivers/scsi/dpt/osd_util.h
delete mode 100644 drivers/scsi/dpt/sys_info.h
delete mode 100644 drivers/scsi/dpt_i2o.c
delete mode 100644 drivers/scsi/dpti.h
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 516e6e201186..7e9f8cfda2ac 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -108,7 +108,7 @@ Code Seq#(hex) Include File Comments
'C' 01-2F linux/capi.h conflict!
'C' F0-FF drivers/net/wan/cosa.h conflict!
'D' all arch/s390/include/asm/dasd.h
-'D' 40-5F drivers/scsi/dpt/dtpi_ioctl.h
+'D' 40-5F drivers/scsi/dpt/dtpi_ioctl.h Dead since 2022
'D' 05 drivers/scsi/pmcraid.h
'E' all linux/input.h conflict!
'E' 00-0F xen/evtchn.h conflict!
diff --git a/MAINTAINERS b/MAINTAINERS
index d95d71e0ba4e..eb0cc9888618 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4566,14 +4566,6 @@ L: linux-kernel(a)vger.kernel.org
S: Maintained
F: drivers/staging/fsl-dpaa2/rtc
-DPT_I2O SCSI RAID DRIVER
-M: Adaptec OEM Raid Solutions <aacraid(a)microsemi.com>
-L: linux-scsi(a)vger.kernel.org
-W: http://www.adaptec.com/
-S: Maintained
-F: drivers/scsi/dpt*
-F: drivers/scsi/dpt/
-
DRBD DRIVER
M: Philipp Reisner <philipp.reisner(a)linbit.com>
M: Lars Ellenberg <lars.ellenberg(a)linbit.com>
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 63d2aaa22834..fd0e56baac0a 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -480,17 +480,6 @@ config SCSI_MVUMI
To compile this driver as a module, choose M here: the
module will be called mvumi.
-config SCSI_DPT_I2O
- tristate "Adaptec I2O RAID support "
- depends on SCSI && PCI && VIRT_TO_BUS
- help
- This driver supports all of Adaptec's I2O based RAID controllers as
- well as the DPT SmartRaid V cards. This is an Adaptec maintained
- driver by Deanna Bonds. See <file:Documentation/scsi/dpti.txt>.
-
- To compile this driver as a module, choose M here: the
- module will be called dpt_i2o.
-
config SCSI_ADVANSYS
tristate "AdvanSys SCSI support"
depends on SCSI
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 4056cf26e09e..c77535918fe6 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -64,7 +64,6 @@ obj-$(CONFIG_BVME6000_SCSI) += 53c700.o bvme6000_scsi.o
obj-$(CONFIG_SCSI_SIM710) += 53c700.o sim710.o
obj-$(CONFIG_SCSI_ADVANSYS) += advansys.o
obj-$(CONFIG_SCSI_BUSLOGIC) += BusLogic.o
-obj-$(CONFIG_SCSI_DPT_I2O) += dpt_i2o.o
obj-$(CONFIG_SCSI_ARCMSR) += arcmsr/
obj-$(CONFIG_SCSI_AHA152X) += aha152x.o
obj-$(CONFIG_SCSI_AHA1542) += aha1542.o
diff --git a/drivers/scsi/dpt/dpti_i2o.h b/drivers/scsi/dpt/dpti_i2o.h
deleted file mode 100644
index 16fc380b5512..000000000000
--- a/drivers/scsi/dpt/dpti_i2o.h
+++ /dev/null
@@ -1,446 +0,0 @@
-#ifndef _SCSI_I2O_H
-#define _SCSI_I2O_H
-
-/* I2O kernel space accessible structures/APIs
- *
- * (c) Copyright 1999, 2000 Red Hat Software
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- *
- *************************************************************************
- *
- * This header file defined the I2O APIs/structures for use by
- * the I2O kernel modules.
- *
- */
-
-#ifdef __KERNEL__ /* This file to be included by kernel only */
-
-#include <linux/i2o-dev.h>
-
-#include <linux/notifier.h>
-#include <linux/atomic.h>
-
-
-/*
- * Tunable parameters first
- */
-
-/* How many different OSM's are we allowing */
-#define MAX_I2O_MODULES 64
-
-#define I2O_EVT_CAPABILITY_OTHER 0x01
-#define I2O_EVT_CAPABILITY_CHANGED 0x02
-
-#define I2O_EVT_SENSOR_STATE_CHANGED 0x01
-
-//#ifdef __KERNEL__ /* ioctl stuff only thing exported to users */
-
-#define I2O_MAX_MANAGERS 4
-
-/*
- * I2O Interface Objects
- */
-
-#include <linux/wait.h>
-typedef wait_queue_head_t adpt_wait_queue_head_t;
-#define ADPT_DECLARE_WAIT_QUEUE_HEAD(wait) DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wait)
-typedef wait_queue_entry_t adpt_wait_queue_entry_t;
-
-/*
- * message structures
- */
-
-struct i2o_message
-{
- u8 version_offset;
- u8 flags;
- u16 size;
- u32 target_tid:12;
- u32 init_tid:12;
- u32 function:8;
- u32 initiator_context;
- /* List follows */
-};
-
-struct adpt_device;
-struct _adpt_hba;
-struct i2o_device
-{
- struct i2o_device *next; /* Chain */
- struct i2o_device *prev;
-
- char dev_name[8]; /* linux /dev name if available */
- i2o_lct_entry lct_data;/* Device LCT information */
- u32 flags;
- struct proc_dir_entry* proc_entry; /* /proc dir */
- struct adpt_device *owner;
- struct _adpt_hba *controller; /* Controlling IOP */
-};
-
-/*
- * Each I2O controller has one of these objects
- */
-
-struct i2o_controller
-{
- char name[16];
- int unit;
- int type;
- int enabled;
-
- struct notifier_block *event_notifer; /* Events */
- atomic_t users;
- struct i2o_device *devices; /* I2O device chain */
- struct i2o_controller *next; /* Controller chain */
-
-};
-
-/*
- * I2O System table entry
- */
-struct i2o_sys_tbl_entry
-{
- u16 org_id;
- u16 reserved1;
- u32 iop_id:12;
- u32 reserved2:20;
- u16 seg_num:12;
- u16 i2o_version:4;
- u8 iop_state;
- u8 msg_type;
- u16 frame_size;
- u16 reserved3;
- u32 last_changed;
- u32 iop_capabilities;
- u32 inbound_low;
- u32 inbound_high;
-};
-
-struct i2o_sys_tbl
-{
- u8 num_entries;
- u8 version;
- u16 reserved1;
- u32 change_ind;
- u32 reserved2;
- u32 reserved3;
- struct i2o_sys_tbl_entry iops[0];
-};
-
-/*
- * I2O classes / subclasses
- */
-
-/* Class ID and Code Assignments
- * (LCT.ClassID.Version field)
- */
-#define I2O_CLASS_VERSION_10 0x00
-#define I2O_CLASS_VERSION_11 0x01
-
-/* Class code names
- * (from v1.5 Table 6-1 Class Code Assignments.)
- */
-
-#define I2O_CLASS_EXECUTIVE 0x000
-#define I2O_CLASS_DDM 0x001
-#define I2O_CLASS_RANDOM_BLOCK_STORAGE 0x010
-#define I2O_CLASS_SEQUENTIAL_STORAGE 0x011
-#define I2O_CLASS_LAN 0x020
-#define I2O_CLASS_WAN 0x030
-#define I2O_CLASS_FIBRE_CHANNEL_PORT 0x040
-#define I2O_CLASS_FIBRE_CHANNEL_PERIPHERAL 0x041
-#define I2O_CLASS_SCSI_PERIPHERAL 0x051
-#define I2O_CLASS_ATE_PORT 0x060
-#define I2O_CLASS_ATE_PERIPHERAL 0x061
-#define I2O_CLASS_FLOPPY_CONTROLLER 0x070
-#define I2O_CLASS_FLOPPY_DEVICE 0x071
-#define I2O_CLASS_BUS_ADAPTER_PORT 0x080
-#define I2O_CLASS_PEER_TRANSPORT_AGENT 0x090
-#define I2O_CLASS_PEER_TRANSPORT 0x091
-
-/* Rest of 0x092 - 0x09f reserved for peer-to-peer classes
- */
-
-#define I2O_CLASS_MATCH_ANYCLASS 0xffffffff
-
-/* Subclasses
- */
-
-#define I2O_SUBCLASS_i960 0x001
-#define I2O_SUBCLASS_HDM 0x020
-#define I2O_SUBCLASS_ISM 0x021
-
-/* Operation functions */
-
-#define I2O_PARAMS_FIELD_GET 0x0001
-#define I2O_PARAMS_LIST_GET 0x0002
-#define I2O_PARAMS_MORE_GET 0x0003
-#define I2O_PARAMS_SIZE_GET 0x0004
-#define I2O_PARAMS_TABLE_GET 0x0005
-#define I2O_PARAMS_FIELD_SET 0x0006
-#define I2O_PARAMS_LIST_SET 0x0007
-#define I2O_PARAMS_ROW_ADD 0x0008
-#define I2O_PARAMS_ROW_DELETE 0x0009
-#define I2O_PARAMS_TABLE_CLEAR 0x000A
-
-/*
- * I2O serial number conventions / formats
- * (circa v1.5)
- */
-
-#define I2O_SNFORMAT_UNKNOWN 0
-#define I2O_SNFORMAT_BINARY 1
-#define I2O_SNFORMAT_ASCII 2
-#define I2O_SNFORMAT_UNICODE 3
-#define I2O_SNFORMAT_LAN48_MAC 4
-#define I2O_SNFORMAT_WAN 5
-
-/* Plus new in v2.0 (Yellowstone pdf doc)
- */
-
-#define I2O_SNFORMAT_LAN64_MAC 6
-#define I2O_SNFORMAT_DDM 7
-#define I2O_SNFORMAT_IEEE_REG64 8
-#define I2O_SNFORMAT_IEEE_REG128 9
-#define I2O_SNFORMAT_UNKNOWN2 0xff
-
-/* Transaction Reply Lists (TRL) Control Word structure */
-
-#define TRL_SINGLE_FIXED_LENGTH 0x00
-#define TRL_SINGLE_VARIABLE_LENGTH 0x40
-#define TRL_MULTIPLE_FIXED_LENGTH 0x80
-
-/*
- * Messaging API values
- */
-
-#define I2O_CMD_ADAPTER_ASSIGN 0xB3
-#define I2O_CMD_ADAPTER_READ 0xB2
-#define I2O_CMD_ADAPTER_RELEASE 0xB5
-#define I2O_CMD_BIOS_INFO_SET 0xA5
-#define I2O_CMD_BOOT_DEVICE_SET 0xA7
-#define I2O_CMD_CONFIG_VALIDATE 0xBB
-#define I2O_CMD_CONN_SETUP 0xCA
-#define I2O_CMD_DDM_DESTROY 0xB1
-#define I2O_CMD_DDM_ENABLE 0xD5
-#define I2O_CMD_DDM_QUIESCE 0xC7
-#define I2O_CMD_DDM_RESET 0xD9
-#define I2O_CMD_DDM_SUSPEND 0xAF
-#define I2O_CMD_DEVICE_ASSIGN 0xB7
-#define I2O_CMD_DEVICE_RELEASE 0xB9
-#define I2O_CMD_HRT_GET 0xA8
-#define I2O_CMD_ADAPTER_CLEAR 0xBE
-#define I2O_CMD_ADAPTER_CONNECT 0xC9
-#define I2O_CMD_ADAPTER_RESET 0xBD
-#define I2O_CMD_LCT_NOTIFY 0xA2
-#define I2O_CMD_OUTBOUND_INIT 0xA1
-#define I2O_CMD_PATH_ENABLE 0xD3
-#define I2O_CMD_PATH_QUIESCE 0xC5
-#define I2O_CMD_PATH_RESET 0xD7
-#define I2O_CMD_STATIC_MF_CREATE 0xDD
-#define I2O_CMD_STATIC_MF_RELEASE 0xDF
-#define I2O_CMD_STATUS_GET 0xA0
-#define I2O_CMD_SW_DOWNLOAD 0xA9
-#define I2O_CMD_SW_UPLOAD 0xAB
-#define I2O_CMD_SW_REMOVE 0xAD
-#define I2O_CMD_SYS_ENABLE 0xD1
-#define I2O_CMD_SYS_MODIFY 0xC1
-#define I2O_CMD_SYS_QUIESCE 0xC3
-#define I2O_CMD_SYS_TAB_SET 0xA3
-
-#define I2O_CMD_UTIL_NOP 0x00
-#define I2O_CMD_UTIL_ABORT 0x01
-#define I2O_CMD_UTIL_CLAIM 0x09
-#define I2O_CMD_UTIL_RELEASE 0x0B
-#define I2O_CMD_UTIL_PARAMS_GET 0x06
-#define I2O_CMD_UTIL_PARAMS_SET 0x05
-#define I2O_CMD_UTIL_EVT_REGISTER 0x13
-#define I2O_CMD_UTIL_EVT_ACK 0x14
-#define I2O_CMD_UTIL_CONFIG_DIALOG 0x10
-#define I2O_CMD_UTIL_DEVICE_RESERVE 0x0D
-#define I2O_CMD_UTIL_DEVICE_RELEASE 0x0F
-#define I2O_CMD_UTIL_LOCK 0x17
-#define I2O_CMD_UTIL_LOCK_RELEASE 0x19
-#define I2O_CMD_UTIL_REPLY_FAULT_NOTIFY 0x15
-
-#define I2O_CMD_SCSI_EXEC 0x81
-#define I2O_CMD_SCSI_ABORT 0x83
-#define I2O_CMD_SCSI_BUSRESET 0x27
-
-#define I2O_CMD_BLOCK_READ 0x30
-#define I2O_CMD_BLOCK_WRITE 0x31
-#define I2O_CMD_BLOCK_CFLUSH 0x37
-#define I2O_CMD_BLOCK_MLOCK 0x49
-#define I2O_CMD_BLOCK_MUNLOCK 0x4B
-#define I2O_CMD_BLOCK_MMOUNT 0x41
-#define I2O_CMD_BLOCK_MEJECT 0x43
-
-#define I2O_PRIVATE_MSG 0xFF
-
-/*
- * Init Outbound Q status
- */
-
-#define I2O_CMD_OUTBOUND_INIT_IN_PROGRESS 0x01
-#define I2O_CMD_OUTBOUND_INIT_REJECTED 0x02
-#define I2O_CMD_OUTBOUND_INIT_FAILED 0x03
-#define I2O_CMD_OUTBOUND_INIT_COMPLETE 0x04
-
-/*
- * I2O Get Status State values
- */
-
-#define ADAPTER_STATE_INITIALIZING 0x01
-#define ADAPTER_STATE_RESET 0x02
-#define ADAPTER_STATE_HOLD 0x04
-#define ADAPTER_STATE_READY 0x05
-#define ADAPTER_STATE_OPERATIONAL 0x08
-#define ADAPTER_STATE_FAILED 0x10
-#define ADAPTER_STATE_FAULTED 0x11
-
-/* I2O API function return values */
-
-#define I2O_RTN_NO_ERROR 0
-#define I2O_RTN_NOT_INIT 1
-#define I2O_RTN_FREE_Q_EMPTY 2
-#define I2O_RTN_TCB_ERROR 3
-#define I2O_RTN_TRANSACTION_ERROR 4
-#define I2O_RTN_ADAPTER_ALREADY_INIT 5
-#define I2O_RTN_MALLOC_ERROR 6
-#define I2O_RTN_ADPTR_NOT_REGISTERED 7
-#define I2O_RTN_MSG_REPLY_TIMEOUT 8
-#define I2O_RTN_NO_STATUS 9
-#define I2O_RTN_NO_FIRM_VER 10
-#define I2O_RTN_NO_LINK_SPEED 11
-
-/* Reply message status defines for all messages */
-
-#define I2O_REPLY_STATUS_SUCCESS 0x00
-#define I2O_REPLY_STATUS_ABORT_DIRTY 0x01
-#define I2O_REPLY_STATUS_ABORT_NO_DATA_TRANSFER 0x02
-#define I2O_REPLY_STATUS_ABORT_PARTIAL_TRANSFER 0x03
-#define I2O_REPLY_STATUS_ERROR_DIRTY 0x04
-#define I2O_REPLY_STATUS_ERROR_NO_DATA_TRANSFER 0x05
-#define I2O_REPLY_STATUS_ERROR_PARTIAL_TRANSFER 0x06
-#define I2O_REPLY_STATUS_PROCESS_ABORT_DIRTY 0x08
-#define I2O_REPLY_STATUS_PROCESS_ABORT_NO_DATA_TRANSFER 0x09
-#define I2O_REPLY_STATUS_PROCESS_ABORT_PARTIAL_TRANSFER 0x0A
-#define I2O_REPLY_STATUS_TRANSACTION_ERROR 0x0B
-#define I2O_REPLY_STATUS_PROGRESS_REPORT 0x80
-
-/* Status codes and Error Information for Parameter functions */
-
-#define I2O_PARAMS_STATUS_SUCCESS 0x00
-#define I2O_PARAMS_STATUS_BAD_KEY_ABORT 0x01
-#define I2O_PARAMS_STATUS_BAD_KEY_CONTINUE 0x02
-#define I2O_PARAMS_STATUS_BUFFER_FULL 0x03
-#define I2O_PARAMS_STATUS_BUFFER_TOO_SMALL 0x04
-#define I2O_PARAMS_STATUS_FIELD_UNREADABLE 0x05
-#define I2O_PARAMS_STATUS_FIELD_UNWRITEABLE 0x06
-#define I2O_PARAMS_STATUS_INSUFFICIENT_FIELDS 0x07
-#define I2O_PARAMS_STATUS_INVALID_GROUP_ID 0x08
-#define I2O_PARAMS_STATUS_INVALID_OPERATION 0x09
-#define I2O_PARAMS_STATUS_NO_KEY_FIELD 0x0A
-#define I2O_PARAMS_STATUS_NO_SUCH_FIELD 0x0B
-#define I2O_PARAMS_STATUS_NON_DYNAMIC_GROUP 0x0C
-#define I2O_PARAMS_STATUS_OPERATION_ERROR 0x0D
-#define I2O_PARAMS_STATUS_SCALAR_ERROR 0x0E
-#define I2O_PARAMS_STATUS_TABLE_ERROR 0x0F
-#define I2O_PARAMS_STATUS_WRONG_GROUP_TYPE 0x10
-
-/* DetailedStatusCode defines for Executive, DDM, Util and Transaction error
- * messages: Table 3-2 Detailed Status Codes.*/
-
-#define I2O_DSC_SUCCESS 0x0000
-#define I2O_DSC_BAD_KEY 0x0002
-#define I2O_DSC_TCL_ERROR 0x0003
-#define I2O_DSC_REPLY_BUFFER_FULL 0x0004
-#define I2O_DSC_NO_SUCH_PAGE 0x0005
-#define I2O_DSC_INSUFFICIENT_RESOURCE_SOFT 0x0006
-#define I2O_DSC_INSUFFICIENT_RESOURCE_HARD 0x0007
-#define I2O_DSC_CHAIN_BUFFER_TOO_LARGE 0x0009
-#define I2O_DSC_UNSUPPORTED_FUNCTION 0x000A
-#define I2O_DSC_DEVICE_LOCKED 0x000B
-#define I2O_DSC_DEVICE_RESET 0x000C
-#define I2O_DSC_INAPPROPRIATE_FUNCTION 0x000D
-#define I2O_DSC_INVALID_INITIATOR_ADDRESS 0x000E
-#define I2O_DSC_INVALID_MESSAGE_FLAGS 0x000F
-#define I2O_DSC_INVALID_OFFSET 0x0010
-#define I2O_DSC_INVALID_PARAMETER 0x0011
-#define I2O_DSC_INVALID_REQUEST 0x0012
-#define I2O_DSC_INVALID_TARGET_ADDRESS 0x0013
-#define I2O_DSC_MESSAGE_TOO_LARGE 0x0014
-#define I2O_DSC_MESSAGE_TOO_SMALL 0x0015
-#define I2O_DSC_MISSING_PARAMETER 0x0016
-#define I2O_DSC_TIMEOUT 0x0017
-#define I2O_DSC_UNKNOWN_ERROR 0x0018
-#define I2O_DSC_UNKNOWN_FUNCTION 0x0019
-#define I2O_DSC_UNSUPPORTED_VERSION 0x001A
-#define I2O_DSC_DEVICE_BUSY 0x001B
-#define I2O_DSC_DEVICE_NOT_AVAILABLE 0x001C
-
-/* Device Claim Types */
-#define I2O_CLAIM_PRIMARY 0x01000000
-#define I2O_CLAIM_MANAGEMENT 0x02000000
-#define I2O_CLAIM_AUTHORIZED 0x03000000
-#define I2O_CLAIM_SECONDARY 0x04000000
-
-/* Message header defines for VersionOffset */
-#define I2OVER15 0x0001
-#define I2OVER20 0x0002
-/* Default is 1.5, FIXME: Need support for both 1.5 and 2.0 */
-#define I2OVERSION I2OVER15
-#define SGL_OFFSET_0 I2OVERSION
-#define SGL_OFFSET_4 (0x0040 | I2OVERSION)
-#define SGL_OFFSET_5 (0x0050 | I2OVERSION)
-#define SGL_OFFSET_6 (0x0060 | I2OVERSION)
-#define SGL_OFFSET_7 (0x0070 | I2OVERSION)
-#define SGL_OFFSET_8 (0x0080 | I2OVERSION)
-#define SGL_OFFSET_9 (0x0090 | I2OVERSION)
-#define SGL_OFFSET_10 (0x00A0 | I2OVERSION)
-#define SGL_OFFSET_12 (0x00C0 | I2OVERSION)
-
-#define TRL_OFFSET_5 (0x0050 | I2OVERSION)
-#define TRL_OFFSET_6 (0x0060 | I2OVERSION)
-
- /* msg header defines for MsgFlags */
-#define MSG_STATIC 0x0100
-#define MSG_64BIT_CNTXT 0x0200
-#define MSG_MULTI_TRANS 0x1000
-#define MSG_FAIL 0x2000
-#define MSG_LAST 0x4000
-#define MSG_REPLY 0x8000
-
- /* minimum size msg */
-#define THREE_WORD_MSG_SIZE 0x00030000
-#define FOUR_WORD_MSG_SIZE 0x00040000
-#define FIVE_WORD_MSG_SIZE 0x00050000
-#define SIX_WORD_MSG_SIZE 0x00060000
-#define SEVEN_WORD_MSG_SIZE 0x00070000
-#define EIGHT_WORD_MSG_SIZE 0x00080000
-#define NINE_WORD_MSG_SIZE 0x00090000
-#define TEN_WORD_MSG_SIZE 0x000A0000
-#define I2O_MESSAGE_SIZE(x) ((x)<<16)
-
-
-/* Special TID Assignments */
-
-#define ADAPTER_TID 0
-#define HOST_TID 1
-
-#define MSG_FRAME_SIZE 128
-#define NMBR_MSG_FRAMES 128
-
-#define MSG_POOL_SIZE 16384
-
-#define I2O_POST_WAIT_OK 0
-#define I2O_POST_WAIT_TIMEOUT -ETIMEDOUT
-
-
-#endif /* __KERNEL__ */
-
-#endif /* _SCSI_I2O_H */
diff --git a/drivers/scsi/dpt/dpti_ioctl.h b/drivers/scsi/dpt/dpti_ioctl.h
deleted file mode 100644
index f60236721e0d..000000000000
--- a/drivers/scsi/dpt/dpti_ioctl.h
+++ /dev/null
@@ -1,139 +0,0 @@
-/***************************************************************************
- dpti_ioctl.h - description
- -------------------
- begin : Thu Sep 7 2000
- copyright : (C) 2001 by Adaptec
-
- See Documentation/scsi/dpti.txt for history, notes, license info
- and credits
- ***************************************************************************/
-
-/***************************************************************************
- * *
- * This program is free software; you can redistribute it and/or modify *
- * it under the terms of the GNU General Public License as published by *
- * the Free Software Foundation; either version 2 of the License, or *
- * (at your option) any later version. *
- * *
- ***************************************************************************/
-
-/***************************************************************************
- * This file is generated from osd_unix.h *
- * *************************************************************************/
-
-#ifndef _dpti_ioctl_h
-#define _dpti_ioctl_h
-
-// IOCTL interface commands
-
-#ifndef _IOWR
-# define _IOWR(x,y,z) (((x)<<8)|y)
-#endif
-#ifndef _IOW
-# define _IOW(x,y,z) (((x)<<8)|y)
-#endif
-#ifndef _IOR
-# define _IOR(x,y,z) (((x)<<8)|y)
-#endif
-#ifndef _IO
-# define _IO(x,y) (((x)<<8)|y)
-#endif
-/* EATA PassThrough Command */
-#define EATAUSRCMD _IOWR('D',65,EATA_CP)
-/* Set Debug Level If Enabled */
-#define DPT_DEBUG _IOW('D',66,int)
-/* Get Signature Structure */
-#define DPT_SIGNATURE _IOR('D',67,dpt_sig_S)
-#if defined __bsdi__
-#define DPT_SIGNATURE_PACKED _IOR('D',67,dpt_sig_S_Packed)
-#endif
-/* Get Number Of DPT Adapters */
-#define DPT_NUMCTRLS _IOR('D',68,int)
-/* Get Adapter Info Structure */
-#define DPT_CTRLINFO _IOR('D',69,CtrlInfo)
-/* Get Statistics If Enabled */
-#define DPT_STATINFO _IO('D',70)
-/* Clear Stats If Enabled */
-#define DPT_CLRSTAT _IO('D',71)
-/* Get System Info Structure */
-#define DPT_SYSINFO _IOR('D',72,sysInfo_S)
-/* Set Timeout Value */
-#define DPT_TIMEOUT _IO('D',73)
-/* Get config Data */
-#define DPT_CONFIG _IO('D',74)
-/* Get Blink LED Code */
-#define DPT_BLINKLED _IOR('D',75,int)
-/* Get Statistical information (if available) */
-#define DPT_STATS_INFO _IOR('D',80,STATS_DATA)
-/* Clear the statistical information */
-#define DPT_STATS_CLEAR _IO('D',81)
-/* Get Performance metrics */
-#define DPT_PERF_INFO _IOR('D',82,dpt_perf_t)
-/* Send an I2O command */
-#define I2OUSRCMD _IO('D',76)
-/* Inform driver to re-acquire LCT information */
-#define I2ORESCANCMD _IO('D',77)
-/* Inform driver to reset adapter */
-#define I2ORESETCMD _IO('D',78)
-/* See if the target is mounted */
-#define DPT_TARGET_BUSY _IOR('D',79, TARGET_BUSY_T)
-
-
- /* Structure Returned From Get Controller Info */
-
-typedef struct {
- uCHAR state; /* Operational state */
- uCHAR id; /* Host adapter SCSI id */
- int vect; /* Interrupt vector number */
- int base; /* Base I/O address */
- int njobs; /* # of jobs sent to HA */
- int qdepth; /* Controller queue depth. */
- int wakebase; /* mpx wakeup base index. */
- uINT SGsize; /* Scatter/Gather list size. */
- unsigned heads; /* heads for drives on cntlr. */
- unsigned sectors; /* sectors for drives on cntlr. */
- uCHAR do_drive32; /* Flag for Above 16 MB Ability */
- uCHAR BusQuiet; /* SCSI Bus Quiet Flag */
- char idPAL[4]; /* 4 Bytes Of The ID Pal */
- uCHAR primary; /* 1 For Primary, 0 For Secondary */
- uCHAR eataVersion; /* EATA Version */
- uINT cpLength; /* EATA Command Packet Length */
- uINT spLength; /* EATA Status Packet Length */
- uCHAR drqNum; /* DRQ Index (0,5,6,7) */
- uCHAR flag1; /* EATA Flags 1 (Byte 9) */
- uCHAR flag2; /* EATA Flags 2 (Byte 30) */
-} CtrlInfo;
-
-typedef struct {
- uSHORT length; // Remaining length of this
- uSHORT drvrHBAnum; // Relative HBA # used by the driver
- uINT baseAddr; // Base I/O address
- uSHORT blinkState; // Blink LED state (0=Not in blink LED)
- uCHAR pciBusNum; // PCI Bus # (Optional)
- uCHAR pciDeviceNum; // PCI Device # (Optional)
- uSHORT hbaFlags; // Miscellaneous HBA flags
- uSHORT Interrupt; // Interrupt set for this device.
-# if (defined(_DPT_ARC))
- uINT baseLength;
- ADAPTER_OBJECT *AdapterObject;
- LARGE_INTEGER DmaLogicalAddress;
- PVOID DmaVirtualAddress;
- LARGE_INTEGER ReplyLogicalAddress;
- PVOID ReplyVirtualAddress;
-# else
- uINT reserved1; // Reserved for future expansion
- uINT reserved2; // Reserved for future expansion
- uINT reserved3; // Reserved for future expansion
-# endif
-} drvrHBAinfo_S;
-
-typedef struct TARGET_BUSY
-{
- uLONG channel;
- uLONG id;
- uLONG lun;
- uLONG isBusy;
-} TARGET_BUSY_T;
-
-#endif
-
diff --git a/drivers/scsi/dpt/dptsig.h b/drivers/scsi/dpt/dptsig.h
deleted file mode 100644
index a6644b332b53..000000000000
--- a/drivers/scsi/dpt/dptsig.h
+++ /dev/null
@@ -1,336 +0,0 @@
-/* BSDI dptsig.h,v 1.7 1998/06/03 19:15:00 karels Exp */
-
-/*
- * Copyright (c) 1996-1999 Distributed Processing Technology Corporation
- * All rights reserved.
- *
- * Redistribution and use in source form, with or without modification, are
- * permitted provided that redistributions of source code must retain the
- * above copyright notice, this list of conditions and the following disclaimer.
- *
- * This software is provided `as is' by Distributed Processing Technology and
- * any express or implied warranties, including, but not limited to, the
- * implied warranties of merchantability and fitness for a particular purpose,
- * are disclaimed. In no event shall Distributed Processing Technology be
- * liable for any direct, indirect, incidental, special, exemplary or
- * consequential damages (including, but not limited to, procurement of
- * substitute goods or services; loss of use, data, or profits; or business
- * interruptions) however caused and on any theory of liability, whether in
- * contract, strict liability, or tort (including negligence or otherwise)
- * arising in any way out of the use of this driver software, even if advised
- * of the possibility of such damage.
- *
- */
-
-#ifndef __DPTSIG_H_
-#define __DPTSIG_H_
-#ifdef _SINIX_ADDON
-#include "dpt.h"
-#endif
-/* DPT SIGNATURE SPEC AND HEADER FILE */
-/* Signature Version 1 (sorry no 'A') */
-
-/* to make sure we are talking the same size under all OS's */
-typedef unsigned char sigBYTE;
-typedef unsigned short sigWORD;
-typedef unsigned int sigINT;
-
-/*
- * use sigWORDLittleEndian for:
- * dsCapabilities
- * dsDeviceSupp
- * dsAdapterSupp
- * dsApplication
- * use sigLONGLittleEndian for:
- * dsOS
- * so that the sig can be standardised to Little Endian
- */
-#if (defined(_DPT_BIG_ENDIAN))
-# define sigWORDLittleEndian(x) ((((x)&0xFF)<<8)|(((x)>>8)&0xFF))
-# define sigLONGLittleEndian(x) \
- ((((x)&0xFF)<<24) | \
- (((x)&0xFF00)<<8) | \
- (((x)&0xFF0000L)>>8) | \
- (((x)&0xFF000000L)>>24))
-#else
-# define sigWORDLittleEndian(x) (x)
-# define sigLONGLittleEndian(x) (x)
-#endif
-
-/* must make sure the structure is not word or double-word aligned */
-/* --------------------------------------------------------------- */
-/* Borland will ignore the following pragma: */
-/* Word alignment is OFF by default. If in the, IDE make */
-/* sure that Options | Compiler | Code Generation | Word Alignment */
-/* is not checked. If using BCC, do not use the -a option. */
-
-#ifndef NO_PACK
-#if defined (_DPT_AIX)
-#pragma options align=packed
-#else
-#pragma pack(1)
-#endif /* aix */
-#endif
-/* For the Macintosh */
-#ifdef STRUCTALIGNMENTSUPPORTED
-#pragma options align=mac68k
-#endif
-
-
-/* Current Signature Version - sigBYTE dsSigVersion; */
-/* ------------------------------------------------------------------ */
-#define SIG_VERSION 1
-
-/* Processor Family - sigBYTE dsProcessorFamily; DISTINCT VALUES */
-/* ------------------------------------------------------------------ */
-/* What type of processor the file is meant to run on. */
-/* This will let us know whether to read sigWORDs as high/low or low/high. */
-#define PROC_INTEL 0x00 /* Intel 80x86/ia64 */
-#define PROC_MOTOROLA 0x01 /* Motorola 68K */
-#define PROC_MIPS4000 0x02 /* MIPS RISC 4000 */
-#define PROC_ALPHA 0x03 /* DEC Alpha */
-#define PROC_POWERPC 0x04 /* IBM Power PC */
-#define PROC_i960 0x05 /* Intel i960 */
-#define PROC_ULTRASPARC 0x06 /* SPARC processor */
-
-/* Specific Minimim Processor - sigBYTE dsProcessor; FLAG BITS */
-/* ------------------------------------------------------------------ */
-/* Different bit definitions dependent on processor_family */
-
-/* PROC_INTEL: */
-#define PROC_8086 0x01 /* Intel 8086 */
-#define PROC_286 0x02 /* Intel 80286 */
-#define PROC_386 0x04 /* Intel 80386 */
-#define PROC_486 0x08 /* Intel 80486 */
-#define PROC_PENTIUM 0x10 /* Intel 586 aka P5 aka Pentium */
-#define PROC_SEXIUM 0x20 /* Intel 686 aka P6 aka Pentium Pro or MMX */
-#define PROC_IA64 0x40 /* Intel IA64 processor */
-
-/* PROC_i960: */
-#define PROC_960RX 0x01 /* Intel 80960RC/RD */
-#define PROC_960HX 0x02 /* Intel 80960HA/HD/HT */
-
-/* PROC_MOTOROLA: */
-#define PROC_68000 0x01 /* Motorola 68000 */
-#define PROC_68010 0x02 /* Motorola 68010 */
-#define PROC_68020 0x04 /* Motorola 68020 */
-#define PROC_68030 0x08 /* Motorola 68030 */
-#define PROC_68040 0x10 /* Motorola 68040 */
-
-/* PROC_POWERPC */
-#define PROC_PPC601 0x01 /* PowerPC 601 */
-#define PROC_PPC603 0x02 /* PowerPC 603 */
-#define PROC_PPC604 0x04 /* PowerPC 604 */
-
-/* PROC_MIPS4000: */
-#define PROC_R4000 0x01 /* MIPS R4000 */
-
-/* Filetype - sigBYTE dsFiletype; DISTINCT VALUES */
-/* ------------------------------------------------------------------ */
-#define FT_EXECUTABLE 0 /* Executable Program */
-#define FT_SCRIPT 1 /* Script/Batch File??? */
-#define FT_HBADRVR 2 /* HBA Driver */
-#define FT_OTHERDRVR 3 /* Other Driver */
-#define FT_IFS 4 /* Installable Filesystem Driver */
-#define FT_ENGINE 5 /* DPT Engine */
-#define FT_COMPDRVR 6 /* Compressed Driver Disk */
-#define FT_LANGUAGE 7 /* Foreign Language file */
-#define FT_FIRMWARE 8 /* Downloadable or actual Firmware */
-#define FT_COMMMODL 9 /* Communications Module */
-#define FT_INT13 10 /* INT 13 style HBA Driver */
-#define FT_HELPFILE 11 /* Help file */
-#define FT_LOGGER 12 /* Event Logger */
-#define FT_INSTALL 13 /* An Install Program */
-#define FT_LIBRARY 14 /* Storage Manager Real-Mode Calls */
-#define FT_RESOURCE 15 /* Storage Manager Resource File */
-#define FT_MODEM_DB 16 /* Storage Manager Modem Database */
-
-/* Filetype flags - sigBYTE dsFiletypeFlags; FLAG BITS */
-/* ------------------------------------------------------------------ */
-#define FTF_DLL 0x01 /* Dynamic Link Library */
-#define FTF_NLM 0x02 /* Netware Loadable Module */
-#define FTF_OVERLAYS 0x04 /* Uses overlays */
-#define FTF_DEBUG 0x08 /* Debug version */
-#define FTF_TSR 0x10 /* TSR */
-#define FTF_SYS 0x20 /* DOS Loadable driver */
-#define FTF_PROTECTED 0x40 /* Runs in protected mode */
-#define FTF_APP_SPEC 0x80 /* Application Specific */
-#define FTF_ROM (FTF_SYS|FTF_TSR) /* Special Case */
-
-/* OEM - sigBYTE dsOEM; DISTINCT VALUES */
-/* ------------------------------------------------------------------ */
-#define OEM_DPT 0 /* DPT */
-#define OEM_ATT 1 /* ATT */
-#define OEM_NEC 2 /* NEC */
-#define OEM_ALPHA 3 /* Alphatronix */
-#define OEM_AST 4 /* AST */
-#define OEM_OLIVETTI 5 /* Olivetti */
-#define OEM_SNI 6 /* Siemens/Nixdorf */
-#define OEM_SUN 7 /* SUN Microsystems */
-
-/* Operating System - sigLONG dsOS; FLAG BITS */
-/* ------------------------------------------------------------------ */
-#define OS_DOS 0x00000001 /* PC/MS-DOS */
-#define OS_WINDOWS 0x00000002 /* Microsoft Windows 3.x */
-#define OS_WINDOWS_NT 0x00000004 /* Microsoft Windows NT */
-#define OS_OS2M 0x00000008 /* OS/2 1.2.x,MS 1.3.0,IBM 1.3.x - Monolithic */
-#define OS_OS2L 0x00000010 /* Microsoft OS/2 1.301 - LADDR */
-#define OS_OS22x 0x00000020 /* IBM OS/2 2.x */
-#define OS_NW286 0x00000040 /* Novell NetWare 286 */
-#define OS_NW386 0x00000080 /* Novell NetWare 386 */
-#define OS_GEN_UNIX 0x00000100 /* Generic Unix */
-#define OS_SCO_UNIX 0x00000200 /* SCO Unix */
-#define OS_ATT_UNIX 0x00000400 /* ATT Unix */
-#define OS_UNIXWARE 0x00000800 /* USL Unix */
-#define OS_INT_UNIX 0x00001000 /* Interactive Unix */
-#define OS_SOLARIS 0x00002000 /* SunSoft Solaris */
-#define OS_QNX 0x00004000 /* QNX for Tom Moch */
-#define OS_NEXTSTEP 0x00008000 /* NeXTSTEP/OPENSTEP/MACH */
-#define OS_BANYAN 0x00010000 /* Banyan Vines */
-#define OS_OLIVETTI_UNIX 0x00020000/* Olivetti Unix */
-#define OS_MAC_OS 0x00040000 /* Mac OS */
-#define OS_WINDOWS_95 0x00080000 /* Microsoft Windows '95 */
-#define OS_NW4x 0x00100000 /* Novell Netware 4.x */
-#define OS_BSDI_UNIX 0x00200000 /* BSDi Unix BSD/OS 2.0 and up */
-#define OS_AIX_UNIX 0x00400000 /* AIX Unix */
-#define OS_FREE_BSD 0x00800000 /* FreeBSD Unix */
-#define OS_LINUX 0x01000000 /* Linux */
-#define OS_DGUX_UNIX 0x02000000 /* Data General Unix */
-#define OS_SINIX_N 0x04000000 /* SNI SINIX-N */
-#define OS_PLAN9 0x08000000 /* ATT Plan 9 */
-#define OS_TSX 0x10000000 /* SNH TSX-32 */
-
-#define OS_OTHER 0x80000000 /* Other */
-
-/* Capabilities - sigWORD dsCapabilities; FLAG BITS */
-/* ------------------------------------------------------------------ */
-#define CAP_RAID0 0x0001 /* RAID-0 */
-#define CAP_RAID1 0x0002 /* RAID-1 */
-#define CAP_RAID3 0x0004 /* RAID-3 */
-#define CAP_RAID5 0x0008 /* RAID-5 */
-#define CAP_SPAN 0x0010 /* Spanning */
-#define CAP_PASS 0x0020 /* Provides passthrough */
-#define CAP_OVERLAP 0x0040 /* Passthrough supports overlapped commands */
-#define CAP_ASPI 0x0080 /* Supports ASPI Command Requests */
-#define CAP_ABOVE16MB 0x0100 /* ISA Driver supports greater than 16MB */
-#define CAP_EXTEND 0x8000 /* Extended info appears after description */
-#ifdef SNI_MIPS
-#define CAP_CACHEMODE 0x1000 /* dpt_force_cache is set in driver */
-#endif
-
-/* Devices Supported - sigWORD dsDeviceSupp; FLAG BITS */
-/* ------------------------------------------------------------------ */
-#define DEV_DASD 0x0001 /* DASD (hard drives) */
-#define DEV_TAPE 0x0002 /* Tape drives */
-#define DEV_PRINTER 0x0004 /* Printers */
-#define DEV_PROC 0x0008 /* Processors */
-#define DEV_WORM 0x0010 /* WORM drives */
-#define DEV_CDROM 0x0020 /* CD-ROM drives */
-#define DEV_SCANNER 0x0040 /* Scanners */
-#define DEV_OPTICAL 0x0080 /* Optical Drives */
-#define DEV_JUKEBOX 0x0100 /* Jukebox */
-#define DEV_COMM 0x0200 /* Communications Devices */
-#define DEV_OTHER 0x0400 /* Other Devices */
-#define DEV_ALL 0xFFFF /* All SCSI Devices */
-
-/* Adapters Families Supported - sigWORD dsAdapterSupp; FLAG BITS */
-/* ------------------------------------------------------------------ */
-#define ADF_2001 0x0001 /* PM2001 */
-#define ADF_2012A 0x0002 /* PM2012A */
-#define ADF_PLUS_ISA 0x0004 /* PM2011,PM2021 */
-#define ADF_PLUS_EISA 0x0008 /* PM2012B,PM2022 */
-#define ADF_SC3_ISA 0x0010 /* PM2021 */
-#define ADF_SC3_EISA 0x0020 /* PM2022,PM2122, etc */
-#define ADF_SC3_PCI 0x0040 /* SmartCache III PCI */
-#define ADF_SC4_ISA 0x0080 /* SmartCache IV ISA */
-#define ADF_SC4_EISA 0x0100 /* SmartCache IV EISA */
-#define ADF_SC4_PCI 0x0200 /* SmartCache IV PCI */
-#define ADF_SC5_PCI 0x0400 /* Fifth Generation I2O products */
-/*
- * Combinations of products
- */
-#define ADF_ALL_2000 (ADF_2001|ADF_2012A)
-#define ADF_ALL_PLUS (ADF_PLUS_ISA|ADF_PLUS_EISA)
-#define ADF_ALL_SC3 (ADF_SC3_ISA|ADF_SC3_EISA|ADF_SC3_PCI)
-#define ADF_ALL_SC4 (ADF_SC4_ISA|ADF_SC4_EISA|ADF_SC4_PCI)
-#define ADF_ALL_SC5 (ADF_SC5_PCI)
-/* All EATA Cacheing Products */
-#define ADF_ALL_CACHE (ADF_ALL_PLUS|ADF_ALL_SC3|ADF_ALL_SC4)
-/* All EATA Bus Mastering Products */
-#define ADF_ALL_MASTER (ADF_2012A|ADF_ALL_CACHE)
-/* All EATA Adapter Products */
-#define ADF_ALL_EATA (ADF_2001|ADF_ALL_MASTER)
-#define ADF_ALL ADF_ALL_EATA
-
-/* Application - sigWORD dsApplication; FLAG BITS */
-/* ------------------------------------------------------------------ */
-#define APP_DPTMGR 0x0001 /* DPT Storage Manager */
-#define APP_ENGINE 0x0002 /* DPT Engine */
-#define APP_SYTOS 0x0004 /* Sytron Sytos Plus */
-#define APP_CHEYENNE 0x0008 /* Cheyenne ARCServe + ARCSolo */
-#define APP_MSCDEX 0x0010 /* Microsoft CD-ROM extensions */
-#define APP_NOVABACK 0x0020 /* NovaStor Novaback */
-#define APP_AIM 0x0040 /* Archive Information Manager */
-
-/* Requirements - sigBYTE dsRequirements; FLAG BITS */
-/* ------------------------------------------------------------------ */
-#define REQ_SMARTROM 0x01 /* Requires SmartROM to be present */
-#define REQ_DPTDDL 0x02 /* Requires DPTDDL.SYS to be loaded */
-#define REQ_HBA_DRIVER 0x04 /* Requires an HBA driver to be loaded */
-#define REQ_ASPI_TRAN 0x08 /* Requires an ASPI Transport Modules */
-#define REQ_ENGINE 0x10 /* Requires a DPT Engine to be loaded */
-#define REQ_COMM_ENG 0x20 /* Requires a DPT Communications Engine */
-
-/*
- * You may adjust dsDescription_size with an override to a value less than
- * 50 so that the structure allocates less real space.
- */
-#if (!defined(dsDescription_size))
-# define dsDescription_size 50
-#endif
-
-typedef struct dpt_sig {
- char dsSignature[6]; /* ALWAYS "dPtSiG" */
- sigBYTE dsSigVersion; /* signature version (currently 1) */
- sigBYTE dsProcessorFamily; /* what type of processor */
- sigBYTE dsProcessor; /* precise processor */
- sigBYTE dsFiletype; /* type of file */
- sigBYTE dsFiletypeFlags; /* flags to specify load type, etc. */
- sigBYTE dsOEM; /* OEM file was created for */
- sigINT dsOS; /* which Operating systems */
- sigWORD dsCapabilities; /* RAID levels, etc. */
- sigWORD dsDeviceSupp; /* Types of SCSI devices supported */
- sigWORD dsAdapterSupp; /* DPT adapter families supported */
- sigWORD dsApplication; /* applications file is for */
- sigBYTE dsRequirements; /* Other driver dependencies */
- sigBYTE dsVersion; /* 1 */
- sigBYTE dsRevision; /* 'J' */
- sigBYTE dsSubRevision; /* '9' ' ' if N/A */
- sigBYTE dsMonth; /* creation month */
- sigBYTE dsDay; /* creation day */
- sigBYTE dsYear; /* creation year since 1980 (1993=13) */
- /* description (NULL terminated) */
- char dsDescription[dsDescription_size];
-} dpt_sig_S;
-/* 32 bytes minimum - with no description. Put NULL at description[0] */
-/* 81 bytes maximum - with 49 character description plus NULL. */
-
-/* This line added at Roycroft's request */
-/* Microsoft's NT compiler gets confused if you do a pack and don't */
-/* restore it. */
-
-#ifndef NO_UNPACK
-#if defined (_DPT_AIX)
-#pragma options align=reset
-#elif defined (UNPACK_FOUR)
-#pragma pack(4)
-#else
-#pragma pack()
-#endif /* aix */
-#endif
-/* For the Macintosh */
-#ifdef STRUCTALIGNMENTSUPPORTED
-#pragma options align=reset
-#endif
-
-#endif
diff --git a/drivers/scsi/dpt/osd_defs.h b/drivers/scsi/dpt/osd_defs.h
deleted file mode 100644
index de3ae5722982..000000000000
--- a/drivers/scsi/dpt/osd_defs.h
+++ /dev/null
@@ -1,79 +0,0 @@
-/* BSDI osd_defs.h,v 1.4 1998/06/03 19:14:58 karels Exp */
-/*
- * Copyright (c) 1996-1999 Distributed Processing Technology Corporation
- * All rights reserved.
- *
- * Redistribution and use in source form, with or without modification, are
- * permitted provided that redistributions of source code must retain the
- * above copyright notice, this list of conditions and the following disclaimer.
- *
- * This software is provided `as is' by Distributed Processing Technology and
- * any express or implied warranties, including, but not limited to, the
- * implied warranties of merchantability and fitness for a particular purpose,
- * are disclaimed. In no event shall Distributed Processing Technology be
- * liable for any direct, indirect, incidental, special, exemplary or
- * consequential damages (including, but not limited to, procurement of
- * substitute goods or services; loss of use, data, or profits; or business
- * interruptions) however caused and on any theory of liability, whether in
- * contract, strict liability, or tort (including negligence or otherwise)
- * arising in any way out of the use of this driver software, even if advised
- * of the possibility of such damage.
- *
- */
-
-#ifndef _OSD_DEFS_H
-#define _OSD_DEFS_H
-
-/*File - OSD_DEFS.H
- ****************************************************************************
- *
- *Description:
- *
- * This file contains the OS dependent defines. This file is included
- *in osd_util.h and provides the OS specific defines for that file.
- *
- *Copyright Distributed Processing Technology, Corp.
- * 140 Candace Dr.
- * Maitland, Fl. 32751 USA
- * Phone: (407) 830-5522 Fax: (407) 260-5366
- * All Rights Reserved
- *
- *Author: Doug Anderson
- *Date: 1/31/94
- *
- *Editors:
- *
- *Remarks:
- *
- *
- *****************************************************************************/
-
-
-/*Definitions - Defines & Constants ----------------------------------------- */
-
- /* Define the operating system */
-#if (defined(__linux__))
-# define _DPT_LINUX
-#elif (defined(__bsdi__))
-# define _DPT_BSDI
-#elif (defined(__FreeBSD__))
-# define _DPT_FREE_BSD
-#else
-# define _DPT_SCO
-#endif
-
-#if defined (ZIL_CURSES)
-#define _DPT_CURSES
-#else
-#define _DPT_MOTIF
-#endif
-
- /* Redefine 'far' to nothing - no far pointer type required in UNIX */
-#define far
-
- /* Define the mutually exclusive semaphore type */
-#define SEMAPHORE_T unsigned int *
- /* Define a handle to a DLL */
-#define DLL_HANDLE_T unsigned int *
-
-#endif
diff --git a/drivers/scsi/dpt/osd_util.h b/drivers/scsi/dpt/osd_util.h
deleted file mode 100644
index b2613c2eaac7..000000000000
--- a/drivers/scsi/dpt/osd_util.h
+++ /dev/null
@@ -1,358 +0,0 @@
-/* BSDI osd_util.h,v 1.8 1998/06/03 19:14:58 karels Exp */
-
-/*
- * Copyright (c) 1996-1999 Distributed Processing Technology Corporation
- * All rights reserved.
- *
- * Redistribution and use in source form, with or without modification, are
- * permitted provided that redistributions of source code must retain the
- * above copyright notice, this list of conditions and the following disclaimer.
- *
- * This software is provided `as is' by Distributed Processing Technology and
- * any express or implied warranties, including, but not limited to, the
- * implied warranties of merchantability and fitness for a particular purpose,
- * are disclaimed. In no event shall Distributed Processing Technology be
- * liable for any direct, indirect, incidental, special, exemplary or
- * consequential damages (including, but not limited to, procurement of
- * substitute goods or services; loss of use, data, or profits; or business
- * interruptions) however caused and on any theory of liability, whether in
- * contract, strict liability, or tort (including negligence or otherwise)
- * arising in any way out of the use of this driver software, even if advised
- * of the possibility of such damage.
- *
- */
-
-#ifndef __OSD_UTIL_H
-#define __OSD_UTIL_H
-
-/*File - OSD_UTIL.H
- ****************************************************************************
- *
- *Description:
- *
- * This file contains defines and function prototypes that are
- *operating system dependent. The resources defined in this file
- *are not specific to any particular application.
- *
- *Copyright Distributed Processing Technology, Corp.
- * 140 Candace Dr.
- * Maitland, Fl. 32751 USA
- * Phone: (407) 830-5522 Fax: (407) 260-5366
- * All Rights Reserved
- *
- *Author: Doug Anderson
- *Date: 1/7/94
- *
- *Editors:
- *
- *Remarks:
- *
- *
- *****************************************************************************/
-
-
-/*Definitions - Defines & Constants ----------------------------------------- */
-
-/*----------------------------- */
-/* Operating system selections: */
-/*----------------------------- */
-
-/*#define _DPT_MSDOS */
-/*#define _DPT_WIN_3X */
-/*#define _DPT_WIN_4X */
-/*#define _DPT_WIN_NT */
-/*#define _DPT_NETWARE */
-/*#define _DPT_OS2 */
-/*#define _DPT_SCO */
-/*#define _DPT_UNIXWARE */
-/*#define _DPT_SOLARIS */
-/*#define _DPT_NEXTSTEP */
-/*#define _DPT_BANYAN */
-
-/*-------------------------------- */
-/* Include the OS specific defines */
-/*-------------------------------- */
-
-/*#define OS_SELECTION From Above List */
-/*#define SEMAPHORE_T ??? */
-/*#define DLL_HANDLE_T ??? */
-
-#if (defined(KERNEL) && (defined(__FreeBSD__) || defined(__bsdi__)))
-# include "i386/isa/dpt_osd_defs.h"
-#else
-# include "osd_defs.h"
-#endif
-
-#ifndef DPT_UNALIGNED
- #define DPT_UNALIGNED
-#endif
-
-#ifndef DPT_EXPORT
- #define DPT_EXPORT
-#endif
-
-#ifndef DPT_IMPORT
- #define DPT_IMPORT
-#endif
-
-#ifndef DPT_RUNTIME_IMPORT
- #define DPT_RUNTIME_IMPORT DPT_IMPORT
-#endif
-
-/*--------------------- */
-/* OS dependent defines */
-/*--------------------- */
-
-#if defined (_DPT_MSDOS) || defined (_DPT_WIN_3X)
- #define _DPT_16_BIT
-#else
- #define _DPT_32_BIT
-#endif
-
-#if defined (_DPT_SCO) || defined (_DPT_UNIXWARE) || defined (_DPT_SOLARIS) || defined (_DPT_AIX) || defined (SNI_MIPS) || defined (_DPT_BSDI) || defined (_DPT_FREE_BSD) || defined(_DPT_LINUX)
- #define _DPT_UNIX
-#endif
-
-#if defined (_DPT_WIN_3x) || defined (_DPT_WIN_4X) || defined (_DPT_WIN_NT) \
- || defined (_DPT_OS2)
- #define _DPT_DLL_SUPPORT
-#endif
-
-#if !defined (_DPT_MSDOS) && !defined (_DPT_WIN_3X) && !defined (_DPT_NETWARE)
- #define _DPT_PREEMPTIVE
-#endif
-
-#if !defined (_DPT_MSDOS) && !defined (_DPT_WIN_3X)
- #define _DPT_MULTI_THREADED
-#endif
-
-#if !defined (_DPT_MSDOS)
- #define _DPT_MULTI_TASKING
-#endif
-
- /* These exist for platforms that */
- /* chunk when accessing mis-aligned */
- /* data */
-#if defined (SNI_MIPS) || defined (_DPT_SOLARIS)
- #if defined (_DPT_BIG_ENDIAN)
- #if !defined (_DPT_STRICT_ALIGN)
- #define _DPT_STRICT_ALIGN
- #endif
- #endif
-#endif
-
- /* Determine if in C or C++ mode */
-#ifdef __cplusplus
- #define _DPT_CPP
-#else
- #define _DPT_C
-#endif
-
-/*-------------------------------------------------------------------*/
-/* Under Solaris the compiler refuses to accept code like: */
-/* { {"DPT"}, 0, NULL .... }, */
-/* and complains about the {"DPT"} part by saying "cannot use { } */
-/* to initialize char*". */
-/* */
-/* By defining these ugly macros we can get around this and also */
-/* not have to copy and #ifdef large sections of code. I know that */
-/* these macros are *really* ugly, but they should help reduce */
-/* maintenance in the long run. */
-/* */
-/*-------------------------------------------------------------------*/
-#if !defined (DPTSQO)
- #if defined (_DPT_SOLARIS)
- #define DPTSQO
- #define DPTSQC
- #else
- #define DPTSQO {
- #define DPTSQC }
- #endif /* solaris */
-#endif /* DPTSQO */
-
-
-/*---------------------- */
-/* OS dependent typedefs */
-/*---------------------- */
-
-#if defined (_DPT_MSDOS) || defined (_DPT_SCO)
- #define BYTE unsigned char
- #define WORD unsigned short
-#endif
-
-#ifndef _DPT_TYPEDEFS
- #define _DPT_TYPEDEFS
- typedef unsigned char uCHAR;
- typedef unsigned short uSHORT;
- typedef unsigned int uINT;
- typedef unsigned long uLONG;
-
- typedef union {
- uCHAR u8[4];
- uSHORT u16[2];
- uLONG u32;
- } access_U;
-#endif
-
-#if !defined (NULL)
- #define NULL 0
-#endif
-
-
-/*Prototypes - function ----------------------------------------------------- */
-
-#ifdef __cplusplus
- extern "C" { /* Declare all these functions as "C" functions */
-#endif
-
-/*------------------------ */
-/* Byte reversal functions */
-/*------------------------ */
-
- /* Reverses the byte ordering of a 2 byte variable */
-#if (!defined(osdSwap2))
- uSHORT osdSwap2(DPT_UNALIGNED uSHORT *);
-#endif // !osdSwap2
-
- /* Reverses the byte ordering of a 4 byte variable and shifts left 8 bits */
-#if (!defined(osdSwap3))
- uLONG osdSwap3(DPT_UNALIGNED uLONG *);
-#endif // !osdSwap3
-
-
-#ifdef _DPT_NETWARE
- #include "novpass.h" /* For DPT_Bswapl() prototype */
- /* Inline the byte swap */
- #ifdef __cplusplus
- inline uLONG osdSwap4(uLONG *inLong) {
- return *inLong = DPT_Bswapl(*inLong);
- }
- #else
- #define osdSwap4(inLong) DPT_Bswapl(inLong)
- #endif // cplusplus
-#else
- /* Reverses the byte ordering of a 4 byte variable */
-# if (!defined(osdSwap4))
- uLONG osdSwap4(DPT_UNALIGNED uLONG *);
-# endif // !osdSwap4
-
- /* The following functions ALWAYS swap regardless of the *
- * presence of DPT_BIG_ENDIAN */
-
- uSHORT trueSwap2(DPT_UNALIGNED uSHORT *);
- uLONG trueSwap4(DPT_UNALIGNED uLONG *);
-
-#endif // netware
-
-
-/*-------------------------------------*
- * Network order swap functions *
- * *
- * These functions/macros will be used *
- * by the structure insert()/extract() *
- * functions. *
- *
- * We will enclose all structure *
- * portability modifications inside *
- * #ifdefs. When we are ready, we *
- * will #define DPT_PORTABLE to begin *
- * using the modifications. *
- *-------------------------------------*/
-uLONG netSwap4(uLONG val);
-
-#if defined (_DPT_BIG_ENDIAN)
-
-// for big-endian we need to swap
-
-#ifndef NET_SWAP_2
-#define NET_SWAP_2(x) (((x) >> 8) | ((x) << 8))
-#endif // NET_SWAP_2
-
-#ifndef NET_SWAP_4
-#define NET_SWAP_4(x) netSwap4((x))
-#endif // NET_SWAP_4
-
-#else
-
-// for little-endian we don't need to do anything
-
-#ifndef NET_SWAP_2
-#define NET_SWAP_2(x) (x)
-#endif // NET_SWAP_2
-
-#ifndef NET_SWAP_4
-#define NET_SWAP_4(x) (x)
-#endif // NET_SWAP_4
-
-#endif // big endian
-
-
-
-/*----------------------------------- */
-/* Run-time loadable module functions */
-/*----------------------------------- */
-
- /* Loads the specified run-time loadable DLL */
-DLL_HANDLE_T osdLoadModule(uCHAR *);
- /* Unloads the specified run-time loadable DLL */
-uSHORT osdUnloadModule(DLL_HANDLE_T);
- /* Returns a pointer to a function inside a run-time loadable DLL */
-void * osdGetFnAddr(DLL_HANDLE_T,uCHAR *);
-
-/*--------------------------------------- */
-/* Mutually exclusive semaphore functions */
-/*--------------------------------------- */
-
- /* Create a named semaphore */
-SEMAPHORE_T osdCreateNamedSemaphore(char *);
- /* Create a mutually exlusive semaphore */
-SEMAPHORE_T osdCreateSemaphore(void);
- /* create an event semaphore */
-SEMAPHORE_T osdCreateEventSemaphore(void);
- /* create a named event semaphore */
-SEMAPHORE_T osdCreateNamedEventSemaphore(char *);
-
- /* Destroy the specified mutually exclusive semaphore object */
-uSHORT osdDestroySemaphore(SEMAPHORE_T);
- /* Request access to the specified mutually exclusive semaphore */
-uLONG osdRequestSemaphore(SEMAPHORE_T,uLONG);
- /* Release access to the specified mutually exclusive semaphore */
-uSHORT osdReleaseSemaphore(SEMAPHORE_T);
- /* wait for a event to happen */
-uLONG osdWaitForEventSemaphore(SEMAPHORE_T, uLONG);
- /* signal an event */
-uLONG osdSignalEventSemaphore(SEMAPHORE_T);
- /* reset the event */
-uLONG osdResetEventSemaphore(SEMAPHORE_T);
-
-/*----------------- */
-/* Thread functions */
-/*----------------- */
-
- /* Releases control to the task switcher in non-preemptive */
- /* multitasking operating systems. */
-void osdSwitchThreads(void);
-
- /* Starts a thread function */
-uLONG osdStartThread(void *,void *);
-
-/* what is my thread id */
-uLONG osdGetThreadID(void);
-
-/* wakes up the specifed thread */
-void osdWakeThread(uLONG);
-
-/* osd sleep for x milliseconds */
-void osdSleep(uLONG);
-
-#define DPT_THREAD_PRIORITY_LOWEST 0x00
-#define DPT_THREAD_PRIORITY_NORMAL 0x01
-#define DPT_THREAD_PRIORITY_HIGHEST 0x02
-
-uCHAR osdSetThreadPriority(uLONG tid, uCHAR priority);
-
-#ifdef __cplusplus
- } /* end the xtern "C" declaration */
-#endif
-
-#endif /* osd_util_h */
diff --git a/drivers/scsi/dpt/sys_info.h b/drivers/scsi/dpt/sys_info.h
deleted file mode 100644
index a4aa1c31ff72..000000000000
--- a/drivers/scsi/dpt/sys_info.h
+++ /dev/null
@@ -1,417 +0,0 @@
-/* BSDI sys_info.h,v 1.6 1998/06/03 19:14:59 karels Exp */
-
-/*
- * Copyright (c) 1996-1999 Distributed Processing Technology Corporation
- * All rights reserved.
- *
- * Redistribution and use in source form, with or without modification, are
- * permitted provided that redistributions of source code must retain the
- * above copyright notice, this list of conditions and the following disclaimer.
- *
- * This software is provided `as is' by Distributed Processing Technology and
- * any express or implied warranties, including, but not limited to, the
- * implied warranties of merchantability and fitness for a particular purpose,
- * are disclaimed. In no event shall Distributed Processing Technology be
- * liable for any direct, indirect, incidental, special, exemplary or
- * consequential damages (including, but not limited to, procurement of
- * substitute goods or services; loss of use, data, or profits; or business
- * interruptions) however caused and on any theory of liability, whether in
- * contract, strict liability, or tort (including negligence or otherwise)
- * arising in any way out of the use of this driver software, even if advised
- * of the possibility of such damage.
- *
- */
-
-#ifndef __SYS_INFO_H
-#define __SYS_INFO_H
-
-/*File - SYS_INFO.H
- ****************************************************************************
- *
- *Description:
- *
- * This file contains structure definitions for the OS dependent
- *layer system information buffers.
- *
- *Copyright Distributed Processing Technology, Corp.
- * 140 Candace Dr.
- * Maitland, Fl. 32751 USA
- * Phone: (407) 830-5522 Fax: (407) 260-5366
- * All Rights Reserved
- *
- *Author: Don Kemper
- *Date: 5/10/94
- *
- *Editors:
- *
- *Remarks:
- *
- *
- *****************************************************************************/
-
-
-/*Include Files ------------------------------------------------------------- */
-
-#include "osd_util.h"
-
-#ifndef NO_PACK
-#if defined (_DPT_AIX)
-#pragma options align=packed
-#else
-#pragma pack(1)
-#endif /* aix */
-#endif // no unpack
-
-
-/*struct - driveParam_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the drive parameters seen during
- *booting.
- *
- *---------------------------------------------------------------------------*/
-
-#ifdef __cplusplus
- struct driveParam_S {
-#else
- typedef struct {
-#endif
-
- uSHORT cylinders; /* Up to 1024 */
- uCHAR heads; /* Up to 255 */
- uCHAR sectors; /* Up to 63 */
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } driveParam_S;
-#endif
-/*driveParam_S - end */
-
-
-/*struct - sysInfo_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the command system information that
- *should be returned by every OS dependent layer.
- *
- *---------------------------------------------------------------------------*/
-
-/*flags - bit definitions */
-#define SI_CMOS_Valid 0x0001
-#define SI_NumDrivesValid 0x0002
-#define SI_ProcessorValid 0x0004
-#define SI_MemorySizeValid 0x0008
-#define SI_DriveParamsValid 0x0010
-#define SI_SmartROMverValid 0x0020
-#define SI_OSversionValid 0x0040
-#define SI_OSspecificValid 0x0080 /* 1 if OS structure returned */
-#define SI_BusTypeValid 0x0100
-
-#define SI_ALL_VALID 0x0FFF /* All Std SysInfo is valid */
-#define SI_NO_SmartROM 0x8000
-
-/*busType - definitions */
-#define SI_ISA_BUS 0x00
-#define SI_MCA_BUS 0x01
-#define SI_EISA_BUS 0x02
-#define SI_PCI_BUS 0x04
-
-#ifdef __cplusplus
- struct sysInfo_S {
-#else
- typedef struct {
-#endif
-
- uCHAR drive0CMOS; /* CMOS Drive 0 Type */
- uCHAR drive1CMOS; /* CMOS Drive 1 Type */
- uCHAR numDrives; /* 0040:0075 contents */
- uCHAR processorFamily; /* Same as DPTSIG's definition */
- uCHAR processorType; /* Same as DPTSIG's definition */
- uCHAR smartROMMajorVersion;
- uCHAR smartROMMinorVersion; /* SmartROM version */
- uCHAR smartROMRevision;
- uSHORT flags; /* See bit definitions above */
- uSHORT conventionalMemSize; /* in KB */
- uINT extendedMemSize; /* in KB */
- uINT osType; /* Same as DPTSIG's definition */
- uCHAR osMajorVersion;
- uCHAR osMinorVersion; /* The OS version */
- uCHAR osRevision;
-#ifdef _SINIX_ADDON
- uCHAR busType; /* See defininitions above */
- uSHORT osSubRevision;
- uCHAR pad[2]; /* For alignment */
-#else
- uCHAR osSubRevision;
- uCHAR busType; /* See defininitions above */
- uCHAR pad[3]; /* For alignment */
-#endif
- driveParam_S drives[16]; /* SmartROM Logical Drives */
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } sysInfo_S;
-#endif
-/*sysInfo_S - end */
-
-
-/*struct - DOS_Info_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the system information specific to a
- *DOS workstation.
- *
- *---------------------------------------------------------------------------*/
-
-/*flags - bit definitions */
-#define DI_DOS_HIGH 0x01 /* DOS is loaded high */
-#define DI_DPMI_VALID 0x02 /* DPMI version is valid */
-
-#ifdef __cplusplus
- struct DOS_Info_S {
-#else
- typedef struct {
-#endif
-
- uCHAR flags; /* See bit definitions above */
- uSHORT driverLocation; /* SmartROM BIOS address */
- uSHORT DOS_version;
- uSHORT DPMI_version;
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } DOS_Info_S;
-#endif
-/*DOS_Info_S - end */
-
-
-/*struct - Netware_Info_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the system information specific to a
- *Netware machine.
- *
- *---------------------------------------------------------------------------*/
-
-#ifdef __cplusplus
- struct Netware_Info_S {
-#else
- typedef struct {
-#endif
-
- uCHAR driverName[13]; /* ie PM12NW31.DSK */
- uCHAR serverName[48];
- uCHAR netwareVersion; /* The Netware OS version */
- uCHAR netwareSubVersion;
- uCHAR netwareRevision;
- uSHORT maxConnections; /* Probably 250 or 1000 */
- uSHORT connectionsInUse;
- uSHORT maxVolumes;
- uCHAR unused;
- uCHAR SFTlevel;
- uCHAR TTSlevel;
-
- uCHAR clibMajorVersion; /* The CLIB.NLM version */
- uCHAR clibMinorVersion;
- uCHAR clibRevision;
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } Netware_Info_S;
-#endif
-/*Netware_Info_S - end */
-
-
-/*struct - OS2_Info_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the system information specific to an
- *OS/2 machine.
- *
- *---------------------------------------------------------------------------*/
-
-#ifdef __cplusplus
- struct OS2_Info_S {
-#else
- typedef struct {
-#endif
-
- uCHAR something;
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } OS2_Info_S;
-#endif
-/*OS2_Info_S - end */
-
-
-/*struct - WinNT_Info_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the system information specific to a
- *Windows NT machine.
- *
- *---------------------------------------------------------------------------*/
-
-#ifdef __cplusplus
- struct WinNT_Info_S {
-#else
- typedef struct {
-#endif
-
- uCHAR something;
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } WinNT_Info_S;
-#endif
-/*WinNT_Info_S - end */
-
-
-/*struct - SCO_Info_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the system information specific to an
- *SCO UNIX machine.
- *
- *---------------------------------------------------------------------------*/
-
-#ifdef __cplusplus
- struct SCO_Info_S {
-#else
- typedef struct {
-#endif
-
- uCHAR something;
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } SCO_Info_S;
-#endif
-/*SCO_Info_S - end */
-
-
-/*struct - USL_Info_S - start
- *===========================================================================
- *
- *Description:
- *
- * This structure defines the system information specific to a
- *USL UNIX machine.
- *
- *---------------------------------------------------------------------------*/
-
-#ifdef __cplusplus
- struct USL_Info_S {
-#else
- typedef struct {
-#endif
-
- uCHAR something;
-
-#ifdef __cplusplus
-
-//---------- Portability Additions ----------- in sp_sinfo.cpp
-#ifdef DPT_PORTABLE
- uSHORT netInsert(dptBuffer_S *buffer);
- uSHORT netExtract(dptBuffer_S *buffer);
-#endif // DPT PORTABLE
-//--------------------------------------------
-
- };
-#else
- } USL_Info_S;
-#endif
-/*USL_Info_S - end */
-
-
- /* Restore default structure packing */
-#ifndef NO_UNPACK
-#if defined (_DPT_AIX)
-#pragma options align=reset
-#elif defined (UNPACK_FOUR)
-#pragma pack(4)
-#else
-#pragma pack()
-#endif /* aix */
-#endif // no unpack
-
-#endif // __SYS_INFO_H
-
diff --git a/drivers/scsi/dpt_i2o.c b/drivers/scsi/dpt_i2o.c
deleted file mode 100644
index 37de8fb186d7..000000000000
--- a/drivers/scsi/dpt_i2o.c
+++ /dev/null
@@ -1,3616 +0,0 @@
-/***************************************************************************
- dpti.c - description
- -------------------
- begin : Thu Sep 7 2000
- copyright : (C) 2000 by Adaptec
-
- July 30, 2001 First version being submitted
- for inclusion in the kernel. V2.4
-
- See Documentation/scsi/dpti.txt for history, notes, license info
- and credits
- ***************************************************************************/
-
-/***************************************************************************
- * *
- * This program is free software; you can redistribute it and/or modify *
- * it under the terms of the GNU General Public License as published by *
- * the Free Software Foundation; either version 2 of the License, or *
- * (at your option) any later version. *
- * *
- ***************************************************************************/
-/***************************************************************************
- * Sat Dec 20 2003 Go Taniguchi <go(a)turbolinux.co.jp>
- - Support 2.6 kernel and DMA-mapping
- - ioctl fix for raid tools
- - use schedule_timeout in long long loop
- **************************************************************************/
-
-/*#define DEBUG 1 */
-/*#define UARTDELAY 1 */
-
-#include <linux/module.h>
-
-MODULE_AUTHOR("Deanna Bonds, with _lots_ of help from Mark Salyzyn");
-MODULE_DESCRIPTION("Adaptec I2O RAID Driver");
-
-////////////////////////////////////////////////////////////////
-
-#include <linux/ioctl.h> /* For SCSI-Passthrough */
-#include <linux/uaccess.h>
-
-#include <linux/stat.h>
-#include <linux/slab.h> /* for kmalloc() */
-#include <linux/pci.h> /* for PCI support */
-#include <linux/proc_fs.h>
-#include <linux/blkdev.h>
-#include <linux/delay.h> /* for udelay */
-#include <linux/interrupt.h>
-#include <linux/kernel.h> /* for printk */
-#include <linux/sched.h>
-#include <linux/reboot.h>
-#include <linux/spinlock.h>
-#include <linux/dma-mapping.h>
-
-#include <linux/timer.h>
-#include <linux/string.h>
-#include <linux/ioport.h>
-#include <linux/mutex.h>
-
-#include <asm/processor.h> /* for boot_cpu_data */
-#include <asm/pgtable.h>
-#include <asm/io.h> /* for virt_to_bus, etc. */
-
-#include <scsi/scsi.h>
-#include <scsi/scsi_cmnd.h>
-#include <scsi/scsi_device.h>
-#include <scsi/scsi_host.h>
-#include <scsi/scsi_tcq.h>
-
-#include "dpt/dptsig.h"
-#include "dpti.h"
-
-/*============================================================================
- * Create a binary signature - this is read by dptsig
- * Needed for our management apps
- *============================================================================
- */
-static DEFINE_MUTEX(adpt_mutex);
-static dpt_sig_S DPTI_sig = {
- {'d', 'P', 't', 'S', 'i', 'G'}, SIG_VERSION,
-#ifdef __i386__
- PROC_INTEL, PROC_386 | PROC_486 | PROC_PENTIUM | PROC_SEXIUM,
-#elif defined(__ia64__)
- PROC_INTEL, PROC_IA64,
-#elif defined(__sparc__)
- PROC_ULTRASPARC, PROC_ULTRASPARC,
-#elif defined(__alpha__)
- PROC_ALPHA, PROC_ALPHA,
-#else
- (-1),(-1),
-#endif
- FT_HBADRVR, 0, OEM_DPT, OS_LINUX, CAP_OVERLAP, DEV_ALL,
- ADF_ALL_SC5, 0, 0, DPT_VERSION, DPT_REVISION, DPT_SUBREVISION,
- DPT_MONTH, DPT_DAY, DPT_YEAR, "Adaptec Linux I2O RAID Driver"
-};
-
-
-
-
-/*============================================================================
- * Globals
- *============================================================================
- */
-
-static DEFINE_MUTEX(adpt_configuration_lock);
-
-static struct i2o_sys_tbl *sys_tbl;
-static dma_addr_t sys_tbl_pa;
-static int sys_tbl_ind;
-static int sys_tbl_len;
-
-static adpt_hba* hba_chain = NULL;
-static int hba_count = 0;
-
-static struct class *adpt_sysfs_class;
-
-static long adpt_unlocked_ioctl(struct file *, unsigned int, unsigned long);
-#ifdef CONFIG_COMPAT
-static long compat_adpt_ioctl(struct file *, unsigned int, unsigned long);
-#endif
-
-static const struct file_operations adpt_fops = {
- .unlocked_ioctl = adpt_unlocked_ioctl,
- .open = adpt_open,
- .release = adpt_close,
-#ifdef CONFIG_COMPAT
- .compat_ioctl = compat_adpt_ioctl,
-#endif
- .llseek = noop_llseek,
-};
-
-/* Structures and definitions for synchronous message posting.
- * See adpt_i2o_post_wait() for description
- * */
-struct adpt_i2o_post_wait_data
-{
- int status;
- u32 id;
- adpt_wait_queue_head_t *wq;
- struct adpt_i2o_post_wait_data *next;
-};
-
-static struct adpt_i2o_post_wait_data *adpt_post_wait_queue = NULL;
-static u32 adpt_post_wait_id = 0;
-static DEFINE_SPINLOCK(adpt_post_wait_lock);
-
-
-/*============================================================================
- * Functions
- *============================================================================
- */
-
-static inline int dpt_dma64(adpt_hba *pHba)
-{
- return (sizeof(dma_addr_t) > 4 && (pHba)->dma64);
-}
-
-static inline u32 dma_high(dma_addr_t addr)
-{
- return upper_32_bits(addr);
-}
-
-static inline u32 dma_low(dma_addr_t addr)
-{
- return (u32)addr;
-}
-
-static u8 adpt_read_blink_led(adpt_hba* host)
-{
- if (host->FwDebugBLEDflag_P) {
- if( readb(host->FwDebugBLEDflag_P) == 0xbc ){
- return readb(host->FwDebugBLEDvalue_P);
- }
- }
- return 0;
-}
-
-/*============================================================================
- * Scsi host template interface functions
- *============================================================================
- */
-
-#ifdef MODULE
-static struct pci_device_id dptids[] = {
- { PCI_DPT_VENDOR_ID, PCI_DPT_DEVICE_ID, PCI_ANY_ID, PCI_ANY_ID,},
- { PCI_DPT_VENDOR_ID, PCI_DPT_RAPTOR_DEVICE_ID, PCI_ANY_ID, PCI_ANY_ID,},
- { 0, }
-};
-#endif
-
-MODULE_DEVICE_TABLE(pci,dptids);
-
-static int adpt_detect(struct scsi_host_template* sht)
-{
- struct pci_dev *pDev = NULL;
- adpt_hba *pHba;
- adpt_hba *next;
-
- PINFO("Detecting Adaptec I2O RAID controllers...\n");
-
- /* search for all Adatpec I2O RAID cards */
- while ((pDev = pci_get_device( PCI_DPT_VENDOR_ID, PCI_ANY_ID, pDev))) {
- if(pDev->device == PCI_DPT_DEVICE_ID ||
- pDev->device == PCI_DPT_RAPTOR_DEVICE_ID){
- if(adpt_install_hba(sht, pDev) ){
- PERROR("Could not Init an I2O RAID device\n");
- PERROR("Will not try to detect others.\n");
- return hba_count-1;
- }
- pci_dev_get(pDev);
- }
- }
-
- /* In INIT state, Activate IOPs */
- for (pHba = hba_chain; pHba; pHba = next) {
- next = pHba->next;
- // Activate does get status , init outbound, and get hrt
- if (adpt_i2o_activate_hba(pHba) < 0) {
- adpt_i2o_delete_hba(pHba);
- }
- }
-
-
- /* Active IOPs in HOLD state */
-
-rebuild_sys_tab:
- if (hba_chain == NULL)
- return 0;
-
- /*
- * If build_sys_table fails, we kill everything and bail
- * as we can't init the IOPs w/o a system table
- */
- if (adpt_i2o_build_sys_table() < 0) {
- adpt_i2o_sys_shutdown();
- return 0;
- }
-
- PDEBUG("HBA's in HOLD state\n");
-
- /* If IOP don't get online, we need to rebuild the System table */
- for (pHba = hba_chain; pHba; pHba = pHba->next) {
- if (adpt_i2o_online_hba(pHba) < 0) {
- adpt_i2o_delete_hba(pHba);
- goto rebuild_sys_tab;
- }
- }
-
- /* Active IOPs now in OPERATIONAL state */
- PDEBUG("HBA's in OPERATIONAL state\n");
-
- printk("dpti: If you have a lot of devices this could take a few minutes.\n");
- for (pHba = hba_chain; pHba; pHba = next) {
- next = pHba->next;
- printk(KERN_INFO"%s: Reading the hardware resource table.\n", pHba->name);
- if (adpt_i2o_lct_get(pHba) < 0){
- adpt_i2o_delete_hba(pHba);
- continue;
- }
-
- if (adpt_i2o_parse_lct(pHba) < 0){
- adpt_i2o_delete_hba(pHba);
- continue;
- }
- adpt_inquiry(pHba);
- }
-
- adpt_sysfs_class = class_create(THIS_MODULE, "dpt_i2o");
- if (IS_ERR(adpt_sysfs_class)) {
- printk(KERN_WARNING"dpti: unable to create dpt_i2o class\n");
- adpt_sysfs_class = NULL;
- }
-
- for (pHba = hba_chain; pHba; pHba = next) {
- next = pHba->next;
- if (adpt_scsi_host_alloc(pHba, sht) < 0){
- adpt_i2o_delete_hba(pHba);
- continue;
- }
- pHba->initialized = TRUE;
- pHba->state &= ~DPTI_STATE_RESET;
- if (adpt_sysfs_class) {
- struct device *dev = device_create(adpt_sysfs_class,
- NULL, MKDEV(DPTI_I2O_MAJOR, pHba->unit), NULL,
- "dpti%d", pHba->unit);
- if (IS_ERR(dev)) {
- printk(KERN_WARNING"dpti%d: unable to "
- "create device in dpt_i2o class\n",
- pHba->unit);
- }
- }
- }
-
- // Register our control device node
- // nodes will need to be created in /dev to access this
- // the nodes can not be created from within the driver
- if (hba_count && register_chrdev(DPTI_I2O_MAJOR, DPT_DRIVER, &adpt_fops)) {
- adpt_i2o_sys_shutdown();
- return 0;
- }
- return hba_count;
-}
-
-
-static void adpt_release(adpt_hba *pHba)
-{
- struct Scsi_Host *shost = pHba->host;
-
- scsi_remove_host(shost);
-// adpt_i2o_quiesce_hba(pHba);
- adpt_i2o_delete_hba(pHba);
- scsi_host_put(shost);
-}
-
-
-static void adpt_inquiry(adpt_hba* pHba)
-{
- u32 msg[17];
- u32 *mptr;
- u32 *lenptr;
- int direction;
- int scsidir;
- u32 len;
- u32 reqlen;
- u8* buf;
- dma_addr_t addr;
- u8 scb[16];
- s32 rcode;
-
- memset(msg, 0, sizeof(msg));
- buf = dma_alloc_coherent(&pHba->pDev->dev, 80, &addr, GFP_KERNEL);
- if(!buf){
- printk(KERN_ERR"%s: Could not allocate buffer\n",pHba->name);
- return;
- }
- memset((void*)buf, 0, 36);
-
- len = 36;
- direction = 0x00000000;
- scsidir =0x40000000; // DATA IN (iop<--dev)
-
- if (dpt_dma64(pHba))
- reqlen = 17; // SINGLE SGE, 64 bit
- else
- reqlen = 14; // SINGLE SGE, 32 bit
- /* Stick the headers on */
- msg[0] = reqlen<<16 | SGL_OFFSET_12;
- msg[1] = (0xff<<24|HOST_TID<<12|ADAPTER_TID);
- msg[2] = 0;
- msg[3] = 0;
- // Adaptec/DPT Private stuff
- msg[4] = I2O_CMD_SCSI_EXEC|DPT_ORGANIZATION_ID<<16;
- msg[5] = ADAPTER_TID | 1<<16 /* Interpret*/;
- /* Direction, disconnect ok | sense data | simple queue , CDBLen */
- // I2O_SCB_FLAG_ENABLE_DISCONNECT |
- // I2O_SCB_FLAG_SIMPLE_QUEUE_TAG |
- // I2O_SCB_FLAG_SENSE_DATA_IN_MESSAGE;
- msg[6] = scsidir|0x20a00000| 6 /* cmd len*/;
-
- mptr=msg+7;
-
- memset(scb, 0, sizeof(scb));
- // Write SCSI command into the message - always 16 byte block
- scb[0] = INQUIRY;
- scb[1] = 0;
- scb[2] = 0;
- scb[3] = 0;
- scb[4] = 36;
- scb[5] = 0;
- // Don't care about the rest of scb
-
- memcpy(mptr, scb, sizeof(scb));
- mptr+=4;
- lenptr=mptr++; /* Remember me - fill in when we know */
-
- /* Now fill in the SGList and command */
- *lenptr = len;
- if (dpt_dma64(pHba)) {
- *mptr++ = (0x7C<<24)+(2<<16)+0x02; /* Enable 64 bit */
- *mptr++ = 1 << PAGE_SHIFT;
- *mptr++ = 0xD0000000|direction|len;
- *mptr++ = dma_low(addr);
- *mptr++ = dma_high(addr);
- } else {
- *mptr++ = 0xD0000000|direction|len;
- *mptr++ = addr;
- }
-
- // Send it on it's way
- rcode = adpt_i2o_post_wait(pHba, msg, reqlen<<2, 120);
- if (rcode != 0) {
- sprintf(pHba->detail, "Adaptec I2O RAID");
- printk(KERN_INFO "%s: Inquiry Error (%d)\n",pHba->name,rcode);
- if (rcode != -ETIME && rcode != -EINTR)
- dma_free_coherent(&pHba->pDev->dev, 80, buf, addr);
- } else {
- memset(pHba->detail, 0, sizeof(pHba->detail));
- memcpy(&(pHba->detail), "Vendor: Adaptec ", 16);
- memcpy(&(pHba->detail[16]), " Model: ", 8);
- memcpy(&(pHba->detail[24]), (u8*) &buf[16], 16);
- memcpy(&(pHba->detail[40]), " FW: ", 4);
- memcpy(&(pHba->detail[44]), (u8*) &buf[32], 4);
- pHba->detail[48] = '\0'; /* precautionary */
- dma_free_coherent(&pHba->pDev->dev, 80, buf, addr);
- }
- adpt_i2o_status_get(pHba);
- return ;
-}
-
-
-static int adpt_slave_configure(struct scsi_device * device)
-{
- struct Scsi_Host *host = device->host;
- adpt_hba* pHba;
-
- pHba = (adpt_hba *) host->hostdata[0];
-
- if (host->can_queue && device->tagged_supported) {
- scsi_change_queue_depth(device,
- host->can_queue - 1);
- }
- return 0;
-}
-
-static int adpt_queue_lck(struct scsi_cmnd * cmd, void (*done) (struct scsi_cmnd *))
-{
- adpt_hba* pHba = NULL;
- struct adpt_device* pDev = NULL; /* dpt per device information */
-
- cmd->scsi_done = done;
- /*
- * SCSI REQUEST_SENSE commands will be executed automatically by the
- * Host Adapter for any errors, so they should not be executed
- * explicitly unless the Sense Data is zero indicating that no error
- * occurred.
- */
-
- if ((cmd->cmnd[0] == REQUEST_SENSE) && (cmd->sense_buffer[0] != 0)) {
- cmd->result = (DID_OK << 16);
- cmd->scsi_done(cmd);
- return 0;
- }
-
- pHba = (adpt_hba*)cmd->device->host->hostdata[0];
- if (!pHba) {
- return FAILED;
- }
-
- rmb();
- if ((pHba->state) & DPTI_STATE_RESET)
- return SCSI_MLQUEUE_HOST_BUSY;
-
- // TODO if the cmd->device if offline then I may need to issue a bus rescan
- // followed by a get_lct to see if the device is there anymore
- if((pDev = (struct adpt_device*) (cmd->device->hostdata)) == NULL) {
- /*
- * First command request for this device. Set up a pointer
- * to the device structure. This should be a TEST_UNIT_READY
- * command from scan_scsis_single.
- */
- if ((pDev = adpt_find_device(pHba, (u32)cmd->device->channel, (u32)cmd->device->id, cmd->device->lun)) == NULL) {
- // TODO: if any luns are at this bus, scsi id then fake a TEST_UNIT_READY and INQUIRY response
- // with type 7F (for all luns less than the max for this bus,id) so the lun scan will continue.
- cmd->result = (DID_NO_CONNECT << 16);
- cmd->scsi_done(cmd);
- return 0;
- }
- cmd->device->hostdata = pDev;
- }
- pDev->pScsi_dev = cmd->device;
-
- /*
- * If we are being called from when the device is being reset,
- * delay processing of the command until later.
- */
- if (pDev->state & DPTI_DEV_RESET ) {
- return FAILED;
- }
- return adpt_scsi_to_i2o(pHba, cmd, pDev);
-}
-
-static DEF_SCSI_QCMD(adpt_queue)
-
-static int adpt_bios_param(struct scsi_device *sdev, struct block_device *dev,
- sector_t capacity, int geom[])
-{
- int heads=-1;
- int sectors=-1;
- int cylinders=-1;
-
- // *** First lets set the default geometry ****
-
- // If the capacity is less than ox2000
- if (capacity < 0x2000 ) { // floppy
- heads = 18;
- sectors = 2;
- }
- // else if between 0x2000 and 0x20000
- else if (capacity < 0x20000) {
- heads = 64;
- sectors = 32;
- }
- // else if between 0x20000 and 0x40000
- else if (capacity < 0x40000) {
- heads = 65;
- sectors = 63;
- }
- // else if between 0x4000 and 0x80000
- else if (capacity < 0x80000) {
- heads = 128;
- sectors = 63;
- }
- // else if greater than 0x80000
- else {
- heads = 255;
- sectors = 63;
- }
- cylinders = sector_div(capacity, heads * sectors);
-
- // Special case if CDROM
- if(sdev->type == 5) { // CDROM
- heads = 252;
- sectors = 63;
- cylinders = 1111;
- }
-
- geom[0] = heads;
- geom[1] = sectors;
- geom[2] = cylinders;
-
- PDEBUG("adpt_bios_param: exit\n");
- return 0;
-}
-
-
-static const char *adpt_info(struct Scsi_Host *host)
-{
- adpt_hba* pHba;
-
- pHba = (adpt_hba *) host->hostdata[0];
- return (char *) (pHba->detail);
-}
-
-static int adpt_show_info(struct seq_file *m, struct Scsi_Host *host)
-{
- struct adpt_device* d;
- int id;
- int chan;
- adpt_hba* pHba;
- int unit;
-
- // Find HBA (host bus adapter) we are looking for
- mutex_lock(&adpt_configuration_lock);
- for (pHba = hba_chain; pHba; pHba = pHba->next) {
- if (pHba->host == host) {
- break; /* found adapter */
- }
- }
- mutex_unlock(&adpt_configuration_lock);
- if (pHba == NULL) {
- return 0;
- }
- host = pHba->host;
-
- seq_printf(m, "Adaptec I2O RAID Driver Version: %s\n\n", DPT_I2O_VERSION);
- seq_printf(m, "%s\n", pHba->detail);
- seq_printf(m, "SCSI Host=scsi%d Control Node=/dev/%s irq=%d\n",
- pHba->host->host_no, pHba->name, host->irq);
- seq_printf(m, "\tpost fifo size = %d\n\treply fifo size = %d\n\tsg table size = %d\n\n",
- host->can_queue, (int) pHba->reply_fifo_size , host->sg_tablesize);
-
- seq_puts(m, "Devices:\n");
- for(chan = 0; chan < MAX_CHANNEL; chan++) {
- for(id = 0; id < MAX_ID; id++) {
- d = pHba->channel[chan].device[id];
- while(d) {
- seq_printf(m,"\t%-24.24s", d->pScsi_dev->vendor);
- seq_printf(m," Rev: %-8.8s\n", d->pScsi_dev->rev);
-
- unit = d->pI2o_dev->lct_data.tid;
- seq_printf(m, "\tTID=%d, (Channel=%d, Target=%d, Lun=%llu) (%s)\n\n",
- unit, (int)d->scsi_channel, (int)d->scsi_id, d->scsi_lun,
- scsi_device_online(d->pScsi_dev)? "online":"offline");
- d = d->next_lun;
- }
- }
- }
- return 0;
-}
-
-/*
- * Turn a struct scsi_cmnd * into a unique 32 bit 'context'.
- */
-static u32 adpt_cmd_to_context(struct scsi_cmnd *cmd)
-{
- return (u32)cmd->serial_number;
-}
-
-/*
- * Go from a u32 'context' to a struct scsi_cmnd * .
- * This could probably be made more efficient.
- */
-static struct scsi_cmnd *
- adpt_cmd_from_context(adpt_hba * pHba, u32 context)
-{
- struct scsi_cmnd * cmd;
- struct scsi_device * d;
-
- if (context == 0)
- return NULL;
-
- spin_unlock(pHba->host->host_lock);
- shost_for_each_device(d, pHba->host) {
- unsigned long flags;
- spin_lock_irqsave(&d->list_lock, flags);
- list_for_each_entry(cmd, &d->cmd_list, list) {
- if (((u32)cmd->serial_number == context)) {
- spin_unlock_irqrestore(&d->list_lock, flags);
- scsi_device_put(d);
- spin_lock(pHba->host->host_lock);
- return cmd;
- }
- }
- spin_unlock_irqrestore(&d->list_lock, flags);
- }
- spin_lock(pHba->host->host_lock);
-
- return NULL;
-}
-
-/*
- * Turn a pointer to ioctl reply data into an u32 'context'
- */
-static u32 adpt_ioctl_to_context(adpt_hba * pHba, void *reply)
-{
-#if BITS_PER_LONG == 32
- return (u32)(unsigned long)reply;
-#else
- ulong flags = 0;
- u32 nr, i;
-
- spin_lock_irqsave(pHba->host->host_lock, flags);
- nr = ARRAY_SIZE(pHba->ioctl_reply_context);
- for (i = 0; i < nr; i++) {
- if (pHba->ioctl_reply_context[i] == NULL) {
- pHba->ioctl_reply_context[i] = reply;
- break;
- }
- }
- spin_unlock_irqrestore(pHba->host->host_lock, flags);
- if (i >= nr) {
- printk(KERN_WARNING"%s: Too many outstanding "
- "ioctl commands\n", pHba->name);
- return (u32)-1;
- }
-
- return i;
-#endif
-}
-
-/*
- * Go from an u32 'context' to a pointer to ioctl reply data.
- */
-static void *adpt_ioctl_from_context(adpt_hba *pHba, u32 context)
-{
-#if BITS_PER_LONG == 32
- return (void *)(unsigned long)context;
-#else
- void *p = pHba->ioctl_reply_context[context];
- pHba->ioctl_reply_context[context] = NULL;
-
- return p;
-#endif
-}
-
-/*===========================================================================
- * Error Handling routines
- *===========================================================================
- */
-
-static int adpt_abort(struct scsi_cmnd * cmd)
-{
- adpt_hba* pHba = NULL; /* host bus adapter structure */
- struct adpt_device* dptdevice; /* dpt per device information */
- u32 msg[5];
- int rcode;
-
- if(cmd->serial_number == 0){
- return FAILED;
- }
- pHba = (adpt_hba*) cmd->device->host->hostdata[0];
- printk(KERN_INFO"%s: Trying to Abort\n",pHba->name);
- if ((dptdevice = (void*) (cmd->device->hostdata)) == NULL) {
- printk(KERN_ERR "%s: Unable to abort: No device in cmnd\n",pHba->name);
- return FAILED;
- }
-
- memset(msg, 0, sizeof(msg));
- msg[0] = FIVE_WORD_MSG_SIZE|SGL_OFFSET_0;
- msg[1] = I2O_CMD_SCSI_ABORT<<24|HOST_TID<<12|dptdevice->tid;
- msg[2] = 0;
- msg[3]= 0;
- msg[4] = adpt_cmd_to_context(cmd);
- if (pHba->host)
- spin_lock_irq(pHba->host->host_lock);
- rcode = adpt_i2o_post_wait(pHba, msg, sizeof(msg), FOREVER);
- if (pHba->host)
- spin_unlock_irq(pHba->host->host_lock);
- if (rcode != 0) {
- if(rcode == -EOPNOTSUPP ){
- printk(KERN_INFO"%s: Abort cmd not supported\n",pHba->name);
- return FAILED;
- }
- printk(KERN_INFO"%s: Abort failed.\n",pHba->name);
- return FAILED;
- }
- printk(KERN_INFO"%s: Abort complete.\n",pHba->name);
- return SUCCESS;
-}
-
-
-#define I2O_DEVICE_RESET 0x27
-// This is the same for BLK and SCSI devices
-// NOTE this is wrong in the i2o.h definitions
-// This is not currently supported by our adapter but we issue it anyway
-static int adpt_device_reset(struct scsi_cmnd* cmd)
-{
- adpt_hba* pHba;
- u32 msg[4];
- u32 rcode;
- int old_state;
- struct adpt_device* d = cmd->device->hostdata;
-
- pHba = (void*) cmd->device->host->hostdata[0];
- printk(KERN_INFO"%s: Trying to reset device\n",pHba->name);
- if (!d) {
- printk(KERN_INFO"%s: Reset Device: Device Not found\n",pHba->name);
- return FAILED;
- }
- memset(msg, 0, sizeof(msg));
- msg[0] = FOUR_WORD_MSG_SIZE|SGL_OFFSET_0;
- msg[1] = (I2O_DEVICE_RESET<<24|HOST_TID<<12|d->tid);
- msg[2] = 0;
- msg[3] = 0;
-
- if (pHba->host)
- spin_lock_irq(pHba->host->host_lock);
- old_state = d->state;
- d->state |= DPTI_DEV_RESET;
- rcode = adpt_i2o_post_wait(pHba, msg,sizeof(msg), FOREVER);
- d->state = old_state;
- if (pHba->host)
- spin_unlock_irq(pHba->host->host_lock);
- if (rcode != 0) {
- if(rcode == -EOPNOTSUPP ){
- printk(KERN_INFO"%s: Device reset not supported\n",pHba->name);
- return FAILED;
- }
- printk(KERN_INFO"%s: Device reset failed\n",pHba->name);
- return FAILED;
- } else {
- printk(KERN_INFO"%s: Device reset successful\n",pHba->name);
- return SUCCESS;
- }
-}
-
-
-#define I2O_HBA_BUS_RESET 0x87
-// This version of bus reset is called by the eh_error handler
-static int adpt_bus_reset(struct scsi_cmnd* cmd)
-{
- adpt_hba* pHba;
- u32 msg[4];
- u32 rcode;
-
- pHba = (adpt_hba*)cmd->device->host->hostdata[0];
- memset(msg, 0, sizeof(msg));
- printk(KERN_WARNING"%s: Bus reset: SCSI Bus %d: tid: %d\n",pHba->name, cmd->device->channel,pHba->channel[cmd->device->channel].tid );
- msg[0] = FOUR_WORD_MSG_SIZE|SGL_OFFSET_0;
- msg[1] = (I2O_HBA_BUS_RESET<<24|HOST_TID<<12|pHba->channel[cmd->device->channel].tid);
- msg[2] = 0;
- msg[3] = 0;
- if (pHba->host)
- spin_lock_irq(pHba->host->host_lock);
- rcode = adpt_i2o_post_wait(pHba, msg,sizeof(msg), FOREVER);
- if (pHba->host)
- spin_unlock_irq(pHba->host->host_lock);
- if (rcode != 0) {
- printk(KERN_WARNING"%s: Bus reset failed.\n",pHba->name);
- return FAILED;
- } else {
- printk(KERN_WARNING"%s: Bus reset success.\n",pHba->name);
- return SUCCESS;
- }
-}
-
-// This version of reset is called by the eh_error_handler
-static int __adpt_reset(struct scsi_cmnd* cmd)
-{
- adpt_hba* pHba;
- int rcode;
- char name[32];
-
- pHba = (adpt_hba*)cmd->device->host->hostdata[0];
- strncpy(name, pHba->name, sizeof(name));
- printk(KERN_WARNING"%s: Hba Reset: scsi id %d: tid: %d\n", name, cmd->device->channel, pHba->channel[cmd->device->channel].tid);
- rcode = adpt_hba_reset(pHba);
- if(rcode == 0){
- printk(KERN_WARNING"%s: HBA reset complete\n", name);
- return SUCCESS;
- } else {
- printk(KERN_WARNING"%s: HBA reset failed (%x)\n", name, rcode);
- return FAILED;
- }
-}
-
-static int adpt_reset(struct scsi_cmnd* cmd)
-{
- int rc;
-
- spin_lock_irq(cmd->device->host->host_lock);
- rc = __adpt_reset(cmd);
- spin_unlock_irq(cmd->device->host->host_lock);
-
- return rc;
-}
-
-// This version of reset is called by the ioctls and indirectly from eh_error_handler via adpt_reset
-static int adpt_hba_reset(adpt_hba* pHba)
-{
- int rcode;
-
- pHba->state |= DPTI_STATE_RESET;
-
- // Activate does get status , init outbound, and get hrt
- if ((rcode=adpt_i2o_activate_hba(pHba)) < 0) {
- printk(KERN_ERR "%s: Could not activate\n", pHba->name);
- adpt_i2o_delete_hba(pHba);
- return rcode;
- }
-
- if ((rcode=adpt_i2o_build_sys_table()) < 0) {
- adpt_i2o_delete_hba(pHba);
- return rcode;
- }
- PDEBUG("%s: in HOLD state\n",pHba->name);
-
- if ((rcode=adpt_i2o_online_hba(pHba)) < 0) {
- adpt_i2o_delete_hba(pHba);
- return rcode;
- }
- PDEBUG("%s: in OPERATIONAL state\n",pHba->name);
-
- if ((rcode=adpt_i2o_lct_get(pHba)) < 0){
- adpt_i2o_delete_hba(pHba);
- return rcode;
- }
-
- if ((rcode=adpt_i2o_reparse_lct(pHba)) < 0){
- adpt_i2o_delete_hba(pHba);
- return rcode;
- }
- pHba->state &= ~DPTI_STATE_RESET;
-
- adpt_fail_posted_scbs(pHba);
- return 0; /* return success */
-}
-
-/*===========================================================================
- *
- *===========================================================================
- */
-
-
-static void adpt_i2o_sys_shutdown(void)
-{
- adpt_hba *pHba, *pNext;
- struct adpt_i2o_post_wait_data *p1, *old;
-
- printk(KERN_INFO"Shutting down Adaptec I2O controllers.\n");
- printk(KERN_INFO" This could take a few minutes if there are many devices attached\n");
- /* Delete all IOPs from the controller chain */
- /* They should have already been released by the
- * scsi-core
- */
- for (pHba = hba_chain; pHba; pHba = pNext) {
- pNext = pHba->next;
- adpt_i2o_delete_hba(pHba);
- }
-
- /* Remove any timedout entries from the wait queue. */
-// spin_lock_irqsave(&adpt_post_wait_lock, flags);
- /* Nothing should be outstanding at this point so just
- * free them
- */
- for(p1 = adpt_post_wait_queue; p1;) {
- old = p1;
- p1 = p1->next;
- kfree(old);
- }
-// spin_unlock_irqrestore(&adpt_post_wait_lock, flags);
- adpt_post_wait_queue = NULL;
-
- printk(KERN_INFO "Adaptec I2O controllers down.\n");
-}
-
-static int adpt_install_hba(struct scsi_host_template* sht, struct pci_dev* pDev)
-{
-
- adpt_hba* pHba = NULL;
- adpt_hba* p = NULL;
- ulong base_addr0_phys = 0;
- ulong base_addr1_phys = 0;
- u32 hba_map0_area_size = 0;
- u32 hba_map1_area_size = 0;
- void __iomem *base_addr_virt = NULL;
- void __iomem *msg_addr_virt = NULL;
- int dma64 = 0;
-
- int raptorFlag = FALSE;
-
- if(pci_enable_device(pDev)) {
- return -EINVAL;
- }
-
- if (pci_request_regions(pDev, "dpt_i2o")) {
- PERROR("dpti: adpt_config_hba: pci request region failed\n");
- return -EINVAL;
- }
-
- pci_set_master(pDev);
-
- /*
- * See if we should enable dma64 mode.
- */
- if (sizeof(dma_addr_t) > 4 &&
- pci_set_dma_mask(pDev, DMA_BIT_MASK(64)) == 0) {
- if (dma_get_required_mask(&pDev->dev) > DMA_BIT_MASK(32))
- dma64 = 1;
- }
- if (!dma64 && pci_set_dma_mask(pDev, DMA_BIT_MASK(32)) != 0)
- return -EINVAL;
-
- /* adapter only supports message blocks below 4GB */
- pci_set_consistent_dma_mask(pDev, DMA_BIT_MASK(32));
-
- base_addr0_phys = pci_resource_start(pDev,0);
- hba_map0_area_size = pci_resource_len(pDev,0);
-
- // Check if standard PCI card or single BAR Raptor
- if(pDev->device == PCI_DPT_DEVICE_ID){
- if(pDev->subsystem_device >=0xc032 && pDev->subsystem_device <= 0xc03b){
- // Raptor card with this device id needs 4M
- hba_map0_area_size = 0x400000;
- } else { // Not Raptor - it is a PCI card
- if(hba_map0_area_size > 0x100000 ){
- hba_map0_area_size = 0x100000;
- }
- }
- } else {// Raptor split BAR config
- // Use BAR1 in this configuration
- base_addr1_phys = pci_resource_start(pDev,1);
- hba_map1_area_size = pci_resource_len(pDev,1);
- raptorFlag = TRUE;
- }
-
-#if BITS_PER_LONG == 64
- /*
- * The original Adaptec 64 bit driver has this comment here:
- * "x86_64 machines need more optimal mappings"
- *
- * I assume some HBAs report ridiculously large mappings
- * and we need to limit them on platforms with IOMMUs.
- */
- if (raptorFlag == TRUE) {
- if (hba_map0_area_size > 128)
- hba_map0_area_size = 128;
- if (hba_map1_area_size > 524288)
- hba_map1_area_size = 524288;
- } else {
- if (hba_map0_area_size > 524288)
- hba_map0_area_size = 524288;
- }
-#endif
-
- base_addr_virt = ioremap(base_addr0_phys,hba_map0_area_size);
- if (!base_addr_virt) {
- pci_release_regions(pDev);
- PERROR("dpti: adpt_config_hba: io remap failed\n");
- return -EINVAL;
- }
-
- if(raptorFlag == TRUE) {
- msg_addr_virt = ioremap(base_addr1_phys, hba_map1_area_size );
- if (!msg_addr_virt) {
- PERROR("dpti: adpt_config_hba: io remap failed on BAR1\n");
- iounmap(base_addr_virt);
- pci_release_regions(pDev);
- return -EINVAL;
- }
- } else {
- msg_addr_virt = base_addr_virt;
- }
-
- // Allocate and zero the data structure
- pHba = kzalloc(sizeof(adpt_hba), GFP_KERNEL);
- if (!pHba) {
- if (msg_addr_virt != base_addr_virt)
- iounmap(msg_addr_virt);
- iounmap(base_addr_virt);
- pci_release_regions(pDev);
- return -ENOMEM;
- }
-
- mutex_lock(&adpt_configuration_lock);
-
- if(hba_chain != NULL){
- for(p = hba_chain; p->next; p = p->next);
- p->next = pHba;
- } else {
- hba_chain = pHba;
- }
- pHba->next = NULL;
- pHba->unit = hba_count;
- sprintf(pHba->name, "dpti%d", hba_count);
- hba_count++;
-
- mutex_unlock(&adpt_configuration_lock);
-
- pHba->pDev = pDev;
- pHba->base_addr_phys = base_addr0_phys;
-
- // Set up the Virtual Base Address of the I2O Device
- pHba->base_addr_virt = base_addr_virt;
- pHba->msg_addr_virt = msg_addr_virt;
- pHba->irq_mask = base_addr_virt+0x30;
- pHba->post_port = base_addr_virt+0x40;
- pHba->reply_port = base_addr_virt+0x44;
-
- pHba->hrt = NULL;
- pHba->lct = NULL;
- pHba->lct_size = 0;
- pHba->status_block = NULL;
- pHba->post_count = 0;
- pHba->state = DPTI_STATE_RESET;
- pHba->pDev = pDev;
- pHba->devices = NULL;
- pHba->dma64 = dma64;
-
- // Initializing the spinlocks
- spin_lock_init(&pHba->state_lock);
- spin_lock_init(&adpt_post_wait_lock);
-
- if(raptorFlag == 0){
- printk(KERN_INFO "Adaptec I2O RAID controller"
- " %d at %p size=%x irq=%d%s\n",
- hba_count-1, base_addr_virt,
- hba_map0_area_size, pDev->irq,
- dma64 ? " (64-bit DMA)" : "");
- } else {
- printk(KERN_INFO"Adaptec I2O RAID controller %d irq=%d%s\n",
- hba_count-1, pDev->irq,
- dma64 ? " (64-bit DMA)" : "");
- printk(KERN_INFO" BAR0 %p - size= %x\n",base_addr_virt,hba_map0_area_size);
- printk(KERN_INFO" BAR1 %p - size= %x\n",msg_addr_virt,hba_map1_area_size);
- }
-
- if (request_irq (pDev->irq, adpt_isr, IRQF_SHARED, pHba->name, pHba)) {
- printk(KERN_ERR"%s: Couldn't register IRQ %d\n", pHba->name, pDev->irq);
- adpt_i2o_delete_hba(pHba);
- return -EINVAL;
- }
-
- return 0;
-}
-
-
-static void adpt_i2o_delete_hba(adpt_hba* pHba)
-{
- adpt_hba* p1;
- adpt_hba* p2;
- struct i2o_device* d;
- struct i2o_device* next;
- int i;
- int j;
- struct adpt_device* pDev;
- struct adpt_device* pNext;
-
-
- mutex_lock(&adpt_configuration_lock);
- if(pHba->host){
- free_irq(pHba->host->irq, pHba);
- }
- p2 = NULL;
- for( p1 = hba_chain; p1; p2 = p1,p1=p1->next){
- if(p1 == pHba) {
- if(p2) {
- p2->next = p1->next;
- } else {
- hba_chain = p1->next;
- }
- break;
- }
- }
-
- hba_count--;
- mutex_unlock(&adpt_configuration_lock);
-
- iounmap(pHba->base_addr_virt);
- pci_release_regions(pHba->pDev);
- if(pHba->msg_addr_virt != pHba->base_addr_virt){
- iounmap(pHba->msg_addr_virt);
- }
- if(pHba->FwDebugBuffer_P)
- iounmap(pHba->FwDebugBuffer_P);
- if(pHba->hrt) {
- dma_free_coherent(&pHba->pDev->dev,
- pHba->hrt->num_entries * pHba->hrt->entry_len << 2,
- pHba->hrt, pHba->hrt_pa);
- }
- if(pHba->lct) {
- dma_free_coherent(&pHba->pDev->dev, pHba->lct_size,
- pHba->lct, pHba->lct_pa);
- }
- if(pHba->status_block) {
- dma_free_coherent(&pHba->pDev->dev, sizeof(i2o_status_block),
- pHba->status_block, pHba->status_block_pa);
- }
- if(pHba->reply_pool) {
- dma_free_coherent(&pHba->pDev->dev,
- pHba->reply_fifo_size * REPLY_FRAME_SIZE * 4,
- pHba->reply_pool, pHba->reply_pool_pa);
- }
-
- for(d = pHba->devices; d ; d = next){
- next = d->next;
- kfree(d);
- }
- for(i = 0 ; i < pHba->top_scsi_channel ; i++){
- for(j = 0; j < MAX_ID; j++){
- if(pHba->channel[i].device[j] != NULL){
- for(pDev = pHba->channel[i].device[j]; pDev; pDev = pNext){
- pNext = pDev->next_lun;
- kfree(pDev);
- }
- }
- }
- }
- pci_dev_put(pHba->pDev);
- if (adpt_sysfs_class)
- device_destroy(adpt_sysfs_class,
- MKDEV(DPTI_I2O_MAJOR, pHba->unit));
- kfree(pHba);
-
- if(hba_count <= 0){
- unregister_chrdev(DPTI_I2O_MAJOR, DPT_DRIVER);
- if (adpt_sysfs_class) {
- class_destroy(adpt_sysfs_class);
- adpt_sysfs_class = NULL;
- }
- }
-}
-
-static struct adpt_device* adpt_find_device(adpt_hba* pHba, u32 chan, u32 id, u64 lun)
-{
- struct adpt_device* d;
-
- if(chan < 0 || chan >= MAX_CHANNEL)
- return NULL;
-
- d = pHba->channel[chan].device[id];
- if(!d || d->tid == 0) {
- return NULL;
- }
-
- /* If it is the only lun at that address then this should match*/
- if(d->scsi_lun == lun){
- return d;
- }
-
- /* else we need to look through all the luns */
- for(d=d->next_lun ; d ; d = d->next_lun){
- if(d->scsi_lun == lun){
- return d;
- }
- }
- return NULL;
-}
-
-
-static int adpt_i2o_post_wait(adpt_hba* pHba, u32* msg, int len, int timeout)
-{
- // I used my own version of the WAIT_QUEUE_HEAD
- // to handle some version differences
- // When embedded in the kernel this could go back to the vanilla one
- ADPT_DECLARE_WAIT_QUEUE_HEAD(adpt_wq_i2o_post);
- int status = 0;
- ulong flags = 0;
- struct adpt_i2o_post_wait_data *p1, *p2;
- struct adpt_i2o_post_wait_data *wait_data =
- kmalloc(sizeof(struct adpt_i2o_post_wait_data), GFP_ATOMIC);
- DECLARE_WAITQUEUE(wait, current);
-
- if (!wait_data)
- return -ENOMEM;
-
- /*
- * The spin locking is needed to keep anyone from playing
- * with the queue pointers and id while we do the same
- */
- spin_lock_irqsave(&adpt_post_wait_lock, flags);
- // TODO we need a MORE unique way of getting ids
- // to support async LCT get
- wait_data->next = adpt_post_wait_queue;
- adpt_post_wait_queue = wait_data;
- adpt_post_wait_id++;
- adpt_post_wait_id &= 0x7fff;
- wait_data->id = adpt_post_wait_id;
- spin_unlock_irqrestore(&adpt_post_wait_lock, flags);
-
- wait_data->wq = &adpt_wq_i2o_post;
- wait_data->status = -ETIMEDOUT;
-
- add_wait_queue(&adpt_wq_i2o_post, &wait);
-
- msg[2] |= 0x80000000 | ((u32)wait_data->id);
- timeout *= HZ;
- if((status = adpt_i2o_post_this(pHba, msg, len)) == 0){
- set_current_state(TASK_INTERRUPTIBLE);
- if(pHba->host)
- spin_unlock_irq(pHba->host->host_lock);
- if (!timeout)
- schedule();
- else{
- timeout = schedule_timeout(timeout);
- if (timeout == 0) {
- // I/O issued, but cannot get result in
- // specified time. Freeing resorces is
- // dangerous.
- status = -ETIME;
- }
- }
- if(pHba->host)
- spin_lock_irq(pHba->host->host_lock);
- }
- remove_wait_queue(&adpt_wq_i2o_post, &wait);
-
- if(status == -ETIMEDOUT){
- printk(KERN_INFO"dpti%d: POST WAIT TIMEOUT\n",pHba->unit);
- // We will have to free the wait_data memory during shutdown
- return status;
- }
-
- /* Remove the entry from the queue. */
- p2 = NULL;
- spin_lock_irqsave(&adpt_post_wait_lock, flags);
- for(p1 = adpt_post_wait_queue; p1; p2 = p1, p1 = p1->next) {
- if(p1 == wait_data) {
- if(p1->status == I2O_DETAIL_STATUS_UNSUPPORTED_FUNCTION ) {
- status = -EOPNOTSUPP;
- }
- if(p2) {
- p2->next = p1->next;
- } else {
- adpt_post_wait_queue = p1->next;
- }
- break;
- }
- }
- spin_unlock_irqrestore(&adpt_post_wait_lock, flags);
-
- kfree(wait_data);
-
- return status;
-}
-
-
-static s32 adpt_i2o_post_this(adpt_hba* pHba, u32* data, int len)
-{
-
- u32 m = EMPTY_QUEUE;
- u32 __iomem *msg;
- ulong timeout = jiffies + 30*HZ;
- do {
- rmb();
- m = readl(pHba->post_port);
- if (m != EMPTY_QUEUE) {
- break;
- }
- if(time_after(jiffies,timeout)){
- printk(KERN_WARNING"dpti%d: Timeout waiting for message frame!\n", pHba->unit);
- return -ETIMEDOUT;
- }
- schedule_timeout_uninterruptible(1);
- } while(m == EMPTY_QUEUE);
-
- msg = pHba->msg_addr_virt + m;
- memcpy_toio(msg, data, len);
- wmb();
-
- //post message
- writel(m, pHba->post_port);
- wmb();
-
- return 0;
-}
-
-
-static void adpt_i2o_post_wait_complete(u32 context, int status)
-{
- struct adpt_i2o_post_wait_data *p1 = NULL;
- /*
- * We need to search through the adpt_post_wait
- * queue to see if the given message is still
- * outstanding. If not, it means that the IOP
- * took longer to respond to the message than we
- * had allowed and timer has already expired.
- * Not much we can do about that except log
- * it for debug purposes, increase timeout, and recompile
- *
- * Lock needed to keep anyone from moving queue pointers
- * around while we're looking through them.
- */
-
- context &= 0x7fff;
-
- spin_lock(&adpt_post_wait_lock);
- for(p1 = adpt_post_wait_queue; p1; p1 = p1->next) {
- if(p1->id == context) {
- p1->status = status;
- spin_unlock(&adpt_post_wait_lock);
- wake_up_interruptible(p1->wq);
- return;
- }
- }
- spin_unlock(&adpt_post_wait_lock);
- // If this happens we lose commands that probably really completed
- printk(KERN_DEBUG"dpti: Could Not find task %d in wait queue\n",context);
- printk(KERN_DEBUG" Tasks in wait queue:\n");
- for(p1 = adpt_post_wait_queue; p1; p1 = p1->next) {
- printk(KERN_DEBUG" %d\n",p1->id);
- }
- return;
-}
-
-static s32 adpt_i2o_reset_hba(adpt_hba* pHba)
-{
- u32 msg[8];
- u8* status;
- dma_addr_t addr;
- u32 m = EMPTY_QUEUE ;
- ulong timeout = jiffies + (TMOUT_IOPRESET*HZ);
-
- if(pHba->initialized == FALSE) { // First time reset should be quick
- timeout = jiffies + (25*HZ);
- } else {
- adpt_i2o_quiesce_hba(pHba);
- }
-
- do {
- rmb();
- m = readl(pHba->post_port);
- if (m != EMPTY_QUEUE) {
- break;
- }
- if(time_after(jiffies,timeout)){
- printk(KERN_WARNING"Timeout waiting for message!\n");
- return -ETIMEDOUT;
- }
- schedule_timeout_uninterruptible(1);
- } while (m == EMPTY_QUEUE);
-
- status = dma_alloc_coherent(&pHba->pDev->dev, 4, &addr, GFP_KERNEL);
- if(status == NULL) {
- adpt_send_nop(pHba, m);
- printk(KERN_ERR"IOP reset failed - no free memory.\n");
- return -ENOMEM;
- }
- memset(status,0,4);
-
- msg[0]=EIGHT_WORD_MSG_SIZE|SGL_OFFSET_0;
- msg[1]=I2O_CMD_ADAPTER_RESET<<24|HOST_TID<<12|ADAPTER_TID;
- msg[2]=0;
- msg[3]=0;
- msg[4]=0;
- msg[5]=0;
- msg[6]=dma_low(addr);
- msg[7]=dma_high(addr);
-
- memcpy_toio(pHba->msg_addr_virt+m, msg, sizeof(msg));
- wmb();
- writel(m, pHba->post_port);
- wmb();
-
- while(*status == 0){
- if(time_after(jiffies,timeout)){
- printk(KERN_WARNING"%s: IOP Reset Timeout\n",pHba->name);
- /* We lose 4 bytes of "status" here, but we cannot
- free these because controller may awake and corrupt
- those bytes at any time */
- /* dma_free_coherent(&pHba->pDev->dev, 4, buf, addr); */
- return -ETIMEDOUT;
- }
- rmb();
- schedule_timeout_uninterruptible(1);
- }
-
- if(*status == 0x01 /*I2O_EXEC_IOP_RESET_IN_PROGRESS*/) {
- PDEBUG("%s: Reset in progress...\n", pHba->name);
- // Here we wait for message frame to become available
- // indicated that reset has finished
- do {
- rmb();
- m = readl(pHba->post_port);
- if (m != EMPTY_QUEUE) {
- break;
- }
- if(time_after(jiffies,timeout)){
- printk(KERN_ERR "%s:Timeout waiting for IOP Reset.\n",pHba->name);
- /* We lose 4 bytes of "status" here, but we
- cannot free these because controller may
- awake and corrupt those bytes at any time */
- /* dma_free_coherent(&pHba->pDev->dev, 4, buf, addr); */
- return -ETIMEDOUT;
- }
- schedule_timeout_uninterruptible(1);
- } while (m == EMPTY_QUEUE);
- // Flush the offset
- adpt_send_nop(pHba, m);
- }
- adpt_i2o_status_get(pHba);
- if(*status == 0x02 ||
- pHba->status_block->iop_state != ADAPTER_STATE_RESET) {
- printk(KERN_WARNING"%s: Reset reject, trying to clear\n",
- pHba->name);
- } else {
- PDEBUG("%s: Reset completed.\n", pHba->name);
- }
-
- dma_free_coherent(&pHba->pDev->dev, 4, status, addr);
-#ifdef UARTDELAY
- // This delay is to allow someone attached to the card through the debug UART to
- // set up the dump levels that they want before the rest of the initialization sequence
- adpt_delay(20000);
-#endif
- return 0;
-}
-
-
-static int adpt_i2o_parse_lct(adpt_hba* pHba)
-{
- int i;
- int max;
- int tid;
- struct i2o_device *d;
- i2o_lct *lct = pHba->lct;
- u8 bus_no = 0;
- s16 scsi_id;
- u64 scsi_lun;
- u32 buf[10]; // larger than 7, or 8 ...
- struct adpt_device* pDev;
-
- if (lct == NULL) {
- printk(KERN_ERR "%s: LCT is empty???\n",pHba->name);
- return -1;
- }
-
- max = lct->table_size;
- max -= 3;
- max /= 9;
-
- for(i=0;i<max;i++) {
- if( lct->lct_entry[i].user_tid != 0xfff){
- /*
- * If we have hidden devices, we need to inform the upper layers about
- * the possible maximum id reference to handle device access when
- * an array is disassembled. This code has no other purpose but to
- * allow us future access to devices that are currently hidden
- * behind arrays, hotspares or have not been configured (JBOD mode).
- */
- if( lct->lct_entry[i].class_id != I2O_CLASS_RANDOM_BLOCK_STORAGE &&
- lct->lct_entry[i].class_id != I2O_CLASS_SCSI_PERIPHERAL &&
- lct->lct_entry[i].class_id != I2O_CLASS_FIBRE_CHANNEL_PERIPHERAL ){
- continue;
- }
- tid = lct->lct_entry[i].tid;
- // I2O_DPT_DEVICE_INFO_GROUP_NO;
- if(adpt_i2o_query_scalar(pHba, tid, 0x8000, -1, buf, 32)<0) {
- continue;
- }
- bus_no = buf[0]>>16;
- scsi_id = buf[1];
- scsi_lun = scsilun_to_int((struct scsi_lun *)&buf[2]);
- if(bus_no >= MAX_CHANNEL) { // Something wrong skip it
- printk(KERN_WARNING"%s: Channel number %d out of range \n", pHba->name, bus_no);
- continue;
- }
- if (scsi_id >= MAX_ID){
- printk(KERN_WARNING"%s: SCSI ID %d out of range \n", pHba->name, bus_no);
- continue;
- }
- if(bus_no > pHba->top_scsi_channel){
- pHba->top_scsi_channel = bus_no;
- }
- if(scsi_id > pHba->top_scsi_id){
- pHba->top_scsi_id = scsi_id;
- }
- if(scsi_lun > pHba->top_scsi_lun){
- pHba->top_scsi_lun = scsi_lun;
- }
- continue;
- }
- d = kmalloc(sizeof(struct i2o_device), GFP_KERNEL);
- if(d==NULL)
- {
- printk(KERN_CRIT"%s: Out of memory for I2O device data.\n",pHba->name);
- return -ENOMEM;
- }
-
- d->controller = pHba;
- d->next = NULL;
-
- memcpy(&d->lct_data, &lct->lct_entry[i], sizeof(i2o_lct_entry));
-
- d->flags = 0;
- tid = d->lct_data.tid;
- adpt_i2o_report_hba_unit(pHba, d);
- adpt_i2o_install_device(pHba, d);
- }
- bus_no = 0;
- for(d = pHba->devices; d ; d = d->next) {
- if(d->lct_data.class_id == I2O_CLASS_BUS_ADAPTER_PORT ||
- d->lct_data.class_id == I2O_CLASS_FIBRE_CHANNEL_PORT){
- tid = d->lct_data.tid;
- // TODO get the bus_no from hrt-but for now they are in order
- //bus_no =
- if(bus_no > pHba->top_scsi_channel){
- pHba->top_scsi_channel = bus_no;
- }
- pHba->channel[bus_no].type = d->lct_data.class_id;
- pHba->channel[bus_no].tid = tid;
- if(adpt_i2o_query_scalar(pHba, tid, 0x0200, -1, buf, 28)>=0)
- {
- pHba->channel[bus_no].scsi_id = buf[1];
- PDEBUG("Bus %d - SCSI ID %d.\n", bus_no, buf[1]);
- }
- // TODO remove - this is just until we get from hrt
- bus_no++;
- if(bus_no >= MAX_CHANNEL) { // Something wrong skip it
- printk(KERN_WARNING"%s: Channel number %d out of range - LCT\n", pHba->name, bus_no);
- break;
- }
- }
- }
-
- // Setup adpt_device table
- for(d = pHba->devices; d ; d = d->next) {
- if(d->lct_data.class_id == I2O_CLASS_RANDOM_BLOCK_STORAGE ||
- d->lct_data.class_id == I2O_CLASS_SCSI_PERIPHERAL ||
- d->lct_data.class_id == I2O_CLASS_FIBRE_CHANNEL_PERIPHERAL ){
-
- tid = d->lct_data.tid;
- scsi_id = -1;
- // I2O_DPT_DEVICE_INFO_GROUP_NO;
- if(adpt_i2o_query_scalar(pHba, tid, 0x8000, -1, buf, 32)>=0) {
- bus_no = buf[0]>>16;
- scsi_id = buf[1];
- scsi_lun = scsilun_to_int((struct scsi_lun *)&buf[2]);
- if(bus_no >= MAX_CHANNEL) { // Something wrong skip it
- continue;
- }
- if (scsi_id >= MAX_ID) {
- continue;
- }
- if( pHba->channel[bus_no].device[scsi_id] == NULL){
- pDev = kzalloc(sizeof(struct adpt_device),GFP_KERNEL);
- if(pDev == NULL) {
- return -ENOMEM;
- }
- pHba->channel[bus_no].device[scsi_id] = pDev;
- } else {
- for( pDev = pHba->channel[bus_no].device[scsi_id];
- pDev->next_lun; pDev = pDev->next_lun){
- }
- pDev->next_lun = kzalloc(sizeof(struct adpt_device),GFP_KERNEL);
- if(pDev->next_lun == NULL) {
- return -ENOMEM;
- }
- pDev = pDev->next_lun;
- }
- pDev->tid = tid;
- pDev->scsi_channel = bus_no;
- pDev->scsi_id = scsi_id;
- pDev->scsi_lun = scsi_lun;
- pDev->pI2o_dev = d;
- d->owner = pDev;
- pDev->type = (buf[0])&0xff;
- pDev->flags = (buf[0]>>8)&0xff;
- if(scsi_id > pHba->top_scsi_id){
- pHba->top_scsi_id = scsi_id;
- }
- if(scsi_lun > pHba->top_scsi_lun){
- pHba->top_scsi_lun = scsi_lun;
- }
- }
- if(scsi_id == -1){
- printk(KERN_WARNING"Could not find SCSI ID for %s\n",
- d->lct_data.identity_tag);
- }
- }
- }
- return 0;
-}
-
-
-/*
- * Each I2O controller has a chain of devices on it - these match
- * the useful parts of the LCT of the board.
- */
-
-static int adpt_i2o_install_device(adpt_hba* pHba, struct i2o_device *d)
-{
- mutex_lock(&adpt_configuration_lock);
- d->controller=pHba;
- d->owner=NULL;
- d->next=pHba->devices;
- d->prev=NULL;
- if (pHba->devices != NULL){
- pHba->devices->prev=d;
- }
- pHba->devices=d;
- *d->dev_name = 0;
-
- mutex_unlock(&adpt_configuration_lock);
- return 0;
-}
-
-static int adpt_open(struct inode *inode, struct file *file)
-{
- int minor;
- adpt_hba* pHba;
-
- mutex_lock(&adpt_mutex);
- //TODO check for root access
- //
- minor = iminor(inode);
- if (minor >= hba_count) {
- mutex_unlock(&adpt_mutex);
- return -ENXIO;
- }
- mutex_lock(&adpt_configuration_lock);
- for (pHba = hba_chain; pHba; pHba = pHba->next) {
- if (pHba->unit == minor) {
- break; /* found adapter */
- }
- }
- if (pHba == NULL) {
- mutex_unlock(&adpt_configuration_lock);
- mutex_unlock(&adpt_mutex);
- return -ENXIO;
- }
-
-// if(pHba->in_use){
- // mutex_unlock(&adpt_configuration_lock);
-// return -EBUSY;
-// }
-
- pHba->in_use = 1;
- mutex_unlock(&adpt_configuration_lock);
- mutex_unlock(&adpt_mutex);
-
- return 0;
-}
-
-static int adpt_close(struct inode *inode, struct file *file)
-{
- int minor;
- adpt_hba* pHba;
-
- minor = iminor(inode);
- if (minor >= hba_count) {
- return -ENXIO;
- }
- mutex_lock(&adpt_configuration_lock);
- for (pHba = hba_chain; pHba; pHba = pHba->next) {
- if (pHba->unit == minor) {
- break; /* found adapter */
- }
- }
- mutex_unlock(&adpt_configuration_lock);
- if (pHba == NULL) {
- return -ENXIO;
- }
-
- pHba->in_use = 0;
-
- return 0;
-}
-
-
-static int adpt_i2o_passthru(adpt_hba* pHba, u32 __user *arg)
-{
- u32 msg[MAX_MESSAGE_SIZE];
- u32* reply = NULL;
- u32 size = 0;
- u32 reply_size = 0;
- u32 __user *user_msg = arg;
- u32 __user * user_reply = NULL;
- void **sg_list = NULL;
- u32 sg_offset = 0;
- u32 sg_count = 0;
- int sg_index = 0;
- u32 i = 0;
- u32 rcode = 0;
- void *p = NULL;
- dma_addr_t addr;
- ulong flags = 0;
-
- memset(&msg, 0, MAX_MESSAGE_SIZE*4);
- // get user msg size in u32s
- if(get_user(size, &user_msg[0])){
- return -EFAULT;
- }
- size = size>>16;
-
- user_reply = &user_msg[size];
- if(size > MAX_MESSAGE_SIZE){
- return -EFAULT;
- }
- size *= 4; // Convert to bytes
-
- /* Copy in the user's I2O command */
- if(copy_from_user(msg, user_msg, size)) {
- return -EFAULT;
- }
- get_user(reply_size, &user_reply[0]);
- reply_size = reply_size>>16;
- if(reply_size > REPLY_FRAME_SIZE){
- reply_size = REPLY_FRAME_SIZE;
- }
- reply_size *= 4;
- reply = kzalloc(REPLY_FRAME_SIZE*4, GFP_KERNEL);
- if(reply == NULL) {
- printk(KERN_WARNING"%s: Could not allocate reply buffer\n",pHba->name);
- return -ENOMEM;
- }
- sg_offset = (msg[0]>>4)&0xf;
- msg[2] = 0x40000000; // IOCTL context
- msg[3] = adpt_ioctl_to_context(pHba, reply);
- if (msg[3] == (u32)-1) {
- rcode = -EBUSY;
- goto free;
- }
-
- sg_list = kcalloc(pHba->sg_tablesize, sizeof(*sg_list), GFP_KERNEL);
- if (!sg_list) {
- rcode = -ENOMEM;
- goto free;
- }
- if(sg_offset) {
- // TODO add 64 bit API
- struct sg_simple_element *sg = (struct sg_simple_element*) (msg+sg_offset);
- sg_count = (size - sg_offset*4) / sizeof(struct sg_simple_element);
- if (sg_count > pHba->sg_tablesize){
- printk(KERN_DEBUG"%s:IOCTL SG List too large (%u)\n", pHba->name,sg_count);
- rcode = -EINVAL;
- goto free;
- }
-
- for(i = 0; i < sg_count; i++) {
- int sg_size;
-
- if (!(sg[i].flag_count & 0x10000000 /*I2O_SGL_FLAGS_SIMPLE_ADDRESS_ELEMENT*/)) {
- printk(KERN_DEBUG"%s:Bad SG element %d - not simple (%x)\n",pHba->name,i, sg[i].flag_count);
- rcode = -EINVAL;
- goto cleanup;
- }
- sg_size = sg[i].flag_count & 0xffffff;
- /* Allocate memory for the transfer */
- p = dma_alloc_coherent(&pHba->pDev->dev, sg_size, &addr, GFP_KERNEL);
- if(!p) {
- printk(KERN_DEBUG"%s: Could not allocate SG buffer - size = %d buffer number %d of %d\n",
- pHba->name,sg_size,i,sg_count);
- rcode = -ENOMEM;
- goto cleanup;
- }
- sg_list[sg_index++] = p; // sglist indexed with input frame, not our internal frame.
- /* Copy in the user's SG buffer if necessary */
- if(sg[i].flag_count & 0x04000000 /*I2O_SGL_FLAGS_DIR*/) {
- // sg_simple_element API is 32 bit
- if (copy_from_user(p,(void __user *)(ulong)sg[i].addr_bus, sg_size)) {
- printk(KERN_DEBUG"%s: Could not copy SG buf %d FROM user\n",pHba->name,i);
- rcode = -EFAULT;
- goto cleanup;
- }
- }
- /* sg_simple_element API is 32 bit, but addr < 4GB */
- sg[i].addr_bus = addr;
- }
- }
-
- do {
- /*
- * Stop any new commands from enterring the
- * controller while processing the ioctl
- */
- if (pHba->host) {
- scsi_block_requests(pHba->host);
- spin_lock_irqsave(pHba->host->host_lock, flags);
- }
- rcode = adpt_i2o_post_wait(pHba, msg, size, FOREVER);
- if (rcode != 0)
- printk("adpt_i2o_passthru: post wait failed %d %p\n",
- rcode, reply);
- if (pHba->host) {
- spin_unlock_irqrestore(pHba->host->host_lock, flags);
- scsi_unblock_requests(pHba->host);
- }
- } while (rcode == -ETIMEDOUT);
-
- if(rcode){
- goto cleanup;
- }
-
- if(sg_offset) {
- /* Copy back the Scatter Gather buffers back to user space */
- u32 j;
- // TODO add 64 bit API
- struct sg_simple_element* sg;
- int sg_size;
-
- // re-acquire the original message to handle correctly the sg copy operation
- memset(&msg, 0, MAX_MESSAGE_SIZE*4);
- // get user msg size in u32s
- if(get_user(size, &user_msg[0])){
- rcode = -EFAULT;
- goto cleanup;
- }
- size = size>>16;
- size *= 4;
- if (size > MAX_MESSAGE_SIZE) {
- rcode = -EINVAL;
- goto cleanup;
- }
- /* Copy in the user's I2O command */
- if (copy_from_user (msg, user_msg, size)) {
- rcode = -EFAULT;
- goto cleanup;
- }
- sg_count = (size - sg_offset*4) / sizeof(struct sg_simple_element);
-
- // TODO add 64 bit API
- sg = (struct sg_simple_element*)(msg + sg_offset);
- for (j = 0; j < sg_count; j++) {
- /* Copy out the SG list to user's buffer if necessary */
- if(! (sg[j].flag_count & 0x4000000 /*I2O_SGL_FLAGS_DIR*/)) {
- sg_size = sg[j].flag_count & 0xffffff;
- // sg_simple_element API is 32 bit
- if (copy_to_user((void __user *)(ulong)sg[j].addr_bus,sg_list[j], sg_size)) {
- printk(KERN_WARNING"%s: Could not copy %p TO user %x\n",pHba->name, sg_list[j], sg[j].addr_bus);
- rcode = -EFAULT;
- goto cleanup;
- }
- }
- }
- }
-
- /* Copy back the reply to user space */
- if (reply_size) {
- // we wrote our own values for context - now restore the user supplied ones
- if(copy_from_user(reply+2, user_msg+2, sizeof(u32)*2)) {
- printk(KERN_WARNING"%s: Could not copy message context FROM user\n",pHba->name);
- rcode = -EFAULT;
- }
- if(copy_to_user(user_reply, reply, reply_size)) {
- printk(KERN_WARNING"%s: Could not copy reply TO user\n",pHba->name);
- rcode = -EFAULT;
- }
- }
-
-
-cleanup:
- if (rcode != -ETIME && rcode != -EINTR) {
- struct sg_simple_element *sg =
- (struct sg_simple_element*) (msg +sg_offset);
- while(sg_index) {
- if(sg_list[--sg_index]) {
- dma_free_coherent(&pHba->pDev->dev,
- sg[sg_index].flag_count & 0xffffff,
- sg_list[sg_index],
- sg[sg_index].addr_bus);
- }
- }
- }
-
-free:
- kfree(sg_list);
- kfree(reply);
- return rcode;
-}
-
-#if defined __ia64__
-static void adpt_ia64_info(sysInfo_S* si)
-{
- // This is all the info we need for now
- // We will add more info as our new
- // managmenent utility requires it
- si->processorType = PROC_IA64;
-}
-#endif
-
-#if defined __sparc__
-static void adpt_sparc_info(sysInfo_S* si)
-{
- // This is all the info we need for now
- // We will add more info as our new
- // managmenent utility requires it
- si->processorType = PROC_ULTRASPARC;
-}
-#endif
-#if defined __alpha__
-static void adpt_alpha_info(sysInfo_S* si)
-{
- // This is all the info we need for now
- // We will add more info as our new
- // managmenent utility requires it
- si->processorType = PROC_ALPHA;
-}
-#endif
-
-#if defined __i386__
-
-#include <uapi/asm/vm86.h>
-
-static void adpt_i386_info(sysInfo_S* si)
-{
- // This is all the info we need for now
- // We will add more info as our new
- // managmenent utility requires it
- switch (boot_cpu_data.x86) {
- case CPU_386:
- si->processorType = PROC_386;
- break;
- case CPU_486:
- si->processorType = PROC_486;
- break;
- case CPU_586:
- si->processorType = PROC_PENTIUM;
- break;
- default: // Just in case
- si->processorType = PROC_PENTIUM;
- break;
- }
-}
-#endif
-
-/*
- * This routine returns information about the system. This does not effect
- * any logic and if the info is wrong - it doesn't matter.
- */
-
-/* Get all the info we can not get from kernel services */
-static int adpt_system_info(void __user *buffer)
-{
- sysInfo_S si;
-
- memset(&si, 0, sizeof(si));
-
- si.osType = OS_LINUX;
- si.osMajorVersion = 0;
- si.osMinorVersion = 0;
- si.osRevision = 0;
- si.busType = SI_PCI_BUS;
- si.processorFamily = DPTI_sig.dsProcessorFamily;
-
-#if defined __i386__
- adpt_i386_info(&si);
-#elif defined (__ia64__)
- adpt_ia64_info(&si);
-#elif defined(__sparc__)
- adpt_sparc_info(&si);
-#elif defined (__alpha__)
- adpt_alpha_info(&si);
-#else
- si.processorType = 0xff ;
-#endif
- if (copy_to_user(buffer, &si, sizeof(si))){
- printk(KERN_WARNING"dpti: Could not copy buffer TO user\n");
- return -EFAULT;
- }
-
- return 0;
-}
-
-static int adpt_ioctl(struct inode *inode, struct file *file, uint cmd, ulong arg)
-{
- int minor;
- int error = 0;
- adpt_hba* pHba;
- ulong flags = 0;
- void __user *argp = (void __user *)arg;
-
- minor = iminor(inode);
- if (minor >= DPTI_MAX_HBA){
- return -ENXIO;
- }
- mutex_lock(&adpt_configuration_lock);
- for (pHba = hba_chain; pHba; pHba = pHba->next) {
- if (pHba->unit == minor) {
- break; /* found adapter */
- }
- }
- mutex_unlock(&adpt_configuration_lock);
- if(pHba == NULL){
- return -ENXIO;
- }
-
- while((volatile u32) pHba->state & DPTI_STATE_RESET )
- schedule_timeout_uninterruptible(2);
-
- switch (cmd) {
- // TODO: handle 3 cases
- case DPT_SIGNATURE:
- if (copy_to_user(argp, &DPTI_sig, sizeof(DPTI_sig))) {
- return -EFAULT;
- }
- break;
- case I2OUSRCMD:
- return adpt_i2o_passthru(pHba, argp);
-
- case DPT_CTRLINFO:{
- drvrHBAinfo_S HbaInfo;
-
-#define FLG_OSD_PCI_VALID 0x0001
-#define FLG_OSD_DMA 0x0002
-#define FLG_OSD_I2O 0x0004
- memset(&HbaInfo, 0, sizeof(HbaInfo));
- HbaInfo.drvrHBAnum = pHba->unit;
- HbaInfo.baseAddr = (ulong) pHba->base_addr_phys;
- HbaInfo.blinkState = adpt_read_blink_led(pHba);
- HbaInfo.pciBusNum = pHba->pDev->bus->number;
- HbaInfo.pciDeviceNum=PCI_SLOT(pHba->pDev->devfn);
- HbaInfo.Interrupt = pHba->pDev->irq;
- HbaInfo.hbaFlags = FLG_OSD_PCI_VALID | FLG_OSD_DMA | FLG_OSD_I2O;
- if(copy_to_user(argp, &HbaInfo, sizeof(HbaInfo))){
- printk(KERN_WARNING"%s: Could not copy HbaInfo TO user\n",pHba->name);
- return -EFAULT;
- }
- break;
- }
- case DPT_SYSINFO:
- return adpt_system_info(argp);
- case DPT_BLINKLED:{
- u32 value;
- value = (u32)adpt_read_blink_led(pHba);
- if (copy_to_user(argp, &value, sizeof(value))) {
- return -EFAULT;
- }
- break;
- }
- case I2ORESETCMD: {
- struct Scsi_Host *shost = pHba->host;
-
- if (shost)
- spin_lock_irqsave(shost->host_lock, flags);
- adpt_hba_reset(pHba);
- if (shost)
- spin_unlock_irqrestore(shost->host_lock, flags);
- break;
- }
- case I2ORESCANCMD:
- adpt_rescan(pHba);
- break;
- default:
- return -EINVAL;
- }
-
- return error;
-}
-
-static long adpt_unlocked_ioctl(struct file *file, uint cmd, ulong arg)
-{
- struct inode *inode;
- long ret;
-
- inode = file_inode(file);
-
- mutex_lock(&adpt_mutex);
- ret = adpt_ioctl(inode, file, cmd, arg);
- mutex_unlock(&adpt_mutex);
-
- return ret;
-}
-
-#ifdef CONFIG_COMPAT
-static long compat_adpt_ioctl(struct file *file,
- unsigned int cmd, unsigned long arg)
-{
- struct inode *inode;
- long ret;
-
- inode = file_inode(file);
-
- mutex_lock(&adpt_mutex);
-
- switch(cmd) {
- case DPT_SIGNATURE:
- case I2OUSRCMD:
- case DPT_CTRLINFO:
- case DPT_SYSINFO:
- case DPT_BLINKLED:
- case I2ORESETCMD:
- case I2ORESCANCMD:
- case (DPT_TARGET_BUSY & 0xFFFF):
- case DPT_TARGET_BUSY:
- ret = adpt_ioctl(inode, file, cmd, arg);
- break;
- default:
- ret = -ENOIOCTLCMD;
- }
-
- mutex_unlock(&adpt_mutex);
-
- return ret;
-}
-#endif
-
-static irqreturn_t adpt_isr(int irq, void *dev_id)
-{
- struct scsi_cmnd* cmd;
- adpt_hba* pHba = dev_id;
- u32 m;
- void __iomem *reply;
- u32 status=0;
- u32 context;
- ulong flags = 0;
- int handled = 0;
-
- if (pHba == NULL){
- printk(KERN_WARNING"adpt_isr: NULL dev_id\n");
- return IRQ_NONE;
- }
- if(pHba->host)
- spin_lock_irqsave(pHba->host->host_lock, flags);
-
- while( readl(pHba->irq_mask) & I2O_INTERRUPT_PENDING_B) {
- m = readl(pHba->reply_port);
- if(m == EMPTY_QUEUE){
- // Try twice then give up
- rmb();
- m = readl(pHba->reply_port);
- if(m == EMPTY_QUEUE){
- // This really should not happen
- printk(KERN_ERR"dpti: Could not get reply frame\n");
- goto out;
- }
- }
- if (pHba->reply_pool_pa <= m &&
- m < pHba->reply_pool_pa +
- (pHba->reply_fifo_size * REPLY_FRAME_SIZE * 4)) {
- reply = (u8 *)pHba->reply_pool +
- (m - pHba->reply_pool_pa);
- } else {
- /* Ick, we should *never* be here */
- printk(KERN_ERR "dpti: reply frame not from pool\n");
- reply = (u8 *)bus_to_virt(m);
- }
-
- if (readl(reply) & MSG_FAIL) {
- u32 old_m = readl(reply+28);
- void __iomem *msg;
- u32 old_context;
- PDEBUG("%s: Failed message\n",pHba->name);
- if(old_m >= 0x100000){
- printk(KERN_ERR"%s: Bad preserved MFA (%x)- dropping frame\n",pHba->name,old_m);
- writel(m,pHba->reply_port);
- continue;
- }
- // Transaction context is 0 in failed reply frame
- msg = pHba->msg_addr_virt + old_m;
- old_context = readl(msg+12);
- writel(old_context, reply+12);
- adpt_send_nop(pHba, old_m);
- }
- context = readl(reply+8);
- if(context & 0x40000000){ // IOCTL
- void *p = adpt_ioctl_from_context(pHba, readl(reply+12));
- if( p != NULL) {
- memcpy_fromio(p, reply, REPLY_FRAME_SIZE * 4);
- }
- // All IOCTLs will also be post wait
- }
- if(context & 0x80000000){ // Post wait message
- status = readl(reply+16);
- if(status >> 24){
- status &= 0xffff; /* Get detail status */
- } else {
- status = I2O_POST_WAIT_OK;
- }
- if(!(context & 0x40000000)) {
- cmd = adpt_cmd_from_context(pHba,
- readl(reply+12));
- if(cmd != NULL) {
- printk(KERN_WARNING"%s: Apparent SCSI cmd in Post Wait Context - cmd=%p context=%x\n", pHba->name, cmd, context);
- }
- }
- adpt_i2o_post_wait_complete(context, status);
- } else { // SCSI message
- cmd = adpt_cmd_from_context (pHba, readl(reply+12));
- if(cmd != NULL){
- scsi_dma_unmap(cmd);
- if(cmd->serial_number != 0) { // If not timedout
- adpt_i2o_to_scsi(reply, cmd);
- }
- }
- }
- writel(m, pHba->reply_port);
- wmb();
- rmb();
- }
- handled = 1;
-out: if(pHba->host)
- spin_unlock_irqrestore(pHba->host->host_lock, flags);
- return IRQ_RETVAL(handled);
-}
-
-static s32 adpt_scsi_to_i2o(adpt_hba* pHba, struct scsi_cmnd* cmd, struct adpt_device* d)
-{
- int i;
- u32 msg[MAX_MESSAGE_SIZE];
- u32* mptr;
- u32* lptr;
- u32 *lenptr;
- int direction;
- int scsidir;
- int nseg;
- u32 len;
- u32 reqlen;
- s32 rcode;
- dma_addr_t addr;
-
- memset(msg, 0 , sizeof(msg));
- len = scsi_bufflen(cmd);
- direction = 0x00000000;
-
- scsidir = 0x00000000; // DATA NO XFER
- if(len) {
- /*
- * Set SCBFlags to indicate if data is being transferred
- * in or out, or no data transfer
- * Note: Do not have to verify index is less than 0 since
- * cmd->cmnd[0] is an unsigned char
- */
- switch(cmd->sc_data_direction){
- case DMA_FROM_DEVICE:
- scsidir =0x40000000; // DATA IN (iop<--dev)
- break;
- case DMA_TO_DEVICE:
- direction=0x04000000; // SGL OUT
- scsidir =0x80000000; // DATA OUT (iop-->dev)
- break;
- case DMA_NONE:
- break;
- case DMA_BIDIRECTIONAL:
- scsidir =0x40000000; // DATA IN (iop<--dev)
- // Assume In - and continue;
- break;
- default:
- printk(KERN_WARNING"%s: scsi opcode 0x%x not supported.\n",
- pHba->name, cmd->cmnd[0]);
- cmd->result = (DID_OK <<16) | (INITIATOR_ERROR << 8);
- cmd->scsi_done(cmd);
- return 0;
- }
- }
- // msg[0] is set later
- // I2O_CMD_SCSI_EXEC
- msg[1] = ((0xff<<24)|(HOST_TID<<12)|d->tid);
- msg[2] = 0;
- msg[3] = adpt_cmd_to_context(cmd); /* Want SCSI control block back */
- // Our cards use the transaction context as the tag for queueing
- // Adaptec/DPT Private stuff
- msg[4] = I2O_CMD_SCSI_EXEC|(DPT_ORGANIZATION_ID<<16);
- msg[5] = d->tid;
- /* Direction, disconnect ok | sense data | simple queue , CDBLen */
- // I2O_SCB_FLAG_ENABLE_DISCONNECT |
- // I2O_SCB_FLAG_SIMPLE_QUEUE_TAG |
- // I2O_SCB_FLAG_SENSE_DATA_IN_MESSAGE;
- msg[6] = scsidir|0x20a00000|cmd->cmd_len;
-
- mptr=msg+7;
-
- // Write SCSI command into the message - always 16 byte block
- memset(mptr, 0, 16);
- memcpy(mptr, cmd->cmnd, cmd->cmd_len);
- mptr+=4;
- lenptr=mptr++; /* Remember me - fill in when we know */
- if (dpt_dma64(pHba)) {
- reqlen = 16; // SINGLE SGE
- *mptr++ = (0x7C<<24)+(2<<16)+0x02; /* Enable 64 bit */
- *mptr++ = 1 << PAGE_SHIFT;
- } else {
- reqlen = 14; // SINGLE SGE
- }
- /* Now fill in the SGList and command */
-
- nseg = scsi_dma_map(cmd);
- BUG_ON(nseg < 0);
- if (nseg) {
- struct scatterlist *sg;
-
- len = 0;
- scsi_for_each_sg(cmd, sg, nseg, i) {
- lptr = mptr;
- *mptr++ = direction|0x10000000|sg_dma_len(sg);
- len+=sg_dma_len(sg);
- addr = sg_dma_address(sg);
- *mptr++ = dma_low(addr);
- if (dpt_dma64(pHba))
- *mptr++ = dma_high(addr);
- /* Make this an end of list */
- if (i == nseg - 1)
- *lptr = direction|0xD0000000|sg_dma_len(sg);
- }
- reqlen = mptr - msg;
- *lenptr = len;
-
- if(cmd->underflow && len != cmd->underflow){
- printk(KERN_WARNING"Cmd len %08X Cmd underflow %08X\n",
- len, cmd->underflow);
- }
- } else {
- *lenptr = len = 0;
- reqlen = 12;
- }
-
- /* Stick the headers on */
- msg[0] = reqlen<<16 | ((reqlen > 12) ? SGL_OFFSET_12 : SGL_OFFSET_0);
-
- // Send it on it's way
- rcode = adpt_i2o_post_this(pHba, msg, reqlen<<2);
- if (rcode == 0) {
- return 0;
- }
- return rcode;
-}
-
-
-static s32 adpt_scsi_host_alloc(adpt_hba* pHba, struct scsi_host_template *sht)
-{
- struct Scsi_Host *host;
-
- host = scsi_host_alloc(sht, sizeof(adpt_hba*));
- if (host == NULL) {
- printk("%s: scsi_host_alloc returned NULL\n", pHba->name);
- return -1;
- }
- host->hostdata[0] = (unsigned long)pHba;
- pHba->host = host;
-
- host->irq = pHba->pDev->irq;
- /* no IO ports, so don't have to set host->io_port and
- * host->n_io_port
- */
- host->io_port = 0;
- host->n_io_port = 0;
- /* see comments in scsi_host.h */
- host->max_id = 16;
- host->max_lun = 256;
- host->max_channel = pHba->top_scsi_channel + 1;
- host->cmd_per_lun = 1;
- host->unique_id = (u32)sys_tbl_pa + pHba->unit;
- host->sg_tablesize = pHba->sg_tablesize;
- host->can_queue = pHba->post_fifo_size;
- host->use_cmd_list = 1;
-
- return 0;
-}
-
-
-static s32 adpt_i2o_to_scsi(void __iomem *reply, struct scsi_cmnd* cmd)
-{
- adpt_hba* pHba;
- u32 hba_status;
- u32 dev_status;
- u32 reply_flags = readl(reply) & 0xff00; // Leave it shifted up 8 bits
- // I know this would look cleaner if I just read bytes
- // but the model I have been using for all the rest of the
- // io is in 4 byte words - so I keep that model
- u16 detailed_status = readl(reply+16) &0xffff;
- dev_status = (detailed_status & 0xff);
- hba_status = detailed_status >> 8;
-
- // calculate resid for sg
- scsi_set_resid(cmd, scsi_bufflen(cmd) - readl(reply+20));
-
- pHba = (adpt_hba*) cmd->device->host->hostdata[0];
-
- cmd->sense_buffer[0] = '\0'; // initialize sense valid flag to false
-
- if(!(reply_flags & MSG_FAIL)) {
- switch(detailed_status & I2O_SCSI_DSC_MASK) {
- case I2O_SCSI_DSC_SUCCESS:
- cmd->result = (DID_OK << 16);
- // handle underflow
- if (readl(reply+20) < cmd->underflow) {
- cmd->result = (DID_ERROR <<16);
- printk(KERN_WARNING"%s: SCSI CMD underflow\n",pHba->name);
- }
- break;
- case I2O_SCSI_DSC_REQUEST_ABORTED:
- cmd->result = (DID_ABORT << 16);
- break;
- case I2O_SCSI_DSC_PATH_INVALID:
- case I2O_SCSI_DSC_DEVICE_NOT_PRESENT:
- case I2O_SCSI_DSC_SELECTION_TIMEOUT:
- case I2O_SCSI_DSC_COMMAND_TIMEOUT:
- case I2O_SCSI_DSC_NO_ADAPTER:
- case I2O_SCSI_DSC_RESOURCE_UNAVAILABLE:
- printk(KERN_WARNING"%s: SCSI Timeout-Device (%d,%d,%llu) hba status=0x%x, dev status=0x%x, cmd=0x%x\n",
- pHba->name, (u32)cmd->device->channel, (u32)cmd->device->id, cmd->device->lun, hba_status, dev_status, cmd->cmnd[0]);
- cmd->result = (DID_TIME_OUT << 16);
- break;
- case I2O_SCSI_DSC_ADAPTER_BUSY:
- case I2O_SCSI_DSC_BUS_BUSY:
- cmd->result = (DID_BUS_BUSY << 16);
- break;
- case I2O_SCSI_DSC_SCSI_BUS_RESET:
- case I2O_SCSI_DSC_BDR_MESSAGE_SENT:
- cmd->result = (DID_RESET << 16);
- break;
- case I2O_SCSI_DSC_PARITY_ERROR_FAILURE:
- printk(KERN_WARNING"%s: SCSI CMD parity error\n",pHba->name);
- cmd->result = (DID_PARITY << 16);
- break;
- case I2O_SCSI_DSC_UNABLE_TO_ABORT:
- case I2O_SCSI_DSC_COMPLETE_WITH_ERROR:
- case I2O_SCSI_DSC_UNABLE_TO_TERMINATE:
- case I2O_SCSI_DSC_MR_MESSAGE_RECEIVED:
- case I2O_SCSI_DSC_AUTOSENSE_FAILED:
- case I2O_SCSI_DSC_DATA_OVERRUN:
- case I2O_SCSI_DSC_UNEXPECTED_BUS_FREE:
- case I2O_SCSI_DSC_SEQUENCE_FAILURE:
- case I2O_SCSI_DSC_REQUEST_LENGTH_ERROR:
- case I2O_SCSI_DSC_PROVIDE_FAILURE:
- case I2O_SCSI_DSC_REQUEST_TERMINATED:
- case I2O_SCSI_DSC_IDE_MESSAGE_SENT:
- case I2O_SCSI_DSC_UNACKNOWLEDGED_EVENT:
- case I2O_SCSI_DSC_MESSAGE_RECEIVED:
- case I2O_SCSI_DSC_INVALID_CDB:
- case I2O_SCSI_DSC_LUN_INVALID:
- case I2O_SCSI_DSC_SCSI_TID_INVALID:
- case I2O_SCSI_DSC_FUNCTION_UNAVAILABLE:
- case I2O_SCSI_DSC_NO_NEXUS:
- case I2O_SCSI_DSC_CDB_RECEIVED:
- case I2O_SCSI_DSC_LUN_ALREADY_ENABLED:
- case I2O_SCSI_DSC_QUEUE_FROZEN:
- case I2O_SCSI_DSC_REQUEST_INVALID:
- default:
- printk(KERN_WARNING"%s: SCSI error %0x-Device(%d,%d,%llu) hba_status=0x%x, dev_status=0x%x, cmd=0x%x\n",
- pHba->name, detailed_status & I2O_SCSI_DSC_MASK, (u32)cmd->device->channel, (u32)cmd->device->id, cmd->device->lun,
- hba_status, dev_status, cmd->cmnd[0]);
- cmd->result = (DID_ERROR << 16);
- break;
- }
-
- // copy over the request sense data if it was a check
- // condition status
- if (dev_status == SAM_STAT_CHECK_CONDITION) {
- u32 len = min(SCSI_SENSE_BUFFERSIZE, 40);
- // Copy over the sense data
- memcpy_fromio(cmd->sense_buffer, (reply+28) , len);
- if(cmd->sense_buffer[0] == 0x70 /* class 7 */ &&
- cmd->sense_buffer[2] == DATA_PROTECT ){
- /* This is to handle an array failed */
- cmd->result = (DID_TIME_OUT << 16);
- printk(KERN_WARNING"%s: SCSI Data Protect-Device (%d,%d,%llu) hba_status=0x%x, dev_status=0x%x, cmd=0x%x\n",
- pHba->name, (u32)cmd->device->channel, (u32)cmd->device->id, cmd->device->lun,
- hba_status, dev_status, cmd->cmnd[0]);
-
- }
- }
- } else {
- /* In this condtion we could not talk to the tid
- * the card rejected it. We should signal a retry
- * for a limitted number of retries.
- */
- cmd->result = (DID_TIME_OUT << 16);
- printk(KERN_WARNING"%s: I2O MSG_FAIL - Device (%d,%d,%llu) tid=%d, cmd=0x%x\n",
- pHba->name, (u32)cmd->device->channel, (u32)cmd->device->id, cmd->device->lun,
- ((struct adpt_device*)(cmd->device->hostdata))->tid, cmd->cmnd[0]);
- }
-
- cmd->result |= (dev_status);
-
- if(cmd->scsi_done != NULL){
- cmd->scsi_done(cmd);
- }
- return cmd->result;
-}
-
-
-static s32 adpt_rescan(adpt_hba* pHba)
-{
- s32 rcode;
- ulong flags = 0;
-
- if(pHba->host)
- spin_lock_irqsave(pHba->host->host_lock, flags);
- if ((rcode=adpt_i2o_lct_get(pHba)) < 0)
- goto out;
- if ((rcode=adpt_i2o_reparse_lct(pHba)) < 0)
- goto out;
- rcode = 0;
-out: if(pHba->host)
- spin_unlock_irqrestore(pHba->host->host_lock, flags);
- return rcode;
-}
-
-
-static s32 adpt_i2o_reparse_lct(adpt_hba* pHba)
-{
- int i;
- int max;
- int tid;
- struct i2o_device *d;
- i2o_lct *lct = pHba->lct;
- u8 bus_no = 0;
- s16 scsi_id;
- u64 scsi_lun;
- u32 buf[10]; // at least 8 u32's
- struct adpt_device* pDev = NULL;
- struct i2o_device* pI2o_dev = NULL;
-
- if (lct == NULL) {
- printk(KERN_ERR "%s: LCT is empty???\n",pHba->name);
- return -1;
- }
-
- max = lct->table_size;
- max -= 3;
- max /= 9;
-
- // Mark each drive as unscanned
- for (d = pHba->devices; d; d = d->next) {
- pDev =(struct adpt_device*) d->owner;
- if(!pDev){
- continue;
- }
- pDev->state |= DPTI_DEV_UNSCANNED;
- }
-
- printk(KERN_INFO "%s: LCT has %d entries.\n", pHba->name,max);
-
- for(i=0;i<max;i++) {
- if( lct->lct_entry[i].user_tid != 0xfff){
- continue;
- }
-
- if( lct->lct_entry[i].class_id == I2O_CLASS_RANDOM_BLOCK_STORAGE ||
- lct->lct_entry[i].class_id == I2O_CLASS_SCSI_PERIPHERAL ||
- lct->lct_entry[i].class_id == I2O_CLASS_FIBRE_CHANNEL_PERIPHERAL ){
- tid = lct->lct_entry[i].tid;
- if(adpt_i2o_query_scalar(pHba, tid, 0x8000, -1, buf, 32)<0) {
- printk(KERN_ERR"%s: Could not query device\n",pHba->name);
- continue;
- }
- bus_no = buf[0]>>16;
- if (bus_no >= MAX_CHANNEL) { /* Something wrong skip it */
- printk(KERN_WARNING
- "%s: Channel number %d out of range\n",
- pHba->name, bus_no);
- continue;
- }
-
- scsi_id = buf[1];
- scsi_lun = scsilun_to_int((struct scsi_lun *)&buf[2]);
- pDev = pHba->channel[bus_no].device[scsi_id];
- /* da lun */
- while(pDev) {
- if(pDev->scsi_lun == scsi_lun) {
- break;
- }
- pDev = pDev->next_lun;
- }
- if(!pDev ) { // Something new add it
- d = kmalloc(sizeof(struct i2o_device),
- GFP_ATOMIC);
- if(d==NULL)
- {
- printk(KERN_CRIT "Out of memory for I2O device data.\n");
- return -ENOMEM;
- }
-
- d->controller = pHba;
- d->next = NULL;
-
- memcpy(&d->lct_data, &lct->lct_entry[i], sizeof(i2o_lct_entry));
-
- d->flags = 0;
- adpt_i2o_report_hba_unit(pHba, d);
- adpt_i2o_install_device(pHba, d);
-
- pDev = pHba->channel[bus_no].device[scsi_id];
- if( pDev == NULL){
- pDev =
- kzalloc(sizeof(struct adpt_device),
- GFP_ATOMIC);
- if(pDev == NULL) {
- return -ENOMEM;
- }
- pHba->channel[bus_no].device[scsi_id] = pDev;
- } else {
- while (pDev->next_lun) {
- pDev = pDev->next_lun;
- }
- pDev = pDev->next_lun =
- kzalloc(sizeof(struct adpt_device),
- GFP_ATOMIC);
- if(pDev == NULL) {
- return -ENOMEM;
- }
- }
- pDev->tid = d->lct_data.tid;
- pDev->scsi_channel = bus_no;
- pDev->scsi_id = scsi_id;
- pDev->scsi_lun = scsi_lun;
- pDev->pI2o_dev = d;
- d->owner = pDev;
- pDev->type = (buf[0])&0xff;
- pDev->flags = (buf[0]>>8)&0xff;
- // Too late, SCSI system has made up it's mind, but what the hey ...
- if(scsi_id > pHba->top_scsi_id){
- pHba->top_scsi_id = scsi_id;
- }
- if(scsi_lun > pHba->top_scsi_lun){
- pHba->top_scsi_lun = scsi_lun;
- }
- continue;
- } // end of new i2o device
-
- // We found an old device - check it
- while(pDev) {
- if(pDev->scsi_lun == scsi_lun) {
- if(!scsi_device_online(pDev->pScsi_dev)) {
- printk(KERN_WARNING"%s: Setting device (%d,%d,%llu) back online\n",
- pHba->name,bus_no,scsi_id,scsi_lun);
- if (pDev->pScsi_dev) {
- scsi_device_set_state(pDev->pScsi_dev, SDEV_RUNNING);
- }
- }
- d = pDev->pI2o_dev;
- if(d->lct_data.tid != tid) { // something changed
- pDev->tid = tid;
- memcpy(&d->lct_data, &lct->lct_entry[i], sizeof(i2o_lct_entry));
- if (pDev->pScsi_dev) {
- pDev->pScsi_dev->changed = TRUE;
- pDev->pScsi_dev->removable = TRUE;
- }
- }
- // Found it - mark it scanned
- pDev->state = DPTI_DEV_ONLINE;
- break;
- }
- pDev = pDev->next_lun;
- }
- }
- }
- for (pI2o_dev = pHba->devices; pI2o_dev; pI2o_dev = pI2o_dev->next) {
- pDev =(struct adpt_device*) pI2o_dev->owner;
- if(!pDev){
- continue;
- }
- // Drive offline drives that previously existed but could not be found
- // in the LCT table
- if (pDev->state & DPTI_DEV_UNSCANNED){
- pDev->state = DPTI_DEV_OFFLINE;
- printk(KERN_WARNING"%s: Device (%d,%d,%llu) offline\n",pHba->name,pDev->scsi_channel,pDev->scsi_id,pDev->scsi_lun);
- if (pDev->pScsi_dev) {
- scsi_device_set_state(pDev->pScsi_dev, SDEV_OFFLINE);
- }
- }
- }
- return 0;
-}
-
-static void adpt_fail_posted_scbs(adpt_hba* pHba)
-{
- struct scsi_cmnd* cmd = NULL;
- struct scsi_device* d = NULL;
-
- shost_for_each_device(d, pHba->host) {
- unsigned long flags;
- spin_lock_irqsave(&d->list_lock, flags);
- list_for_each_entry(cmd, &d->cmd_list, list) {
- if(cmd->serial_number == 0){
- continue;
- }
- cmd->result = (DID_OK << 16) | (QUEUE_FULL <<1);
- cmd->scsi_done(cmd);
- }
- spin_unlock_irqrestore(&d->list_lock, flags);
- }
-}
-
-
-/*============================================================================
- * Routines from i2o subsystem
- *============================================================================
- */
-
-
-
-/*
- * Bring an I2O controller into HOLD state. See the spec.
- */
-static int adpt_i2o_activate_hba(adpt_hba* pHba)
-{
- int rcode;
-
- if(pHba->initialized ) {
- if (adpt_i2o_status_get(pHba) < 0) {
- if((rcode = adpt_i2o_reset_hba(pHba)) != 0){
- printk(KERN_WARNING"%s: Could NOT reset.\n", pHba->name);
- return rcode;
- }
- if (adpt_i2o_status_get(pHba) < 0) {
- printk(KERN_INFO "HBA not responding.\n");
- return -1;
- }
- }
-
- if(pHba->status_block->iop_state == ADAPTER_STATE_FAULTED) {
- printk(KERN_CRIT "%s: hardware fault\n", pHba->name);
- return -1;
- }
-
- if (pHba->status_block->iop_state == ADAPTER_STATE_READY ||
- pHba->status_block->iop_state == ADAPTER_STATE_OPERATIONAL ||
- pHba->status_block->iop_state == ADAPTER_STATE_HOLD ||
- pHba->status_block->iop_state == ADAPTER_STATE_FAILED) {
- adpt_i2o_reset_hba(pHba);
- if (adpt_i2o_status_get(pHba) < 0 || pHba->status_block->iop_state != ADAPTER_STATE_RESET) {
- printk(KERN_ERR "%s: Failed to initialize.\n", pHba->name);
- return -1;
- }
- }
- } else {
- if((rcode = adpt_i2o_reset_hba(pHba)) != 0){
- printk(KERN_WARNING"%s: Could NOT reset.\n", pHba->name);
- return rcode;
- }
-
- }
-
- if (adpt_i2o_init_outbound_q(pHba) < 0) {
- return -1;
- }
-
- /* In HOLD state */
-
- if (adpt_i2o_hrt_get(pHba) < 0) {
- return -1;
- }
-
- return 0;
-}
-
-/*
- * Bring a controller online into OPERATIONAL state.
- */
-
-static int adpt_i2o_online_hba(adpt_hba* pHba)
-{
- if (adpt_i2o_systab_send(pHba) < 0)
- return -1;
- /* In READY state */
-
- if (adpt_i2o_enable_hba(pHba) < 0)
- return -1;
-
- /* In OPERATIONAL state */
- return 0;
-}
-
-static s32 adpt_send_nop(adpt_hba*pHba,u32 m)
-{
- u32 __iomem *msg;
- ulong timeout = jiffies + 5*HZ;
-
- while(m == EMPTY_QUEUE){
- rmb();
- m = readl(pHba->post_port);
- if(m != EMPTY_QUEUE){
- break;
- }
- if(time_after(jiffies,timeout)){
- printk(KERN_ERR "%s: Timeout waiting for message frame!\n",pHba->name);
- return 2;
- }
- schedule_timeout_uninterruptible(1);
- }
- msg = (u32 __iomem *)(pHba->msg_addr_virt + m);
- writel( THREE_WORD_MSG_SIZE | SGL_OFFSET_0,&msg[0]);
- writel( I2O_CMD_UTIL_NOP << 24 | HOST_TID << 12 | 0,&msg[1]);
- writel( 0,&msg[2]);
- wmb();
-
- writel(m, pHba->post_port);
- wmb();
- return 0;
-}
-
-static s32 adpt_i2o_init_outbound_q(adpt_hba* pHba)
-{
- u8 *status;
- dma_addr_t addr;
- u32 __iomem *msg = NULL;
- int i;
- ulong timeout = jiffies + TMOUT_INITOUTBOUND*HZ;
- u32 m;
-
- do {
- rmb();
- m = readl(pHba->post_port);
- if (m != EMPTY_QUEUE) {
- break;
- }
-
- if(time_after(jiffies,timeout)){
- printk(KERN_WARNING"%s: Timeout waiting for message frame\n",pHba->name);
- return -ETIMEDOUT;
- }
- schedule_timeout_uninterruptible(1);
- } while(m == EMPTY_QUEUE);
-
- msg=(u32 __iomem *)(pHba->msg_addr_virt+m);
-
- status = dma_alloc_coherent(&pHba->pDev->dev, 4, &addr, GFP_KERNEL);
- if (!status) {
- adpt_send_nop(pHba, m);
- printk(KERN_WARNING"%s: IOP reset failed - no free memory.\n",
- pHba->name);
- return -ENOMEM;
- }
- memset(status, 0, 4);
-
- writel(EIGHT_WORD_MSG_SIZE| SGL_OFFSET_6, &msg[0]);
- writel(I2O_CMD_OUTBOUND_INIT<<24 | HOST_TID<<12 | ADAPTER_TID, &msg[1]);
- writel(0, &msg[2]);
- writel(0x0106, &msg[3]); /* Transaction context */
- writel(4096, &msg[4]); /* Host page frame size */
- writel((REPLY_FRAME_SIZE)<<16|0x80, &msg[5]); /* Outbound msg frame size and Initcode */
- writel(0xD0000004, &msg[6]); /* Simple SG LE, EOB */
- writel((u32)addr, &msg[7]);
-
- writel(m, pHba->post_port);
- wmb();
-
- // Wait for the reply status to come back
- do {
- if (*status) {
- if (*status != 0x01 /*I2O_EXEC_OUTBOUND_INIT_IN_PROGRESS*/) {
- break;
- }
- }
- rmb();
- if(time_after(jiffies,timeout)){
- printk(KERN_WARNING"%s: Timeout Initializing\n",pHba->name);
- /* We lose 4 bytes of "status" here, but we
- cannot free these because controller may
- awake and corrupt those bytes at any time */
- /* dma_free_coherent(&pHba->pDev->dev, 4, status, addr); */
- return -ETIMEDOUT;
- }
- schedule_timeout_uninterruptible(1);
- } while (1);
-
- // If the command was successful, fill the fifo with our reply
- // message packets
- if(*status != 0x04 /*I2O_EXEC_OUTBOUND_INIT_COMPLETE*/) {
- dma_free_coherent(&pHba->pDev->dev, 4, status, addr);
- return -2;
- }
- dma_free_coherent(&pHba->pDev->dev, 4, status, addr);
-
- if(pHba->reply_pool != NULL) {
- dma_free_coherent(&pHba->pDev->dev,
- pHba->reply_fifo_size * REPLY_FRAME_SIZE * 4,
- pHba->reply_pool, pHba->reply_pool_pa);
- }
-
- pHba->reply_pool = dma_alloc_coherent(&pHba->pDev->dev,
- pHba->reply_fifo_size * REPLY_FRAME_SIZE * 4,
- &pHba->reply_pool_pa, GFP_KERNEL);
- if (!pHba->reply_pool) {
- printk(KERN_ERR "%s: Could not allocate reply pool\n", pHba->name);
- return -ENOMEM;
- }
- memset(pHba->reply_pool, 0 , pHba->reply_fifo_size * REPLY_FRAME_SIZE * 4);
-
- for(i = 0; i < pHba->reply_fifo_size; i++) {
- writel(pHba->reply_pool_pa + (i * REPLY_FRAME_SIZE * 4),
- pHba->reply_port);
- wmb();
- }
- adpt_i2o_status_get(pHba);
- return 0;
-}
-
-
-/*
- * I2O System Table. Contains information about
- * all the IOPs in the system. Used to inform IOPs
- * about each other's existence.
- *
- * sys_tbl_ver is the CurrentChangeIndicator that is
- * used by IOPs to track changes.
- */
-
-
-
-static s32 adpt_i2o_status_get(adpt_hba* pHba)
-{
- ulong timeout;
- u32 m;
- u32 __iomem *msg;
- u8 *status_block=NULL;
-
- if(pHba->status_block == NULL) {
- pHba->status_block = dma_alloc_coherent(&pHba->pDev->dev,
- sizeof(i2o_status_block),
- &pHba->status_block_pa, GFP_KERNEL);
- if(pHba->status_block == NULL) {
- printk(KERN_ERR
- "dpti%d: Get Status Block failed; Out of memory. \n",
- pHba->unit);
- return -ENOMEM;
- }
- }
- memset(pHba->status_block, 0, sizeof(i2o_status_block));
- status_block = (u8*)(pHba->status_block);
- timeout = jiffies+TMOUT_GETSTATUS*HZ;
- do {
- rmb();
- m = readl(pHba->post_port);
- if (m != EMPTY_QUEUE) {
- break;
- }
- if(time_after(jiffies,timeout)){
- printk(KERN_ERR "%s: Timeout waiting for message !\n",
- pHba->name);
- return -ETIMEDOUT;
- }
- schedule_timeout_uninterruptible(1);
- } while(m==EMPTY_QUEUE);
-
-
- msg=(u32 __iomem *)(pHba->msg_addr_virt+m);
-
- writel(NINE_WORD_MSG_SIZE|SGL_OFFSET_0, &msg[0]);
- writel(I2O_CMD_STATUS_GET<<24|HOST_TID<<12|ADAPTER_TID, &msg[1]);
- writel(1, &msg[2]);
- writel(0, &msg[3]);
- writel(0, &msg[4]);
- writel(0, &msg[5]);
- writel( dma_low(pHba->status_block_pa), &msg[6]);
- writel( dma_high(pHba->status_block_pa), &msg[7]);
- writel(sizeof(i2o_status_block), &msg[8]); // 88 bytes
-
- //post message
- writel(m, pHba->post_port);
- wmb();
-
- while(status_block[87]!=0xff){
- if(time_after(jiffies,timeout)){
- printk(KERN_ERR"dpti%d: Get status timeout.\n",
- pHba->unit);
- return -ETIMEDOUT;
- }
- rmb();
- schedule_timeout_uninterruptible(1);
- }
-
- // Set up our number of outbound and inbound messages
- pHba->post_fifo_size = pHba->status_block->max_inbound_frames;
- if (pHba->post_fifo_size > MAX_TO_IOP_MESSAGES) {
- pHba->post_fifo_size = MAX_TO_IOP_MESSAGES;
- }
-
- pHba->reply_fifo_size = pHba->status_block->max_outbound_frames;
- if (pHba->reply_fifo_size > MAX_FROM_IOP_MESSAGES) {
- pHba->reply_fifo_size = MAX_FROM_IOP_MESSAGES;
- }
-
- // Calculate the Scatter Gather list size
- if (dpt_dma64(pHba)) {
- pHba->sg_tablesize
- = ((pHba->status_block->inbound_frame_size * 4
- - 14 * sizeof(u32))
- / (sizeof(struct sg_simple_element) + sizeof(u32)));
- } else {
- pHba->sg_tablesize
- = ((pHba->status_block->inbound_frame_size * 4
- - 12 * sizeof(u32))
- / sizeof(struct sg_simple_element));
- }
- if (pHba->sg_tablesize > SG_LIST_ELEMENTS) {
- pHba->sg_tablesize = SG_LIST_ELEMENTS;
- }
-
-
-#ifdef DEBUG
- printk("dpti%d: State = ",pHba->unit);
- switch(pHba->status_block->iop_state) {
- case 0x01:
- printk("INIT\n");
- break;
- case 0x02:
- printk("RESET\n");
- break;
- case 0x04:
- printk("HOLD\n");
- break;
- case 0x05:
- printk("READY\n");
- break;
- case 0x08:
- printk("OPERATIONAL\n");
- break;
- case 0x10:
- printk("FAILED\n");
- break;
- case 0x11:
- printk("FAULTED\n");
- break;
- default:
- printk("%x (unknown!!)\n",pHba->status_block->iop_state);
- }
-#endif
- return 0;
-}
-
-/*
- * Get the IOP's Logical Configuration Table
- */
-static int adpt_i2o_lct_get(adpt_hba* pHba)
-{
- u32 msg[8];
- int ret;
- u32 buf[16];
-
- if ((pHba->lct_size == 0) || (pHba->lct == NULL)){
- pHba->lct_size = pHba->status_block->expected_lct_size;
- }
- do {
- if (pHba->lct == NULL) {
- pHba->lct = dma_alloc_coherent(&pHba->pDev->dev,
- pHba->lct_size, &pHba->lct_pa,
- GFP_ATOMIC);
- if(pHba->lct == NULL) {
- printk(KERN_CRIT "%s: Lct Get failed. Out of memory.\n",
- pHba->name);
- return -ENOMEM;
- }
- }
- memset(pHba->lct, 0, pHba->lct_size);
-
- msg[0] = EIGHT_WORD_MSG_SIZE|SGL_OFFSET_6;
- msg[1] = I2O_CMD_LCT_NOTIFY<<24 | HOST_TID<<12 | ADAPTER_TID;
- msg[2] = 0;
- msg[3] = 0;
- msg[4] = 0xFFFFFFFF; /* All devices */
- msg[5] = 0x00000000; /* Report now */
- msg[6] = 0xD0000000|pHba->lct_size;
- msg[7] = (u32)pHba->lct_pa;
-
- if ((ret=adpt_i2o_post_wait(pHba, msg, sizeof(msg), 360))) {
- printk(KERN_ERR "%s: LCT Get failed (status=%#10x.\n",
- pHba->name, ret);
- printk(KERN_ERR"Adaptec: Error Reading Hardware.\n");
- return ret;
- }
-
- if ((pHba->lct->table_size << 2) > pHba->lct_size) {
- pHba->lct_size = pHba->lct->table_size << 2;
- dma_free_coherent(&pHba->pDev->dev, pHba->lct_size,
- pHba->lct, pHba->lct_pa);
- pHba->lct = NULL;
- }
- } while (pHba->lct == NULL);
-
- PDEBUG("%s: Hardware resource table read.\n", pHba->name);
-
-
- // I2O_DPT_EXEC_IOP_BUFFERS_GROUP_NO;
- if(adpt_i2o_query_scalar(pHba, 0 , 0x8000, -1, buf, sizeof(buf))>=0) {
- pHba->FwDebugBufferSize = buf[1];
- pHba->FwDebugBuffer_P = ioremap(pHba->base_addr_phys + buf[0],
- pHba->FwDebugBufferSize);
- if (pHba->FwDebugBuffer_P) {
- pHba->FwDebugFlags_P = pHba->FwDebugBuffer_P +
- FW_DEBUG_FLAGS_OFFSET;
- pHba->FwDebugBLEDvalue_P = pHba->FwDebugBuffer_P +
- FW_DEBUG_BLED_OFFSET;
- pHba->FwDebugBLEDflag_P = pHba->FwDebugBLEDvalue_P + 1;
- pHba->FwDebugStrLength_P = pHba->FwDebugBuffer_P +
- FW_DEBUG_STR_LENGTH_OFFSET;
- pHba->FwDebugBuffer_P += buf[2];
- pHba->FwDebugFlags = 0;
- }
- }
-
- return 0;
-}
-
-static int adpt_i2o_build_sys_table(void)
-{
- adpt_hba* pHba = hba_chain;
- int count = 0;
-
- if (sys_tbl)
- dma_free_coherent(&pHba->pDev->dev, sys_tbl_len,
- sys_tbl, sys_tbl_pa);
-
- sys_tbl_len = sizeof(struct i2o_sys_tbl) + // Header + IOPs
- (hba_count) * sizeof(struct i2o_sys_tbl_entry);
-
- sys_tbl = dma_alloc_coherent(&pHba->pDev->dev,
- sys_tbl_len, &sys_tbl_pa, GFP_KERNEL);
- if (!sys_tbl) {
- printk(KERN_WARNING "SysTab Set failed. Out of memory.\n");
- return -ENOMEM;
- }
- memset(sys_tbl, 0, sys_tbl_len);
-
- sys_tbl->num_entries = hba_count;
- sys_tbl->version = I2OVERSION;
- sys_tbl->change_ind = sys_tbl_ind++;
-
- for(pHba = hba_chain; pHba; pHba = pHba->next) {
- u64 addr;
- // Get updated Status Block so we have the latest information
- if (adpt_i2o_status_get(pHba)) {
- sys_tbl->num_entries--;
- continue; // try next one
- }
-
- sys_tbl->iops[count].org_id = pHba->status_block->org_id;
- sys_tbl->iops[count].iop_id = pHba->unit + 2;
- sys_tbl->iops[count].seg_num = 0;
- sys_tbl->iops[count].i2o_version = pHba->status_block->i2o_version;
- sys_tbl->iops[count].iop_state = pHba->status_block->iop_state;
- sys_tbl->iops[count].msg_type = pHba->status_block->msg_type;
- sys_tbl->iops[count].frame_size = pHba->status_block->inbound_frame_size;
- sys_tbl->iops[count].last_changed = sys_tbl_ind - 1; // ??
- sys_tbl->iops[count].iop_capabilities = pHba->status_block->iop_capabilities;
- addr = pHba->base_addr_phys + 0x40;
- sys_tbl->iops[count].inbound_low = dma_low(addr);
- sys_tbl->iops[count].inbound_high = dma_high(addr);
-
- count++;
- }
-
-#ifdef DEBUG
-{
- u32 *table = (u32*)sys_tbl;
- printk(KERN_DEBUG"sys_tbl_len=%d in 32bit words\n",(sys_tbl_len >>2));
- for(count = 0; count < (sys_tbl_len >>2); count++) {
- printk(KERN_INFO "sys_tbl[%d] = %0#10x\n",
- count, table[count]);
- }
-}
-#endif
-
- return 0;
-}
-
-
-/*
- * Dump the information block associated with a given unit (TID)
- */
-
-static void adpt_i2o_report_hba_unit(adpt_hba* pHba, struct i2o_device *d)
-{
- char buf[64];
- int unit = d->lct_data.tid;
-
- printk(KERN_INFO "TID %3.3d ", unit);
-
- if(adpt_i2o_query_scalar(pHba, unit, 0xF100, 3, buf, 16)>=0)
- {
- buf[16]=0;
- printk(" Vendor: %-12.12s", buf);
- }
- if(adpt_i2o_query_scalar(pHba, unit, 0xF100, 4, buf, 16)>=0)
- {
- buf[16]=0;
- printk(" Device: %-12.12s", buf);
- }
- if(adpt_i2o_query_scalar(pHba, unit, 0xF100, 6, buf, 8)>=0)
- {
- buf[8]=0;
- printk(" Rev: %-12.12s\n", buf);
- }
-#ifdef DEBUG
- printk(KERN_INFO "\tClass: %.21s\n", adpt_i2o_get_class_name(d->lct_data.class_id));
- printk(KERN_INFO "\tSubclass: 0x%04X\n", d->lct_data.sub_class);
- printk(KERN_INFO "\tFlags: ");
-
- if(d->lct_data.device_flags&(1<<0))
- printk("C"); // ConfigDialog requested
- if(d->lct_data.device_flags&(1<<1))
- printk("U"); // Multi-user capable
- if(!(d->lct_data.device_flags&(1<<4)))
- printk("P"); // Peer service enabled!
- if(!(d->lct_data.device_flags&(1<<5)))
- printk("M"); // Mgmt service enabled!
- printk("\n");
-#endif
-}
-
-#ifdef DEBUG
-/*
- * Do i2o class name lookup
- */
-static const char *adpt_i2o_get_class_name(int class)
-{
- int idx = 16;
- static char *i2o_class_name[] = {
- "Executive",
- "Device Driver Module",
- "Block Device",
- "Tape Device",
- "LAN Interface",
- "WAN Interface",
- "Fibre Channel Port",
- "Fibre Channel Device",
- "SCSI Device",
- "ATE Port",
- "ATE Device",
- "Floppy Controller",
- "Floppy Device",
- "Secondary Bus Port",
- "Peer Transport Agent",
- "Peer Transport",
- "Unknown"
- };
-
- switch(class&0xFFF) {
- case I2O_CLASS_EXECUTIVE:
- idx = 0; break;
- case I2O_CLASS_DDM:
- idx = 1; break;
- case I2O_CLASS_RANDOM_BLOCK_STORAGE:
- idx = 2; break;
- case I2O_CLASS_SEQUENTIAL_STORAGE:
- idx = 3; break;
- case I2O_CLASS_LAN:
- idx = 4; break;
- case I2O_CLASS_WAN:
- idx = 5; break;
- case I2O_CLASS_FIBRE_CHANNEL_PORT:
- idx = 6; break;
- case I2O_CLASS_FIBRE_CHANNEL_PERIPHERAL:
- idx = 7; break;
- case I2O_CLASS_SCSI_PERIPHERAL:
- idx = 8; break;
- case I2O_CLASS_ATE_PORT:
- idx = 9; break;
- case I2O_CLASS_ATE_PERIPHERAL:
- idx = 10; break;
- case I2O_CLASS_FLOPPY_CONTROLLER:
- idx = 11; break;
- case I2O_CLASS_FLOPPY_DEVICE:
- idx = 12; break;
- case I2O_CLASS_BUS_ADAPTER_PORT:
- idx = 13; break;
- case I2O_CLASS_PEER_TRANSPORT_AGENT:
- idx = 14; break;
- case I2O_CLASS_PEER_TRANSPORT:
- idx = 15; break;
- }
- return i2o_class_name[idx];
-}
-#endif
-
-
-static s32 adpt_i2o_hrt_get(adpt_hba* pHba)
-{
- u32 msg[6];
- int ret, size = sizeof(i2o_hrt);
-
- do {
- if (pHba->hrt == NULL) {
- pHba->hrt = dma_alloc_coherent(&pHba->pDev->dev,
- size, &pHba->hrt_pa, GFP_KERNEL);
- if (pHba->hrt == NULL) {
- printk(KERN_CRIT "%s: Hrt Get failed; Out of memory.\n", pHba->name);
- return -ENOMEM;
- }
- }
-
- msg[0]= SIX_WORD_MSG_SIZE| SGL_OFFSET_4;
- msg[1]= I2O_CMD_HRT_GET<<24 | HOST_TID<<12 | ADAPTER_TID;
- msg[2]= 0;
- msg[3]= 0;
- msg[4]= (0xD0000000 | size); /* Simple transaction */
- msg[5]= (u32)pHba->hrt_pa; /* Dump it here */
-
- if ((ret = adpt_i2o_post_wait(pHba, msg, sizeof(msg),20))) {
- printk(KERN_ERR "%s: Unable to get HRT (status=%#10x)\n", pHba->name, ret);
- return ret;
- }
-
- if (pHba->hrt->num_entries * pHba->hrt->entry_len << 2 > size) {
- int newsize = pHba->hrt->num_entries * pHba->hrt->entry_len << 2;
- dma_free_coherent(&pHba->pDev->dev, size,
- pHba->hrt, pHba->hrt_pa);
- size = newsize;
- pHba->hrt = NULL;
- }
- } while(pHba->hrt == NULL);
- return 0;
-}
-
-/*
- * Query one scalar group value or a whole scalar group.
- */
-static int adpt_i2o_query_scalar(adpt_hba* pHba, int tid,
- int group, int field, void *buf, int buflen)
-{
- u16 opblk[] = { 1, 0, I2O_PARAMS_FIELD_GET, group, 1, field };
- u8 *opblk_va;
- dma_addr_t opblk_pa;
- u8 *resblk_va;
- dma_addr_t resblk_pa;
-
- int size;
-
- /* 8 bytes for header */
- resblk_va = dma_alloc_coherent(&pHba->pDev->dev,
- sizeof(u8) * (8 + buflen), &resblk_pa, GFP_KERNEL);
- if (resblk_va == NULL) {
- printk(KERN_CRIT "%s: query scalar failed; Out of memory.\n", pHba->name);
- return -ENOMEM;
- }
-
- opblk_va = dma_alloc_coherent(&pHba->pDev->dev,
- sizeof(opblk), &opblk_pa, GFP_KERNEL);
- if (opblk_va == NULL) {
- dma_free_coherent(&pHba->pDev->dev, sizeof(u8) * (8+buflen),
- resblk_va, resblk_pa);
- printk(KERN_CRIT "%s: query operation failed; Out of memory.\n",
- pHba->name);
- return -ENOMEM;
- }
- if (field == -1) /* whole group */
- opblk[4] = -1;
-
- memcpy(opblk_va, opblk, sizeof(opblk));
- size = adpt_i2o_issue_params(I2O_CMD_UTIL_PARAMS_GET, pHba, tid,
- opblk_va, opblk_pa, sizeof(opblk),
- resblk_va, resblk_pa, sizeof(u8)*(8+buflen));
- dma_free_coherent(&pHba->pDev->dev, sizeof(opblk), opblk_va, opblk_pa);
- if (size == -ETIME) {
- dma_free_coherent(&pHba->pDev->dev, sizeof(u8) * (8+buflen),
- resblk_va, resblk_pa);
- printk(KERN_WARNING "%s: issue params failed; Timed out.\n", pHba->name);
- return -ETIME;
- } else if (size == -EINTR) {
- dma_free_coherent(&pHba->pDev->dev, sizeof(u8) * (8+buflen),
- resblk_va, resblk_pa);
- printk(KERN_WARNING "%s: issue params failed; Interrupted.\n", pHba->name);
- return -EINTR;
- }
-
- memcpy(buf, resblk_va+8, buflen); /* cut off header */
-
- dma_free_coherent(&pHba->pDev->dev, sizeof(u8) * (8+buflen),
- resblk_va, resblk_pa);
- if (size < 0)
- return size;
-
- return buflen;
-}
-
-
-/* Issue UTIL_PARAMS_GET or UTIL_PARAMS_SET
- *
- * This function can be used for all UtilParamsGet/Set operations.
- * The OperationBlock is given in opblk-buffer,
- * and results are returned in resblk-buffer.
- * Note that the minimum sized resblk is 8 bytes and contains
- * ResultCount, ErrorInfoSize, BlockStatus and BlockSize.
- */
-static int adpt_i2o_issue_params(int cmd, adpt_hba* pHba, int tid,
- void *opblk_va, dma_addr_t opblk_pa, int oplen,
- void *resblk_va, dma_addr_t resblk_pa, int reslen)
-{
- u32 msg[9];
- u32 *res = (u32 *)resblk_va;
- int wait_status;
-
- msg[0] = NINE_WORD_MSG_SIZE | SGL_OFFSET_5;
- msg[1] = cmd << 24 | HOST_TID << 12 | tid;
- msg[2] = 0;
- msg[3] = 0;
- msg[4] = 0;
- msg[5] = 0x54000000 | oplen; /* OperationBlock */
- msg[6] = (u32)opblk_pa;
- msg[7] = 0xD0000000 | reslen; /* ResultBlock */
- msg[8] = (u32)resblk_pa;
-
- if ((wait_status = adpt_i2o_post_wait(pHba, msg, sizeof(msg), 20))) {
- printk("adpt_i2o_issue_params: post_wait failed (%p)\n", resblk_va);
- return wait_status; /* -DetailedStatus */
- }
-
- if (res[1]&0x00FF0000) { /* BlockStatus != SUCCESS */
- printk(KERN_WARNING "%s: %s - Error:\n ErrorInfoSize = 0x%02x, "
- "BlockStatus = 0x%02x, BlockSize = 0x%04x\n",
- pHba->name,
- (cmd == I2O_CMD_UTIL_PARAMS_SET) ? "PARAMS_SET"
- : "PARAMS_GET",
- res[1]>>24, (res[1]>>16)&0xFF, res[1]&0xFFFF);
- return -((res[1] >> 16) & 0xFF); /* -BlockStatus */
- }
-
- return 4 + ((res[1] & 0x0000FFFF) << 2); /* bytes used in resblk */
-}
-
-
-static s32 adpt_i2o_quiesce_hba(adpt_hba* pHba)
-{
- u32 msg[4];
- int ret;
-
- adpt_i2o_status_get(pHba);
-
- /* SysQuiesce discarded if IOP not in READY or OPERATIONAL state */
-
- if((pHba->status_block->iop_state != ADAPTER_STATE_READY) &&
- (pHba->status_block->iop_state != ADAPTER_STATE_OPERATIONAL)){
- return 0;
- }
-
- msg[0] = FOUR_WORD_MSG_SIZE|SGL_OFFSET_0;
- msg[1] = I2O_CMD_SYS_QUIESCE<<24|HOST_TID<<12|ADAPTER_TID;
- msg[2] = 0;
- msg[3] = 0;
-
- if((ret = adpt_i2o_post_wait(pHba, msg, sizeof(msg), 240))) {
- printk(KERN_INFO"dpti%d: Unable to quiesce (status=%#x).\n",
- pHba->unit, -ret);
- } else {
- printk(KERN_INFO"dpti%d: Quiesced.\n",pHba->unit);
- }
-
- adpt_i2o_status_get(pHba);
- return ret;
-}
-
-
-/*
- * Enable IOP. Allows the IOP to resume external operations.
- */
-static int adpt_i2o_enable_hba(adpt_hba* pHba)
-{
- u32 msg[4];
- int ret;
-
- adpt_i2o_status_get(pHba);
- if(!pHba->status_block){
- return -ENOMEM;
- }
- /* Enable only allowed on READY state */
- if(pHba->status_block->iop_state == ADAPTER_STATE_OPERATIONAL)
- return 0;
-
- if(pHba->status_block->iop_state != ADAPTER_STATE_READY)
- return -EINVAL;
-
- msg[0]=FOUR_WORD_MSG_SIZE|SGL_OFFSET_0;
- msg[1]=I2O_CMD_SYS_ENABLE<<24|HOST_TID<<12|ADAPTER_TID;
- msg[2]= 0;
- msg[3]= 0;
-
- if ((ret = adpt_i2o_post_wait(pHba, msg, sizeof(msg), 240))) {
- printk(KERN_WARNING"%s: Could not enable (status=%#10x).\n",
- pHba->name, ret);
- } else {
- PDEBUG("%s: Enabled.\n", pHba->name);
- }
-
- adpt_i2o_status_get(pHba);
- return ret;
-}
-
-
-static int adpt_i2o_systab_send(adpt_hba* pHba)
-{
- u32 msg[12];
- int ret;
-
- msg[0] = I2O_MESSAGE_SIZE(12) | SGL_OFFSET_6;
- msg[1] = I2O_CMD_SYS_TAB_SET<<24 | HOST_TID<<12 | ADAPTER_TID;
- msg[2] = 0;
- msg[3] = 0;
- msg[4] = (0<<16) | ((pHba->unit+2) << 12); /* Host 0 IOP ID (unit + 2) */
- msg[5] = 0; /* Segment 0 */
-
- /*
- * Provide three SGL-elements:
- * System table (SysTab), Private memory space declaration and
- * Private i/o space declaration
- */
- msg[6] = 0x54000000 | sys_tbl_len;
- msg[7] = (u32)sys_tbl_pa;
- msg[8] = 0x54000000 | 0;
- msg[9] = 0;
- msg[10] = 0xD4000000 | 0;
- msg[11] = 0;
-
- if ((ret=adpt_i2o_post_wait(pHba, msg, sizeof(msg), 120))) {
- printk(KERN_INFO "%s: Unable to set SysTab (status=%#10x).\n",
- pHba->name, ret);
- }
-#ifdef DEBUG
- else {
- PINFO("%s: SysTab set.\n", pHba->name);
- }
-#endif
-
- return ret;
-}
-
-
-/*============================================================================
- *
- *============================================================================
- */
-
-
-#ifdef UARTDELAY
-
-static static void adpt_delay(int millisec)
-{
- int i;
- for (i = 0; i < millisec; i++) {
- udelay(1000); /* delay for one millisecond */
- }
-}
-
-#endif
-
-static struct scsi_host_template driver_template = {
- .module = THIS_MODULE,
- .name = "dpt_i2o",
- .proc_name = "dpt_i2o",
- .show_info = adpt_show_info,
- .info = adpt_info,
- .queuecommand = adpt_queue,
- .eh_abort_handler = adpt_abort,
- .eh_device_reset_handler = adpt_device_reset,
- .eh_bus_reset_handler = adpt_bus_reset,
- .eh_host_reset_handler = adpt_reset,
- .bios_param = adpt_bios_param,
- .slave_configure = adpt_slave_configure,
- .can_queue = MAX_TO_IOP_MESSAGES,
- .this_id = 7,
- .use_clustering = ENABLE_CLUSTERING,
-};
-
-static int __init adpt_init(void)
-{
- int error;
- adpt_hba *pHba, *next;
-
- printk("Loading Adaptec I2O RAID: Version " DPT_I2O_VERSION "\n");
-
- error = adpt_detect(&driver_template);
- if (error < 0)
- return error;
- if (hba_chain == NULL)
- return -ENODEV;
-
- for (pHba = hba_chain; pHba; pHba = pHba->next) {
- error = scsi_add_host(pHba->host, &pHba->pDev->dev);
- if (error)
- goto fail;
- scsi_scan_host(pHba->host);
- }
- return 0;
-fail:
- for (pHba = hba_chain; pHba; pHba = next) {
- next = pHba->next;
- scsi_remove_host(pHba->host);
- }
- return error;
-}
-
-static void __exit adpt_exit(void)
-{
- adpt_hba *pHba, *next;
-
- for (pHba = hba_chain; pHba; pHba = next) {
- next = pHba->next;
- adpt_release(pHba);
- }
-}
-
-module_init(adpt_init);
-module_exit(adpt_exit);
-
-MODULE_LICENSE("GPL");
diff --git a/drivers/scsi/dpti.h b/drivers/scsi/dpti.h
deleted file mode 100644
index dfc8d2eaa09e..000000000000
--- a/drivers/scsi/dpti.h
+++ /dev/null
@@ -1,335 +0,0 @@
-/***************************************************************************
- dpti.h - description
- -------------------
- begin : Thu Sep 7 2000
- copyright : (C) 2001 by Adaptec
-
- See Documentation/scsi/dpti.txt for history, notes, license info
- and credits
- ***************************************************************************/
-
-/***************************************************************************
- * *
- * This program is free software; you can redistribute it and/or modify *
- * it under the terms of the GNU General Public License as published by *
- * the Free Software Foundation; either version 2 of the License, or *
- * (at your option) any later version. *
- * *
- ***************************************************************************/
-
-#ifndef _DPT_H
-#define _DPT_H
-
-#define MAX_TO_IOP_MESSAGES (255)
-#define MAX_FROM_IOP_MESSAGES (255)
-
-
-/*
- * SCSI interface function Prototypes
- */
-
-static int adpt_detect(struct scsi_host_template * sht);
-static int adpt_queue(struct Scsi_Host *h, struct scsi_cmnd * cmd);
-static int adpt_abort(struct scsi_cmnd * cmd);
-static int adpt_reset(struct scsi_cmnd* cmd);
-static int adpt_slave_configure(struct scsi_device *);
-
-static const char *adpt_info(struct Scsi_Host *pSHost);
-static int adpt_bios_param(struct scsi_device * sdev, struct block_device *dev,
- sector_t, int geom[]);
-
-static int adpt_bus_reset(struct scsi_cmnd* cmd);
-static int adpt_device_reset(struct scsi_cmnd* cmd);
-
-
-/*
- * struct scsi_host_template (see scsi/scsi_host.h)
- */
-
-#define DPT_DRIVER_NAME "Adaptec I2O RAID"
-
-#ifndef HOSTS_C
-
-#include "dpt/sys_info.h"
-#include <linux/wait.h>
-#include "dpt/dpti_i2o.h"
-#include "dpt/dpti_ioctl.h"
-
-#define DPT_I2O_VERSION "2.4 Build 5go"
-#define DPT_VERSION 2
-#define DPT_REVISION '4'
-#define DPT_SUBREVISION '5'
-#define DPT_BETA ""
-#define DPT_MONTH 8
-#define DPT_DAY 7
-#define DPT_YEAR (2001-1980)
-
-#define DPT_DRIVER "dpt_i2o"
-#define DPTI_I2O_MAJOR (151)
-#define DPT_ORGANIZATION_ID (0x1B) /* For Private Messages */
-#define DPTI_MAX_HBA (16)
-#define MAX_CHANNEL (5) // Maximum Channel # Supported
-#define MAX_ID (128) // Maximum Target ID Supported
-
-/* Sizes in 4 byte words */
-#define REPLY_FRAME_SIZE (17)
-#define MAX_MESSAGE_SIZE (128)
-#define SG_LIST_ELEMENTS (56)
-
-#define EMPTY_QUEUE 0xffffffff
-#define I2O_INTERRUPT_PENDING_B (0x08)
-
-#define PCI_DPT_VENDOR_ID (0x1044) // DPT PCI Vendor ID
-#define PCI_DPT_DEVICE_ID (0xA501) // DPT PCI I2O Device ID
-#define PCI_DPT_RAPTOR_DEVICE_ID (0xA511)
-
-/* Debugging macro from Linux Device Drivers - Rubini */
-#undef PDEBUG
-#ifdef DEBUG
-//TODO add debug level switch
-# define PDEBUG(fmt, args...) printk(KERN_DEBUG "dpti: " fmt, ##args)
-# define PDEBUGV(fmt, args...) printk(KERN_DEBUG "dpti: " fmt, ##args)
-#else
-# define PDEBUG(fmt, args...) /* not debugging: nothing */
-# define PDEBUGV(fmt, args...) /* not debugging: nothing */
-#endif
-
-#define PERROR(fmt, args...) printk(KERN_ERR fmt, ##args)
-#define PWARN(fmt, args...) printk(KERN_WARNING fmt, ##args)
-#define PINFO(fmt, args...) printk(KERN_INFO fmt, ##args)
-#define PCRIT(fmt, args...) printk(KERN_CRIT fmt, ##args)
-
-#define SHUTDOWN_SIGS (sigmask(SIGKILL)|sigmask(SIGINT)|sigmask(SIGTERM))
-
-// Command timeouts
-#define FOREVER (0)
-#define TMOUT_INQUIRY (20)
-#define TMOUT_FLUSH (360/45)
-#define TMOUT_ABORT (30)
-#define TMOUT_SCSI (300)
-#define TMOUT_IOPRESET (360)
-#define TMOUT_GETSTATUS (15)
-#define TMOUT_INITOUTBOUND (15)
-#define TMOUT_LCT (360)
-
-
-#define I2O_SCSI_DEVICE_DSC_MASK 0x00FF
-
-#define I2O_DETAIL_STATUS_UNSUPPORTED_FUNCTION 0x000A
-
-#define I2O_SCSI_DSC_MASK 0xFF00
-#define I2O_SCSI_DSC_SUCCESS 0x0000
-#define I2O_SCSI_DSC_REQUEST_ABORTED 0x0200
-#define I2O_SCSI_DSC_UNABLE_TO_ABORT 0x0300
-#define I2O_SCSI_DSC_COMPLETE_WITH_ERROR 0x0400
-#define I2O_SCSI_DSC_ADAPTER_BUSY 0x0500
-#define I2O_SCSI_DSC_REQUEST_INVALID 0x0600
-#define I2O_SCSI_DSC_PATH_INVALID 0x0700
-#define I2O_SCSI_DSC_DEVICE_NOT_PRESENT 0x0800
-#define I2O_SCSI_DSC_UNABLE_TO_TERMINATE 0x0900
-#define I2O_SCSI_DSC_SELECTION_TIMEOUT 0x0A00
-#define I2O_SCSI_DSC_COMMAND_TIMEOUT 0x0B00
-#define I2O_SCSI_DSC_MR_MESSAGE_RECEIVED 0x0D00
-#define I2O_SCSI_DSC_SCSI_BUS_RESET 0x0E00
-#define I2O_SCSI_DSC_PARITY_ERROR_FAILURE 0x0F00
-#define I2O_SCSI_DSC_AUTOSENSE_FAILED 0x1000
-#define I2O_SCSI_DSC_NO_ADAPTER 0x1100
-#define I2O_SCSI_DSC_DATA_OVERRUN 0x1200
-#define I2O_SCSI_DSC_UNEXPECTED_BUS_FREE 0x1300
-#define I2O_SCSI_DSC_SEQUENCE_FAILURE 0x1400
-#define I2O_SCSI_DSC_REQUEST_LENGTH_ERROR 0x1500
-#define I2O_SCSI_DSC_PROVIDE_FAILURE 0x1600
-#define I2O_SCSI_DSC_BDR_MESSAGE_SENT 0x1700
-#define I2O_SCSI_DSC_REQUEST_TERMINATED 0x1800
-#define I2O_SCSI_DSC_IDE_MESSAGE_SENT 0x3300
-#define I2O_SCSI_DSC_RESOURCE_UNAVAILABLE 0x3400
-#define I2O_SCSI_DSC_UNACKNOWLEDGED_EVENT 0x3500
-#define I2O_SCSI_DSC_MESSAGE_RECEIVED 0x3600
-#define I2O_SCSI_DSC_INVALID_CDB 0x3700
-#define I2O_SCSI_DSC_LUN_INVALID 0x3800
-#define I2O_SCSI_DSC_SCSI_TID_INVALID 0x3900
-#define I2O_SCSI_DSC_FUNCTION_UNAVAILABLE 0x3A00
-#define I2O_SCSI_DSC_NO_NEXUS 0x3B00
-#define I2O_SCSI_DSC_SCSI_IID_INVALID 0x3C00
-#define I2O_SCSI_DSC_CDB_RECEIVED 0x3D00
-#define I2O_SCSI_DSC_LUN_ALREADY_ENABLED 0x3E00
-#define I2O_SCSI_DSC_BUS_BUSY 0x3F00
-#define I2O_SCSI_DSC_QUEUE_FROZEN 0x4000
-
-
-#ifndef TRUE
-#define TRUE 1
-#define FALSE 0
-#endif
-
-#define HBA_FLAGS_INSTALLED_B 0x00000001 // Adapter Was Installed
-#define HBA_FLAGS_BLINKLED_B 0x00000002 // Adapter In Blink LED State
-#define HBA_FLAGS_IN_RESET 0x00000040 /* in reset */
-#define HBA_HOSTRESET_FAILED 0x00000080 /* adpt_resethost failed */
-
-
-// Device state flags
-#define DPTI_DEV_ONLINE 0x00
-#define DPTI_DEV_UNSCANNED 0x01
-#define DPTI_DEV_RESET 0x02
-#define DPTI_DEV_OFFLINE 0x04
-
-
-struct adpt_device {
- struct adpt_device* next_lun;
- u32 flags;
- u32 type;
- u32 capacity;
- u32 block_size;
- u8 scsi_channel;
- u8 scsi_id;
- u64 scsi_lun;
- u8 state;
- u16 tid;
- struct i2o_device* pI2o_dev;
- struct scsi_device *pScsi_dev;
-};
-
-struct adpt_channel {
- struct adpt_device* device[MAX_ID]; /* used as an array of 128 scsi ids */
- u8 scsi_id;
- u8 type;
- u16 tid;
- u32 state;
- struct i2o_device* pI2o_dev;
-};
-
-// HBA state flags
-#define DPTI_STATE_RESET (0x01)
-
-typedef struct _adpt_hba {
- struct _adpt_hba *next;
- struct pci_dev *pDev;
- struct Scsi_Host *host;
- u32 state;
- spinlock_t state_lock;
- int unit;
- int host_no; /* SCSI host number */
- u8 initialized;
- u8 in_use; /* is the management node open*/
-
- char name[32];
- char detail[55];
-
- void __iomem *base_addr_virt;
- void __iomem *msg_addr_virt;
- ulong base_addr_phys;
- void __iomem *post_port;
- void __iomem *reply_port;
- void __iomem *irq_mask;
- u16 post_count;
- u32 post_fifo_size;
- u32 reply_fifo_size;
- u32* reply_pool;
- dma_addr_t reply_pool_pa;
- u32 sg_tablesize; // Scatter/Gather List Size.
- u8 top_scsi_channel;
- u8 top_scsi_id;
- u64 top_scsi_lun;
- u8 dma64;
-
- i2o_status_block* status_block;
- dma_addr_t status_block_pa;
- i2o_hrt* hrt;
- dma_addr_t hrt_pa;
- i2o_lct* lct;
- dma_addr_t lct_pa;
- uint lct_size;
- struct i2o_device* devices;
- struct adpt_channel channel[MAX_CHANNEL];
- struct proc_dir_entry* proc_entry; /* /proc dir */
-
- void __iomem *FwDebugBuffer_P; // Virtual Address Of FW Debug Buffer
- u32 FwDebugBufferSize; // FW Debug Buffer Size In Bytes
- void __iomem *FwDebugStrLength_P;// Virtual Addr Of FW Debug String Len
- void __iomem *FwDebugFlags_P; // Virtual Address Of FW Debug Flags
- void __iomem *FwDebugBLEDflag_P;// Virtual Addr Of FW Debug BLED
- void __iomem *FwDebugBLEDvalue_P;// Virtual Addr Of FW Debug BLED
- u32 FwDebugFlags;
- u32 *ioctl_reply_context[4];
-} adpt_hba;
-
-struct sg_simple_element {
- u32 flag_count;
- u32 addr_bus;
-};
-
-/*
- * Function Prototypes
- */
-
-static void adpt_i2o_sys_shutdown(void);
-static int adpt_init(void);
-static int adpt_i2o_build_sys_table(void);
-static irqreturn_t adpt_isr(int irq, void *dev_id);
-
-static void adpt_i2o_report_hba_unit(adpt_hba* pHba, struct i2o_device *d);
-static int adpt_i2o_query_scalar(adpt_hba* pHba, int tid,
- int group, int field, void *buf, int buflen);
-#ifdef DEBUG
-static const char *adpt_i2o_get_class_name(int class);
-#endif
-static int adpt_i2o_issue_params(int cmd, adpt_hba* pHba, int tid,
- void *opblk, dma_addr_t opblk_pa, int oplen,
- void *resblk, dma_addr_t resblk_pa, int reslen);
-static int adpt_i2o_post_wait(adpt_hba* pHba, u32* msg, int len, int timeout);
-static int adpt_i2o_lct_get(adpt_hba* pHba);
-static int adpt_i2o_parse_lct(adpt_hba* pHba);
-static int adpt_i2o_activate_hba(adpt_hba* pHba);
-static int adpt_i2o_enable_hba(adpt_hba* pHba);
-static int adpt_i2o_install_device(adpt_hba* pHba, struct i2o_device *d);
-static s32 adpt_i2o_post_this(adpt_hba* pHba, u32* data, int len);
-static s32 adpt_i2o_quiesce_hba(adpt_hba* pHba);
-static s32 adpt_i2o_status_get(adpt_hba* pHba);
-static s32 adpt_i2o_init_outbound_q(adpt_hba* pHba);
-static s32 adpt_i2o_hrt_get(adpt_hba* pHba);
-static s32 adpt_scsi_to_i2o(adpt_hba* pHba, struct scsi_cmnd* cmd, struct adpt_device* dptdevice);
-static s32 adpt_i2o_to_scsi(void __iomem *reply, struct scsi_cmnd* cmd);
-static s32 adpt_scsi_host_alloc(adpt_hba* pHba,struct scsi_host_template * sht);
-static s32 adpt_hba_reset(adpt_hba* pHba);
-static s32 adpt_i2o_reset_hba(adpt_hba* pHba);
-static s32 adpt_rescan(adpt_hba* pHba);
-static s32 adpt_i2o_reparse_lct(adpt_hba* pHba);
-static s32 adpt_send_nop(adpt_hba*pHba,u32 m);
-static void adpt_i2o_delete_hba(adpt_hba* pHba);
-static void adpt_inquiry(adpt_hba* pHba);
-static void adpt_fail_posted_scbs(adpt_hba* pHba);
-static struct adpt_device* adpt_find_device(adpt_hba* pHba, u32 chan, u32 id, u64 lun);
-static int adpt_install_hba(struct scsi_host_template* sht, struct pci_dev* pDev) ;
-static int adpt_i2o_online_hba(adpt_hba* pHba);
-static void adpt_i2o_post_wait_complete(u32, int);
-static int adpt_i2o_systab_send(adpt_hba* pHba);
-
-static int adpt_ioctl(struct inode *inode, struct file *file, uint cmd, ulong arg);
-static int adpt_open(struct inode *inode, struct file *file);
-static int adpt_close(struct inode *inode, struct file *file);
-
-
-#ifdef UARTDELAY
-static void adpt_delay(int millisec);
-#endif
-
-#define PRINT_BUFFER_SIZE 512
-
-#define HBA_FLAGS_DBG_FLAGS_MASK 0xffff0000 // Mask for debug flags
-#define HBA_FLAGS_DBG_KERNEL_PRINT_B 0x00010000 // Kernel Debugger Print
-#define HBA_FLAGS_DBG_FW_PRINT_B 0x00020000 // Firmware Debugger Print
-#define HBA_FLAGS_DBG_FUNCTION_ENTRY_B 0x00040000 // Function Entry Point
-#define HBA_FLAGS_DBG_FUNCTION_EXIT_B 0x00080000 // Function Exit
-#define HBA_FLAGS_DBG_ERROR_B 0x00100000 // Error Conditions
-#define HBA_FLAGS_DBG_INIT_B 0x00200000 // Init Prints
-#define HBA_FLAGS_DBG_OS_COMMANDS_B 0x00400000 // OS Command Info
-#define HBA_FLAGS_DBG_SCAN_B 0x00800000 // Device Scan
-
-#define FW_DEBUG_STR_LENGTH_OFFSET 0
-#define FW_DEBUG_FLAGS_OFFSET 4
-#define FW_DEBUG_BLED_OFFSET 8
-
-#define FW_DEBUG_FLAGS_NO_HEADERS_B 0x01
-#endif /* !HOSTS_C */
-#endif /* _DPT_H */
--
2.25.1
1
6

[PATCH openEuler-1.0-LTS 1/9] bonding: Fix memory leak when changing bond type to Ethernet
by Yongqiang Liu 08 May '23
by Yongqiang Liu 08 May '23
08 May '23
From: Ido Schimmel <idosch(a)nvidia.com>
mainline inclusion
from mainline-v6.3-rc6
commit c484fcc058bada604d7e4e5228d4affb646ddbc2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I718K0
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
When a net device is put administratively up, its 'IFF_UP' flag is set
(if not set already) and a 'NETDEV_UP' notification is emitted, which
causes the 8021q driver to add VLAN ID 0 on the device. The reverse
happens when a net device is put administratively down.
When changing the type of a bond to Ethernet, its 'IFF_UP' flag is
incorrectly cleared, resulting in the kernel skipping the above process
and VLAN ID 0 being leaked [1].
Fix by restoring the flag when changing the type to Ethernet, in a
similar fashion to the restoration of the 'IFF_SLAVE' flag.
The issue can be reproduced using the script in [2], with example out
before and after the fix in [3].
[1]
unreferenced object 0xffff888103479900 (size 256):
comm "ip", pid 329, jiffies 4294775225 (age 28.561s)
hex dump (first 32 bytes):
00 a0 0c 15 81 88 ff ff 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0
[<ffffffff8406426c>] vlan_vid_add+0x30c/0x790
[<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0
[<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0
[<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150
[<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0
[<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180
[<ffffffff8379af36>] do_setlink+0xb96/0x4060
[<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0
[<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0
[<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00
[<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440
[<ffffffff839a738f>] netlink_unicast+0x53f/0x810
[<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90
[<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70
[<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0
unreferenced object 0xffff88810f6a83e0 (size 32):
comm "ip", pid 329, jiffies 4294775225 (age 28.561s)
hex dump (first 32 bytes):
a0 99 47 03 81 88 ff ff a0 99 47 03 81 88 ff ff ..G.......G.....
81 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc ................
backtrace:
[<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0
[<ffffffff84064369>] vlan_vid_add+0x409/0x790
[<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0
[<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0
[<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150
[<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0
[<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180
[<ffffffff8379af36>] do_setlink+0xb96/0x4060
[<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0
[<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0
[<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00
[<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440
[<ffffffff839a738f>] netlink_unicast+0x53f/0x810
[<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90
[<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70
[<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0
[2]
ip link add name t-nlmon type nlmon
ip link add name t-dummy type dummy
ip link add name t-bond type bond mode active-backup
ip link set dev t-bond up
ip link set dev t-nlmon master t-bond
ip link set dev t-nlmon nomaster
ip link show dev t-bond
ip link set dev t-dummy master t-bond
ip link show dev t-bond
ip link del dev t-bond
ip link del dev t-dummy
ip link del dev t-nlmon
[3]
Before:
12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/netlink
12: t-bond: <BROADCAST,MULTICAST,MASTER,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 46:57:39:a4:46:a2 brd ff:ff:ff:ff:ff:ff
After:
12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/netlink
12: t-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 66:48:7b:74:b6:8a brd ff:ff:ff:ff:ff:ff
Fixes: e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type")
Fixes: 75c78500ddad ("bonding: remap muticast addresses without using dev_close() and dev_open()")
Fixes: 9ec7eb60dcbc ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change")
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Link: https://lore.kernel.org/netdev/78a8a03b-6070-3e6b-5042-f848dab16fb8@alu.uni…
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Signed-off-by: Ido Schimmel <idosch(a)nvidia.com>
Acked-by: Jay Vosburgh <jay.vosburgh(a)canonical.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/net/bonding/bond_main.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 08939a35a187..fe28cbec4a4c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1393,14 +1393,15 @@ void bond_lower_state_changed(struct slave *slave)
/* The bonding driver uses ether_setup() to convert a master bond device
* to ARPHRD_ETHER, that resets the target netdevice's flags so we always
- * have to restore the IFF_MASTER flag, and only restore IFF_SLAVE if it was set
+ * have to restore the IFF_MASTER flag, and only restore IFF_SLAVE and IFF_UP
+ * if they were set
*/
static void bond_ether_setup(struct net_device *bond_dev)
{
- unsigned int slave_flag = bond_dev->flags & IFF_SLAVE;
+ unsigned int flags = bond_dev->flags & (IFF_SLAVE | IFF_UP);
ether_setup(bond_dev);
- bond_dev->flags |= IFF_MASTER | slave_flag;
+ bond_dev->flags |= IFF_MASTER | flags;
bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING;
}
--
2.25.1
1
8

[PATCH openEuler-1.0-LTS] dm ioctl: fix nested locking in table_clear() to remove deadlock concern
by Yongqiang Liu 06 May '23
by Yongqiang Liu 06 May '23
06 May '23
From: Mike Snitzer <snitzer(a)kernel.org>
mainline inclusion
from mainline-v6.4-rc1
commit 3d32aaa7e66d5c1479a3c31d6c2c5d45dd0d3b89
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6YQZS
CVE: CVE-2023-2269
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?i…
----------------------------------------
syzkaller found the following problematic rwsem locking (with write
lock already held):
down_read+0x9d/0x450 kernel/locking/rwsem.c:1509
dm_get_inactive_table+0x2b/0xc0 drivers/md/dm-ioctl.c:773
__dev_status+0x4fd/0x7c0 drivers/md/dm-ioctl.c:844
table_clear+0x197/0x280 drivers/md/dm-ioctl.c:1537
In table_clear, it first acquires a write lock
https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L1520
down_write(&_hash_lock);
Then before the lock is released at L1539, there is a path shown above:
table_clear -> __dev_status -> dm_get_inactive_table -> down_read
https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L773
down_read(&_hash_lock);
It tries to acquire the same read lock again, resulting in the deadlock
problem.
Fix this by moving table_clear()'s __dev_status() call to after its
up_write(&_hash_lock);
Cc: stable(a)vger.kernel.org
Reported-by: Zheng Zhang <zheng.zhang(a)email.ucr.edu>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
Conflicts:
drivers/md/dm-ioctl.c
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/md/dm-ioctl.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
index 7ec02676bcad..c8c27d23bb45 100644
--- a/drivers/md/dm-ioctl.c
+++ b/drivers/md/dm-ioctl.c
@@ -1410,11 +1410,12 @@ static int table_clear(struct file *filp, struct dm_ioctl *param, size_t param_s
hc->new_map = NULL;
}
- param->flags &= ~DM_INACTIVE_PRESENT_FLAG;
-
- __dev_status(hc->md, param);
md = hc->md;
up_write(&_hash_lock);
+
+ param->flags &= ~DM_INACTIVE_PRESENT_FLAG;
+ __dev_status(md, param);
+
if (old_map) {
dm_sync_table(md);
dm_table_destroy(old_map);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/3] bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change
by Yongqiang Liu 06 May '23
by Yongqiang Liu 06 May '23
06 May '23
From: Nikolay Aleksandrov <razor(a)blackwall.org>
mainline inclusion
from mainline-v6.3-rc3
commit 9ec7eb60dcbcb6c41076defbc5df7bbd95ceaba5
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WNGK
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
Add bond_ether_setup helper which is used to fix ether_setup() calls in the
bonding driver. It takes care of both IFF_MASTER and IFF_SLAVE flags, the
former is always restored and the latter only if it was set.
If the bond enslaves non-ARPHRD_ETHER device (changes its type), then
releases it and enslaves ARPHRD_ETHER device (changes back) then we
use ether_setup() to restore the bond device type but it also resets its
flags and removes IFF_MASTER and IFF_SLAVE[1]. Use the bond_ether_setup
helper to restore both after such transition.
[1] reproduce (nlmon is non-ARPHRD_ETHER):
$ ip l add nlmon0 type nlmon
$ ip l add bond2 type bond mode active-backup
$ ip l set nlmon0 master bond2
$ ip l set nlmon0 nomaster
$ ip l add bond1 type bond
(we use bond1 as ARPHRD_ETHER device to restore bond2's mode)
$ ip l set bond1 master bond2
$ ip l sh dev bond2
37: bond2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether be:d7:c5:40:5b:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500
(notice bond2's IFF_MASTER is missing)
Fixes: e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type")
Signed-off-by: Nikolay Aleksandrov <razor(a)blackwall.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Conflicts:
drivers/net/bonding/bond_main.c
Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/net/bonding/bond_main.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 02965c5c1ed2..b0ecbb293bbf 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1391,6 +1391,19 @@ void bond_lower_state_changed(struct slave *slave)
netdev_lower_state_changed(slave->dev, &info);
}
+/* The bonding driver uses ether_setup() to convert a master bond device
+ * to ARPHRD_ETHER, that resets the target netdevice's flags so we always
+ * have to restore the IFF_MASTER flag, and only restore IFF_SLAVE if it was set
+ */
+static void bond_ether_setup(struct net_device *bond_dev)
+{
+ unsigned int slave_flag = bond_dev->flags & IFF_SLAVE;
+
+ ether_setup(bond_dev);
+ bond_dev->flags |= IFF_MASTER | slave_flag;
+ bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING;
+}
+
/* enslave device <slave> to bond device <master> */
int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
struct netlink_ext_ack *extack)
@@ -1481,10 +1494,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
if (slave_dev->type != ARPHRD_ETHER)
bond_setup_by_slave(bond_dev, slave_dev);
- else {
- ether_setup(bond_dev);
- bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING;
- }
+ else
+ bond_ether_setup(bond_dev);
call_netdevice_notifiers(NETDEV_POST_TYPE_CHANGE,
bond_dev);
--
2.25.1
1
2

[PATCH openEuler-1.0-LTS 1/2] ovl: get_acl: Fix null pointer dereference at realinode in rcu-walk mode
by Yongqiang Liu 06 May '23
by Yongqiang Liu 06 May '23
06 May '23
From: Zhihao Cheng <chengzhihao1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I70MZX
CVE: NA
--------------------------------
Following process:
P1 P2
path_openat
link_path_walk
may_lookup
inode_permission(rcu)
ovl_permission
acl_permission_check
check_acl
get_cached_acl_rcu
ovl_get_inode_acl
realinode = ovl_inode_real(ovl_inode)
drop_cache
__dentry_kill(ovl_dentry)
iput(ovl_inode)
ovl_destroy_inode(ovl_inode)
dput(oi->__upperdentry)
dentry_kill(upperdentry)
dentry_unlink_inode
upperdentry->d_inode = NULL
ovl_inode_upper
upperdentry = ovl_i_dentry_upper(ovl_inode)
d_inode(upperdentry) // returns NULL
IS_POSIXACL(realinode) // NULL pointer dereference
, will trigger an null pointer dereference at realinode:
[ 205.472797] BUG: kernel NULL pointer dereference, address:
0000000000000028
[ 205.476701] CPU: 2 PID: 2713 Comm: ls Not tainted
6.3.0-12064-g2edfa098e750-dirty #1216
[ 205.478754] RIP: 0010:do_ovl_get_acl+0x5d/0x300
[ 205.489584] Call Trace:
[ 205.489812] <TASK>
[ 205.490014] ovl_get_inode_acl+0x26/0x30
[ 205.490466] get_cached_acl_rcu+0x61/0xa0
[ 205.490908] generic_permission+0x1bf/0x4e0
[ 205.491447] ovl_permission+0x79/0x1b0
[ 205.491917] inode_permission+0x15e/0x2c0
[ 205.492425] link_path_walk+0x115/0x550
[ 205.493311] path_lookupat.isra.0+0xb2/0x200
[ 205.493803] filename_lookup+0xda/0x240
[ 205.495747] vfs_fstatat+0x7b/0xb0
Fetch a reproducer in [Link].
Fix it by checking realinode whether to be NULL before accessing it.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217404
Fixes: 332f606b32b6 ("ovl: enable RCU'd ->get_acl()")
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/overlayfs/inode.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 5c82c95e57d2..3a35623b86e1 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -428,7 +428,15 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type, bool rcu)
const struct cred *old_cred;
struct posix_acl *acl;
- if (!IS_ENABLED(CONFIG_FS_POSIX_ACL) || !IS_POSIXACL(realinode))
+ if (!IS_ENABLED(CONFIG_FS_POSIX_ACL))
+ return NULL;
+
+ if (!realinode) {
+ WARN_ON(!rcu);
+ return ERR_PTR(-ECHILD);
+ }
+
+ if (!IS_POSIXACL(realinode))
return NULL;
if (rcu)
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS] net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg
by Yongqiang Liu 05 May '23
by Yongqiang Liu 05 May '23
05 May '23
From: Gwangun Jung <exsociety(a)gmail.com>
stable inclusion
from stable-v4.19.282
commit 6ef8120262dfa63d9ec517d724e6f15591473a78
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6ZISA
CVE: CVE-2023-31436
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 3037933448f60f9acb705997eae62013ecb81e0d ]
If the TCA_QFQ_LMAX value is not offered through nlattr, lmax is determined by the MTU value of the network device.
The MTU of the loopback device can be set up to 2^31-1.
As a result, it is possible to have an lmax value that exceeds QFQ_MIN_LMAX.
Due to the invalid lmax value, an index is generated that exceeds the QFQ_MAX_INDEX(=24) value, causing out-of-bounds read/write errors.
The following reports a oob access:
[ 84.582666] BUG: KASAN: slab-out-of-bounds in qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313)
[ 84.583267] Read of size 4 at addr ffff88810f676948 by task ping/301
[ 84.583686]
[ 84.583797] CPU: 3 PID: 301 Comm: ping Not tainted 6.3.0-rc5 #1
[ 84.584164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 84.584644] Call Trace:
[ 84.584787] <TASK>
[ 84.584906] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[ 84.585108] print_report (mm/kasan/report.c:320 mm/kasan/report.c:430)
[ 84.585570] kasan_report (mm/kasan/report.c:538)
[ 84.585988] qfq_activate_agg.constprop.0 (net/sched/sch_qfq.c:1027 net/sched/sch_qfq.c:1060 net/sched/sch_qfq.c:1313)
[ 84.586599] qfq_enqueue (net/sched/sch_qfq.c:1255)
[ 84.587607] dev_qdisc_enqueue (net/core/dev.c:3776)
[ 84.587749] __dev_queue_xmit (./include/net/sch_generic.h:186 net/core/dev.c:3865 net/core/dev.c:4212)
[ 84.588763] ip_finish_output2 (./include/net/neighbour.h:546 net/ipv4/ip_output.c:228)
[ 84.589460] ip_output (net/ipv4/ip_output.c:430)
[ 84.590132] ip_push_pending_frames (./include/net/dst.h:444 net/ipv4/ip_output.c:126 net/ipv4/ip_output.c:1586 net/ipv4/ip_output.c:1606)
[ 84.590285] raw_sendmsg (net/ipv4/raw.c:649)
[ 84.591960] sock_sendmsg (net/socket.c:724 net/socket.c:747)
[ 84.592084] __sys_sendto (net/socket.c:2142)
[ 84.593306] __x64_sys_sendto (net/socket.c:2150)
[ 84.593779] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 84.593902] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 84.594070] RIP: 0033:0x7fe568032066
[ 84.594192] Code: 0e 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c09[ 84.594796] RSP: 002b:00007ffce388b4e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
Code starting with the faulting instruction
===========================================
[ 84.595047] RAX: ffffffffffffffda RBX: 00007ffce388cc70 RCX: 00007fe568032066
[ 84.595281] RDX: 0000000000000040 RSI: 00005605fdad6d10 RDI: 0000000000000003
[ 84.595515] RBP: 00005605fdad6d10 R08: 00007ffce388eeec R09: 0000000000000010
[ 84.595749] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[ 84.595984] R13: 00007ffce388cc30 R14: 00007ffce388b4f0 R15: 0000001d00000001
[ 84.596218] </TASK>
[ 84.596295]
[ 84.596351] Allocated by task 291:
[ 84.596467] kasan_save_stack (mm/kasan/common.c:46)
[ 84.596597] kasan_set_track (mm/kasan/common.c:52)
[ 84.596725] __kasan_kmalloc (mm/kasan/common.c:384)
[ 84.596852] __kmalloc_node (./include/linux/kasan.h:196 mm/slab_common.c:967 mm/slab_common.c:974)
[ 84.596979] qdisc_alloc (./include/linux/slab.h:610 ./include/linux/slab.h:731 net/sched/sch_generic.c:938)
[ 84.597100] qdisc_create (net/sched/sch_api.c:1244)
[ 84.597222] tc_modify_qdisc (net/sched/sch_api.c:1680)
[ 84.597357] rtnetlink_rcv_msg (net/core/rtnetlink.c:6174)
[ 84.597495] netlink_rcv_skb (net/netlink/af_netlink.c:2574)
[ 84.597627] netlink_unicast (net/netlink/af_netlink.c:1340 net/netlink/af_netlink.c:1365)
[ 84.597759] netlink_sendmsg (net/netlink/af_netlink.c:1942)
[ 84.597891] sock_sendmsg (net/socket.c:724 net/socket.c:747)
[ 84.598016] ____sys_sendmsg (net/socket.c:2501)
[ 84.598147] ___sys_sendmsg (net/socket.c:2557)
[ 84.598275] __sys_sendmsg (./include/linux/file.h:31 net/socket.c:2586)
[ 84.598399] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 84.598520] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 84.598688]
[ 84.598744] The buggy address belongs to the object at ffff88810f674000
[ 84.598744] which belongs to the cache kmalloc-8k of size 8192
[ 84.599135] The buggy address is located 2664 bytes to the right of
[ 84.599135] allocated 7904-byte region [ffff88810f674000, ffff88810f675ee0)
[ 84.599544]
[ 84.599598] The buggy address belongs to the physical page:
[ 84.599777] page:00000000e638567f refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10f670
[ 84.600074] head:00000000e638567f order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 84.600330] flags: 0x200000000010200(slab|head|node=0|zone=2)
[ 84.600517] raw: 0200000000010200 ffff888100043180 dead000000000122 0000000000000000
[ 84.600764] raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000
[ 84.601009] page dumped because: kasan: bad access detected
[ 84.601187]
[ 84.601241] Memory state around the buggy address:
[ 84.601396] ffff88810f676800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 84.601620] ffff88810f676880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 84.601845] >ffff88810f676900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 84.602069] ^
[ 84.602243] ffff88810f676980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 84.602468] ffff88810f676a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 84.602693] ==================================================================
[ 84.602924] Disabling lock debugging due to kernel taint
Fixes: 3015f3d2a3cd ("pkt_sched: enable QFQ to support TSO/GSO")
Reported-by: Gwangun Jung <exsociety(a)gmail.com>
Signed-off-by: Gwangun Jung <exsociety(a)gmail.com>
Acked-by: Jamal Hadi Salim<jhs(a)mojatatu.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/sched/sch_qfq.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 73ac3c65c747..0326793b700b 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -433,15 +433,16 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
} else
weight = 1;
- if (tb[TCA_QFQ_LMAX]) {
+ if (tb[TCA_QFQ_LMAX])
lmax = nla_get_u32(tb[TCA_QFQ_LMAX]);
- if (lmax < QFQ_MIN_LMAX || lmax > (1UL << QFQ_MTU_SHIFT)) {
- pr_notice("qfq: invalid max length %u\n", lmax);
- return -EINVAL;
- }
- } else
+ else
lmax = psched_mtu(qdisc_dev(sch));
+ if (lmax < QFQ_MIN_LMAX || lmax > (1UL << QFQ_MTU_SHIFT)) {
+ pr_notice("qfq: invalid max length %u\n", lmax);
+ return -EINVAL;
+ }
+
inv_w = ONE_FP / weight;
weight = ONE_FP / inv_w;
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] ext4: only update i_reserved_data_blocks on successful block allocation
by Yongqiang Liu 05 May '23
by Yongqiang Liu 05 May '23
05 May '23
From: Baokun Li <libaokun1(a)huawei.com>
maillist inclusion
category: bugfix
bugzilla: 188499, https://gitee.com/openeuler/kernel/issues/I6TNVT
CVE: NA
Reference: https://patchwork.ozlabs.org/project/linux-ext4/patch/20230412124126.228671…
----------------------------------------
In our fault injection test, we create an ext4 file, migrate it to
non-extent based file, then punch a hole and finally trigger a WARN_ON
in the ext4_da_update_reserve_space():
EXT4-fs warning (device sda): ext4_da_update_reserve_space:369:
ino 14, used 11 with only 10 reserved data blocks
When writing back a non-extent based file, if we enable delalloc, the
number of reserved blocks will be subtracted from the number of blocks
mapped by ext4_ind_map_blocks(), and the extent status tree will be
updated. We update the extent status tree by first removing the old
extent_status and then inserting the new extent_status. If the block range
we remove happens to be in an extent, then we need to allocate another
extent_status with ext4_es_alloc_extent().
use old to remove to add new
|----------|------------|------------|
old extent_status
The problem is that the allocation of a new extent_status failed due to a
fault injection, and __es_shrink() did not get free memory, resulting in
a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM,
we map to the same extent again, and the number of reserved blocks is again
subtracted from the number of blocks in that extent. Since the blocks in
the same extent are subtracted twice, we end up triggering WARN_ON at
ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks.
For non-extent based file, we update the number of reserved blocks after
ext4_ind_map_blocks() is executed, which causes a problem that when we call
ext4_ind_map_blocks() to create a block, it doesn't always create a block,
but we always reduce the number of reserved blocks. So we move the logic
for updating reserved blocks to ext4_ind_map_blocks() to ensure that the
number of reserved blocks is updated only after we do succeed in allocating
some new blocks.
Fixes: 5f634d064c70 ("ext4: Fix quota accounting error with fallocate")
Reviewed-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/ext4/indirect.c | 8 ++++++++
fs/ext4/inode.c | 10 ----------
2 files changed, 8 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index cd3b4ebbbab3..f99f366feeaf 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -645,6 +645,14 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
ext4_update_inode_fsync_trans(handle, inode, 1);
count = ar.len;
+
+ /*
+ * Update reserved blocks/metadata blocks after successful block
+ * allocation which had been deferred till now.
+ */
+ if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
+ ext4_da_update_reserve_space(inode, count, 1);
+
got_it:
map->m_flags |= EXT4_MAP_MAPPED;
map->m_pblk = le32_to_cpu(chain[depth-1].key);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1dbaebce54c6..c40f4442c5c3 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -655,16 +655,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
*/
ext4_clear_inode_state(inode, EXT4_STATE_EXT_MIGRATE);
}
-
- /*
- * Update reserved blocks/metadata blocks after successful
- * block allocation which had been deferred till now. We don't
- * support fallocate for non extent files. So we can update
- * reserve space here.
- */
- if ((retval > 0) &&
- (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE))
- ext4_da_update_reserve_space(inode, retval, 1);
}
if (retval > 0) {
--
2.25.1
1
0
您好!
Kernel SIG 邀请您参加 2023-05-05 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG 双周例会
会议内容:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/84858762199?pwd=WVE1eGJXcEp1K1FYcDM1U0k3L2Z3QT09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-05-05 14:00,
The subject of the conference is openEuler Kernel SIG 双周例会,
Summary:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/84858762199?pwd=WVE1eGJXcEp1K1FYcDM1U0k3L2Z3QT09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[PATCH openEuler-1.0-LTS] mm: mem_reliable: Use zone_page_state to count free reliable pages
by Yongqiang Liu 28 Apr '23
by Yongqiang Liu 28 Apr '23
28 Apr '23
From: Ma Wupeng <mawupeng1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6RKHX
CVE: NA
--------------------------------
commit ee97b61005df ("mm: Count mirrored pages in buddy system") proposes
to use vmstat events (PGALLOC/PGFREE) to count free reliable pages.
However, this can be achieved by counting NR_FREE_PAGES for all
non-movable zones, which will have better compatibility. Therefore, replace
it.
Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
include/linux/mem_reliable.h | 7 -------
mm/mem_reliable.c | 27 +++++++++++++--------------
mm/page_alloc.c | 4 ----
3 files changed, 13 insertions(+), 25 deletions(-)
diff --git a/include/linux/mem_reliable.h b/include/linux/mem_reliable.h
index aa3fe77c8a72..c0eff851bbe7 100644
--- a/include/linux/mem_reliable.h
+++ b/include/linux/mem_reliable.h
@@ -111,12 +111,6 @@ static inline void shmem_reliable_page_counter(struct page *page, int nr_page)
percpu_counter_add(&reliable_shmem_used_nr_page, nr_page);
}
-static inline void mem_reliable_buddy_counter(struct page *page, int nr_page)
-{
- if (page_reliable(page))
- this_cpu_add(nr_reliable_buddy_pages, nr_page);
-}
-
static inline bool mem_reliable_shmem_limit_check(void)
{
return percpu_counter_read_positive(&reliable_shmem_used_nr_page) <
@@ -168,7 +162,6 @@ static inline void shmem_reliable_page_counter(struct page *page, int nr_page)
static inline bool pagecache_reliable_is_enabled(void) { return false; }
static inline bool mem_reliable_status(void) { return false; }
-static inline void mem_reliable_buddy_counter(struct page *page, int nr_page) {}
static inline bool mem_reliable_shmem_limit_check(void) { return true; }
static inline void reliable_lru_add(enum lru_list lru,
struct page *page,
diff --git a/mm/mem_reliable.c b/mm/mem_reliable.c
index f8b8543d7933..524d276db72c 100644
--- a/mm/mem_reliable.c
+++ b/mm/mem_reliable.c
@@ -28,7 +28,6 @@ unsigned long task_reliable_limit = ULONG_MAX;
bool reliable_allow_fallback __read_mostly = true;
bool shmem_reliable __read_mostly = true;
struct percpu_counter reliable_shmem_used_nr_page __read_mostly;
-DEFINE_PER_CPU(long, nr_reliable_buddy_pages);
long shmem_reliable_nr_page = LONG_MAX;
bool pagecache_use_reliable_mem __read_mostly = true;
@@ -176,16 +175,21 @@ static unsigned long total_reliable_mem_sz(void)
return atomic_long_read(&total_reliable_mem);
}
-static unsigned long used_reliable_mem_sz(void)
+static inline long free_reliable_pages(void)
{
- unsigned long nr_page = 0;
- struct zone *z;
+ struct zone *zone;
+ unsigned long cnt = 0;
- for_each_populated_zone(z)
- if (zone_idx(z) < ZONE_MOVABLE)
- nr_page += zone_page_state(z, NR_FREE_PAGES);
+ for_each_populated_zone(zone)
+ if (zone_idx(zone) < ZONE_MOVABLE)
+ cnt += zone_page_state(zone, NR_FREE_PAGES);
- return total_reliable_mem_sz() - nr_page * PAGE_SIZE;
+ return cnt;
+}
+
+static unsigned long used_reliable_mem_sz(void)
+{
+ return total_reliable_mem_sz() - (free_reliable_pages() << PAGE_SHIFT);
}
static void show_val_kb(struct seq_file *m, const char *s, unsigned long num)
@@ -198,15 +202,10 @@ void reliable_report_meminfo(struct seq_file *m)
{
s64 nr_pagecache_pages = 0;
s64 nr_anon_pages = 0;
- long nr_buddy_pages = 0;
- int cpu;
if (!mem_reliable_is_enabled())
return;
- for_each_possible_cpu(cpu)
- nr_buddy_pages += per_cpu(nr_reliable_buddy_pages, cpu);
-
nr_anon_pages = percpu_counter_sum_positive(&anon_reliable_pages);
nr_pagecache_pages = percpu_counter_sum_positive(&pagecache_reliable_pages);
@@ -215,7 +214,7 @@ void reliable_report_meminfo(struct seq_file *m)
show_val_kb(m, "ReliableUsed: ",
used_reliable_mem_sz() >> PAGE_SHIFT);
show_val_kb(m, "ReliableTaskUsed: ", nr_anon_pages + nr_pagecache_pages);
- show_val_kb(m, "ReliableBuddyMem: ", nr_buddy_pages);
+ show_val_kb(m, "ReliableBuddyMem: ", free_reliable_pages());
if (shmem_reliable_is_enabled()) {
show_val_kb(m, "ReliableShmem: ",
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e722d73a3724..2084b912efa8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1340,7 +1340,6 @@ static void __free_pages_ok(struct page *page, unsigned int order,
migratetype = get_pfnblock_migratetype(page, pfn);
local_irq_save(flags);
__count_vm_events(PGFREE, 1 << order);
- mem_reliable_buddy_counter(page, 1 << order);
free_one_page(page_zone(page), page, pfn, order, migratetype,
fpi_flags);
local_irq_restore(flags);
@@ -2920,7 +2919,6 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
migratetype = get_pcppage_migratetype(page);
__count_vm_event(PGFREE);
- mem_reliable_buddy_counter(page, 1);
/*
* We only track unmovable, reclaimable and movable on pcp lists.
@@ -3174,7 +3172,6 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone,
page = __rmqueue_pcplist(zone, migratetype, pcp, list);
if (page) {
__count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
- mem_reliable_buddy_counter(page, -(1 << order));
zone_statistics(preferred_zone, zone);
}
local_irq_restore(flags);
@@ -3223,7 +3220,6 @@ struct page *rmqueue(struct zone *preferred_zone,
get_pcppage_migratetype(page));
__count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
- mem_reliable_buddy_counter(page, -(1 << order));
zone_statistics(preferred_zone, zone);
local_irq_restore(flags);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
by Yongqiang Liu 28 Apr '23
by Yongqiang Liu 28 Apr '23
28 Apr '23
From: Baokun Li <libaokun1(a)huawei.com>
mainline inclusion
from mainline-v6.3-rc8
commit 1ba1199ec5747f475538c0d25a32804e5ba1dfde
category: bugfix
bugzilla: 188601, https://gitee.com/openeuler/kernel/issues/I6TNTC
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
KASAN report null-ptr-deref:
==================================================================
BUG: KASAN: null-ptr-deref in bdi_split_work_to_wbs+0x5c5/0x7b0
Write of size 8 at addr 0000000000000000 by task sync/943
CPU: 5 PID: 943 Comm: sync Tainted: 6.3.0-rc5-next-20230406-dirty #461
Call Trace:
<TASK>
dump_stack_lvl+0x7f/0xc0
print_report+0x2ba/0x340
kasan_report+0xc4/0x120
kasan_check_range+0x1b7/0x2e0
__kasan_check_write+0x24/0x40
bdi_split_work_to_wbs+0x5c5/0x7b0
sync_inodes_sb+0x195/0x630
sync_inodes_one_sb+0x3a/0x50
iterate_supers+0x106/0x1b0
ksys_sync+0x98/0x160
[...]
==================================================================
The race that causes the above issue is as follows:
cpu1 cpu2
-------------------------|-------------------------
inode_switch_wbs
INIT_WORK(&isw->work, inode_switch_wbs_work_fn)
queue_rcu_work(isw_wq, &isw->work)
// queue_work async
inode_switch_wbs_work_fn
wb_put_many(old_wb, nr_switched)
percpu_ref_put_many
ref->data->release(ref)
cgwb_release
queue_work(cgwb_release_wq, &wb->release_work)
// queue_work async
&wb->release_work
cgwb_release_workfn
ksys_sync
iterate_supers
sync_inodes_one_sb
sync_inodes_sb
bdi_split_work_to_wbs
kmalloc(sizeof(*work), GFP_ATOMIC)
// alloc memory failed
percpu_ref_exit
ref->data = NULL
kfree(data)
wb_get(wb)
percpu_ref_get(&wb->refcnt)
percpu_ref_get_many(ref, 1)
atomic_long_add(nr, &ref->data->count)
atomic64_add(i, v)
// trigger null-ptr-deref
bdi_split_work_to_wbs() traverses &bdi->wb_list to split work into all
wbs. If the allocation of new work fails, the on-stack fallback will be
used and the reference count of the current wb is increased afterwards.
If cgroup writeback membership switches occur before getting the reference
count and the current wb is released as old_wd, then calling wb_get() or
wb_put() will trigger the null pointer dereference above.
This issue was introduced in v4.3-rc7 (see fix tag1). Both
sync_inodes_sb() and __writeback_inodes_sb_nr() calls to
bdi_split_work_to_wbs() can trigger this issue. For scenarios called via
sync_inodes_sb(), originally commit 7fc5854f8c6e ("writeback: synchronize
sync(2) against cgroup writeback membership switches") reduced the
possibility of the issue by adding wb_switch_rwsem, but in v5.14-rc1 (see
fix tag2) removed the "inode_io_list_del_locked(inode, old_wb)" from
inode_switch_wbs_work_fn() so that wb->state contains WB_has_dirty_io,
thus old_wb is not skipped when traversing wbs in bdi_split_work_to_wbs(),
and the issue becomes easily reproducible again.
To solve this problem, percpu_ref_exit() is called under RCU protection to
avoid race between cgwb_release_workfn() and bdi_split_work_to_wbs().
Moreover, replace wb_get() with wb_tryget() in bdi_split_work_to_wbs(),
and skip the current wb if wb_tryget() fails because the wb has already
been shutdown.
Link: https://lkml.kernel.org/r/20230410130826.1492525-1-libaokun1@huawei.com
Fixes: b817525a4a80 ("writeback: bdi_writeback iteration must not skip dying ones")
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Acked-by: Tejun Heo <tj(a)kernel.org>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel(a)dilger.ca>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Dennis Zhou <dennis(a)kernel.org>
Cc: Hou Tao <houtao1(a)huawei.com>
Cc: yangerkun <yangerkun(a)huawei.com>
Cc: Zhang Yi <yi.zhang(a)huawei.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Conflicts:
mm/backing-dev.c
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/fs-writeback.c | 17 ++++++++++-------
mm/backing-dev.c | 12 ++++++++++--
2 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 23a632f02839..dd53d00a2583 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -884,6 +884,16 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
continue;
}
+ /*
+ * If wb_tryget fails, the wb has been shutdown, skip it.
+ *
+ * Pin @wb so that it stays on @bdi->wb_list. This allows
+ * continuing iteration from @wb after dropping and
+ * regrabbing rcu read lock.
+ */
+ if (!wb_tryget(wb))
+ continue;
+
/* alloc failed, execute synchronously using on-stack fallback */
work = &fallback_work;
*work = *base_work;
@@ -892,13 +902,6 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
work->done = &fallback_work_done;
wb_queue_work(wb, work);
-
- /*
- * Pin @wb so that it stays on @bdi->wb_list. This allows
- * continuing iteration from @wb after dropping and
- * regrabbing rcu read lock.
- */
- wb_get(wb);
last_wb = wb;
rcu_read_unlock();
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index f80d5c2ff420..c5caa7e91935 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -487,6 +487,15 @@ void wb_congested_put(struct bdi_writeback_congested *congested)
kfree(congested);
}
+static void cgwb_free_rcu(struct rcu_head *rcu_head)
+{
+ struct bdi_writeback *wb = container_of(rcu_head,
+ struct bdi_writeback, rcu);
+
+ percpu_ref_exit(&wb->refcnt);
+ kfree(wb);
+}
+
static void cgwb_release_workfn(struct work_struct *work)
{
struct bdi_writeback *wb = container_of(work, struct bdi_writeback,
@@ -504,9 +513,8 @@ static void cgwb_release_workfn(struct work_struct *work)
blkcg_cgwb_put(blkcg);
fprop_local_destroy_percpu(&wb->memcg_completions);
- percpu_ref_exit(&wb->refcnt);
wb_exit(wb);
- kfree_rcu(wb, rcu);
+ call_rcu(&wb->rcu, cgwb_free_rcu);
}
static void cgwb_release(struct percpu_ref *refcnt)
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/6] vfs: add rcu argument to ->get_acl() callback
by Yongqiang Liu 28 Apr '23
by Yongqiang Liu 28 Apr '23
28 Apr '23
From: Miklos Szeredi <mszeredi(a)redhat.com>
mainline inclusion
from mainline-v5.15-rc1
commit 0cad6246621b5887d5b33fea84219d2a71f2f99a
category: perf
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZCW0
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Add a rcu argument to the ->get_acl() callback to allow
get_cached_acl_rcu() to call the ->get_acl() method in the next patch.
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
[chengzhihao: rename get_acl to get_acl2 to prevent KABI changes, and
only backport(realize) overlayfs]
Conflicts:
fs/overlayfs/dir.c
fs/overlayfs/inode.c
fs/overlayfs/overlayfs.h
fs/posix_acl.c
include/linux/fs.h
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/overlayfs/dir.c | 2 +-
fs/overlayfs/inode.c | 9 ++++++---
fs/overlayfs/overlayfs.h | 2 +-
fs/posix_acl.c | 7 +++++--
include/linux/fs.h | 1 +
5 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 8570e755a392..cdbb916aa636 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1292,6 +1292,6 @@ const struct inode_operations ovl_dir_inode_operations = {
.permission = ovl_permission,
.getattr = ovl_getattr,
.listxattr = ovl_listxattr,
- .get_acl = ovl_get_acl,
+ .get_acl2 = ovl_get_acl,
.update_time = ovl_update_time,
};
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 3eabe457a5d5..3aa33c18cb80 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -421,12 +421,15 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
return res;
}
-struct posix_acl *ovl_get_acl(struct inode *inode, int type)
+struct posix_acl *ovl_get_acl(struct inode *inode, int type, bool rcu)
{
struct inode *realinode = ovl_inode_real(inode);
const struct cred *old_cred;
struct posix_acl *acl;
+ if (rcu)
+ return ERR_PTR(-ECHILD);
+
if (!IS_ENABLED(CONFIG_FS_POSIX_ACL) || !IS_POSIXACL(realinode))
return NULL;
@@ -480,7 +483,7 @@ static const struct inode_operations ovl_file_inode_operations = {
.permission = ovl_permission,
.getattr = ovl_getattr,
.listxattr = ovl_listxattr,
- .get_acl = ovl_get_acl,
+ .get_acl2 = ovl_get_acl,
.update_time = ovl_update_time,
.fiemap = ovl_fiemap,
};
@@ -498,7 +501,7 @@ static const struct inode_operations ovl_special_inode_operations = {
.permission = ovl_permission,
.getattr = ovl_getattr,
.listxattr = ovl_listxattr,
- .get_acl = ovl_get_acl,
+ .get_acl2 = ovl_get_acl,
.update_time = ovl_update_time,
};
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 76e096ad2ef9..38fada5fa552 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -358,7 +358,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
void *value, size_t size);
ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
-struct posix_acl *ovl_get_acl(struct inode *inode, int type);
+struct posix_acl *ovl_get_acl(struct inode *inode, int type, bool rcu);
int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags);
bool ovl_is_private_xattr(const char *name);
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 2fd0fde16fe1..8e3fe9cbb28c 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -133,11 +133,14 @@ struct posix_acl *get_acl(struct inode *inode, int type)
* If the filesystem doesn't have a get_acl() function at all, we'll
* just create the negative cache entry.
*/
- if (!inode->i_op->get_acl) {
+ if (!inode->i_op->get_acl && !inode->i_op->get_acl2) {
set_cached_acl(inode, type, NULL);
return NULL;
}
- acl = inode->i_op->get_acl(inode, type);
+ if (inode->i_op->get_acl)
+ acl = inode->i_op->get_acl(inode, type);
+ else
+ acl = inode->i_op->get_acl2(inode, type, false);
if (IS_ERR(acl)) {
/*
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6363c0a67af5..bd3e09a3c9a9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1838,6 +1838,7 @@ struct inode_operations {
const char * (*get_link) (struct dentry *, struct inode *, struct delayed_call *);
int (*permission) (struct inode *, int);
struct posix_acl * (*get_acl)(struct inode *, int);
+ struct posix_acl * (*get_acl2)(struct inode *, int, bool);
int (*readlink) (struct dentry *, char __user *,int);
--
2.25.1
1
6
From: xiabing <xiabing12(a)h-partners.com>
The register is modified to solve the problem that the link is unstable
and the link bit error occurs when the screen is refreshed.
Yihang Li (1):
scsi: hisi_sas: Configure the initialization registers according to
HBA model
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
--
2.30.0
1
1

[PATCH openEuler-1.0-LTS 1/6] vfs: add rcu argument to ->get_acl() callback
by Yongqiang Liu 28 Apr '23
by Yongqiang Liu 28 Apr '23
28 Apr '23
From: Miklos Szeredi <mszeredi(a)redhat.com>
mainline inclusion
from mainline-v5.15-rc1
commit 0cad6246621b5887d5b33fea84219d2a71f2f99a
category: perf
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZCW0
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Add a rcu argument to the ->get_acl() callback to allow
get_cached_acl_rcu() to call the ->get_acl() method in the next patch.
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
[chengzhihao: rename get_acl to get_acl2 to prevent KABI changes, and
only backport(realize) overlayfs]
Conflicts:
fs/overlayfs/dir.c
fs/overlayfs/inode.c
fs/overlayfs/overlayfs.h
fs/posix_acl.c
include/linux/fs.h
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/overlayfs/dir.c | 2 +-
fs/overlayfs/inode.c | 9 ++++++---
fs/overlayfs/overlayfs.h | 2 +-
fs/posix_acl.c | 7 +++++--
include/linux/fs.h | 1 +
5 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 8570e755a392..cdbb916aa636 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -1292,6 +1292,6 @@ const struct inode_operations ovl_dir_inode_operations = {
.permission = ovl_permission,
.getattr = ovl_getattr,
.listxattr = ovl_listxattr,
- .get_acl = ovl_get_acl,
+ .get_acl2 = ovl_get_acl,
.update_time = ovl_update_time,
};
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 3eabe457a5d5..3aa33c18cb80 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -421,12 +421,15 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
return res;
}
-struct posix_acl *ovl_get_acl(struct inode *inode, int type)
+struct posix_acl *ovl_get_acl(struct inode *inode, int type, bool rcu)
{
struct inode *realinode = ovl_inode_real(inode);
const struct cred *old_cred;
struct posix_acl *acl;
+ if (rcu)
+ return ERR_PTR(-ECHILD);
+
if (!IS_ENABLED(CONFIG_FS_POSIX_ACL) || !IS_POSIXACL(realinode))
return NULL;
@@ -480,7 +483,7 @@ static const struct inode_operations ovl_file_inode_operations = {
.permission = ovl_permission,
.getattr = ovl_getattr,
.listxattr = ovl_listxattr,
- .get_acl = ovl_get_acl,
+ .get_acl2 = ovl_get_acl,
.update_time = ovl_update_time,
.fiemap = ovl_fiemap,
};
@@ -498,7 +501,7 @@ static const struct inode_operations ovl_special_inode_operations = {
.permission = ovl_permission,
.getattr = ovl_getattr,
.listxattr = ovl_listxattr,
- .get_acl = ovl_get_acl,
+ .get_acl2 = ovl_get_acl,
.update_time = ovl_update_time,
};
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 76e096ad2ef9..38fada5fa552 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -358,7 +358,7 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name,
void *value, size_t size);
ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
-struct posix_acl *ovl_get_acl(struct inode *inode, int type);
+struct posix_acl *ovl_get_acl(struct inode *inode, int type, bool rcu);
int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags);
bool ovl_is_private_xattr(const char *name);
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 2fd0fde16fe1..8e3fe9cbb28c 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -133,11 +133,14 @@ struct posix_acl *get_acl(struct inode *inode, int type)
* If the filesystem doesn't have a get_acl() function at all, we'll
* just create the negative cache entry.
*/
- if (!inode->i_op->get_acl) {
+ if (!inode->i_op->get_acl && !inode->i_op->get_acl2) {
set_cached_acl(inode, type, NULL);
return NULL;
}
- acl = inode->i_op->get_acl(inode, type);
+ if (inode->i_op->get_acl)
+ acl = inode->i_op->get_acl(inode, type);
+ else
+ acl = inode->i_op->get_acl2(inode, type, false);
if (IS_ERR(acl)) {
/*
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6363c0a67af5..bd3e09a3c9a9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1838,6 +1838,7 @@ struct inode_operations {
const char * (*get_link) (struct dentry *, struct inode *, struct delayed_call *);
int (*permission) (struct inode *, int);
struct posix_acl * (*get_acl)(struct inode *, int);
+ struct posix_acl * (*get_acl2)(struct inode *, int, bool);
int (*readlink) (struct dentry *, char __user *,int);
--
2.25.1
1
5

[PATCH OLK-5.10 1/1] scsi: hisi_sas: Configure the initialization registers according to HBA model
by Yihang Li 28 Apr '23
by Yihang Li 28 Apr '23
28 Apr '23
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZGK8
CVE: NA
----------------------------------------------------------------------
We use init_reg_v3_hw() to set some registers, for the latest HBA devices,
some of these HW registers are set through firmware. Therefore, different
HBA models are distinguished through pci_dev->revision.
Signed-off-by: Yihang Li <liyihang9(a)huawei.com>
Reviewed-by: Xiang Chen <chenxiang66(a)hisilicon.com>
Signed-off-by: xiabing <xiabing12(a)h-partners.com>
---
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index eeb40ad079c8..6e40c4c06650 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -617,12 +617,12 @@ static void interrupt_enable_v3_hw(struct hisi_hba *hisi_hba)
static void init_reg_v3_hw(struct hisi_hba *hisi_hba)
{
+ struct pci_dev *pdev = hisi_hba->pci_dev;
int i, j;
/* Global registers init */
hisi_sas_write32(hisi_hba, DLVRY_QUEUE_ENABLE,
(u32)((1ULL << hisi_hba->queue_count) - 1));
- hisi_sas_write32(hisi_hba, SAS_AXI_USER3, 0);
hisi_sas_write32(hisi_hba, CFG_MAX_TAG, 0xfff0400);
hisi_sas_write32(hisi_hba, HGC_SAS_TXFAIL_RETRY_CTRL, 0x108);
hisi_sas_write32(hisi_hba, CFG_AGING_TIME, 0x1);
@@ -642,6 +642,9 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba)
hisi_sas_write32(hisi_hba, ARQOS_ARCACHE_CFG, 0xf0f0);
hisi_sas_write32(hisi_hba, HYPER_STREAM_ID_EN_CFG, 1);
+ if (pdev->revision < 0x30)
+ hisi_sas_write32(hisi_hba, SAS_AXI_USER3, 0);
+
interrupt_enable_v3_hw(hisi_hba);
for (i = 0; i < hisi_hba->n_phy; i++) {
enum sas_linkrate max;
@@ -659,7 +662,6 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba)
prog_phy_link_rate |= hisi_sas_get_prog_phy_linkrate_mask(max);
hisi_sas_phy_write32(hisi_hba, i, PROG_PHY_LINK_RATE,
prog_phy_link_rate);
- hisi_sas_phy_write32(hisi_hba, i, SERDES_CFG, 0xffc00);
hisi_sas_phy_write32(hisi_hba, i, SAS_RX_TRAIN_TIMER, 0x13e80);
hisi_sas_phy_write32(hisi_hba, i, CHL_INT0, 0xffffffff);
hisi_sas_phy_write32(hisi_hba, i, CHL_INT1, 0xffffffff);
@@ -670,13 +672,18 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba)
hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_OOB_RESTART_MSK, 0x1);
hisi_sas_phy_write32(hisi_hba, i, STP_LINK_TIMER, 0x7f7a120);
hisi_sas_phy_write32(hisi_hba, i, CON_CFG_DRIVER, 0x2a0a01);
- hisi_sas_phy_write32(hisi_hba, i, SAS_SSP_CON_TIMER_CFG, 0x32);
hisi_sas_phy_write32(hisi_hba, i, SAS_EC_INT_COAL_TIME,
0x30f4240);
- /* used for 12G negotiate */
- hisi_sas_phy_write32(hisi_hba, i, COARSETUNE_TIME, 0x1e);
hisi_sas_phy_write32(hisi_hba, i, AIP_LIMIT, 0x2ffff);
+ /* set value through firmware for the latest HBA */
+ if (pdev->revision < 0x30) {
+ hisi_sas_phy_write32(hisi_hba, i, SAS_SSP_CON_TIMER_CFG, 0x32);
+ hisi_sas_phy_write32(hisi_hba, i, SERDES_CFG, 0xffc00);
+ /* used for 12G negotiate */
+ hisi_sas_phy_write32(hisi_hba, i, COARSETUNE_TIME, 0x1e);
+ }
+
/* get default FFE configuration for BIST */
for (j = 0; j < FFE_CFG_MAX; j++) {
u32 val = hisi_sas_phy_read32(hisi_hba, i,
--
2.30.0
1
0

28 Apr '23
From: Guan Jing <guanjing6(a)huawei.com>
Guan Jing (3):
sched/fair: Start tracking qos_offline tasks count in cfs_rq
sched/fair: Introduce QOS_SMT_EXPELL priority reversion mechanism
sched/fair: Add cmdline nosmtexpell
kernel/sched/fair.c | 150 +++++++++++++++++++++++++++++++++++++------
kernel/sched/sched.h | 20 ++++++
2 files changed, 152 insertions(+), 18 deletions(-)
--
2.17.1
1
3
From: Juan Zhou <zhoujuan51(a)h-partners.com>
This group of patches fix bonding-related errors.
Junxian Huang (16):
RDMA/hns: Move bond_work from hns_roce_dev to hns_roce_bond_group
RDMA/hns: Apply XArray for Bond ID allocation
RDMA/hns: Delete a useless assignment to bond_state
RDMA/hns: Initial value assignment cleanup for RoCE Bonding variables
RDMA/hns: Remove the struct member 'bond_grp' from hns_roce_dev
RDMA/hns: Simplify the slave uninit logic of RoCE bonding operations
RDMA/hns: Fix the driver uninit order during bond setting
RDMA/hns: Fix the counting error of slave number
RDMA/hns: Support reset recovery for RoCE bonding
RDMA/hns: Rename hns_roce_bond_info_record() to make sense
RDMA/hns: Fix the repetitive workqueue mission in RoCE Bonding
RDMA/hns: Fix the counting error of bonding with more than 2 slaves
RDMA/hns: Get real-time port state of bonding slave
RDMA/hns: Set IB port state depending on upper device for RoCE bonding
RDMA/hns: Support dispatching IB event for RoCE bonding
RDMA/hns: Fix a missing constraint for slave num in RoCE Bonding
drivers/infiniband/hw/hns/hns_roce_bond.c | 540 +++++++++++---------
drivers/infiniband/hw/hns/hns_roce_bond.h | 15 +-
drivers/infiniband/hw/hns/hns_roce_device.h | 5 +-
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 79 ++-
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 10 +-
drivers/infiniband/hw/hns/hns_roce_main.c | 58 ++-
6 files changed, 421 insertions(+), 286 deletions(-)
--
2.30.0
1
17
From: Juan Zhou <zhoujuan51(a)h-partners.com>
Luoyouming (2):
RDMA/hns: Fix the inconsistency between the rq inline bit and the
community
RDMA/hns: Fix the compatibility flag problem
drivers/infiniband/hw/hns/hns_roce_device.h | 4 +---
drivers/infiniband/hw/hns/hns_roce_main.c | 6 ++++--
2 files changed, 5 insertions(+), 5 deletions(-)
--
2.30.0
1
2

27 Apr '23
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I6BSMN
-----------------------------------------------------------------------
For ROH distributed scenario, EID is allocated by DHCP mode.
Driver needs to convert the origin MAC address to EID format,
and updates the destination MAC, chaddr and client id(if exists)
when transmit DHCP packets. Meantime, the chaddr field should
follow the source mac address, in order to make the dhcp
server reply to the right client. For the payload of
dhcp packet changed, so the checksum of L4 should be
calculated too.
Signed-off-by: Jian Shen <shenjian15(a)huawei.com>
Signed-off-by: Ke Chen <chenke54(a)huawei.com>
---
.../net/ethernet/hisilicon/hns3/hns3_enet.c | 174 +++++++++++++++++-
.../net/ethernet/hisilicon/hns3/hns3_enet.h | 50 +++++
.../hisilicon/hns3/hns3pf/hclge_main.c | 16 ++
3 files changed, 235 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index cf79cd69c..95cd61b1f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1165,6 +1165,142 @@ static void hns3_tx_spare_reclaim_cb(struct hns3_enet_ring *ring,
}
}
+static struct hns3_dhcp_packet *hns3_get_dhcp_packet(struct sk_buff *skb,
+ int *dhcp_len)
+{
+ struct hns3_dhcp_packet *dhcp;
+ union l4_hdr_info l4;
+ int l4_payload_len;
+
+ l4.hdr = skb_transport_header(skb);
+ if (l4.udp->dest != htons(HNS3_DHCP_CLIENT_PORT) ||
+ l4.udp->source != htons(HNS3_DHCP_SERVER_PORT))
+ return NULL;
+
+ dhcp = (struct hns3_dhcp_packet *)(l4.hdr + sizeof(struct udphdr));
+ l4_payload_len = ntohs(l4.udp->len) - sizeof(struct udphdr);
+ if (l4_payload_len < offsetof(struct hns3_dhcp_packet, options) ||
+ dhcp->hlen != ETH_ALEN ||
+ dhcp->cookie != htonl(HNS3_DHCP_MAGIC))
+ return NULL;
+
+ *dhcp_len = l4_payload_len;
+ return dhcp;
+}
+
+static u8 *hns3_dhcp_option_scan(struct hns3_dhcp_packet *packet,
+ struct hns3_dhcp_opt_state *opt_state)
+{
+ int opt_len;
+ u8 *cur_opt;
+
+ /* option bytes: [code][len][data0~data[len-1]] */
+ while (opt_state->rem > 0) {
+ switch (opt_state->opt_ptr[DHCP_OPT_CODE]) {
+ /* option padding and end have no len and data byte. */
+ case DHCP_OPT_PADDING:
+ opt_state->rem--;
+ opt_state->opt_ptr++;
+ break;
+ case DHCP_OPT_END:
+ if (DHCP_OVERLOAD_USE_FILE(opt_state->overload_flag)) {
+ opt_state->overload_flag |=
+ DHCP_OVERLOAD_FILE_USED;
+ opt_state->opt_ptr = packet->file;
+ opt_state->rem = sizeof(packet->file);
+ break;
+ }
+ if (DHCP_OVERLOAD_USE_SNAME(opt_state->overload_flag)) {
+ opt_state->overload_flag |=
+ DHCP_OVERLOAD_SNAME_USED;
+ opt_state->opt_ptr = packet->sname;
+ opt_state->rem = sizeof(packet->sname);
+ break;
+ }
+ return NULL;
+ default:
+ if (opt_state->rem <= DHCP_OPT_LEN)
+ return NULL;
+ /* opt_len includes code, len and data bytes */
+ opt_len = opt_state->opt_ptr[DHCP_OPT_LEN] +
+ DHCP_OPT_DATA;
+ cur_opt = opt_state->opt_ptr;
+ if (opt_state->rem < opt_len)
+ return NULL;
+
+ opt_state->opt_ptr += opt_len;
+ opt_state->rem -= opt_len;
+ if (cur_opt[DHCP_OPT_CODE] == DHCP_OPT_OVERLOAD) {
+ opt_state->overload_flag |=
+ cur_opt[DHCP_OPT_DATA];
+ break;
+ }
+ return cur_opt;
+ }
+ }
+
+ return NULL;
+}
+
+static void hns3_dhcp_update_option61(struct hns3_nic_priv *priv,
+ struct hns3_dhcp_packet *packet,
+ int dhcp_len)
+{
+ struct hns3_dhcp_opt_state opt_state;
+ u8 *cur_opt;
+
+ opt_state.opt_ptr = packet->options;
+ opt_state.rem = dhcp_len - offsetof(struct hns3_dhcp_packet, options);
+ opt_state.overload_flag = 0;
+
+ cur_opt = hns3_dhcp_option_scan(packet, &opt_state);
+ while (cur_opt) {
+ if (cur_opt[DHCP_OPT_CODE] != DHCP_OPT_CLIENT_ID) {
+ cur_opt = hns3_dhcp_option_scan(packet, &opt_state);
+ continue;
+ }
+ if (cur_opt[DHCP_OPT_LEN] > ETH_ALEN)
+ ether_addr_copy(&cur_opt[DHCP_CLIENT_ID_MAC_OFT],
+ priv->roh_perm_mac);
+ break;
+ }
+}
+
+static void hns3_dhcp_cal_l4_csum(struct sk_buff *skb)
+{
+ union l3_hdr_info l3;
+ union l4_hdr_info l4;
+ __wsum csum = 0;
+ int offset;
+
+ if (skb->ip_summed == CHECKSUM_PARTIAL)
+ return;
+
+ l3.hdr = skb_network_header(skb);
+ l4.hdr = skb_transport_header(skb);
+ offset = skb_transport_offset(skb);
+ l4.udp->check = 0;
+ csum = csum_partial(l4.udp, ntohs(l4.udp->len), 0);
+ l4.udp->check = csum_tcpudp_magic(l3.v4->saddr, l3.v4->daddr,
+ skb->len - offset, IPPROTO_UDP, csum);
+}
+
+static void hns3_dhcp_packet_convert(struct hns3_nic_priv *priv,
+ struct sk_buff *skb,
+ struct hns3_dhcp_packet *dhcp,
+ int dhcp_len)
+{
+ struct ethhdr *l2hdr = eth_hdr(skb);
+
+ if (!dhcp)
+ return;
+
+ ether_addr_copy(dhcp->chaddr, l2hdr->h_source);
+ hns3_dhcp_update_option61(priv, dhcp, dhcp_len);
+ /* for l4 payload changed, need to re-calculate the csum */
+ hns3_dhcp_cal_l4_csum(skb);
+}
+
static int hns3_set_tso(struct sk_buff *skb, u32 *paylen_fdop_ol4cs,
u16 *mss, u32 *type_cs_vlan_tso, u32 *send_bytes)
{
@@ -1716,7 +1852,20 @@ static int hns3_handle_csum_partial(struct hns3_enet_ring *ring,
return 0;
}
-static int hns3_fill_skb_desc(struct hns3_enet_ring *ring,
+static bool hns3_roh_check_udpv4(struct sk_buff *skb)
+{
+ union l3_hdr_info l3;
+
+ l3.hdr = skb_network_header(skb);
+ if (skb->protocol != htons(ETH_P_IP) ||
+ l3.v4->version != IP_VERSION_IPV4)
+ return false;
+
+ return l3.v4->protocol == IPPROTO_UDP;
+}
+
+static int hns3_fill_skb_desc(struct hns3_nic_priv *priv,
+ struct hns3_enet_ring *ring,
struct sk_buff *skb, struct hns3_desc *desc,
struct hns3_desc_cb *desc_cb)
{
@@ -1741,6 +1890,15 @@ static int hns3_fill_skb_desc(struct hns3_enet_ring *ring,
hnae3_set_field(param.paylen_fdop_ol4cs, HNS3_TXD_FD_OP_M,
HNS3_TXD_FD_OP_S, fd_op);
+ if (hnae3_check_roh_mac_type(priv->ae_handle) &&
+ hns3_roh_check_udpv4(skb)) {
+ struct hns3_dhcp_packet *dhcp;
+ int dhcp_len;
+
+ dhcp = hns3_get_dhcp_packet(skb, &dhcp_len);
+ hns3_dhcp_packet_convert(priv, skb, dhcp, dhcp_len);
+ }
+
/* Set txbd */
desc->tx.ol_type_vlan_len_msec =
cpu_to_le32(param.ol_type_vlan_len_msec);
@@ -2338,15 +2496,16 @@ static int hns3_handle_desc_filling(struct hns3_enet_ring *ring,
return hns3_fill_skb_to_desc(ring, skb, DESC_TYPE_SKB);
}
-static int hns3_handle_skb_desc(struct hns3_enet_ring *ring,
+static int hns3_handle_skb_desc(struct hns3_nic_priv *priv,
+ struct hns3_enet_ring *ring,
struct sk_buff *skb,
struct hns3_desc_cb *desc_cb,
int next_to_use_head)
{
int ret;
- ret = hns3_fill_skb_desc(ring, skb, &ring->desc[ring->next_to_use],
- desc_cb);
+ ret = hns3_fill_skb_desc(priv, ring, skb,
+ &ring->desc[ring->next_to_use], desc_cb);
if (unlikely(ret < 0))
goto fill_err;
@@ -2395,7 +2554,7 @@ netdev_tx_t hns3_nic_net_xmit(struct sk_buff *skb, struct net_device *netdev)
goto out_err_tx_ok;
}
- ret = hns3_handle_skb_desc(ring, skb, desc_cb, ring->next_to_use);
+ ret = hns3_handle_skb_desc(priv, ring, skb, desc_cb, ring->next_to_use);
if (unlikely(ret <= 0))
goto out_err_tx_ok;
@@ -5226,6 +5385,10 @@ static int hns3_init_mac_addr(struct net_device *netdev)
return 0;
}
+ if (hnae3_check_roh_mac_type(h) &&
+ is_zero_ether_addr(priv->roh_perm_mac))
+ ether_addr_copy(priv->roh_perm_mac, netdev->dev_addr);
+
if (h->ae_algo->ops->set_mac_addr)
ret = h->ae_algo->ops->set_mac_addr(h, netdev->dev_addr, true);
@@ -5377,6 +5540,7 @@ static int hns3_client_init(struct hnae3_handle *handle)
priv->tx_timeout_count = 0;
priv->max_non_tso_bd_num = ae_dev->dev_specs.max_non_tso_bd_num;
set_bit(HNS3_NIC_STATE_DOWN, &priv->state);
+ eth_zero_addr(priv->roh_perm_mac);
handle->msg_enable = netif_msg_init(debug, DEFAULT_MSG_LEVEL);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index ccfd38b00..85c352fff 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -604,6 +604,56 @@ struct hns3_nic_priv {
struct hns3_enet_coalesce rx_coal;
u32 tx_copybreak;
u32 rx_copybreak;
+ u8 roh_perm_mac[ETH_ALEN];
+};
+
+#define HNS3_DHCP_SERVER_PORT 68
+#define HNS3_DHCP_CLIENT_PORT 67
+#define HNS3_DHCP_MAGIC 0x63825363
+#define DHCP_OPT_CODE 0
+#define DHCP_OPT_LEN 1
+#define DHCP_OPT_DATA 2
+#define DHCP_CLIENT_ID_LEN 7
+#define DHCP_CLIENT_ID_MAC_OFT 3
+#define DHCP_OVERLOAD_FILE 0x1
+#define DHCP_OVERLOAD_SNAME 0x2
+#define DHCP_OVERLOAD_FILE_USED 0x101
+#define DHCP_OVERLOAD_SNAME_USED 0x202
+#define DHCP_OVERLOAD_USE_FILE(x) \
+ (((x) & DHCP_OVERLOAD_FILE_USED) == DHCP_OVERLOAD_FILE)
+#define DHCP_OVERLOAD_USE_SNAME(x) \
+ (((x) & DHCP_OVERLOAD_SNAME_USED) == DHCP_OVERLOAD_SNAME)
+
+enum DHCP_OPTION_CODES {
+ DHCP_OPT_PADDING = 0,
+ DHCP_OPT_OVERLOAD = 52,
+ DHCP_OPT_CLIENT_ID = 61,
+ DHCP_OPT_END = 255
+};
+
+struct hns3_dhcp_packet {
+ u8 op;
+ u8 htype;
+ u8 hlen;
+ u8 hops;
+ u32 xid;
+ u16 secs;
+ u16 flags;
+ u32 ciaddr;
+ u32 yiaddr;
+ u32 siaddr_nip;
+ u32 gateway_nip;
+ u8 chaddr[16]; /* link-layer client hardware address (MAC) */
+ u8 sname[64];
+ u8 file[128];
+ u32 cookie; /* DHCP magic bytes: 0x63825363 */
+ u8 options[312];
+};
+
+struct hns3_dhcp_opt_state {
+ u8 *opt_ptr; /* refer to current option item */
+ int rem; /* remain bytes in options */
+ u32 overload_flag; /* whether use file and sname field as options */
};
union l3_hdr_info {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index eea175484..947ffde02 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2866,12 +2866,28 @@ static void hclge_get_fec(struct hnae3_handle *handle, u8 *fec_ability,
if (fec_mode)
*fec_mode = mac->fec_mode;
}
+
+static void hclge_roh_convert_mac_addr(struct hclge_dev *hdev)
+{
+#define HCLGE_ROH_EID_MASK_BYTE 3
+
+ struct hclge_vport *vport = &hdev->vport[0];
+ struct hnae3_handle *handle = &vport->nic;
+
+ if (hnae3_check_roh_mac_type(handle)) {
+ if (!is_valid_ether_addr(hdev->hw.mac.mac_addr))
+ random_ether_addr(hdev->hw.mac.mac_addr);
+ memset(hdev->hw.mac.mac_addr, 0, HCLGE_ROH_EID_MASK_BYTE);
+ }
+}
+
static int hclge_mac_init(struct hclge_dev *hdev)
{
struct hclge_mac *mac = &hdev->hw.mac;
int ret;
hclge_mac_type_init(hdev);
+ hclge_roh_convert_mac_addr(hdev);
hdev->support_sfp_query = true;
hdev->hw.mac.duplex = HCLGE_MAC_FULL;
--
2.30.0
1
0

[PATCH openEuler-5.10-LTS 01/20] io_uring: add missing lock in io_get_file_fixed
by Jialin Zhang 26 Apr '23
by Jialin Zhang 26 Apr '23
26 Apr '23
From: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
stable inclusion
from stable-v5.10.171
commit 08681391b84da27133deefaaddefd0acfa90c2be
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6V7V1
CVE: CVE-2023-1872
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
io_get_file_fixed will access io_uring's context. Lock it if it is
invoked unlocked (eg via io-wq) to avoid a race condition with fixed
files getting unregistered.
No single upstream patch exists for this issue, it was fixed as part
of the file assignment changes that went into the 5.18 cycle.
Signed-off-by: Jheng, Bing-Jhong Billy <billy(a)starlabs.sg>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: ZhaoLong Wang <wangzhaolong1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
io_uring/io_uring.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index a200750b536c..98cf1c413691 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1090,7 +1090,8 @@ static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
unsigned nr_args);
static void io_clean_op(struct io_kiocb *req);
static struct file *io_file_get(struct io_ring_ctx *ctx,
- struct io_kiocb *req, int fd, bool fixed);
+ struct io_kiocb *req, int fd, bool fixed,
+ unsigned int issue_flags);
static void __io_queue_sqe(struct io_kiocb *req);
static void io_rsrc_put_work(struct work_struct *work);
@@ -3914,7 +3915,7 @@ static int io_tee(struct io_kiocb *req, unsigned int issue_flags)
return -EAGAIN;
in = io_file_get(req->ctx, req, sp->splice_fd_in,
- (sp->flags & SPLICE_F_FD_IN_FIXED));
+ (sp->flags & SPLICE_F_FD_IN_FIXED), issue_flags);
if (!in) {
ret = -EBADF;
goto done;
@@ -3954,7 +3955,7 @@ static int io_splice(struct io_kiocb *req, unsigned int issue_flags)
return -EAGAIN;
in = io_file_get(req->ctx, req, sp->splice_fd_in,
- (sp->flags & SPLICE_F_FD_IN_FIXED));
+ (sp->flags & SPLICE_F_FD_IN_FIXED), issue_flags);
if (!in) {
ret = -EBADF;
goto done;
@@ -6742,13 +6743,16 @@ static void io_fixed_file_set(struct io_fixed_file *file_slot, struct file *file
}
static inline struct file *io_file_get_fixed(struct io_ring_ctx *ctx,
- struct io_kiocb *req, int fd)
+ struct io_kiocb *req, int fd,
+ unsigned int issue_flags)
{
- struct file *file;
+ struct file *file = NULL;
unsigned long file_ptr;
+ io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+
if (unlikely((unsigned int)fd >= ctx->nr_user_files))
- return NULL;
+ goto out;
fd = array_index_nospec(fd, ctx->nr_user_files);
file_ptr = io_fixed_file_slot(&ctx->file_table, fd)->file_ptr;
file = (struct file *) (file_ptr & FFS_MASK);
@@ -6756,6 +6760,8 @@ static inline struct file *io_file_get_fixed(struct io_ring_ctx *ctx,
/* mask in overlapping REQ_F and FFS bits */
req->flags |= (file_ptr << REQ_F_NOWAIT_READ_BIT);
io_req_set_rsrc_node(req);
+out:
+ io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
return file;
}
@@ -6773,10 +6779,11 @@ static struct file *io_file_get_normal(struct io_ring_ctx *ctx,
}
static inline struct file *io_file_get(struct io_ring_ctx *ctx,
- struct io_kiocb *req, int fd, bool fixed)
+ struct io_kiocb *req, int fd, bool fixed,
+ unsigned int issue_flags)
{
if (fixed)
- return io_file_get_fixed(ctx, req, fd);
+ return io_file_get_fixed(ctx, req, fd, issue_flags);
else
return io_file_get_normal(ctx, req, fd);
}
@@ -6998,7 +7005,7 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
if (io_op_defs[req->opcode].needs_file) {
req->file = io_file_get(ctx, req, READ_ONCE(sqe->fd),
- (sqe_flags & IOSQE_FIXED_FILE));
+ (sqe_flags & IOSQE_FIXED_FILE), 0);
if (unlikely(!req->file))
ret = -EBADF;
}
--
2.25.1
1
19

[PATCH openEuler-5.10-LTS-SP1 01/21] io_uring: add missing lock in io_get_file_fixed
by Jialin Zhang 26 Apr '23
by Jialin Zhang 26 Apr '23
26 Apr '23
From: Bing-Jhong Billy Jheng <billy(a)starlabs.sg>
stable inclusion
from stable-v5.10.171
commit 08681391b84da27133deefaaddefd0acfa90c2be
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6V7V1
CVE: CVE-2023-1872
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
io_get_file_fixed will access io_uring's context. Lock it if it is
invoked unlocked (eg via io-wq) to avoid a race condition with fixed
files getting unregistered.
No single upstream patch exists for this issue, it was fixed as part
of the file assignment changes that went into the 5.18 cycle.
Signed-off-by: Jheng, Bing-Jhong Billy <billy(a)starlabs.sg>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: ZhaoLong Wang <wangzhaolong1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
io_uring/io_uring.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 315031bb38e2..ded2faa66f84 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1090,7 +1090,8 @@ static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
unsigned nr_args);
static void io_clean_op(struct io_kiocb *req);
static struct file *io_file_get(struct io_ring_ctx *ctx,
- struct io_kiocb *req, int fd, bool fixed);
+ struct io_kiocb *req, int fd, bool fixed,
+ unsigned int issue_flags);
static void __io_queue_sqe(struct io_kiocb *req);
static void io_rsrc_put_work(struct work_struct *work);
@@ -3914,7 +3915,7 @@ static int io_tee(struct io_kiocb *req, unsigned int issue_flags)
return -EAGAIN;
in = io_file_get(req->ctx, req, sp->splice_fd_in,
- (sp->flags & SPLICE_F_FD_IN_FIXED));
+ (sp->flags & SPLICE_F_FD_IN_FIXED), issue_flags);
if (!in) {
ret = -EBADF;
goto done;
@@ -3954,7 +3955,7 @@ static int io_splice(struct io_kiocb *req, unsigned int issue_flags)
return -EAGAIN;
in = io_file_get(req->ctx, req, sp->splice_fd_in,
- (sp->flags & SPLICE_F_FD_IN_FIXED));
+ (sp->flags & SPLICE_F_FD_IN_FIXED), issue_flags);
if (!in) {
ret = -EBADF;
goto done;
@@ -6742,13 +6743,16 @@ static void io_fixed_file_set(struct io_fixed_file *file_slot, struct file *file
}
static inline struct file *io_file_get_fixed(struct io_ring_ctx *ctx,
- struct io_kiocb *req, int fd)
+ struct io_kiocb *req, int fd,
+ unsigned int issue_flags)
{
- struct file *file;
+ struct file *file = NULL;
unsigned long file_ptr;
+ io_ring_submit_lock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
+
if (unlikely((unsigned int)fd >= ctx->nr_user_files))
- return NULL;
+ goto out;
fd = array_index_nospec(fd, ctx->nr_user_files);
file_ptr = io_fixed_file_slot(&ctx->file_table, fd)->file_ptr;
file = (struct file *) (file_ptr & FFS_MASK);
@@ -6756,6 +6760,8 @@ static inline struct file *io_file_get_fixed(struct io_ring_ctx *ctx,
/* mask in overlapping REQ_F and FFS bits */
req->flags |= (file_ptr << REQ_F_NOWAIT_READ_BIT);
io_req_set_rsrc_node(req);
+out:
+ io_ring_submit_unlock(ctx, !(issue_flags & IO_URING_F_NONBLOCK));
return file;
}
@@ -6773,10 +6779,11 @@ static struct file *io_file_get_normal(struct io_ring_ctx *ctx,
}
static inline struct file *io_file_get(struct io_ring_ctx *ctx,
- struct io_kiocb *req, int fd, bool fixed)
+ struct io_kiocb *req, int fd, bool fixed,
+ unsigned int issue_flags)
{
if (fixed)
- return io_file_get_fixed(ctx, req, fd);
+ return io_file_get_fixed(ctx, req, fd, issue_flags);
else
return io_file_get_normal(ctx, req, fd);
}
@@ -6998,7 +7005,7 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
if (io_op_defs[req->opcode].needs_file) {
req->file = io_file_get(ctx, req, READ_ONCE(sqe->fd),
- (sqe_flags & IOSQE_FIXED_FILE));
+ (sqe_flags & IOSQE_FIXED_FILE), 0);
if (unlikely(!req->file))
ret = -EBADF;
}
--
2.25.1
1
20

[PATCH openEuler-1.0-LTS] RDMA/hns: Add check for user-configured max_inline_data value
by Zhang Changzhong 26 Apr '23
by Zhang Changzhong 26 Apr '23
26 Apr '23
From: zhaoweibo <zhaoweibo3(a)huawei.com>
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I51UWX
CVE: NA
-------------------------------------------------------------
When the user-configured max_inline_data value exceeds the hardware
specification, an err needs to be returned.
Fixes: 87d3b87b92b9 ("RDMA/hns: Optimize qp param setup flow")
Signed-off-by: zhaoweibo <zhaoweibo3(a)huawei.com>
Reviewed-by: Chunzhi Hu <huchunzhi(a)huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_qp.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 6f26693..df14a69 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -756,6 +756,22 @@ static void free_qp_buf(struct hns_roce_qp *hr_qp, struct ib_pd *ib_pd)
hns_roce_free_recv_inline_buffer(hr_qp);
}
+static int set_max_inline_data(struct hns_roce_dev *hr_dev,
+ struct ib_qp_init_attr *init_attr)
+{
+ if (init_attr->cap.max_inline_data > hr_dev->caps.max_sq_inline)
+ return -EINVAL;
+
+ if (init_attr->qp_type == IB_QPT_UD)
+ init_attr->cap.max_inline_data = 0;
+
+ if (init_attr->cap.max_inline_data)
+ init_attr->cap.max_inline_data = roundup_pow_of_two(
+ init_attr->cap.max_inline_data);
+
+ return 0;
+}
+
static int set_qp_param(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
struct ib_qp_init_attr *init_attr,
struct ib_udata *udata,
@@ -763,6 +779,10 @@ static int set_qp_param(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
{
int ret;
+ ret = set_max_inline_data(hr_dev, init_attr);
+ if (ret != 0)
+ return -EINVAL;
+
hr_qp->ibqp.qp_type = init_attr->qp_type;
if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR)
--
2.9.5
1
0
1
0

[PATCH openEuler-1.0-LTS 0/2] irqchip/gic-v3-its: Balance LPI affinity across CPUs
by Zhang Changzhong 25 Apr '23
by Zhang Changzhong 25 Apr '23
25 Apr '23
Marc Zyngier (2):
irqchip/gic-v3-its: Track LPI distribution on a per CPU basis
irqchip/gic-v3-its: Balance initial LPI affinity across CPUs
drivers/irqchip/irq-gic-v3-its.c | 171 ++++++++++++++++++++++++++++++++-------
1 file changed, 144 insertions(+), 27 deletions(-)
--
2.9.5
1
2

[PATCH openEuler-1.0-LTS 0/2] irqchip/gic-v3-its: Balance LPI affinity across CPUs
by Zhang Changzhong 25 Apr '23
by Zhang Changzhong 25 Apr '23
25 Apr '23
Marc Zyngier (2):
irqchip/gic-v3-its: Track LPI distribution on a per CPU basis
irqchip/gic-v3-its: Balance initial LPI affinity across CPUs
drivers/irqchip/irq-gic-v3-its.c | 171 ++++++++++++++++++++++++++++++++-------
1 file changed, 144 insertions(+), 27 deletions(-)
--
2.9.5
1
2

25 Apr '23
your patch has been converted to a pull request, pull request link is:
https://gitee.com/openeuler/kernel/pulls/626
1
0

25 Apr '23
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I6BSMN
-----------------------------------------------------------------------
For ROH distributed scenario, EID is allocated by DHCP mode.
Driver needs to convert the origin MAC address to EID format,
and updates the destination MAC, chaddr and client id(if exists)
when transmit DHCP packets. Meantime, the chaddr field should
follow the source mac address, in order to make the dhcp
server reply to the right client. For the payload of
dhcp packet changed, so the checksum of L4 should be
calculated too.
Signed-off-by: Jian Shen <shenjian15(a)huawei.com>
Signed-off-by: Ke Chen <chenke54(a)huawei.com>
Reviewed-by: Yonglong Liu <liuyonglong(a)huawei.com>
Reviewed-by: Guangbin Huang <huangguangbin2(a)huawei.com>
Reviewed-by: Hao Lan <lanhao(a)huawei.com>
---
.../net/ethernet/hisilicon/hns3/hns3_enet.c | 160 +++++++++++++++++-
.../net/ethernet/hisilicon/hns3/hns3_enet.h | 50 ++++++
.../hisilicon/hns3/hns3pf/hclge_main.c | 16 ++
3 files changed, 221 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index cf79cd69c..35d6b2f20 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1165,6 +1165,128 @@ static void hns3_tx_spare_reclaim_cb(struct hns3_enet_ring *ring,
}
}
+static struct hns3_dhcp_packet *hns3_get_dhcp_packet(struct sk_buff *skb,
+ int *dhcp_len)
+{
+ struct hns3_dhcp_packet *dhcp;
+ union l4_hdr_info l4;
+ int l4_payload_len;
+
+ l4.hdr = skb_transport_header(skb);
+ if (l4.udp->dest != htons(HNS3_DHCP_CLIENT_PORT) ||
+ l4.udp->source != htons(HNS3_DHCP_SERVER_PORT))
+ return NULL;
+
+ dhcp = (struct hns3_dhcp_packet *)(l4.hdr + sizeof(struct udphdr));
+ l4_payload_len = ntohs(l4.udp->len) - sizeof(struct udphdr);
+ if (l4_payload_len < offsetof(struct hns3_dhcp_packet, options) ||
+ dhcp->hlen != ETH_ALEN ||
+ dhcp->cookie != htonl(HNS3_DHCP_MAGIC))
+ return NULL;
+
+ *dhcp_len = l4_payload_len;
+ return dhcp;
+}
+
+static u8 *hns3_dhcp_option_scan(struct hns3_dhcp_packet *packet,
+ struct hns3_dhcp_opt_state *opt_state)
+{
+ int opt_len;
+ u8 *cur_opt;
+
+ /* option bytes: [code][len][data0~data[len-1]] */
+ while (1) {
+ if (opt_state->rem <= 0)
+ break;
+
+ switch (opt_state->opt_ptr[DHCP_OPT_CODE]) {
+ /* option padding and end have no len and data byte. */
+ case DHCP_OPT_PADDING:
+ opt_state->rem--;
+ opt_state->opt_ptr++;
+ break;
+ case DHCP_OPT_END:
+ if (DHCP_OVERLOAD_USE_FILE(opt_state->overload_flag)) {
+ opt_state->overload_flag |=
+ DHCP_OVERLOAD_FILE_USED;
+ opt_state->opt_ptr = packet->file;
+ opt_state->rem = sizeof(packet->file);
+ break;
+ }
+ if (DHCP_OVERLOAD_USE_SNAME(opt_state->overload_flag)) {
+ opt_state->overload_flag |=
+ DHCP_OVERLOAD_SNAME_USED;
+ opt_state->opt_ptr = packet->sname;
+ opt_state->rem = sizeof(packet->sname);
+ break;
+ }
+ return NULL;
+ default:
+ if (opt_state->rem <= DHCP_OPT_LEN)
+ return NULL;
+ /* opt_len includes code, len and data bytes */
+ opt_len = opt_state->opt_ptr[DHCP_OPT_LEN] +
+ DHCP_OPT_DATA;
+ cur_opt = opt_state->opt_ptr;
+ if (opt_state->rem < opt_len)
+ return NULL;
+
+ opt_state->opt_ptr += opt_len;
+ opt_state->rem -= opt_len;
+ if (cur_opt[DHCP_OPT_CODE] == DHCP_OPT_OVERLOAD) {
+ opt_state->overload_flag |=
+ cur_opt[DHCP_OPT_DATA];
+ break;
+ }
+ return cur_opt;
+ }
+ }
+
+ return NULL;
+}
+
+static void hns3_udhcp_update_option61(struct hns3_nic_priv *priv,
+ struct hns3_dhcp_packet *packet,
+ int dhcp_len)
+{
+ struct hns3_dhcp_opt_state opt_state;
+ u8 *cur_opt;
+
+ opt_state.opt_ptr = packet->options;
+ opt_state.rem = dhcp_len - offsetof(struct hns3_dhcp_packet, options);
+ opt_state.overload_flag = 0;
+
+ cur_opt = hns3_dhcp_option_scan(packet, &opt_state);
+ while (cur_opt) {
+ if (cur_opt[DHCP_OPT_CODE] != DHCP_OPT_CLIENT_ID) {
+ cur_opt = hns3_dhcp_option_scan(packet, &opt_state);
+ continue;
+ }
+ if (cur_opt[DHCP_OPT_LEN] > ETH_ALEN)
+ ether_addr_copy(&cur_opt[DHCP_CLIENT_ID_MAC_OFT],
+ priv->roh_perm_mac);
+ break;
+ }
+}
+
+static void hns3_dhcp_packet_convert(struct hns3_nic_priv *priv,
+ struct sk_buff *skb,
+ struct hns3_dhcp_packet *dhcp,
+ int dhcp_len)
+{
+ const u8 roh_dhcp_mac[ETH_ALEN] = {0x0, 0x0, 0x0, 0xFF, 0xFF, 0x04};
+ struct ethhdr *l2hdr = eth_hdr(skb);
+
+ if (!dhcp)
+ return;
+
+ ether_addr_copy(dhcp->chaddr, priv->roh_perm_mac);
+ hns3_udhcp_update_option61(priv, dhcp, dhcp_len);
+
+ if (is_broadcast_ether_addr(l2hdr->h_source))
+ ether_addr_copy(l2hdr->h_source, roh_dhcp_mac);
+}
+
static int hns3_set_tso(struct sk_buff *skb, u32 *paylen_fdop_ol4cs,
u16 *mss, u32 *type_cs_vlan_tso, u32 *send_bytes)
{
@@ -1716,7 +1838,20 @@ static int hns3_handle_csum_partial(struct hns3_enet_ring *ring,
return 0;
}
-static int hns3_fill_skb_desc(struct hns3_enet_ring *ring,
+static bool hns3_roh_check_udpv4(struct sk_buff *skb)
+{
+ union l3_hdr_info l3;
+
+ l3.hdr = skb_network_header(skb);
+ if (skb->protocol != htons(ETH_P_IP) ||
+ l3.v4->version != IP_VERSION_IPV4)
+ return false;
+
+ return l3.v4->protocol == IPPROTO_UDP;
+}
+
+static int hns3_fill_skb_desc(struct hns3_nic_priv *priv,
+ struct hns3_enet_ring *ring,
struct sk_buff *skb, struct hns3_desc *desc,
struct hns3_desc_cb *desc_cb)
{
@@ -1741,6 +1876,15 @@ static int hns3_fill_skb_desc(struct hns3_enet_ring *ring,
hnae3_set_field(param.paylen_fdop_ol4cs, HNS3_TXD_FD_OP_M,
HNS3_TXD_FD_OP_S, fd_op);
+ if (hnae3_check_roh_mac_type(priv->ae_handle) &&
+ hns3_roh_check_udpv4(skb)) {
+ struct hns3_dhcp_packet *dhcp;
+ int dhcp_len;
+
+ dhcp = hns3_get_dhcp_packet(skb, &dhcp_len);
+ hns3_dhcp_packet_convert(priv, skb, dhcp, dhcp_len);
+ }
+
/* Set txbd */
desc->tx.ol_type_vlan_len_msec =
cpu_to_le32(param.ol_type_vlan_len_msec);
@@ -2338,15 +2482,16 @@ static int hns3_handle_desc_filling(struct hns3_enet_ring *ring,
return hns3_fill_skb_to_desc(ring, skb, DESC_TYPE_SKB);
}
-static int hns3_handle_skb_desc(struct hns3_enet_ring *ring,
+static int hns3_handle_skb_desc(struct hns3_nic_priv *priv,
+ struct hns3_enet_ring *ring,
struct sk_buff *skb,
struct hns3_desc_cb *desc_cb,
int next_to_use_head)
{
int ret;
- ret = hns3_fill_skb_desc(ring, skb, &ring->desc[ring->next_to_use],
- desc_cb);
+ ret = hns3_fill_skb_desc(priv, ring, skb,
+ &ring->desc[ring->next_to_use], desc_cb);
if (unlikely(ret < 0))
goto fill_err;
@@ -2395,7 +2540,7 @@ netdev_tx_t hns3_nic_net_xmit(struct sk_buff *skb, struct net_device *netdev)
goto out_err_tx_ok;
}
- ret = hns3_handle_skb_desc(ring, skb, desc_cb, ring->next_to_use);
+ ret = hns3_handle_skb_desc(priv, ring, skb, desc_cb, ring->next_to_use);
if (unlikely(ret <= 0))
goto out_err_tx_ok;
@@ -5226,6 +5371,10 @@ static int hns3_init_mac_addr(struct net_device *netdev)
return 0;
}
+ if (hnae3_check_roh_mac_type(h) &&
+ is_zero_ether_addr(priv->roh_perm_mac))
+ ether_addr_copy(priv->roh_perm_mac, netdev->dev_addr);
+
if (h->ae_algo->ops->set_mac_addr)
ret = h->ae_algo->ops->set_mac_addr(h, netdev->dev_addr, true);
@@ -5377,6 +5526,7 @@ static int hns3_client_init(struct hnae3_handle *handle)
priv->tx_timeout_count = 0;
priv->max_non_tso_bd_num = ae_dev->dev_specs.max_non_tso_bd_num;
set_bit(HNS3_NIC_STATE_DOWN, &priv->state);
+ eth_zero_addr(priv->roh_perm_mac);
handle->msg_enable = netif_msg_init(debug, DEFAULT_MSG_LEVEL);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index ccfd38b00..85c352fff 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -604,6 +604,56 @@ struct hns3_nic_priv {
struct hns3_enet_coalesce rx_coal;
u32 tx_copybreak;
u32 rx_copybreak;
+ u8 roh_perm_mac[ETH_ALEN];
+};
+
+#define HNS3_DHCP_SERVER_PORT 68
+#define HNS3_DHCP_CLIENT_PORT 67
+#define HNS3_DHCP_MAGIC 0x63825363
+#define DHCP_OPT_CODE 0
+#define DHCP_OPT_LEN 1
+#define DHCP_OPT_DATA 2
+#define DHCP_CLIENT_ID_LEN 7
+#define DHCP_CLIENT_ID_MAC_OFT 3
+#define DHCP_OVERLOAD_FILE 0x1
+#define DHCP_OVERLOAD_SNAME 0x2
+#define DHCP_OVERLOAD_FILE_USED 0x101
+#define DHCP_OVERLOAD_SNAME_USED 0x202
+#define DHCP_OVERLOAD_USE_FILE(x) \
+ (((x) & DHCP_OVERLOAD_FILE_USED) == DHCP_OVERLOAD_FILE)
+#define DHCP_OVERLOAD_USE_SNAME(x) \
+ (((x) & DHCP_OVERLOAD_SNAME_USED) == DHCP_OVERLOAD_SNAME)
+
+enum DHCP_OPTION_CODES {
+ DHCP_OPT_PADDING = 0,
+ DHCP_OPT_OVERLOAD = 52,
+ DHCP_OPT_CLIENT_ID = 61,
+ DHCP_OPT_END = 255
+};
+
+struct hns3_dhcp_packet {
+ u8 op;
+ u8 htype;
+ u8 hlen;
+ u8 hops;
+ u32 xid;
+ u16 secs;
+ u16 flags;
+ u32 ciaddr;
+ u32 yiaddr;
+ u32 siaddr_nip;
+ u32 gateway_nip;
+ u8 chaddr[16]; /* link-layer client hardware address (MAC) */
+ u8 sname[64];
+ u8 file[128];
+ u32 cookie; /* DHCP magic bytes: 0x63825363 */
+ u8 options[312];
+};
+
+struct hns3_dhcp_opt_state {
+ u8 *opt_ptr; /* refer to current option item */
+ int rem; /* remain bytes in options */
+ u32 overload_flag; /* whether use file and sname field as options */
};
union l3_hdr_info {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index eea175484..947ffde02 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2866,12 +2866,28 @@ static void hclge_get_fec(struct hnae3_handle *handle, u8 *fec_ability,
if (fec_mode)
*fec_mode = mac->fec_mode;
}
+
+static void hclge_roh_convert_mac_addr(struct hclge_dev *hdev)
+{
+#define HCLGE_ROH_EID_MASK_BYTE 3
+
+ struct hclge_vport *vport = &hdev->vport[0];
+ struct hnae3_handle *handle = &vport->nic;
+
+ if (hnae3_check_roh_mac_type(handle)) {
+ if (!is_valid_ether_addr(hdev->hw.mac.mac_addr))
+ random_ether_addr(hdev->hw.mac.mac_addr);
+ memset(hdev->hw.mac.mac_addr, 0, HCLGE_ROH_EID_MASK_BYTE);
+ }
+}
+
static int hclge_mac_init(struct hclge_dev *hdev)
{
struct hclge_mac *mac = &hdev->hw.mac;
int ret;
hclge_mac_type_init(hdev);
+ hclge_roh_convert_mac_addr(hdev);
hdev->support_sfp_query = true;
hdev->hw.mac.duplex = HCLGE_MAC_FULL;
--
2.30.0
1
0

[PATCH openEuler-1.0-LTS] power: supply: da9150: Fix use after free bug in da9150_charger_remove due to race condition
by Yongqiang Liu 25 Apr '23
by Yongqiang Liu 25 Apr '23
25 Apr '23
From: Zheng Wang <zyytlz.wz(a)163.com>
stable inclusion
from stable-v4.19.280
commit 533d915899b4a5a7b5b5a99eec24b2920ccd1f11
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6W80A
CVE: CVE-2023-30772
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 06615d11cc78162dfd5116efb71f29eb29502d37 ]
In da9150_charger_probe, &charger->otg_work is bound with
da9150_charger_otg_work. da9150_charger_otg_ncb may be
called to start the work.
If we remove the module which will call da9150_charger_remove
to make cleanup, there may be a unfinished work. The possible
sequence is as follows:
Fix it by canceling the work before cleanup in the da9150_charger_remove
CPU0 CPUc1
|da9150_charger_otg_work
da9150_charger_remove |
power_supply_unregister |
device_unregister |
power_supply_dev_release|
kfree(psy) |
|
| power_supply_changed(charger->usb);
| //use
Fixes: c1a281e34dae ("power: Add support for DA9150 Charger")
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Guo Mengqi <guomengqi3(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/power/supply/da9150-charger.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/power/supply/da9150-charger.c b/drivers/power/supply/da9150-charger.c
index 60099815296e..b2d38eb32288 100644
--- a/drivers/power/supply/da9150-charger.c
+++ b/drivers/power/supply/da9150-charger.c
@@ -666,6 +666,7 @@ static int da9150_charger_remove(struct platform_device *pdev)
if (!IS_ERR_OR_NULL(charger->usb_phy))
usb_unregister_notifier(charger->usb_phy, &charger->otg_nb);
+ cancel_work_sync(&charger->otg_work);
power_supply_unregister(charger->battery);
power_supply_unregister(charger->usb);
--
2.25.1
1
0

24 Apr '23
From: Luoyouming <luoyouming(a)huawei.com>
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WAZI
---------------------------------------------------------------
Support driver gets the num_xrcds and reserved_xrcds from firmware.
Signed-off-by: Luoyouming <luoyouming(a)huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang(a)huawei.com>
Reviewed-by: Chengchang Tang <tangchengchang(a)huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 5 ++---
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 4 ++--
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 36397517e679..042e19fa4039 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -2310,9 +2310,6 @@ static void apply_func_caps(struct hns_roce_dev *hr_dev)
caps->qpc_timer_hop_num = HNS_ROCE_HOP_NUM_0;
caps->cqc_timer_hop_num = HNS_ROCE_HOP_NUM_0;
- caps->num_xrcds = HNS_ROCE_V2_MAX_XRCD_NUM;
- caps->reserved_xrcds = HNS_ROCE_V2_RSV_XRCD_NUM;
-
caps->num_mtt_segs = HNS_ROCE_V2_MAX_MTT_SEGS;
caps->num_srqwqe_segs = HNS_ROCE_V2_MAX_SRQWQE_SEGS;
caps->num_idx_segs = HNS_ROCE_V2_MAX_IDX_SEGS;
@@ -2440,6 +2437,7 @@ static int hns_roce_query_caps(struct hns_roce_dev *hr_dev)
caps->num_cqs = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_CQS);
caps->gid_table_len[0] = hr_reg_read(resp_c, PF_CAPS_C_MAX_GID);
caps->max_cqes = 1 << hr_reg_read(resp_c, PF_CAPS_C_CQ_DEPTH);
+ caps->num_xrcds = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_XRCDS);
caps->num_mtpts = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_MRWS);
caps->num_qps = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_QPS);
caps->max_qp_init_rdma = hr_reg_read(resp_c, PF_CAPS_C_MAX_ORD);
@@ -2461,6 +2459,7 @@ static int hns_roce_query_caps(struct hns_roce_dev *hr_dev)
caps->reserved_mrws = hr_reg_read(resp_e, PF_CAPS_E_RSV_MRWS);
caps->chunk_sz = 1 << hr_reg_read(resp_e, PF_CAPS_E_CHUNK_SIZE_SHIFT);
caps->reserved_cqs = hr_reg_read(resp_e, PF_CAPS_E_RSV_CQS);
+ caps->reserved_xrcds = hr_reg_read(resp_e, PF_CAPS_E_RSV_XRCDS);
caps->reserved_srqs = hr_reg_read(resp_e, PF_CAPS_E_RSV_SRQS);
caps->reserved_lkey = hr_reg_read(resp_e, PF_CAPS_E_RSV_LKEYS);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index 1605a093dae8..6223cc66f065 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -42,8 +42,6 @@
#define HNS_ROCE_V2_MAX_MTT_SEGS 0x1000000
#define HNS_ROCE_V2_MAX_SRQWQE_SEGS 0x1000000
#define HNS_ROCE_V2_MAX_IDX_SEGS 0x1000000
-#define HNS_ROCE_V2_MAX_XRCD_NUM 0x1000000
-#define HNS_ROCE_V2_RSV_XRCD_NUM 0
#define HNS_ROCE_V3_SCCC_SZ 64
#define HNS_ROCE_V3_GMV_ENTRY_SZ 32
@@ -1214,6 +1212,7 @@ struct hns_roce_query_pf_caps_c {
#define PF_CAPS_C_NUM_CQS PF_CAPS_C_FIELD_LOC(51, 32)
#define PF_CAPS_C_MAX_GID PF_CAPS_C_FIELD_LOC(60, 52)
#define PF_CAPS_C_CQ_DEPTH PF_CAPS_C_FIELD_LOC(86, 64)
+#define PF_CAPS_C_NUM_XRCDS PF_CAPS_C_FIELD_LOC(91, 87)
#define PF_CAPS_C_NUM_MRWS PF_CAPS_C_FIELD_LOC(115, 96)
#define PF_CAPS_C_NUM_QPS PF_CAPS_C_FIELD_LOC(147, 128)
#define PF_CAPS_C_MAX_ORD PF_CAPS_C_FIELD_LOC(155, 148)
@@ -1273,6 +1272,7 @@ struct hns_roce_query_pf_caps_e {
#define PF_CAPS_E_RSV_MRWS PF_CAPS_E_FIELD_LOC(19, 0)
#define PF_CAPS_E_CHUNK_SIZE_SHIFT PF_CAPS_E_FIELD_LOC(31, 20)
#define PF_CAPS_E_RSV_CQS PF_CAPS_E_FIELD_LOC(51, 32)
+#define PF_CAPS_E_RSV_XRCDS PF_CAPS_E_FIELD_LOC(63, 52)
#define PF_CAPS_E_RSV_SRQS PF_CAPS_E_FIELD_LOC(83, 64)
#define PF_CAPS_E_RSV_LKEYS PF_CAPS_E_FIELD_LOC(115, 96)
--
2.30.0
1
0

[PATCH openEuler-1.0-LTS] i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer()
by Yongqiang Liu 23 Apr '23
by Yongqiang Liu 23 Apr '23
23 Apr '23
From: Wei Chen <harperchen1110(a)gmail.com>
mainline inclusion
from mainline-v6.3-rc4
commit 92fbb6d1296f81f41f65effd7f5f8c0f74943d15
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6XHPL
CVE: CVE-2023-2194
--------------------------------
The data->block[0] variable comes from user and is a number between
0-255. Without proper check, the variable may be very large to cause
an out-of-bounds when performing memcpy in slimpro_i2c_blkwr.
Fix this bug by checking the value of writelen.
Fixes: f6505fbabc42 ("i2c: add SLIMpro I2C device driver on APM X-Gene platform")
Signed-off-by: Wei Chen <harperchen1110(a)gmail.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Andi Shyti <andi.shyti(a)kernel.org>
Signed-off-by: Wolfram Sang <wsa(a)kernel.org>
Signed-off-by: Yang Jihong <yangjihong1(a)huawei.com>
Reviewed-by: Xu Kuohai <xukuohai(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/i2c/busses/i2c-xgene-slimpro.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/i2c/busses/i2c-xgene-slimpro.c b/drivers/i2c/busses/i2c-xgene-slimpro.c
index a7ac746018ad..7a746f413535 100644
--- a/drivers/i2c/busses/i2c-xgene-slimpro.c
+++ b/drivers/i2c/busses/i2c-xgene-slimpro.c
@@ -321,6 +321,9 @@ static int slimpro_i2c_blkwr(struct slimpro_i2c_dev *ctx, u32 chip,
u32 msg[3];
int rc;
+ if (writelen > I2C_SMBUS_BLOCK_MAX)
+ return -EINVAL;
+
memcpy(ctx->dma_buffer, data, writelen);
paddr = dma_map_single(ctx->dev, ctx->dma_buffer, writelen,
DMA_TO_DEVICE);
--
2.25.1
1
0

[openEuler-22.03-LTS] rxrpc: Fix race between conn bundle lookup and bundle removal [ZDI-CAN-15975]
by Wang Yufen 23 Apr '23
by Wang Yufen 23 Apr '23
23 Apr '23
From: David Howells <dhowells(a)redhat.com>
stable inclusion
from stable-v5.10.157
commit 3535c632e6d16c98f76e615da8dc0cb2750c66cc
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6VK2H
CVE: CVE-2023-2006
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 3bcd6c7eaa53b56c3f584da46a1f7652e759d0e5 ]
After rxrpc_unbundle_conn() has removed a connection from a bundle, it
checks to see if there are any conns with available channels and, if not,
removes and attempts to destroy the bundle.
Whilst it does check after grabbing client_bundles_lock that there are no
connections attached, this races with rxrpc_look_up_bundle() retrieving the
bundle, but not attaching a connection for the connection to be attached
later.
There is therefore a window in which the bundle can get destroyed before we
manage to attach a new connection to it.
Fix this by adding an "active" counter to struct rxrpc_bundle:
(1) rxrpc_connect_call() obtains an active count by prepping/looking up a
bundle and ditches it before returning.
(2) If, during rxrpc_connect_call(), a connection is added to the bundle,
this obtains an active count, which is held until the connection is
discarded.
(3) rxrpc_deactivate_bundle() is created to drop an active count on a
bundle and destroy it when the active count reaches 0. The active
count is checked inside client_bundles_lock() to prevent a race with
rxrpc_look_up_bundle().
(4) rxrpc_unbundle_conn() then calls rxrpc_deactivate_bundle().
Fixes: 245500d853e9 ("rxrpc: Rewrite the client connection manager")
Reported-by: zdi-disclosures(a)trendmicro.com # ZDI-CAN-15975
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: zdi-disclosures(a)trendmicro.com
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: linux-afs(a)lists.infradead.org
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Conflicts:
net/rxrpc/ar-internal.h
net/rxrpc/conn_client.c
Signed-off-by: Wang Yufen <wangyufen(a)huawei.com>
---
net/rxrpc/ar-internal.h | 1 +
net/rxrpc/conn_client.c | 36 ++++++++++++++++++++++--------------
2 files changed, 23 insertions(+), 14 deletions(-)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index ccb6541..90ab125 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -392,6 +392,7 @@ enum rxrpc_conn_proto_state {
struct rxrpc_bundle {
struct rxrpc_conn_parameters params;
atomic_t usage;
+ atomic_t active; /* Number of active users */
unsigned int debug_id;
bool try_upgrade; /* True if the bundle is attempting upgrade */
bool alloc_conn; /* True if someone's getting a conn */
diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c
index f5fb223..34430cc 100644
--- a/net/rxrpc/conn_client.c
+++ b/net/rxrpc/conn_client.c
@@ -40,6 +40,8 @@
DEFINE_IDR(rxrpc_client_conn_ids);
static DEFINE_SPINLOCK(rxrpc_conn_id_lock);
+static void rxrpc_deactivate_bundle(struct rxrpc_bundle *bundle);
+
/*
* Get a connection ID and epoch for a client connection from the global pool.
* The connection struct pointer is then recorded in the idr radix tree. The
@@ -123,6 +125,7 @@ static struct rxrpc_bundle *rxrpc_alloc_bundle(struct rxrpc_conn_parameters *cp,
bundle->params = *cp;
rxrpc_get_peer(bundle->params.peer);
atomic_set(&bundle->usage, 1);
+ atomic_set(&bundle->active, 1);
spin_lock_init(&bundle->channel_lock);
INIT_LIST_HEAD(&bundle->waiting_calls);
}
@@ -341,6 +344,7 @@ static struct rxrpc_bundle *rxrpc_look_up_bundle(struct rxrpc_conn_parameters *c
rxrpc_free_bundle(candidate);
found_bundle:
rxrpc_get_bundle(bundle);
+ atomic_inc(&bundle->active);
spin_unlock(&local->client_bundles_lock);
_leave(" = %u [found]", bundle->debug_id);
return bundle;
@@ -438,6 +442,7 @@ static void rxrpc_add_conn_to_bundle(struct rxrpc_bundle *bundle, gfp_t gfp)
if (old)
trace_rxrpc_client(old, -1, rxrpc_client_replace);
candidate->bundle_shift = shift;
+ atomic_inc(&bundle->active);
bundle->conns[i] = candidate;
for (j = 0; j < RXRPC_MAXCALLS; j++)
set_bit(shift + j, &bundle->avail_chans);
@@ -728,6 +733,7 @@ int rxrpc_connect_call(struct rxrpc_sock *rx,
smp_rmb();
out_put_bundle:
+ rxrpc_deactivate_bundle(bundle);
rxrpc_put_bundle(bundle);
out:
_leave(" = %d", ret);
@@ -903,9 +909,8 @@ void rxrpc_disconnect_client_call(struct rxrpc_bundle *bundle, struct rxrpc_call
static void rxrpc_unbundle_conn(struct rxrpc_connection *conn)
{
struct rxrpc_bundle *bundle = conn->bundle;
- struct rxrpc_local *local = bundle->params.local;
unsigned int bindex;
- bool need_drop = false, need_put = false;
+ bool need_drop = false;
int i;
_enter("C=%x", conn->debug_id);
@@ -924,15 +929,22 @@ static void rxrpc_unbundle_conn(struct rxrpc_connection *conn)
}
spin_unlock(&bundle->channel_lock);
- /* If there are no more connections, remove the bundle */
- if (!bundle->avail_chans) {
- _debug("maybe unbundle");
- spin_lock(&local->client_bundles_lock);
+ if (need_drop) {
+ rxrpc_deactivate_bundle(bundle);
+ rxrpc_put_connection(conn);
+ }
+}
- for (i = 0; i < ARRAY_SIZE(bundle->conns); i++)
- if (bundle->conns[i])
- break;
- if (i == ARRAY_SIZE(bundle->conns) && !bundle->params.exclusive) {
+/*
+ * Drop the active count on a bundle.
+ */
+static void rxrpc_deactivate_bundle(struct rxrpc_bundle *bundle)
+{
+ struct rxrpc_local *local = bundle->params.local;
+ bool need_put = false;
+
+ if (atomic_dec_and_lock(&bundle->active, &local->client_bundles_lock)) {
+ if (!bundle->params.exclusive) {
_debug("erase bundle");
rb_erase(&bundle->local_node, &local->client_bundles);
need_put = true;
@@ -942,10 +954,6 @@ static void rxrpc_unbundle_conn(struct rxrpc_connection *conn)
if (need_put)
rxrpc_put_bundle(bundle);
}
-
- if (need_drop)
- rxrpc_put_connection(conn);
- _leave("");
}
/*
--
1.8.3.1
1
0

[OLK-5.10] rxrpc: Fix race between conn bundle lookup and bundle removal [ZDI-CAN-15975]
by Wang Yufen 23 Apr '23
by Wang Yufen 23 Apr '23
23 Apr '23
From: David Howells <dhowells(a)redhat.com>
stable inclusion
from stable-v5.10.157
commit 3535c632e6d16c98f76e615da8dc0cb2750c66cc
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6VK2H
CVE: CVE-2023-2006
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 3bcd6c7eaa53b56c3f584da46a1f7652e759d0e5 ]
After rxrpc_unbundle_conn() has removed a connection from a bundle, it
checks to see if there are any conns with available channels and, if not,
removes and attempts to destroy the bundle.
Whilst it does check after grabbing client_bundles_lock that there are no
connections attached, this races with rxrpc_look_up_bundle() retrieving the
bundle, but not attaching a connection for the connection to be attached
later.
There is therefore a window in which the bundle can get destroyed before we
manage to attach a new connection to it.
Fix this by adding an "active" counter to struct rxrpc_bundle:
(1) rxrpc_connect_call() obtains an active count by prepping/looking up a
bundle and ditches it before returning.
(2) If, during rxrpc_connect_call(), a connection is added to the bundle,
this obtains an active count, which is held until the connection is
discarded.
(3) rxrpc_deactivate_bundle() is created to drop an active count on a
bundle and destroy it when the active count reaches 0. The active
count is checked inside client_bundles_lock() to prevent a race with
rxrpc_look_up_bundle().
(4) rxrpc_unbundle_conn() then calls rxrpc_deactivate_bundle().
Fixes: 245500d853e9 ("rxrpc: Rewrite the client connection manager")
Reported-by: zdi-disclosures(a)trendmicro.com # ZDI-CAN-15975
Signed-off-by: David Howells <dhowells(a)redhat.com>
Tested-by: zdi-disclosures(a)trendmicro.com
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: linux-afs(a)lists.infradead.org
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Conflicts:
net/rxrpc/ar-internal.h
net/rxrpc/conn_client.c
Signed-off-by: Wang Yufen <wangyufen(a)huawei.com>
---
net/rxrpc/ar-internal.h | 1 +
net/rxrpc/conn_client.c | 36 ++++++++++++++++++++++--------------
2 files changed, 23 insertions(+), 14 deletions(-)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index ccb6541..90ab125 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -392,6 +392,7 @@ enum rxrpc_conn_proto_state {
struct rxrpc_bundle {
struct rxrpc_conn_parameters params;
atomic_t usage;
+ atomic_t active; /* Number of active users */
unsigned int debug_id;
bool try_upgrade; /* True if the bundle is attempting upgrade */
bool alloc_conn; /* True if someone's getting a conn */
diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c
index f5fb223..34430cc 100644
--- a/net/rxrpc/conn_client.c
+++ b/net/rxrpc/conn_client.c
@@ -40,6 +40,8 @@
DEFINE_IDR(rxrpc_client_conn_ids);
static DEFINE_SPINLOCK(rxrpc_conn_id_lock);
+static void rxrpc_deactivate_bundle(struct rxrpc_bundle *bundle);
+
/*
* Get a connection ID and epoch for a client connection from the global pool.
* The connection struct pointer is then recorded in the idr radix tree. The
@@ -123,6 +125,7 @@ static struct rxrpc_bundle *rxrpc_alloc_bundle(struct rxrpc_conn_parameters *cp,
bundle->params = *cp;
rxrpc_get_peer(bundle->params.peer);
atomic_set(&bundle->usage, 1);
+ atomic_set(&bundle->active, 1);
spin_lock_init(&bundle->channel_lock);
INIT_LIST_HEAD(&bundle->waiting_calls);
}
@@ -341,6 +344,7 @@ static struct rxrpc_bundle *rxrpc_look_up_bundle(struct rxrpc_conn_parameters *c
rxrpc_free_bundle(candidate);
found_bundle:
rxrpc_get_bundle(bundle);
+ atomic_inc(&bundle->active);
spin_unlock(&local->client_bundles_lock);
_leave(" = %u [found]", bundle->debug_id);
return bundle;
@@ -438,6 +442,7 @@ static void rxrpc_add_conn_to_bundle(struct rxrpc_bundle *bundle, gfp_t gfp)
if (old)
trace_rxrpc_client(old, -1, rxrpc_client_replace);
candidate->bundle_shift = shift;
+ atomic_inc(&bundle->active);
bundle->conns[i] = candidate;
for (j = 0; j < RXRPC_MAXCALLS; j++)
set_bit(shift + j, &bundle->avail_chans);
@@ -728,6 +733,7 @@ int rxrpc_connect_call(struct rxrpc_sock *rx,
smp_rmb();
out_put_bundle:
+ rxrpc_deactivate_bundle(bundle);
rxrpc_put_bundle(bundle);
out:
_leave(" = %d", ret);
@@ -903,9 +909,8 @@ void rxrpc_disconnect_client_call(struct rxrpc_bundle *bundle, struct rxrpc_call
static void rxrpc_unbundle_conn(struct rxrpc_connection *conn)
{
struct rxrpc_bundle *bundle = conn->bundle;
- struct rxrpc_local *local = bundle->params.local;
unsigned int bindex;
- bool need_drop = false, need_put = false;
+ bool need_drop = false;
int i;
_enter("C=%x", conn->debug_id);
@@ -924,15 +929,22 @@ static void rxrpc_unbundle_conn(struct rxrpc_connection *conn)
}
spin_unlock(&bundle->channel_lock);
- /* If there are no more connections, remove the bundle */
- if (!bundle->avail_chans) {
- _debug("maybe unbundle");
- spin_lock(&local->client_bundles_lock);
+ if (need_drop) {
+ rxrpc_deactivate_bundle(bundle);
+ rxrpc_put_connection(conn);
+ }
+}
- for (i = 0; i < ARRAY_SIZE(bundle->conns); i++)
- if (bundle->conns[i])
- break;
- if (i == ARRAY_SIZE(bundle->conns) && !bundle->params.exclusive) {
+/*
+ * Drop the active count on a bundle.
+ */
+static void rxrpc_deactivate_bundle(struct rxrpc_bundle *bundle)
+{
+ struct rxrpc_local *local = bundle->params.local;
+ bool need_put = false;
+
+ if (atomic_dec_and_lock(&bundle->active, &local->client_bundles_lock)) {
+ if (!bundle->params.exclusive) {
_debug("erase bundle");
rb_erase(&bundle->local_node, &local->client_bundles);
need_put = true;
@@ -942,10 +954,6 @@ static void rxrpc_unbundle_conn(struct rxrpc_connection *conn)
if (need_put)
rxrpc_put_bundle(bundle);
}
-
- if (need_drop)
- rxrpc_put_connection(conn);
- _leave("");
}
/*
--
1.8.3.1
1
0

[PATCH OLK-5.10 v2] KVM: nVMX: Inject #GP, not #UD, if "generic" VMXON CR0/CR4 check fails
by 任敏敏(联通集团联通数字科技有 限公司本部) 23 Apr '23
by 任敏敏(联通集团联通数字科技有 限公司本部) 23 Apr '23
23 Apr '23
From: Sean Christopherson <seanjc(a)google.com>
stable inclusion
from stable-v5.10.163
commit 43dd254853aa274d356bb1d3ab01a8ca880b07b0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6XQ34
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 9cc409325ddd776f6fd6293d5ce93ce1248af6e4 upstream
Inject #GP for if VMXON is attempting with a CR0/CR4 that fails the
generic "is CRx valid" check, but passes the CR4.VMXE check, and do the
generic checks _after_ handling the post-VMXON VM-Fail.
The CR4.VMXE check, and all other #UD cases, are special pre-conditions
that are enforced prior to pivoting on the current VMX mode, i.e. occur
before interception if VMXON is attempted in VMX non-root mode.
All other CR0/CR4 checks generate #GP and effectively have lower priority
than the post-VMXON check.
Per the SDM:
IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ...
THEN #UD;
ELSIF not in VMX operation
THEN
IF (CPL > 0) or (in A20M mode) or
(the values of CR0 and CR4 are not supported in VMX operation)
THEN #GP(0);
ELSIF in VMX non-root operation
THEN VMexit;
ELSIF CPL > 0
THEN #GP(0);
ELSE VMfail("VMXON executed in VMX root operation");
FI;
which, if re-written without ELSIF, yields:
IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ...
THEN #UD
IF in VMX non-root operation
THEN VMexit;
IF CPL > 0
THEN #GP(0)
IF in VMX operation
THEN VMfail("VMXON executed in VMX root operation");
IF (in A20M mode) or
(the values of CR0 and CR4 are not supported in VMX operation)
THEN #GP(0);
Note, KVM unconditionally forwards VMXON VM-Exits that occur in L2 to L1,
i.e. there is no need to check the vCPU is not in VMX non-root mode. Add
a comment to explain why unconditionally forwarding such exits is
functionally correct.
Reported-by: Eric Li <ercli(a)ucdavis.edu>
Fixes: c7d855c2aff2 ("KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20221006001956.329314-1-seanjc@google.com
Signed-off-by: rminmin <renmm6(a)chinaunicom.cn>
---
arch/x86/kvm/vmx/nested.c | 44 +++++++++++++++++++++++++++++----------
1 file changed, 33 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9003b14d72ca..40225b770942 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4899,24 +4899,35 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
| FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
/*
- * Note, KVM cannot rely on hardware to perform the CR0/CR4 #UD checks
- * that have higher priority than VM-Exit (see Intel SDM's pseudocode
- * for VMXON), as KVM must load valid CR0/CR4 values into hardware while
- * running the guest, i.e. KVM needs to check the _guest_ values.
+ * Manually check CR4.VMXE checks, KVM must force CR4.VMXE=1 to enter
+ * the guest and so cannot rely on hardware to perform the check,
+ * which has higher priority than VM-Exit (see Intel SDM's pseudocode
+ * for VMXON).
*
- * Rely on hardware for the other two pre-VM-Exit checks, !VM86 and
- * !COMPATIBILITY modes. KVM may run the guest in VM86 to emulate Real
- * Mode, but KVM will never take the guest out of those modes.
+ * Rely on hardware for the other pre-VM-Exit checks, CR0.PE=1, !VM86
+ * and !COMPATIBILITY modes. For an unrestricted guest, KVM doesn't
+ * force any of the relevant guest state. For a restricted guest, KVM
+ * does force CR0.PE=1, but only to also force VM86 in order to emulate
+ * Real Mode, and so there's no need to check CR0.PE manually.
*/
- if (!nested_host_cr0_valid(vcpu, kvm_read_cr0(vcpu)) ||
- !nested_host_cr4_valid(vcpu, kvm_read_cr4(vcpu))) {
+ if (!kvm_read_cr4_bits(vcpu, X86_CR4_VMXE)) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
/*
- * CPL=0 and all other checks that are lower priority than VM-Exit must
- * be checked manually.
+ * The CPL is checked for "not in VMX operation" and for "in VMX root",
+ * and has higher priority than the VM-Fail due to being post-VMXON,
+ * i.e. VMXON #GPs outside of VMX non-root if CPL!=0. In VMX non-root,
+ * VMXON causes VM-Exit and KVM unconditionally forwards VMXON VM-Exits
+ * from L2 to L1, i.e. there's no need to check for the vCPU being in
+ * VMX non-root.
+ *
+ * Forwarding the VM-Exit unconditionally, i.e. without performing the
+ * #UD checks (see above), is functionally ok because KVM doesn't allow
+ * L1 to run L2 without CR4.VMXE=0, and because KVM never modifies L2's
+ * CR0 or CR4, i.e. it's L2's responsibility to emulate #UDs that are
+ * missed by hardware due to shadowing CR0 and/or CR4.
*/
if (vmx_get_cpl(vcpu)) {
kvm_inject_gp(vcpu, 0);
@@ -4926,6 +4937,17 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
if (vmx->nested.vmxon)
return nested_vmx_fail(vcpu, VMXERR_VMXON_IN_VMX_ROOT_OPERATION);
+ /*
+ * Invalid CR0/CR4 generates #GP. These checks are performed if and
+ * only if the vCPU isn't already in VMX operation, i.e. effectively
+ * have lower priority than the VM-Fail above.
+ */
+ if (!nested_host_cr0_valid(vcpu, kvm_read_cr0(vcpu)) ||
+ !nested_host_cr4_valid(vcpu, kvm_read_cr4(vcpu))) {
+ kvm_inject_gp(vcpu, 0);
+ return 1;
+ }
+
if ((vmx->msr_ia32_feature_control & VMXON_NEEDED_FEATURES)
!= VMXON_NEEDED_FEATURES) {
kvm_inject_gp(vcpu, 0);
--
2.33.0
Èç¹ûÄúŽíÎóœÓÊÕÁËžÃÓÊŒþ£¬ÇëÍš¹ýµç×ÓÓÊŒþÁ¢ŒŽÍšÖªÎÒÃÇ¡£Çë»ØžŽÓÊŒþµœ hqs-spmc@chinaunicom.cn£¬ŒŽ¿ÉÒÔÍ˶©ŽËÓÊŒþ¡£ÎÒÃÇœ«Á¢ŒŽœ«ÄúµÄÐÅÏ¢ŽÓÎÒÃǵķ¢ËÍÄ¿ÂŒÖÐÉŸ³ý¡£ If you have received this email in error please notify us immediately by e-mail. Please reply to hqs-spmc(a)chinaunicom.cn ,you can unsubscribe from this mail. We will immediately remove your information from send catalogue of our.
1
0

[PATCH OLK-5.10] KVM: nVMX: Inject #GP, not #UD, if "generic" VMXON CR0/CR4 check fails
by 任敏敏(联通集团联通数字科技有 限公司本部) 23 Apr '23
by 任敏敏(联通集团联通数字科技有 限公司本部) 23 Apr '23
23 Apr '23
From: Sean Christopherson <seanjc(a)google.com>
stable inclusion
from stable-v6.2
commit 9cc409325ddd776f6fd6293d5ce93ce1248af6e4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6XQ34
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 9cc409325ddd776f6fd6293d5ce93ce1248af6e4 upstream
Inject #GP for if VMXON is attempting with a CR0/CR4 that fails the
generic "is CRx valid" check, but passes the CR4.VMXE check, and do the
generic checks _after_ handling the post-VMXON VM-Fail.
The CR4.VMXE check, and all other #UD cases, are special pre-conditions
that are enforced prior to pivoting on the current VMX mode, i.e. occur
before interception if VMXON is attempted in VMX non-root mode.
All other CR0/CR4 checks generate #GP and effectively have lower priority
than the post-VMXON check.
Per the SDM:
IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ...
THEN #UD;
ELSIF not in VMX operation
THEN
IF (CPL > 0) or (in A20M mode) or
(the values of CR0 and CR4 are not supported in VMX operation)
THEN #GP(0);
ELSIF in VMX non-root operation
THEN VMexit;
ELSIF CPL > 0
THEN #GP(0);
ELSE VMfail("VMXON executed in VMX root operation");
FI;
which, if re-written without ELSIF, yields:
IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ...
THEN #UD
IF in VMX non-root operation
THEN VMexit;
IF CPL > 0
THEN #GP(0)
IF in VMX operation
THEN VMfail("VMXON executed in VMX root operation");
IF (in A20M mode) or
(the values of CR0 and CR4 are not supported in VMX operation)
THEN #GP(0);
Note, KVM unconditionally forwards VMXON VM-Exits that occur in L2 to L1,
i.e. there is no need to check the vCPU is not in VMX non-root mode. Add
a comment to explain why unconditionally forwarding such exits is
functionally correct.
Reported-by: Eric Li <ercli(a)ucdavis.edu>
Fixes: c7d855c2aff2 ("KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20221006001956.329314-1-seanjc@google.com
Signed-off-by: rminmin <renmm6(a)chinaunicom.cn>
---
arch/x86/kvm/vmx/nested.c | 44 +++++++++++++++++++++++++++++----------
1 file changed, 33 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9003b14d72ca..40225b770942 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4899,24 +4899,35 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
| FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
/*
- * Note, KVM cannot rely on hardware to perform the CR0/CR4 #UD checks
- * that have higher priority than VM-Exit (see Intel SDM's pseudocode
- * for VMXON), as KVM must load valid CR0/CR4 values into hardware while
- * running the guest, i.e. KVM needs to check the _guest_ values.
+ * Manually check CR4.VMXE checks, KVM must force CR4.VMXE=1 to enter
+ * the guest and so cannot rely on hardware to perform the check,
+ * which has higher priority than VM-Exit (see Intel SDM's pseudocode
+ * for VMXON).
*
- * Rely on hardware for the other two pre-VM-Exit checks, !VM86 and
- * !COMPATIBILITY modes. KVM may run the guest in VM86 to emulate Real
- * Mode, but KVM will never take the guest out of those modes.
+ * Rely on hardware for the other pre-VM-Exit checks, CR0.PE=1, !VM86
+ * and !COMPATIBILITY modes. For an unrestricted guest, KVM doesn't
+ * force any of the relevant guest state. For a restricted guest, KVM
+ * does force CR0.PE=1, but only to also force VM86 in order to emulate
+ * Real Mode, and so there's no need to check CR0.PE manually.
*/
- if (!nested_host_cr0_valid(vcpu, kvm_read_cr0(vcpu)) ||
- !nested_host_cr4_valid(vcpu, kvm_read_cr4(vcpu))) {
+ if (!kvm_read_cr4_bits(vcpu, X86_CR4_VMXE)) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
/*
- * CPL=0 and all other checks that are lower priority than VM-Exit must
- * be checked manually.
+ * The CPL is checked for "not in VMX operation" and for "in VMX root",
+ * and has higher priority than the VM-Fail due to being post-VMXON,
+ * i.e. VMXON #GPs outside of VMX non-root if CPL!=0. In VMX non-root,
+ * VMXON causes VM-Exit and KVM unconditionally forwards VMXON VM-Exits
+ * from L2 to L1, i.e. there's no need to check for the vCPU being in
+ * VMX non-root.
+ *
+ * Forwarding the VM-Exit unconditionally, i.e. without performing the
+ * #UD checks (see above), is functionally ok because KVM doesn't allow
+ * L1 to run L2 without CR4.VMXE=0, and because KVM never modifies L2's
+ * CR0 or CR4, i.e. it's L2's responsibility to emulate #UDs that are
+ * missed by hardware due to shadowing CR0 and/or CR4.
*/
if (vmx_get_cpl(vcpu)) {
kvm_inject_gp(vcpu, 0);
@@ -4926,6 +4937,17 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
if (vmx->nested.vmxon)
return nested_vmx_fail(vcpu, VMXERR_VMXON_IN_VMX_ROOT_OPERATION);
+ /*
+ * Invalid CR0/CR4 generates #GP. These checks are performed if and
+ * only if the vCPU isn't already in VMX operation, i.e. effectively
+ * have lower priority than the VM-Fail above.
+ */
+ if (!nested_host_cr0_valid(vcpu, kvm_read_cr0(vcpu)) ||
+ !nested_host_cr4_valid(vcpu, kvm_read_cr4(vcpu))) {
+ kvm_inject_gp(vcpu, 0);
+ return 1;
+ }
+
if ((vmx->msr_ia32_feature_control & VMXON_NEEDED_FEATURES)
!= VMXON_NEEDED_FEATURES) {
kvm_inject_gp(vcpu, 0);
--
2.33.0
Èç¹ûÄúŽíÎóœÓÊÕÁËžÃÓÊŒþ£¬ÇëÍš¹ýµç×ÓÓÊŒþÁ¢ŒŽÍšÖªÎÒÃÇ¡£Çë»ØžŽÓÊŒþµœ hqs-spmc@chinaunicom.cn£¬ŒŽ¿ÉÒÔÍ˶©ŽËÓÊŒþ¡£ÎÒÃÇœ«Á¢ŒŽœ«ÄúµÄÐÅÏ¢ŽÓÎÒÃǵķ¢ËÍÄ¿ÂŒÖÐÉŸ³ý¡£ If you have received this email in error please notify us immediately by e-mail. Please reply to hqs-spmc(a)chinaunicom.cn ,you can unsubscribe from this mail. We will immediately remove your information from send catalogue of our.
2
1

Re: [openEuler-22.03-LTS,v1,0/6] Add perf metricgroup support for HiSilicon hip08 platform
by patchwork bot 23 Apr '23
by patchwork bot 23 Apr '23
23 Apr '23
your patch has been converted to a pull request, pull request link is:
https://gitee.com/openeuler/kernel/pulls/609
1
0

[PATCH openEuler-22.03-LTS v1 0/6] Add perf metricgroup support for HiSilicon hip08 platform
by Wei Li 23 Apr '23
by Wei Li 23 Apr '23
23 Apr '23
Add perf metricgroup support for HiSilicon hip08 platform, tested on Huawei D06 board.
John Garry (6):
perf metricgroup: Make find_metric() public with name change
perf test: Handle metric reuse in pmu-events parsing test
perf pmu: Add pmu_events_map__find() function to find the common PMU
map for the system
perf vendor events arm64: Add Hisi hip08 L1 metrics
perf vendor events arm64: Add Hisi hip08 L2 metrics
perf vendor events arm64: Add Hisi hip08 L3 metrics
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/pmu.c | 25 ++
.../arch/arm64/hisilicon/hip08/metrics.json | 233 ++++++++++++++++++
tools/perf/tests/pmu-events.c | 83 ++++++-
tools/perf/util/metricgroup.c | 11 +-
tools/perf/util/metricgroup.h | 3 +-
tools/perf/util/pmu.c | 5 +
tools/perf/util/pmu.h | 1 +
tools/perf/util/s390-sample-raw.c | 4 +-
9 files changed, 356 insertions(+), 10 deletions(-)
create mode 100644 tools/perf/arch/arm64/util/pmu.c
create mode 100644 tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
--
2.25.1
1
6

[PATCH openEuler-1.0-LTS] audit: fix a memleak caused by auditing load module
by Yongqiang Liu 23 Apr '23
by Yongqiang Liu 23 Apr '23
23 Apr '23
From: Li RongQing <lirongqing(a)baidu.com>
mainline inclusion
from mainline-v5.2-rc1
commit 95e0b46fcebd7dbf6850dee96046e4c4ddc7f69c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6X2LV
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
module.name will be allocated unconditionally when auditing load
module, and audit_log_start() can fail with other reasons, or
audit_log_exit maybe not called, caused module.name is not freed
so free module.name in audit_free_context and __audit_syscall_exit
unreferenced object 0xffff88af90837d20 (size 8):
comm "modprobe", pid 1036, jiffies 4294704867 (age 3069.138s)
hex dump (first 8 bytes):
69 78 67 62 65 00 ff ff ixgbe...
backtrace:
[<0000000008da28fe>] __audit_log_kern_module+0x33/0x80
[<00000000c1491e61>] load_module+0x64f/0x3850
[<000000007fc9ae3f>] __do_sys_init_module+0x218/0x250
[<0000000000d4a478>] do_syscall_64+0x117/0x400
[<000000004924ded8>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<000000007dc331dd>] 0xffffffffffffffff
Fixes: ca86cad7380e3 ("audit: log module name on init_module")
Signed-off-by: Zhang Yu <zhangyu31(a)baidu.com>
Signed-off-by: Li RongQing <lirongqing(a)baidu.com>
[PM: manual merge fixup in __audit_syscall_exit()]
Signed-off-by: Paul Moore <paul(a)paul-moore.com>
conflict:
kernel/auditsc.c
Signed-off-by: Yi Yang <yiyang13(a)huawei.com>
Reviewed-by: cuigaosheng <cuigaosheng1(a)huawei.com>
Reviewed-by: guozihua <guozihua(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
kernel/auditsc.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 1513873e23bd..57d30b75f7a2 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -881,6 +881,13 @@ static inline void audit_proctitle_free(struct audit_context *context)
context->proctitle.len = 0;
}
+static inline void audit_free_module(struct audit_context *context)
+{
+ if (context->type == AUDIT_KERN_MODULE) {
+ kfree(context->module.name);
+ context->module.name = NULL;
+ }
+}
static inline void audit_free_names(struct audit_context *context)
{
struct audit_names *n, *next;
@@ -964,6 +971,7 @@ int audit_alloc(struct task_struct *tsk)
static inline void audit_free_context(struct audit_context *context)
{
+ audit_free_module(context);
audit_free_names(context);
unroll_tree_refs(context, NULL, 0);
free_tree_refs(context);
@@ -1281,7 +1289,6 @@ static void show_special(struct audit_context *context, int *call_panic)
audit_log_format(ab, "name=");
if (context->module.name) {
audit_log_untrustedstring(ab, context->module.name);
- kfree(context->module.name);
} else
audit_log_format(ab, "(null)");
@@ -1583,6 +1590,7 @@ void __audit_syscall_exit(int success, long return_code)
if (!list_empty(&context->killed_trees))
audit_kill_trees(&context->killed_trees);
+ audit_free_module(context);
audit_free_names(context);
unroll_tree_refs(context, NULL, 0);
audit_free_aux(context);
--
2.25.1
1
0

[PATCH OLK-5.10 v1 0/6] Add perf metricgroup support for HiSilicon hip08 platform
by Wei Li 23 Apr '23
by Wei Li 23 Apr '23
23 Apr '23
Add perf metricgroup support for HiSilicon hip08 platform, tested on Huawei D06 board.
John Garry (6):
perf metricgroup: Make find_metric() public with name change
perf test: Handle metric reuse in pmu-events parsing test
perf pmu: Add pmu_events_map__find() function to find the common PMU
map for the system
perf vendor events arm64: Add Hisi hip08 L1 metrics
perf vendor events arm64: Add Hisi hip08 L2 metrics
perf vendor events arm64: Add Hisi hip08 L3 metrics
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/pmu.c | 25 ++
.../arch/arm64/hisilicon/hip08/metrics.json | 233 ++++++++++++++++++
tools/perf/tests/pmu-events.c | 83 ++++++-
tools/perf/util/metricgroup.c | 11 +-
tools/perf/util/metricgroup.h | 3 +-
tools/perf/util/pmu.c | 5 +
tools/perf/util/pmu.h | 1 +
tools/perf/util/s390-sample-raw.c | 4 +-
9 files changed, 356 insertions(+), 10 deletions(-)
create mode 100644 tools/perf/arch/arm64/util/pmu.c
create mode 100644 tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json
--
2.25.1
2
7

Re: [PATCH v3] mm: oom: move memcg_print_bad_task() out of mem_cgroup_scan_tasks()
by Zheng Zengkai 22 Apr '23
by Zheng Zengkai 22 Apr '23
22 Apr '23
在 2023/4/17 21:12, Kefeng Wang 写道:
> + Zengkai
>
> Could you merge this patch?
+zhangjialin
Hi Kefeng and Kang Chen
Is this patch same as the one in PR
#582(https://gitee.com/openeuler/kernel/pulls/582) ?
If so,let's tracking it via the gitee PR,
Here are some notices about contribution to openEuler kernel:
https://gitee.com/openeuler/kernel/issues/I6WKLA
Thank you!
>
> On 2023/4/17 19:13, Kang Chen wrote:
>> hulk inclusion
>> category: bugfix
>> bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NYW4
>> CVE: NA
>>
>> --------------------------------
>>
>> raw call flow:
>>
>> oom_kill_process
>> -> mem_cgroup_scan_tasks(.., .., message)
>> -> memcg_print_bad_task(message, ..)
>>
>> message is "const char*" type, and incorrectly cast to
>> "oom_control*" type in memcg_print_bad_task.
>
> a blank line,
>
>> Fix it by moving memcg_print_bad_task out of mem_cgroup_scan_tasks
>> and call it in select_bad_process and dump_tasks. Furthermore,
>> use struct oom_control* directly and remove the useless parm `ret`.
>>
>
> Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
>
>> Signed-off-by: Kang Chen <void0red(a)hust.edu.cn>
>> ---
>> v3 -> v2: use type `struct oom_control *` directly and fix bugs
>> v2 -> v1: remove parm `ret` and create a memcg_print_bad_task stub
>>
>> mm/memcontrol.c | 16 +++++++++-------
>> mm/oom_kill.c | 14 ++++++++------
>> 2 files changed, 17 insertions(+), 13 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 635cb8b65b86..6b4f70d090d6 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -1214,9 +1214,6 @@ int mem_cgroup_scan_tasks(struct mem_cgroup
>> *memcg,
>> break;
>> }
>> }
>> -#ifdef CONFIG_MEMCG_QOS
>> - memcg_print_bad_task(arg, ret);
>> -#endif
>> return ret;
>> }
>> @@ -4004,14 +4001,12 @@ bool memcg_low_priority_scan_tasks(int
>> (*fn)(struct task_struct *, void *),
>> return oc->chosen ? true : false;
>> }
>> -void memcg_print_bad_task(void *arg, int ret)
>> +void memcg_print_bad_task(struct oom_control *oc)
>> {
>> - struct oom_control *oc = arg;
>> -
>> if (!static_branch_likely(&memcg_qos_stat_key))
>> return;
>> - if (!ret && oc->chosen) {
>> + if (oc->chosen) {
>> struct mem_cgroup *memcg;
>> memcg = mem_cgroup_from_task(oc->chosen);
>> @@ -4042,6 +4037,13 @@ int sysctl_memcg_qos_handler(struct ctl_table
>> *table, int write,
>> return ret;
>> }
>> +
>> +#else
>> +
>> +void memcg_print_bad_task(struct oom_control *oc)
>> +{
>> +}
>> +
>> #endif
>> #ifdef CONFIG_NUMA
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> index 0f77eb4c6644..9d595265bbf5 100644
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>> @@ -408,9 +408,10 @@ static void select_bad_process(struct
>> oom_control *oc)
>> {
>> oc->chosen_points = LONG_MIN;
>> - if (is_memcg_oom(oc))
>> - mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc);
>> - else {
>> + if (is_memcg_oom(oc)) {
>> + if (!mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc))
>> + memcg_print_bad_task(oc);
>> + } else {
>> struct task_struct *p;
>> #ifdef CONFIG_MEMCG_QOS
>> @@ -473,9 +474,10 @@ static void dump_tasks(struct oom_control *oc)
>> pr_info("Tasks state (memory values in pages):\n");
>> pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes
>> swapents oom_score_adj name\n");
>> - if (is_memcg_oom(oc))
>> - mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
>> - else {
>> + if (is_memcg_oom(oc)) {
>> + if (!mem_cgroup_scan_tasks(oc->memcg, dump_task, oc))
>> + memcg_print_bad_task(oc);
>> + } else {
>> struct task_struct *p;
>> rcu_read_lock();
> .
>
3
2

[PATCH openEuler-1.0-LTS 0/6] efi: fix crash due to EFI runtime service page faults
by Ding Hui 20 Apr '23
by Ding Hui 20 Apr '23
20 Apr '23
For both arm64 and x86, handle and recovery from page faults due to
EFI runtime service, and disable subsequent invoking, we can avoid
crash the whole system when running on buggy EFI firmware, and get
log like this:
kernel: [Firmware Bug]: Unable to handle paging request in EFI runtime service
kernel: CPU: 54 PID: 8 Comm: kworker/u256:0 Kdump: loaded Tainted: G IOE 4.19.90-2112.8.0.0131.oe1.aarch64.debug #66
kernel: Hardware name: O.D.M FT-2500 Platform/T1DMFT-E4 , BIOS KL4.26.ODM.S.032.210904.R 09/04/21 13:28:40
kernel: Workqueue: efi_rts_wq efi_call_rts
kernel: Call trace:
kernel: dump_backtrace+0x0/0x170
kernel: show_stack+0x24/0x30
kernel: dump_stack+0xa4/0xe8
kernel: efi_runtime_fixup_exception+0x74/0x8c
kernel: __do_kernel_fault+0x8c/0x150
kernel: do_page_fault+0x78/0x4c8
kernel: do_translation_fault+0xa8/0xbc
kernel: do_mem_abort+0x50/0xe0
kernel: el1_da+0x20/0x94
kernel: 0x213f0c24
kernel: 0x213f0d64
kernel: 0x213f044c
kernel: 0x213f04b4
kernel: 0x213f0178
kernel: 0x212e0664
kernel: __efi_rt_asm_wrapper+0x50/0x6c
kernel: efi_call_rts+0x414/0x430
kernel: process_one_work+0x1f8/0x490
kernel: worker_thread+0x50/0x4b8
kernel: kthread+0x134/0x138
kernel: ret_from_fork+0x10/0x18
kernel: [Firmware Bug]: Synchronous exception occurred in EFI runtime service get_time()
kernel: rtc-efi rtc-efi: can't read time
kernel: efi: EFI Runtime Services are disabled!
Anders Roxell (1):
efi: Fix build error due to enum collision between efi.h and ima.h
Ard Biesheuvel (1):
arm64: efi: Recover from synchronous exceptions occurring in firmware
Sai Praneeth (2):
efi: Make efi_rts_work accessible to efi page fault handler
efi/x86: Handle page faults occurring while running EFI runtime
services
Sami Tolvanen (1):
arm64: efi: Restore register x18 if it was corrupted
Waiman Long (1):
efi: Fix debugobjects warning on 'efi_rts_work'
arch/arm64/include/asm/efi.h | 9 ++
arch/arm64/kernel/efi-rt-wrapper.S | 46 +++++++++-
arch/arm64/kernel/efi.c | 26 ++++++
arch/arm64/mm/fault.c | 4 +
arch/x86/include/asm/efi.h | 1 +
arch/x86/mm/fault.c | 9 ++
arch/x86/platform/efi/quirks.c | 78 +++++++++++++++++
drivers/firmware/efi/runtime-wrappers.c | 107 +++++++++---------------
include/linux/efi.h | 42 ++++++++++
9 files changed, 252 insertions(+), 70 deletions(-)
--
2.17.1
1
6

[PATCH openEuler-1.0-LTS] iommu/arm-smmu-v3: Fix UAF when handle evt during iommu group removing
by Ding Hui 20 Apr '23
by Ding Hui 20 Apr '23
20 Apr '23
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6WKM7
CVE: NA
--------
A use-after-free issue like following:
[ 2257.819189] arm-smmu-v3 arm-smmu-v3.4.auto: EVTQ overflow detected -- events lost
[ 2257.819197] arm-smmu-v3 arm-smmu-v3.4.auto: event 0x10 received:
[ 2257.819199] arm-smmu-v3 arm-smmu-v3.4.auto: 0x0000820000000010
[ 2257.819201] arm-smmu-v3 arm-smmu-v3.4.auto: 0x0000020000000000
[ 2257.819202] arm-smmu-v3 arm-smmu-v3.4.auto: 0x00000000dfea7218
[ 2257.819206] iommu: Removing device 0000:82:00.0 from group 49
[ 2257.819207] arm-smmu-v3 arm-smmu-v3.4.auto: 0x00000000dfea7000
[ 2257.819211] arm-smmu-v3 arm-smmu-v3.4.auto: event 0x10 received:
[ 2257.819212] arm-smmu-v3 arm-smmu-v3.4.auto: 0x0000820000000010
[ 2257.819214] arm-smmu-v3 arm-smmu-v3.4.auto: 0x0000020000000000
[ 2257.819215] arm-smmu-v3 arm-smmu-v3.4.auto: 0x00000000dfea722c
[ 2257.819216] arm-smmu-v3 arm-smmu-v3.4.auto: 0x00000000dfea7000
[ 2257.819218] ==================================================================
[ 2257.819228] BUG: KASAN: use-after-free in iommu_report_device_fault+0x520/0x5c0
[ 2257.819230] Read of size 8 at addr ffffa02c516c1320 by task irq/63-arm-smmu/483
[ 2257.819231]
[ 2257.819235] CPU: 25 PID: 483 Comm: irq/63-arm-smmu Kdump: loaded Tainted: G OE 4.19.90-2205.3.0.0149.oe1.aarch64+debug #1
[ 2257.819236] Hardware name: Huawei S920S00/BC82AMDGK, BIOS 1.68 11/10/2020
[ 2257.819237] Call trace:
[ 2257.819240] dump_backtrace+0x0/0x320
[ 2257.819242] show_stack+0x24/0x30
[ 2257.819246] dump_stack+0xdc/0x128
[ 2257.819251] print_address_description+0x68/0x278
[ 2257.819253] kasan_report+0x1e4/0x308
[ 2257.819254] __asan_report_load8_noabort+0x30/0x40
[ 2257.819257] iommu_report_device_fault+0x520/0x5c0
[ 2257.819260] arm_smmu_handle_evt+0x300/0x428
[ 2257.819261] arm_smmu_evtq_thread+0x27c/0x460
[ 2257.819264] irq_thread_fn+0x88/0x140
[ 2257.819266] irq_thread+0x190/0x318
[ 2257.819268] kthread+0x2a4/0x320
[ 2257.819270] ret_from_fork+0x10/0x18
[ 2257.819271]
[ 2257.819273] Allocated by task 95166:
[ 2257.819275] kasan_kmalloc+0xd0/0x178
[ 2257.819277] kmem_cache_alloc_trace+0x100/0x210
[ 2257.819279] iommu_group_add_device+0x254/0xe18
[ 2257.819280] iommu_group_get_for_dev+0x198/0x480
[ 2257.819282] arm_smmu_add_device+0x424/0x988
[ 2257.819284] iort_iommu_configure+0x33c/0x5b8
[ 2257.819287] acpi_dma_configure+0x9c/0xf8
[ 2257.819289] pci_dma_configure+0x124/0x158
[ 2257.819291] dma_configure+0x5c/0x80
[ 2257.819294] really_probe+0xcc/0x920
[ 2257.819296] driver_probe_device+0x224/0x308
[ 2257.819298] __device_attach_driver+0x154/0x260
[ 2257.819299] bus_for_each_drv+0xe4/0x178
[ 2257.819301] __device_attach+0x1bc/0x2a8
[ 2257.819302] device_attach+0x24/0x30
[ 2257.819304] pci_bus_add_device+0x7c/0xe8
[ 2257.819305] pci_bus_add_devices+0x70/0x168
[ 2257.819307] pci_bus_add_devices+0x114/0x168
[ 2257.819308] pci_rescan_bus+0x38/0x48
[ 2257.819310] bus_rescan_store+0xc4/0xe8
[ 2257.819312] bus_attr_store+0x70/0x98
[ 2257.819314] sysfs_kf_write+0x104/0x170
[ 2257.819316] kernfs_fop_write+0x23c/0x430
[ 2257.819319] __vfs_write+0x7c/0xe8
[ 2257.819320] vfs_write+0x12c/0x3d0
[ 2257.819321] ksys_write+0xd4/0x1d8
[ 2257.819322] __arm64_sys_write+0x70/0xa0
[ 2257.819325] el0_svc_common+0xfc/0x278
[ 2257.819327] el0_svc_handler+0x50/0xc0
[ 2257.819329] el0_svc+0x8/0x1b0
[ 2257.819329]
[ 2257.819330] Freed by task 95166:
[ 2257.819332] __kasan_slab_free+0x114/0x200
[ 2257.819334] kasan_slab_free+0x10/0x18
[ 2257.819335] kfree+0x80/0x1f0
[ 2257.819337] iommu_group_remove_device+0x27c/0x560
[ 2257.819338] arm_smmu_remove_device+0xe8/0x190
[ 2257.819339] iommu_bus_notifier+0x134/0x248
[ 2257.819342] notifier_call_chain+0xb0/0x140
[ 2257.819343] blocking_notifier_call_chain+0x6c/0xd8
[ 2257.819344] device_del+0x578/0x940
[ 2257.819346] pci_remove_bus_device+0x114/0x290
[ 2257.819347] pci_stop_and_remove_bus_device_locked+0x2c/0x40
[ 2257.819349] remove_store+0xdc/0xe8
[ 2257.819352] dev_attr_store+0x60/0x80
[ 2257.819353] sysfs_kf_write+0x104/0x170
[ 2257.819354] kernfs_fop_write+0x23c/0x430
[ 2257.819355] __vfs_write+0x7c/0xe8
[ 2257.819357] vfs_write+0x12c/0x3d0
[ 2257.819358] ksys_write+0xd4/0x1d8
[ 2257.819359] __arm64_sys_write+0x70/0xa0
[ 2257.819361] el0_svc_common+0xfc/0x278
[ 2257.819362] el0_svc_handler+0x50/0xc0
[ 2257.819364] el0_svc+0x8/0x1b0
[ 2257.819364]
T0 T1
--------------------------------- ---------------------------------
|- arm_smmu_evtq_thread |- arm_smmu_remove_device
|- arm_smmu_handle_evt |- iommu_group_remove_device
|- kfree(dev->iommu_param)
|- arm_smmu_find_master
|- iommu_report_device_fault |- arm_smmu_remove_master
|- mutex_lock( \
&dev->iommu_param->lock)
// UAF
Reference upstream mainline commit 395ad89d11fd93f79a6b942e91fc409807a63c4b,
move arm_smmu_remove_master() before iommu_group_remove_device(),
and hold mutex to protect finding master and subsequent handling.
Fixes: b525f0a6f9b0 ("iommu/arm-smmu-v3: Add stall support for platform devices")
Signed-off-by: Ding Hui <dinghui(a)sangfor.com.cn>
---
drivers/iommu/arm-smmu-v3.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 3309ae6ebc0b..05cb92da6836 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1742,7 +1742,8 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
struct arm_smmu_stream *stream;
struct arm_smmu_master_data *master = NULL;
- mutex_lock(&smmu->streams_mutex);
+ lockdep_assert_held(&smmu->streams_mutex);
+
node = smmu->streams.rb_node;
while (node) {
stream = rb_entry(node, struct arm_smmu_stream, node);
@@ -1755,7 +1756,6 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
break;
}
}
- mutex_unlock(&smmu->streams_mutex);
return master;
}
@@ -1791,9 +1791,12 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
if (evt[1] & EVTQ_1_S2)
return -EFAULT;
+ mutex_lock(&smmu->streams_mutex);
master = arm_smmu_find_master(smmu, sid);
- if (!master)
- return -EINVAL;
+ if (!master) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
/*
* The domain is valid until the fault returns, because detach() flushes
@@ -1833,6 +1836,8 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
arm_smmu_page_response(master->dev, &resp);
}
+out_unlock:
+ mutex_unlock(&smmu->streams_mutex);
return ret;
}
@@ -2935,8 +2940,8 @@ static void arm_smmu_remove_device(struct device *dev)
iopf_queue_remove_device(dev);
if (master->ste.assigned)
arm_smmu_detach_dev(dev);
- iommu_group_remove_device(dev);
arm_smmu_remove_master(smmu, master);
+ iommu_group_remove_device(dev);
iommu_device_unlink(&smmu->iommu, dev);
kfree(master);
iommu_fwspec_free(dev);
--
2.17.1
1
0
您好!
Kernel SIG 邀请您参加 2023-04-21 16:00 召开的Zoom会议(自动录制)
会议主题:ODD2023-Kernel SIG 开放工作会议
会议内容:
1. 华中大网安学院内核贡献团队组建经验分享
2. 互联网领域的挑战与技术
3. ext4文件系统问题(一致性、稳定性)的挖掘和定位
4. 可编程内核技术落地场景分享和未来畅想
欢迎继续申报议题~
会议链接:https://us06web.zoom.us/j/88277699752?pwd=VTB3ZjhFR1pKd0syc3NpMDVDcVRvZz09
会议纪要:https://etherpad.openeuler.org/p/ODD2023_Kernel_SIG
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-04-21 16:00,
The subject of the conference is ODD2023-Kernel SIG 开放工作会议,
Summary:
1. 华中大网安学院内核贡献团队组建经验分享
2. 互联网领域的挑战与技术
3. ext4文件系统问题(一致性、稳定性)的挖掘和定位
4. 可编程内核技术落地场景分享和未来畅想
欢迎继续申报议题~
You can join the meeting at https://us06web.zoom.us/j/88277699752?pwd=VTB3ZjhFR1pKd0syc3NpMDVDcVRvZz09.
Add topics at https://etherpad.openeuler.org/p/ODD2023_Kernel_SIG.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0
1
0

[PATCH openEuler-1.0-LTS 1/2] x86/speculation: Allow enabling STIBP with legacy IBRS
by Yongqiang Liu 19 Apr '23
by Yongqiang Liu 19 Apr '23
19 Apr '23
From: KP Singh <kpsingh(a)kernel.org>
stable inclusion
from stable-v4.19.276
commit 10543fb3c9b019e45e2045f08f46fdf526add593
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6V7TU
CVE: CVE-2023-1998
--------------------------------
commit 6921ed9049bc7457f66c1596c5b78aec0dae4a9d upstream.
When plain IBRS is enabled (not enhanced IBRS), the logic in
spectre_v2_user_select_mitigation() determines that STIBP is not needed.
The IBRS bit implicitly protects against cross-thread branch target
injection. However, with legacy IBRS, the IBRS bit is cleared on
returning to userspace for performance reasons which leaves userspace
threads vulnerable to cross-thread branch target injection against which
STIBP protects.
Exclude IBRS from the spectre_v2_in_ibrs_mode() check to allow for
enabling STIBP (through seccomp/prctl() by default or always-on, if
selected by spectre_v2_user kernel cmdline parameter).
[ bp: Massage. ]
Fixes: 7c693f54c873 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS")
Reported-by: José Oliveira <joseloliveira11(a)gmail.com>
Reported-by: Rodrigo Branco <rodrigo(a)kernelhacking.com>
Signed-off-by: KP Singh <kpsingh(a)kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230220120127.1975241-1-kpsingh@kernel.org
Link: https://lore.kernel.org/r/20230221184908.2349578-1-kpsingh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wei Li <liwei391(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
arch/x86/kernel/cpu/bugs.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 836000481438..a7becbe9a890 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -973,14 +973,18 @@ spectre_v2_parse_user_cmdline(void)
return SPECTRE_V2_USER_CMD_AUTO;
}
-static inline bool spectre_v2_in_ibrs_mode(enum spectre_v2_mitigation mode)
+static inline bool spectre_v2_in_eibrs_mode(enum spectre_v2_mitigation mode)
{
- return mode == SPECTRE_V2_IBRS ||
- mode == SPECTRE_V2_EIBRS ||
+ return mode == SPECTRE_V2_EIBRS ||
mode == SPECTRE_V2_EIBRS_RETPOLINE ||
mode == SPECTRE_V2_EIBRS_LFENCE;
}
+static inline bool spectre_v2_in_ibrs_mode(enum spectre_v2_mitigation mode)
+{
+ return spectre_v2_in_eibrs_mode(mode) || mode == SPECTRE_V2_IBRS;
+}
+
static void __init
spectre_v2_user_select_mitigation(void)
{
@@ -1043,12 +1047,19 @@ spectre_v2_user_select_mitigation(void)
}
/*
- * If no STIBP, IBRS or enhanced IBRS is enabled, or SMT impossible,
- * STIBP is not required.
+ * If no STIBP, enhanced IBRS is enabled, or SMT impossible, STIBP
+ * is not required.
+ *
+ * Enhanced IBRS also protects against cross-thread branch target
+ * injection in user-mode as the IBRS bit remains always set which
+ * implicitly enables cross-thread protections. However, in legacy IBRS
+ * mode, the IBRS bit is set only on kernel entry and cleared on return
+ * to userspace. This disables the implicit cross-thread protection,
+ * so allow for STIBP to be selected in that case.
*/
if (!boot_cpu_has(X86_FEATURE_STIBP) ||
!smt_possible ||
- spectre_v2_in_ibrs_mode(spectre_v2_enabled))
+ spectre_v2_in_eibrs_mode(spectre_v2_enabled))
return;
/*
@@ -2084,7 +2095,7 @@ static ssize_t mmio_stale_data_show_state(char *buf)
static char *stibp_state(void)
{
- if (spectre_v2_in_ibrs_mode(spectre_v2_enabled))
+ if (spectre_v2_in_eibrs_mode(spectre_v2_enabled))
return "";
switch (spectre_v2_user_stibp) {
--
2.25.1
1
1
Pull new CVEs:
CVE-2023-1829
CVE-2022-36280
CVE-2022-1015
CVE-2023-1989
CVE-2023-30456
CVE-2023-1990
xfs bugfixes from Long Li and yangerkun
1
26

18 Apr '23
Pull new CVEs:
CVE-2023-1829
CVE-2022-36280
CVE-2022-1015
CVE-2023-1989
CVE-2023-30456
CVE-2023-1990
xfs bugfixes from Long Li and yangerkun
1
26
Pull new CVEs:
CVE-2023-1829
CVE-2022-36280
CVE-2022-1015
CVE-2023-1989
CVE-2023-30456
CVE-2023-1990
xfs bugfixes from Long Li and yangerkun
1
26
Backport 5.10.150 LTS patches from upstream.
Conflicts:
Already merged(19):
f039b43cbaea inet: fully convert sk->sk_rx_dst to RCU rules
45c33966759e mm: hugetlb: fix UAF in hugetlb_handle_userfault
c378c479c517 io_uring/af_unix: defer registered files gc to io_uring release
67cbc8865a66 io_uring: correct pinned_vm accounting
904f881b5736 arm64: topology: fix possible overflow in amu_fie_setup()
dbcca76435a6 HID: roccat: Fix use-after-free in roccat_read()
484400d433ca r8152: Rate limit overflow messages
d88b88514ef2 crypto: hisilicon/zip - fix mismatch in get/set sgl_sge_nr
657de36c72f5 arm64: ftrace: fix module PLTs with mcount
29f50bcf0f8b net: mvpp2: fix mvpp2 debugfs leak
6cc0e2afc6a1 bnx2x: fix potential memory leak in bnx2x_tpa_stop()
2a1d03632085 mISDN: fix use-after-free bugs in l1oip timer handlers
0cf6c09dafee ring-buffer: Fix race between reset page and reading page
fbb0e601bd51 ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
483831ad0440 ext4: fix check for block being out of directory size
f34ab9516276 ext4: fix null-ptr-deref in ext4_write_info
e50472949604 fbdev: smscufx: Fix use-after-free in ufx_ops_open()
7d551b7d6114 block: fix inflight statistics of part0
6b7ae4a904a4 quota: Check next/prev free block number after reading from quota file
Context conflict(1):
c13d0d2f5a48 usb: host: xhci-plat: suspend and resume clocks
Total patches: 389 - 19 = 370
1
370

18 Apr '23
From: KP Singh <kpsingh(a)kernel.org>
stable inclusion
from stable-v5.10.173
commit abfed855f05863d292de2d0ebab4656791bab9c8
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6V7TU
CVE: CVE-2023-1998
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 6921ed9049bc7457f66c1596c5b78aec0dae4a9d upstream.
When plain IBRS is enabled (not enhanced IBRS), the logic in
spectre_v2_user_select_mitigation() determines that STIBP is not needed.
The IBRS bit implicitly protects against cross-thread branch target
injection. However, with legacy IBRS, the IBRS bit is cleared on
returning to userspace for performance reasons which leaves userspace
threads vulnerable to cross-thread branch target injection against which
STIBP protects.
Exclude IBRS from the spectre_v2_in_ibrs_mode() check to allow for
enabling STIBP (through seccomp/prctl() by default or always-on, if
selected by spectre_v2_user kernel cmdline parameter).
[ bp: Massage. ]
Fixes: 7c693f54c873 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS")
Reported-by: José Oliveira <joseloliveira11(a)gmail.com>
Reported-by: Rodrigo Branco <rodrigo(a)kernelhacking.com>
Signed-off-by: KP Singh <kpsingh(a)kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20230220120127.1975241-1-kpsingh@kernel.org
Link: https://lore.kernel.org/r/20230221184908.2349578-1-kpsingh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Wei Li <liwei391(a)huawei.com>
---
arch/x86/kernel/cpu/bugs.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 25f3fad210e0..0dd4f8a8d821 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1051,14 +1051,18 @@ spectre_v2_parse_user_cmdline(void)
return SPECTRE_V2_USER_CMD_AUTO;
}
-static inline bool spectre_v2_in_ibrs_mode(enum spectre_v2_mitigation mode)
+static inline bool spectre_v2_in_eibrs_mode(enum spectre_v2_mitigation mode)
{
- return mode == SPECTRE_V2_IBRS ||
- mode == SPECTRE_V2_EIBRS ||
+ return mode == SPECTRE_V2_EIBRS ||
mode == SPECTRE_V2_EIBRS_RETPOLINE ||
mode == SPECTRE_V2_EIBRS_LFENCE;
}
+static inline bool spectre_v2_in_ibrs_mode(enum spectre_v2_mitigation mode)
+{
+ return spectre_v2_in_eibrs_mode(mode) || mode == SPECTRE_V2_IBRS;
+}
+
static void __init
spectre_v2_user_select_mitigation(void)
{
@@ -1121,12 +1125,19 @@ spectre_v2_user_select_mitigation(void)
}
/*
- * If no STIBP, IBRS or enhanced IBRS is enabled, or SMT impossible,
- * STIBP is not required.
+ * If no STIBP, enhanced IBRS is enabled, or SMT impossible, STIBP
+ * is not required.
+ *
+ * Enhanced IBRS also protects against cross-thread branch target
+ * injection in user-mode as the IBRS bit remains always set which
+ * implicitly enables cross-thread protections. However, in legacy IBRS
+ * mode, the IBRS bit is set only on kernel entry and cleared on return
+ * to userspace. This disables the implicit cross-thread protection,
+ * so allow for STIBP to be selected in that case.
*/
if (!boot_cpu_has(X86_FEATURE_STIBP) ||
!smt_possible ||
- spectre_v2_in_ibrs_mode(spectre_v2_enabled))
+ spectre_v2_in_eibrs_mode(spectre_v2_enabled))
return;
/*
@@ -2220,7 +2231,7 @@ static ssize_t mmio_stale_data_show_state(char *buf)
static char *stibp_state(void)
{
- if (spectre_v2_in_ibrs_mode(spectre_v2_enabled))
+ if (spectre_v2_in_eibrs_mode(spectre_v2_enabled))
return "";
switch (spectre_v2_user_stibp) {
--
2.25.1
1
0

[PATCH v3] mm: oom: move memcg_print_bad_task() out of mem_cgroup_scan_tasks()
by Kang Chen 17 Apr '23
by Kang Chen 17 Apr '23
17 Apr '23
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NYW4
CVE: NA
--------------------------------
raw call flow:
oom_kill_process
-> mem_cgroup_scan_tasks(.., .., message)
-> memcg_print_bad_task(message, ..)
message is "const char*" type, and incorrectly cast to
"oom_control*" type in memcg_print_bad_task.
Fix it by moving memcg_print_bad_task out of mem_cgroup_scan_tasks
and call it in select_bad_process and dump_tasks. Furthermore,
use struct oom_control* directly and remove the useless parm `ret`.
Signed-off-by: Kang Chen <void0red(a)hust.edu.cn>
---
v3 -> v2: use type `struct oom_control *` directly and fix bugs
v2 -> v1: remove parm `ret` and create a memcg_print_bad_task stub
mm/memcontrol.c | 16 +++++++++-------
mm/oom_kill.c | 14 ++++++++------
2 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 635cb8b65b86..6b4f70d090d6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1214,9 +1214,6 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
break;
}
}
-#ifdef CONFIG_MEMCG_QOS
- memcg_print_bad_task(arg, ret);
-#endif
return ret;
}
@@ -4004,14 +4001,12 @@ bool memcg_low_priority_scan_tasks(int (*fn)(struct task_struct *, void *),
return oc->chosen ? true : false;
}
-void memcg_print_bad_task(void *arg, int ret)
+void memcg_print_bad_task(struct oom_control *oc)
{
- struct oom_control *oc = arg;
-
if (!static_branch_likely(&memcg_qos_stat_key))
return;
- if (!ret && oc->chosen) {
+ if (oc->chosen) {
struct mem_cgroup *memcg;
memcg = mem_cgroup_from_task(oc->chosen);
@@ -4042,6 +4037,13 @@ int sysctl_memcg_qos_handler(struct ctl_table *table, int write,
return ret;
}
+
+#else
+
+void memcg_print_bad_task(struct oom_control *oc)
+{
+}
+
#endif
#ifdef CONFIG_NUMA
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 0f77eb4c6644..9d595265bbf5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -408,9 +408,10 @@ static void select_bad_process(struct oom_control *oc)
{
oc->chosen_points = LONG_MIN;
- if (is_memcg_oom(oc))
- mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc);
- else {
+ if (is_memcg_oom(oc)) {
+ if (!mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc))
+ memcg_print_bad_task(oc);
+ } else {
struct task_struct *p;
#ifdef CONFIG_MEMCG_QOS
@@ -473,9 +474,10 @@ static void dump_tasks(struct oom_control *oc)
pr_info("Tasks state (memory values in pages):\n");
pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
- if (is_memcg_oom(oc))
- mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
- else {
+ if (is_memcg_oom(oc)) {
+ if (!mem_cgroup_scan_tasks(oc->memcg, dump_task, oc))
+ memcg_print_bad_task(oc);
+ } else {
struct task_struct *p;
rcu_read_lock();
--
2.34.1
1
0

[PATCH openEuler-1.0-LTS] nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition
by Yongqiang Liu 17 Apr '23
by Yongqiang Liu 17 Apr '23
17 Apr '23
From: Zheng Wang <zyytlz.wz(a)163.com>
mainline inclusion
from mainline-v6.3-rc3
commit 5000fe6c27827a61d8250a7e4a1d26c3298ef4f6
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6UW64
CVE: CVE-2023-1990
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?i…
--------------------------------
This bug influences both st_nci_i2c_remove and st_nci_spi_remove.
Take st_nci_i2c_remove as an example.
In st_nci_i2c_probe, it called ndlc_probe and bound &ndlc->sm_work
with llt_ndlc_sm_work.
When it calls ndlc_recv or timeout handler, it will finally call
schedule_work to start the work.
When we call st_nci_i2c_remove to remove the driver, there
may be a sequence as follows:
Fix it by finishing the work before cleanup in ndlc_remove
CPU0 CPU1
|llt_ndlc_sm_work
st_nci_i2c_remove |
ndlc_remove |
st_nci_remove |
nci_free_device|
kfree(ndev) |
//free ndlc->ndev |
|llt_ndlc_rcv_queue
|nci_recv_frame
|//use ndlc->ndev
Fixes: 35630df68d60 ("NFC: st21nfcb: Add driver for STMicroelectronics ST21NFCB NFC chip")
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Link: https://lore.kernel.org/r/20230312160837.2040857-1-zyytlz.wz@163.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Wei Li <liwei391(a)huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/nfc/st-nci/ndlc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/nfc/st-nci/ndlc.c b/drivers/nfc/st-nci/ndlc.c
index f26d938d240f..12d73f9dbe9f 100644
--- a/drivers/nfc/st-nci/ndlc.c
+++ b/drivers/nfc/st-nci/ndlc.c
@@ -297,13 +297,15 @@ EXPORT_SYMBOL(ndlc_probe);
void ndlc_remove(struct llt_ndlc *ndlc)
{
- st_nci_remove(ndlc->ndev);
-
/* cancel timers */
del_timer_sync(&ndlc->t1_timer);
del_timer_sync(&ndlc->t2_timer);
ndlc->t2_active = false;
ndlc->t1_active = false;
+ /* cancel work */
+ cancel_work_sync(&ndlc->sm_work);
+
+ st_nci_remove(ndlc->ndev);
skb_queue_purge(&ndlc->rcv_q);
skb_queue_purge(&ndlc->send_q);
--
2.25.1
1
0

[PATCH v2] mm: oom: move memcg_print_bad_task() out of mem_cgroup_scan_tasks()
by Kang Chen 17 Apr '23
by Kang Chen 17 Apr '23
17 Apr '23
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NYW4
CVE: NA
--------------------------------
raw call flow:
oom_kill_process
-> mem_cgroup_scan_tasks(.., .., message)
-> memcg_print_bad_task(message, ..)
message is "const char*" type, and incorrectly cast to
"oom_control*" type in memcg_print_bad_task.
Fix it by moving memcg_print_bad_task out of mem_cgroup_scan_tasks
and call it in select_bad_process and dump_tasks.
Furthermore, remove the useless parm `ret`.
Signed-off-by: Kang Chen <void0red(a)hust.edu.cn>
---
v2 -> v1: remove parm `ret` and create a memcg_print_bad_task stub
mm/memcontrol.c | 14 +++++++++-----
mm/oom_kill.c | 14 ++++++++------
2 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 635cb8b65b86..e87a96a6c712 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1214,9 +1214,6 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
break;
}
}
-#ifdef CONFIG_MEMCG_QOS
- memcg_print_bad_task(arg, ret);
-#endif
return ret;
}
@@ -4004,14 +4001,14 @@ bool memcg_low_priority_scan_tasks(int (*fn)(struct task_struct *, void *),
return oc->chosen ? true : false;
}
-void memcg_print_bad_task(void *arg, int ret)
+void memcg_print_bad_task(void *arg)
{
struct oom_control *oc = arg;
if (!static_branch_likely(&memcg_qos_stat_key))
return;
- if (!ret && oc->chosen) {
+ if (oc->chosen) {
struct mem_cgroup *memcg;
memcg = mem_cgroup_from_task(oc->chosen);
@@ -4042,6 +4039,13 @@ int sysctl_memcg_qos_handler(struct ctl_table *table, int write,
return ret;
}
+
+#else
+
+void memcg_print_bad_task(void *arg)
+{
+}
+
#endif
#ifdef CONFIG_NUMA
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 0f77eb4c6644..03df1a9093b5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -408,9 +408,10 @@ static void select_bad_process(struct oom_control *oc)
{
oc->chosen_points = LONG_MIN;
- if (is_memcg_oom(oc))
- mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc);
- else {
+ if (is_memcg_oom(oc)) {
+ if (mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc))
+ memcg_print_bad_task(oc);
+ } else {
struct task_struct *p;
#ifdef CONFIG_MEMCG_QOS
@@ -473,9 +474,10 @@ static void dump_tasks(struct oom_control *oc)
pr_info("Tasks state (memory values in pages):\n");
pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
- if (is_memcg_oom(oc))
- mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
- else {
+ if (is_memcg_oom(oc)) {
+ if (mem_cgroup_scan_tasks(oc->memcg, dump_task, oc))
+ memcg_print_bad_task(oc);
+ } else {
struct task_struct *p;
rcu_read_lock();
--
2.34.1
1
0

[PATCH openEuler-1.0-LTS] Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition
by Yongqiang Liu 17 Apr '23
by Yongqiang Liu 17 Apr '23
17 Apr '23
From: Zheng Wang <zyytlz.wz(a)163.com>
mainline inclusion
from mainline-v6.3-rc7
commit 73f7b171b7c09139eb3c6a5677c200dc1be5f318
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6UW68
CVE: CVE-2023-1989
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
In btsdio_probe, the data->work is bound with btsdio_work. It will be
started in btsdio_send_frame.
If the btsdio_remove runs with a unfinished work, there may be a race
condition that hdev is freed but used in btsdio_work. Fix it by
canceling the work before do cleanup in btsdio_remove.
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/bluetooth/btsdio.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/bluetooth/btsdio.c b/drivers/bluetooth/btsdio.c
index 20142bc77554..1325b1df4a8e 100644
--- a/drivers/bluetooth/btsdio.c
+++ b/drivers/bluetooth/btsdio.c
@@ -356,6 +356,7 @@ static void btsdio_remove(struct sdio_func *func)
if (!data)
return;
+ cancel_work_sync(&data->work);
hdev = data->hdev;
sdio_set_drvdata(func, NULL);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition
by Yongqiang Liu 17 Apr '23
by Yongqiang Liu 17 Apr '23
17 Apr '23
From: Zheng Wang <zyytlz.wz(a)163.com>
mainline inclusion
from mainline-v6.3-rc7
commit 73f7b171b7c09139eb3c6a5677c200dc1be5f318
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6UW68
CVE: CVE-2023-1989
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
In btsdio_probe, the data->work is bound with btsdio_work. It will be
started in btsdio_send_frame.
If the btsdio_remove runs with a unfinished work, there may be a race
condition that hdev is freed but used in btsdio_work. Fix it by
canceling the work before do cleanup in btsdio_remove.
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/bluetooth/btsdio.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/bluetooth/btsdio.c b/drivers/bluetooth/btsdio.c
index 20142bc77554..1325b1df4a8e 100644
--- a/drivers/bluetooth/btsdio.c
+++ b/drivers/bluetooth/btsdio.c
@@ -356,6 +356,7 @@ static void btsdio_remove(struct sdio_func *func)
if (!data)
return;
+ cancel_work_sync(&data->work);
hdev = data->hdev;
sdio_set_drvdata(func, NULL);
--
2.25.1
1
0
From: Juan Zhou <zhoujuan51(a)h-partners.com>
Yixing Liu (2):
Update kernel headers
libhns: Add support for SVE Direct WQE
CMakeLists.txt | 1 +
buildlib/RDMA_EnableCStd.cmake | 17 +++++++++++++++++
kernel-headers/rdma/hns-abi.h | 1 +
providers/hns/CMakeLists.txt | 5 +++++
providers/hns/hns_roce_u_hw_v2.c | 21 ++++++++++++++++++++-
5 files changed, 44 insertions(+), 1 deletion(-)
--
2.30.0
1
0
uniontech inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6TN56
CVE: NA
--------------------------------
smatch report:
fs/eulerfs/namei.c:118 eufs_lookup() error: 'inode' dereferencing possible ERR_PTR()
fix it by using the ino above in eufs_err.
Signed-off-by: Kang Chen <void0red(a)hust.edu.cn>
---
I think we need use IS_ERR to handle all errors,
the err can't pass to BUG_ON line later.
v4 -> v3: fix some bugs and return EIO/ENOMEM anyway.
v3 -> v2: use IS_ERR to handle all kind err
v2 -> v1: use correct string format
fs/eulerfs/namei.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/eulerfs/namei.c b/fs/eulerfs/namei.c
index e4c6c36575f2..7064de644b34 100644
--- a/fs/eulerfs/namei.c
+++ b/fs/eulerfs/namei.c
@@ -114,9 +114,11 @@ static struct dentry *eufs_lookup(struct inode *dir, struct dentry *dentry,
goto not_found;
inode = eufs_iget(dir->i_sb, s2p(dir->i_sb, de->inode));
- if (inode == ERR_PTR(-ESTALE)) {
- eufs_err(dir->i_sb, "deleted inode referenced: 0x%lx",
- inode->i_ino);
+ if (IS_ERR(inode)) {
+ eufs_err(dir->i_sb, "eufs_iget failed ino 0x%llx err %d\n",
+ le64_to_cpu(de->inode), PTR_ERR(inode));
+ if (inode == ERR_PTR(-ENOMEM))
+ return ERR_PTR(-ENOMEM);
return ERR_PTR(-EIO);
}
not_found:
--
2.34.1
3
2

[PATCH] mm: move memcg_print_bad_task out of mem_cgroup_scan_tasks to avoid wrong type cast
by Kang Chen 16 Apr '23
by Kang Chen 16 Apr '23
16 Apr '23
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NYW4
CVE: NA
--------------------------------
raw call flow:
oom_kill_process
-> mem_cgroup_scan_tasks(.., .., message)
-> memcg_print_bad_task(message, ..)
message is "const char*" type, and incorrectly cast to
"oom_control*" type in memcg_print_bad_task.
Fix it by moving memcg_print_bad_task out of mem_cgroup_scan_tasks
and call it in select_bad_process and dump_tasks.
Signed-off-by: Kang Chen <void0red(a)hust.edu.cn>
---
mm/memcontrol.c | 3 ---
mm/oom_kill.c | 24 ++++++++++++++++++------
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 635cb8b65b86..8e0d5d484153 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1214,9 +1214,6 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
break;
}
}
-#ifdef CONFIG_MEMCG_QOS
- memcg_print_bad_task(arg, ret);
-#endif
return ret;
}
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 0f77eb4c6644..1e3ba16dd748 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -406,11 +406,17 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
*/
static void select_bad_process(struct oom_control *oc)
{
+ int ret;
oc->chosen_points = LONG_MIN;
- if (is_memcg_oom(oc))
- mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc);
- else {
+ if (is_memcg_oom(oc)) {
+ ret = mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc);
+
+#ifdef CONFIG_MEMCG_QOS
+ memcg_print_bad_task(oc, ret);
+#endif
+
+ } else {
struct task_struct *p;
#ifdef CONFIG_MEMCG_QOS
@@ -470,12 +476,18 @@ static int dump_task(struct task_struct *p, void *arg)
*/
static void dump_tasks(struct oom_control *oc)
{
+ int ret;
pr_info("Tasks state (memory values in pages):\n");
pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n");
- if (is_memcg_oom(oc))
- mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
- else {
+ if (is_memcg_oom(oc)) {
+ ret = mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
+
+#ifdef CONFIG_MEMCG_QOS
+ memcg_print_bad_task(oc, ret);
+#endif
+
+ } else {
struct task_struct *p;
rcu_read_lock();
--
2.34.1
1
0
From: xiabing <xiabing12(a)h-partners.com>
*** BLURB HERE ***
Jie Zhan (3):
scsi: libsas: Add smp_ata_check_ready_type()
scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset
scsi: libsas: Do not export sas_ata_wait_after_reset()
Martin K. Petersen (1):
scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice
Xingui Yang (1):
scsi: sd: Update DIX config every time sd_revalidate_disk() is called
Yihang Li (3):
scsi: hisi_sas: Set a port invalid only if there are no devices
attached when refreshing port id
scsi: hisi_sas: Exit suspending state when usage count is greater than
0
scsi: hisi_sas: Ensure all enabled PHYs up during controller reset
drivers/scsi/hisi_sas/hisi_sas_main.c | 42 ++++++++++++---
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 73 ++++++++++++++++++++------
drivers/scsi/libsas/sas_ata.c | 28 +++++++++-
drivers/scsi/libsas/sas_expander.c | 4 +-
drivers/scsi/libsas/sas_internal.h | 2 +
drivers/scsi/sd.c | 59 +++++++++++----------
drivers/scsi/sd_dif.c | 12 ++---
include/scsi/sas_ata.h | 7 ++-
8 files changed, 161 insertions(+), 66 deletions(-)
--
2.30.0
1
8
1
0
Backport 5.10.151 LTS patches from upstream.
Makefile | 3 +++
scripts/link-vmlinux.sh | 2 +-
scripts/pahole-flags.sh | 21 +++++++++++++++++++++
3 files changed, 25 insertions(+), 1 deletion(-)
create mode 100755 scripts/pahole-flags.sh
--
2.25.1
1
5

13 Apr '23
Backport 5.10.150 LTS patches from upstream.
Conflicts:
Already merged(19):
f039b43cbaea inet: fully convert sk->sk_rx_dst to RCU rules
45c33966759e mm: hugetlb: fix UAF in hugetlb_handle_userfault
c378c479c517 io_uring/af_unix: defer registered files gc to io_uring release
67cbc8865a66 io_uring: correct pinned_vm accounting
904f881b5736 arm64: topology: fix possible overflow in amu_fie_setup()
dbcca76435a6 HID: roccat: Fix use-after-free in roccat_read()
484400d433ca r8152: Rate limit overflow messages
d88b88514ef2 crypto: hisilicon/zip - fix mismatch in get/set sgl_sge_nr
657de36c72f5 arm64: ftrace: fix module PLTs with mcount
29f50bcf0f8b net: mvpp2: fix mvpp2 debugfs leak
6cc0e2afc6a1 bnx2x: fix potential memory leak in bnx2x_tpa_stop()
2a1d03632085 mISDN: fix use-after-free bugs in l1oip timer handlers
0cf6c09dafee ring-buffer: Fix race between reset page and reading page
fbb0e601bd51 ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
483831ad0440 ext4: fix check for block being out of directory size
f34ab9516276 ext4: fix null-ptr-deref in ext4_write_info
e50472949604 fbdev: smscufx: Fix use-after-free in ufx_ops_open()
7d551b7d6114 block: fix inflight statistics of part0
6b7ae4a904a4 quota: Check next/prev free block number after reading from quota file
Context conflict(1):
c13d0d2f5a48 usb: host: xhci-plat: suspend and resume clocks
Total patches: 389 - 19 = 370
Documentation/ABI/testing/sysfs-bus-iio | 2 +-
Makefile | 6 +-
arch/arm/Kconfig | 1 -
arch/arm/boot/dts/armada-385-turris-omnia.dts | 4 +-
arch/arm/boot/dts/exynos4412-midas.dtsi | 2 +-
arch/arm/boot/dts/exynos4412-origen.dts | 2 +-
arch/arm/boot/dts/imx6dl.dtsi | 3 +
arch/arm/boot/dts/imx6q.dtsi | 3 +
arch/arm/boot/dts/imx6qp.dtsi | 6 +
arch/arm/boot/dts/imx6sl.dtsi | 3 +
arch/arm/boot/dts/imx6sll.dtsi | 3 +
arch/arm/boot/dts/imx6sx.dtsi | 6 +
arch/arm/boot/dts/imx7d-sdb.dts | 7 +-
arch/arm/boot/dts/kirkwood-lsxl.dtsi | 16 +-
arch/arm/mm/dump.c | 2 +-
arch/arm/mm/mmu.c | 4 +
.../boot/dts/freescale/imx8mq-librem5.dtsi | 1 +
arch/ia64/mm/numa.c | 1 +
arch/mips/bcm47xx/prom.c | 4 +-
arch/mips/sgi-ip27/ip27-xtalk.c | 74 +++--
arch/powerpc/Makefile | 2 +-
arch/powerpc/boot/Makefile | 1 +
.../boot/dts/fsl/e500v1_power_isa.dtsi | 51 ++++
arch/powerpc/boot/dts/fsl/mpc8540ads.dts | 2 +-
arch/powerpc/boot/dts/fsl/mpc8541cds.dts | 2 +-
arch/powerpc/boot/dts/fsl/mpc8555cds.dts | 2 +-
arch/powerpc/boot/dts/fsl/mpc8560ads.dts | 2 +-
arch/powerpc/kernel/pci_dn.c | 1 +
arch/powerpc/math-emu/math_efp.c | 1 +
arch/powerpc/platforms/powernv/opal.c | 1 +
arch/powerpc/sysdev/fsl_msi.c | 2 +
arch/riscv/Makefile | 2 +
arch/riscv/include/asm/io.h | 16 +-
arch/riscv/kernel/sys_riscv.c | 3 -
arch/riscv/mm/fault.c | 3 +-
arch/sh/include/asm/sections.h | 2 +-
arch/sh/kernel/machvec.c | 10 +-
arch/um/kernel/um_arch.c | 2 +-
arch/x86/include/asm/hyperv-tlfs.h | 4 +-
arch/x86/include/asm/microcode.h | 1 +
arch/x86/kernel/cpu/feat_ctl.c | 2 +-
arch/x86/kernel/cpu/microcode/amd.c | 3 +-
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 12 +-
arch/x86/kvm/emulate.c | 2 +-
arch/x86/kvm/vmx/nested.c | 30 +-
arch/x86/kvm/vmx/vmx.c | 12 +-
arch/x86/xen/enlighten_pv.c | 3 +-
block/blk-throttle.c | 8 +-
crypto/akcipher.c | 8 +
drivers/acpi/acpi_video.c | 16 ++
drivers/acpi/apei/ghes.c | 2 +-
drivers/ata/libahci_platform.c | 14 +-
drivers/block/nbd.c | 6 +-
drivers/bluetooth/btusb.c | 47 +++-
drivers/bluetooth/hci_ldisc.c | 7 +-
drivers/bluetooth/hci_serdev.c | 10 +-
drivers/char/hw_random/imx-rngc.c | 14 +-
drivers/clk/baikal-t1/ccu-div.c | 65 +++++
drivers/clk/baikal-t1/ccu-div.h | 10 +
drivers/clk/baikal-t1/clk-ccu-div.c | 26 +-
drivers/clk/bcm/clk-bcm2835.c | 8 +-
drivers/clk/berlin/bg2.c | 5 +-
drivers/clk/berlin/bg2q.c | 6 +-
drivers/clk/clk-ast2600.c | 2 +-
drivers/clk/clk-oxnas.c | 6 +-
drivers/clk/clk-qoriq.c | 10 +-
drivers/clk/clk-versaclock5.c | 2 +-
drivers/clk/mediatek/clk-mt8183-mfgcfg.c | 6 +-
drivers/clk/meson/meson-aoclk.c | 5 +-
drivers/clk/meson/meson-eeclk.c | 5 +-
drivers/clk/meson/meson8b.c | 5 +-
drivers/clk/qcom/apss-ipq6018.c | 2 +-
drivers/clk/sprd/common.c | 9 +-
drivers/clk/tegra/clk-tegra114.c | 1 +
drivers/clk/tegra/clk-tegra20.c | 1 +
drivers/clk/tegra/clk-tegra210.c | 1 +
drivers/clk/ti/clk-dra7-atl.c | 9 +-
drivers/clk/zynqmp/clkc.c | 7 +
drivers/clk/zynqmp/pll.c | 31 +--
drivers/crypto/cavium/cpt/cptpf_main.c | 6 +-
drivers/crypto/ccp/ccp-dmaengine.c | 6 +-
drivers/crypto/inside-secure/safexcel_hash.c | 8 +-
.../crypto/marvell/octeontx/otx_cptpf_ucode.c | 18 +-
drivers/crypto/qat/qat_common/qat_algs.c | 109 +++++---
drivers/crypto/qat/qat_common/qat_crypto.h | 24 ++
drivers/crypto/sahara.c | 18 +-
drivers/dma-buf/udmabuf.c | 9 +-
drivers/dma/hisi_dma.c | 28 +-
drivers/dma/ioat/dma.c | 6 +-
drivers/firmware/efi/libstub/fdt.c | 8 -
drivers/firmware/google/gsmi.c | 9 +
drivers/fpga/dfl.c | 2 +-
drivers/fsi/fsi-core.c | 3 +
drivers/gpu/drm/Kconfig | 1 +
.../gpu/drm/amd/amdgpu/amdgpu_connectors.c | 7 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 -
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/soc15.c | 25 ++
.../gpu/drm/amd/display/dc/calcs/bw_fixed.c | 6 +-
drivers/gpu/drm/amd/display/dc/core/dc.c | 16 +-
drivers/gpu/drm/amd/display/dc/dc_stream.h | 6 +-
.../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 35 +--
.../amd/display/dc/dcn10/dcn10_hw_sequencer.h | 3 +-
.../gpu/drm/amd/display/dc/inc/hw_sequencer.h | 8 +-
drivers/gpu/drm/bridge/adv7511/adv7511.h | 5 +-
drivers/gpu/drm/bridge/adv7511/adv7511_cec.c | 4 +-
drivers/gpu/drm/bridge/lontium-lt9611.c | 3 +-
.../bridge/megachips-stdpxxxx-ge-b850v3-fw.c | 4 +-
drivers/gpu/drm/bridge/parade-ps8640.c | 4 +-
drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 13 +-
drivers/gpu/drm/drm_bridge.c | 4 +-
drivers/gpu/drm/drm_dp_helper.c | 9 -
drivers/gpu/drm/drm_dp_mst_topology.c | 6 +-
drivers/gpu/drm/drm_ioctl.c | 8 +-
drivers/gpu/drm/drm_mipi_dsi.c | 1 +
.../gpu/drm/drm_panel_orientation_quirks.c | 6 +
drivers/gpu/drm/i915/intel_pm.c | 8 +-
drivers/gpu/drm/meson/meson_drv.c | 8 +
drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 12 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_vbif.c | 29 +-
drivers/gpu/drm/msm/dp/dp_catalog.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 +-
drivers/gpu/drm/nouveau/nouveau_connector.c | 3 +-
drivers/gpu/drm/nouveau/nouveau_prime.c | 1 -
drivers/gpu/drm/omapdrm/dss/dss.c | 3 +
drivers/gpu/drm/pl111/pl111_versatile.c | 1 +
drivers/gpu/drm/udl/udl_modeset.c | 3 -
drivers/gpu/drm/vc4/vc4_vec.c | 4 +-
drivers/gpu/drm/virtio/virtgpu_vq.c | 2 +-
drivers/hid/hid-multitouch.c | 8 +-
drivers/hsi/controllers/omap_ssi_core.c | 1 +
drivers/hsi/controllers/omap_ssi_port.c | 8 +-
drivers/hwmon/gsc-hwmon.c | 1 +
drivers/i2c/busses/i2c-mlxbf.c | 44 ++-
drivers/iio/adc/ad7923.c | 4 +-
drivers/iio/adc/at91-sama5d2_adc.c | 28 +-
drivers/iio/adc/ltc2497.c | 13 +
drivers/iio/dac/ad5593r.c | 46 +--
drivers/iio/inkern.c | 6 +-
drivers/iio/pressure/dps310.c | 262 +++++++++++-------
drivers/infiniband/core/cm.c | 14 +-
drivers/infiniband/core/uverbs_cmd.c | 5 +-
drivers/infiniband/core/verbs.c | 2 +
drivers/infiniband/hw/hns/hns_roce_mr.c | 1 -
drivers/infiniband/hw/mlx4/mr.c | 1 -
drivers/infiniband/sw/rxe/rxe_qp.c | 10 +-
drivers/infiniband/sw/siw/siw_qp_rx.c | 27 +-
drivers/iommu/omap-iommu-debug.c | 6 +-
drivers/leds/leds-lm3601x.c | 2 -
drivers/mailbox/bcm-flexrm-mailbox.c | 8 +-
drivers/md/bcache/writeback.c | 73 +++--
drivers/md/raid0.c | 2 +-
drivers/md/raid5.c | 14 +-
drivers/media/pci/cx88/cx88-vbi.c | 9 +-
drivers/media/pci/cx88/cx88-video.c | 43 +--
drivers/media/platform/exynos4-is/fimc-is.c | 1 +
drivers/media/platform/xilinx/xilinx-vipp.c | 9 +-
drivers/memory/of_memory.c | 2 +
drivers/memory/pl353-smc.c | 1 +
drivers/mfd/fsl-imx25-tsadc.c | 34 ++-
drivers/mfd/intel_soc_pmic_core.c | 1 +
drivers/mfd/lp8788-irq.c | 3 +
drivers/mfd/lp8788.c | 12 +-
drivers/mfd/sm501.c | 7 +-
drivers/misc/ocxl/file.c | 2 +
drivers/mmc/host/au1xmmc.c | 3 +-
drivers/mmc/host/sdhci-msm.c | 1 +
drivers/mmc/host/sdhci-sprd.c | 2 +-
drivers/mmc/host/wmt-sdmmc.c | 5 +-
drivers/mtd/devices/docg3.c | 7 +-
drivers/mtd/nand/raw/atmel/nand-controller.c | 1 +
drivers/mtd/nand/raw/fsl_elbc_nand.c | 28 +-
drivers/mtd/nand/raw/meson_nand.c | 4 +-
drivers/net/can/usb/kvaser_usb/kvaser_usb.h | 2 +
.../net/can/usb/kvaser_usb/kvaser_usb_core.c | 3 +-
.../net/can/usb/kvaser_usb/kvaser_usb_hydra.c | 2 +-
.../net/can/usb/kvaser_usb/kvaser_usb_leaf.c | 79 ++++++
.../net/ethernet/freescale/fs_enet/mac-fec.c | 2 +-
drivers/net/wireless/ath/ath10k/mac.c | 54 ++--
drivers/net/wireless/ath/ath11k/mac.c | 25 +-
drivers/net/wireless/ath/ath9k/htc_hst.c | 43 ++-
.../broadcom/brcm80211/brcmfmac/core.c | 3 +-
.../broadcom/brcm80211/brcmfmac/pno.c | 12 +-
.../net/wireless/ralink/rt2x00/rt2800lib.c | 34 ++-
.../wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 75 ++++-
drivers/nvme/host/core.c | 3 +-
drivers/nvme/host/pci.c | 3 +-
drivers/nvme/target/tcp.c | 11 +-
drivers/pci/setup-res.c | 11 +
drivers/phy/qualcomm/phy-qcom-usb-hsic.c | 6 +-
drivers/platform/chrome/chromeos_laptop.c | 24 +-
drivers/platform/chrome/cros_ec.c | 8 +-
drivers/platform/chrome/cros_ec_chardev.c | 3 +
drivers/platform/chrome/cros_ec_proto.c | 32 +++
drivers/platform/x86/msi-laptop.c | 14 +-
drivers/power/supply/adp5061.c | 6 +-
drivers/powercap/intel_rapl_common.c | 4 +-
drivers/regulator/core.c | 2 +-
drivers/regulator/qcom_rpm-regulator.c | 24 +-
drivers/scsi/3w-9xxx.c | 2 +-
drivers/scsi/iscsi_tcp.c | 73 +++--
drivers/scsi/iscsi_tcp.h | 2 +
drivers/scsi/libsas/sas_expander.c | 2 +-
drivers/scsi/qedf/qedf_main.c | 21 ++
drivers/soc/qcom/smem_state.c | 3 +-
drivers/soc/qcom/smsm.c | 20 +-
drivers/soc/tegra/Kconfig | 1 -
drivers/soundwire/cadence_master.c | 9 +-
drivers/soundwire/intel.c | 1 -
drivers/spi/spi-dw-bt1.c | 4 +-
drivers/spi/spi-meson-spicc.c | 6 +-
drivers/spi/spi-mt7621.c | 8 +-
drivers/spi/spi-omap-100k.c | 1 +
drivers/spi/spi-qup.c | 21 +-
drivers/spi/spi-s3c64xx.c | 9 +
drivers/spi/spi.c | 2 +
drivers/spmi/spmi-pmic-arb.c | 13 +-
drivers/staging/greybus/audio_helper.c | 11 -
drivers/staging/media/meson/vdec/vdec_hevc.c | 6 +-
drivers/staging/media/sunxi/cedrus/cedrus.c | 4 +-
drivers/staging/rtl8723bs/core/rtw_cmd.c | 16 +-
drivers/staging/vt6655/device_main.c | 8 +-
drivers/thermal/intel/intel_powerclamp.c | 4 +-
drivers/thermal/qcom/tsens-v0_1.c | 2 +-
drivers/thunderbolt/switch.c | 24 ++
drivers/thunderbolt/tb.h | 1 +
drivers/thunderbolt/tb_regs.h | 1 +
drivers/thunderbolt/usb4.c | 20 ++
drivers/tty/serial/8250/8250_core.c | 19 +-
drivers/tty/serial/8250/8250_port.c | 15 +-
drivers/tty/serial/fsl_lpuart.c | 2 +
drivers/tty/serial/jsm/jsm_driver.c | 3 +-
drivers/tty/serial/xilinx_uartps.c | 2 +
drivers/usb/common/common.c | 102 ++++++-
drivers/usb/common/debug.c | 78 +++++-
drivers/usb/core/devices.c | 21 +-
drivers/usb/core/endpoint.c | 35 +--
drivers/usb/core/quirks.c | 4 +
drivers/usb/gadget/function/f_printer.c | 12 +-
drivers/usb/host/xhci-mem.c | 7 +-
drivers/usb/host/xhci-plat.c | 18 +-
drivers/usb/host/xhci.c | 3 +-
drivers/usb/host/xhci.h | 1 +
drivers/usb/misc/idmouse.c | 8 +-
drivers/usb/musb/musb_gadget.c | 3 +
drivers/usb/storage/unusual_devs.h | 6 -
drivers/vhost/vsock.c | 2 +-
drivers/video/fbdev/stifb.c | 2 +-
fs/btrfs/qgroup.c | 15 +
fs/btrfs/scrub.c | 36 +++
fs/cifs/file.c | 9 +
fs/cifs/smb2pdu.c | 7 +-
fs/dlm/ast.c | 6 +-
fs/dlm/lock.c | 16 +-
fs/ext4/fast_commit.c | 40 +--
fs/ext4/file.c | 6 +
fs/ext4/inode.c | 14 +-
fs/ext4/resize.c | 2 +-
fs/ext4/super.c | 1 +
fs/f2fs/checkpoint.c | 23 +-
fs/f2fs/data.c | 4 +-
fs/f2fs/extent_cache.c | 3 +-
fs/f2fs/f2fs.h | 27 +-
fs/f2fs/gc.c | 10 +-
fs/f2fs/recovery.c | 23 +-
fs/f2fs/segment.c | 47 ++--
fs/f2fs/super.c | 4 +-
fs/jbd2/commit.c | 2 +-
fs/jbd2/journal.c | 10 +-
fs/jbd2/recovery.c | 1 +
fs/jbd2/transaction.c | 6 +-
fs/nfsd/nfs4recover.c | 4 +-
fs/nfsd/nfs4state.c | 5 +
fs/nfsd/nfs4xdr.c | 2 +-
fs/userfaultfd.c | 4 +-
include/linux/ata.h | 39 +--
include/linux/dynamic_debug.h | 11 +-
include/linux/iova.h | 2 +-
include/linux/once.h | 28 ++
include/linux/ring_buffer.h | 2 +-
include/linux/serial_8250.h | 1 +
include/linux/tcp.h | 2 +-
include/linux/usb/ch9.h | 62 +----
include/net/ieee802154_netdev.h | 12 +-
include/net/tcp.h | 5 +-
include/uapi/linux/usb/ch9.h | 13 +
kernel/bpf/btf.c | 2 +-
kernel/bpf/syscall.c | 2 +
kernel/cgroup/cpuset.c | 18 +-
kernel/gcov/gcc_4_7.c | 18 +-
kernel/livepatch/transition.c | 18 +-
kernel/rcu/tasks.h | 2 +-
kernel/rcu/tree.c | 17 +-
kernel/trace/ftrace.c | 8 +-
kernel/trace/kprobe_event_gen_test.c | 49 +++-
kernel/trace/ring_buffer.c | 54 +++-
kernel/trace/trace.c | 23 ++
lib/dynamic_debug.c | 45 +--
lib/once.c | 30 ++
mm/mmap.c | 5 +-
net/bluetooth/hci_core.c | 34 ++-
net/bluetooth/hci_sysfs.c | 3 +
net/bluetooth/l2cap_core.c | 17 +-
net/can/bcm.c | 7 +-
net/core/stream.c | 3 +-
net/ieee802154/socket.c | 4 +
net/ipv4/inet_hashtables.c | 4 +-
net/ipv4/netfilter/nft_fib_ipv4.c | 3 +
net/ipv4/tcp.c | 16 +-
net/ipv4/tcp_output.c | 19 +-
net/ipv6/netfilter/nft_fib_ipv6.c | 6 +-
net/mac80211/cfg.c | 3 -
net/openvswitch/datapath.c | 18 +-
net/rds/tcp.c | 2 +-
net/sctp/auth.c | 18 +-
net/vmw_vsock/virtio_transport_common.c | 2 +-
net/xfrm/xfrm_ipcomp.c | 1 +
scripts/Kbuild.include | 23 +-
scripts/package/mkspec | 4 +-
scripts/selinux/install_policy.sh | 2 +-
security/Kconfig.hardening | 63 +++--
sound/core/pcm_dmaengine.c | 8 +-
sound/core/rawmidi.c | 2 -
sound/core/sound_oss.c | 13 +-
sound/pci/hda/hda_beep.c | 15 +-
sound/pci/hda/hda_beep.h | 1 +
sound/pci/hda/patch_hdmi.c | 6 -
sound/pci/hda/patch_realtek.c | 11 +-
sound/pci/hda/patch_sigmatel.c | 25 +-
sound/soc/codecs/da7219.c | 5 +-
sound/soc/codecs/mt6660.c | 8 +-
sound/soc/codecs/tas2764.c | 78 ++----
sound/soc/codecs/wcd9335.c | 2 +-
sound/soc/codecs/wcd934x.c | 2 +-
sound/soc/codecs/wm5102.c | 6 +-
sound/soc/codecs/wm5110.c | 6 +-
sound/soc/codecs/wm8997.c | 6 +-
sound/soc/fsl/eukrea-tlv320.c | 8 +-
sound/soc/sh/rcar/ctu.c | 6 +-
sound/soc/sh/rcar/dvc.c | 6 +-
sound/soc/sh/rcar/mix.c | 6 +-
sound/soc/sh/rcar/src.c | 5 +-
sound/soc/sh/rcar/ssi.c | 4 +-
sound/soc/sof/sof-pci-dev.c | 2 +-
sound/usb/endpoint.c | 6 +-
tools/bpf/bpftool/btf_dumper.c | 2 +-
tools/bpf/bpftool/main.c | 10 +
tools/lib/bpf/xsk.c | 6 +-
tools/objtool/elf.c | 7 +-
tools/perf/util/intel-pt.c | 9 +-
.../arm64/signal/testcases/testcases.c | 2 +-
tools/testing/selftests/tpm2/tpm2.py | 4 +
353 files changed, 3016 insertions(+), 1399 deletions(-)
create mode 100644 arch/powerpc/boot/dts/fsl/e500v1_power_isa.dtsi
--
2.25.1
1
370
*** BLURB HERE ***
zhoujiadong (2):
net/hinic3: Huawei Interface Card driver first commit
net/hinic3: Huawei Interface Card hw driver first commit
drivers/net/ethernet/huawei/Kconfig | 1 +
drivers/net/ethernet/huawei/Makefile | 1 +
drivers/net/ethernet/huawei/hinic3/Kconfig | 15 +
drivers/net/ethernet/huawei/hinic3/Makefile | 45 +
.../ethernet/huawei/hinic3/cfg_mgt_comm_pub.h | 212 ++
.../ethernet/huawei/hinic3/comm_cmdq_intf.h | 235 ++
.../net/ethernet/huawei/hinic3/comm_defs.h | 102 +
.../ethernet/huawei/hinic3/comm_msg_intf.h | 663 +++++
.../ethernet/huawei/hinic3/hinic3_comm_cmd.h | 181 ++
.../ethernet/huawei/hinic3/hinic3_common.h | 125 +
.../net/ethernet/huawei/hinic3/hinic3_crm.h | 1166 +++++++++
.../net/ethernet/huawei/hinic3/hinic3_dbg.c | 983 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_dcb.c | 405 +++
.../net/ethernet/huawei/hinic3/hinic3_dcb.h | 78 +
.../ethernet/huawei/hinic3/hinic3_ethtool.c | 1331 ++++++++++
.../huawei/hinic3/hinic3_ethtool_stats.c | 1233 ++++++++++
.../ethernet/huawei/hinic3/hinic3_filter.c | 483 ++++
.../net/ethernet/huawei/hinic3/hinic3_hw.h | 832 +++++++
.../net/ethernet/huawei/hinic3/hinic3_irq.c | 189 ++
.../net/ethernet/huawei/hinic3/hinic3_lld.h | 205 ++
.../ethernet/huawei/hinic3/hinic3_mag_cfg.c | 953 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_main.c | 1125 +++++++++
.../huawei/hinic3/hinic3_mgmt_interface.h | 1245 ++++++++++
.../net/ethernet/huawei/hinic3/hinic3_mt.h | 665 +++++
.../huawei/hinic3/hinic3_netdev_ops.c | 1975 +++++++++++++++
.../net/ethernet/huawei/hinic3/hinic3_nic.h | 183 ++
.../ethernet/huawei/hinic3/hinic3_nic_cfg.c | 1608 ++++++++++++
.../ethernet/huawei/hinic3/hinic3_nic_cfg.h | 621 +++++
.../huawei/hinic3/hinic3_nic_cfg_vf.c | 638 +++++
.../ethernet/huawei/hinic3/hinic3_nic_cmd.h | 162 ++
.../ethernet/huawei/hinic3/hinic3_nic_dbg.c | 146 ++
.../ethernet/huawei/hinic3/hinic3_nic_dbg.h | 21 +
.../ethernet/huawei/hinic3/hinic3_nic_dev.h | 387 +++
.../ethernet/huawei/hinic3/hinic3_nic_event.c | 580 +++++
.../ethernet/huawei/hinic3/hinic3_nic_io.c | 1122 +++++++++
.../ethernet/huawei/hinic3/hinic3_nic_io.h | 326 +++
.../ethernet/huawei/hinic3/hinic3_nic_prof.c | 47 +
.../ethernet/huawei/hinic3/hinic3_nic_prof.h | 60 +
.../ethernet/huawei/hinic3/hinic3_nic_qp.h | 385 +++
.../ethernet/huawei/hinic3/hinic3_ntuple.c | 907 +++++++
.../ethernet/huawei/hinic3/hinic3_prepare.sh | 262 ++
.../ethernet/huawei/hinic3/hinic3_profile.h | 147 ++
.../net/ethernet/huawei/hinic3/hinic3_rss.c | 978 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_rss.h | 100 +
.../ethernet/huawei/hinic3/hinic3_rss_cfg.c | 384 +++
.../net/ethernet/huawei/hinic3/hinic3_rx.c | 1344 ++++++++++
.../net/ethernet/huawei/hinic3/hinic3_rx.h | 155 ++
.../ethernet/huawei/hinic3/hinic3_srv_nic.h | 213 ++
.../net/ethernet/huawei/hinic3/hinic3_tx.c | 1016 ++++++++
.../net/ethernet/huawei/hinic3/hinic3_tx.h | 159 ++
.../net/ethernet/huawei/hinic3/hinic3_wq.h | 134 +
.../huawei/hinic3/hw/hinic3_api_cmd.c | 1211 +++++++++
.../huawei/hinic3/hw/hinic3_api_cmd.h | 286 +++
.../ethernet/huawei/hinic3/hw/hinic3_cmdq.c | 1543 ++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_cmdq.h | 204 ++
.../ethernet/huawei/hinic3/hw/hinic3_common.c | 93 +
.../ethernet/huawei/hinic3/hw/hinic3_csr.h | 187 ++
.../huawei/hinic3/hw/hinic3_dev_mgmt.c | 803 ++++++
.../huawei/hinic3/hw/hinic3_dev_mgmt.h | 105 +
.../huawei/hinic3/hw/hinic3_devlink.c | 431 ++++
.../huawei/hinic3/hw/hinic3_devlink.h | 149 ++
.../ethernet/huawei/hinic3/hw/hinic3_eqs.c | 1385 +++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_eqs.h | 165 ++
.../ethernet/huawei/hinic3/hw/hinic3_hw_api.c | 453 ++++
.../ethernet/huawei/hinic3/hw/hinic3_hw_api.h | 141 ++
.../ethernet/huawei/hinic3/hw/hinic3_hw_cfg.c | 1480 +++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_hw_cfg.h | 332 +++
.../huawei/hinic3/hw/hinic3_hw_comm.c | 1540 ++++++++++++
.../huawei/hinic3/hw/hinic3_hw_comm.h | 51 +
.../ethernet/huawei/hinic3/hw/hinic3_hw_mt.c | 599 +++++
.../ethernet/huawei/hinic3/hw/hinic3_hw_mt.h | 49 +
.../ethernet/huawei/hinic3/hw/hinic3_hwdev.c | 2141 ++++++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_hwdev.h | 175 ++
.../ethernet/huawei/hinic3/hw/hinic3_hwif.c | 994 ++++++++
.../ethernet/huawei/hinic3/hw/hinic3_hwif.h | 113 +
.../ethernet/huawei/hinic3/hw/hinic3_lld.c | 1413 +++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_mbox.c | 1842 ++++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_mbox.h | 267 ++
.../ethernet/huawei/hinic3/hw/hinic3_mgmt.c | 1515 ++++++++++++
.../ethernet/huawei/hinic3/hw/hinic3_mgmt.h | 179 ++
.../huawei/hinic3/hw/hinic3_nictool.c | 974 ++++++++
.../huawei/hinic3/hw/hinic3_nictool.h | 35 +
.../huawei/hinic3/hw/hinic3_pci_id_tbl.h | 15 +
.../huawei/hinic3/hw/hinic3_prof_adap.c | 44 +
.../huawei/hinic3/hw/hinic3_prof_adap.h | 111 +
.../ethernet/huawei/hinic3/hw/hinic3_sm_lt.h | 160 ++
.../ethernet/huawei/hinic3/hw/hinic3_sml_lt.c | 160 ++
.../ethernet/huawei/hinic3/hw/hinic3_sriov.c | 267 ++
.../ethernet/huawei/hinic3/hw/hinic3_sriov.h | 35 +
.../net/ethernet/huawei/hinic3/hw/hinic3_wq.c | 159 ++
.../huawei/hinic3/hw/ossl_knl_linux.c | 533 ++++
drivers/net/ethernet/huawei/hinic3/mag_cmd.h | 879 +++++++
.../ethernet/huawei/hinic3/mgmt_msg_base.h | 27 +
.../net/ethernet/huawei/hinic3/nic_cfg_comm.h | 66 +
drivers/net/ethernet/huawei/hinic3/ossl_knl.h | 38 +
.../ethernet/huawei/hinic3/ossl_knl_linux.h | 2178 +++++++++++++++++
96 files changed, 52060 insertions(+)
create mode 100644 drivers/net/ethernet/huawei/hinic3/Kconfig
create mode 100644 drivers/net/ethernet/huawei/hinic3/Makefile
create mode 100644 drivers/net/ethernet/huawei/hinic3/cfg_mgt_comm_pub.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/comm_cmdq_intf.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/comm_defs.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/comm_msg_intf.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_comm_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_common.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_crm.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_dbg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_dcb.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_dcb.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_ethtool_stats.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_filter.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_hw.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_lld.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_mag_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_main.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_mt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_netdev_ops.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cfg_vf.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_dbg.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_dev.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_event.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_io.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_prof.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_nic_qp.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_ntuple.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_prepare.sh
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_profile.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rss.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rss.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rss_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rx.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_srv_nic.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_tx.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_tx.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hinic3_wq.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_api_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_cmdq.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_common.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_csr.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_dev_mgmt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_devlink.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_eqs.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_api.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_cfg.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_comm.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hw_mt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwdev.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_hwif.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_lld.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mbox.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_mgmt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_nictool.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_pci_id_tbl.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_prof_adap.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sm_lt.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sml_lt.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_sriov.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/hinic3_wq.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/hw/ossl_knl_linux.c
create mode 100644 drivers/net/ethernet/huawei/hinic3/mag_cmd.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/mgmt_msg_base.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/nic_cfg_comm.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/ossl_knl.h
create mode 100644 drivers/net/ethernet/huawei/hinic3/ossl_knl_linux.h
--
2.24.0
1
2
Backport 5.10.150 LTS patches from upstream.
Conflicts:
Already merged(18):
f039b43cbaea inet: fully convert sk->sk_rx_dst to RCU rules
45c33966759e mm: hugetlb: fix UAF in hugetlb_handle_userfault
67cbc8865a66 io_uring: correct pinned_vm accounting
904f881b5736 arm64: topology: fix possible overflow in amu_fie_setup()
dbcca76435a6 HID: roccat: Fix use-after-free in roccat_read()
484400d433ca r8152: Rate limit overflow messages
d88b88514ef2 crypto: hisilicon/zip - fix mismatch in get/set sgl_sge_nr
657de36c72f5 arm64: ftrace: fix module PLTs with mcount
29f50bcf0f8b net: mvpp2: fix mvpp2 debugfs leak
6cc0e2afc6a1 bnx2x: fix potential memory leak in bnx2x_tpa_stop()
2a1d03632085 mISDN: fix use-after-free bugs in l1oip timer handlers
0cf6c09dafee ring-buffer: Fix race between reset page and reading page
fbb0e601bd51 ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
483831ad0440 ext4: fix check for block being out of directory size
f34ab9516276 ext4: fix null-ptr-deref in ext4_write_info
e50472949604 fbdev: smscufx: Fix use-after-free in ufx_ops_open()
7d551b7d6114 block: fix inflight statistics of part0
6b7ae4a904a4 quota: Check next/prev free block number after reading from quota file
Context conflict(1):
c13d0d2f5a48 usb: host: xhci-plat: suspend and resume clocks
Total patches: 389 - 18 = 371
Adrian Hunter (1):
perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
Adrián Larumbe (1):
drm/meson: explicitly remove aggregate driver at module unload time
Albert Briscoe (1):
usb: gadget: function: fix dangling pnp_string in f_printer.c
Alex Deucher (1):
Revert "drm/amdgpu: make sure to init common IP before gmc"
Alexander Aring (4):
fs: dlm: fix race between test_bit() and queue_work()
fs: dlm: handle -EBUSY first in lock arg validation
net: ieee802154: return -EINVAL for unknown addr type
Revert "net/ieee802154: reject zero-sized raw_sendmsg()"
Alexander Coffin (1):
wifi: brcmfmac: fix use-after-free bug in brcmf_netdev_start_xmit()
Alexander Stein (6):
ARM: dts: imx6q: add missing properties for sram
ARM: dts: imx6dl: add missing properties for sram
ARM: dts: imx6qp: add missing properties for sram
ARM: dts: imx6sl: add missing properties for sram
ARM: dts: imx6sll: add missing properties for sram
ARM: dts: imx6sx: add missing properties for sram
Alvin Šipraga (1):
drm: bridge: adv7511: fix CEC power down control register offset
Andreas Pape (1):
ALSA: dmaengine: increment buffer pointer atomically
Andrew Bresticker (2):
riscv: Allow PROT_WRITE-only mmap()
riscv: Make VM_WRITE imply VM_READ
Andrew Perepechko (1):
jbd2: wake up journal waiters in FIFO order, not LIFO
Andri Yngvason (1):
HID: multitouch: Add memory barriers
Anna Schumaker (1):
NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data
Anssi Hannula (4):
can: kvaser_usb: Fix use of uninitialized completion
can: kvaser_usb_leaf: Fix overread with an invalid command
can: kvaser_usb_leaf: Fix TX queue out of sync after restart
can: kvaser_usb_leaf: Fix CAN state after restart
Ard Biesheuvel (1):
efi: libstub: drop pointless get_memory_map() call
Aric Cyr (1):
drm/amd/display: Remove interface for periodic interrupt 1
Arnd Bergmann (1):
Bluetooth: btusb: fix excessive stack usage
Arvid Norlander (1):
ACPI: video: Add Toshiba Satellite/Portege Z830 quirk
Asmaa Mnebhi (1):
i2c: mlxbf: support lock mechanism
Bernard Metzler (1):
RDMA/siw: Always consume all skbuf data in sk_data_ready() upcall.
Bitterblue Smith (4):
wifi: rtl8xxxu: Fix skb misuse in TX queue selection
wifi: rtl8xxxu: gen2: Fix mistake in path B IQ calibration
wifi: rtl8xxxu: Remove copy-paste leftover in gen2_update_rate_mask
wifi: rtl8xxxu: Fix AIFS written to REG_EDCA_*_PARAM
Callum Osmotherly (1):
ALSA: hda/realtek: remove ALC289_FIXUP_DUAL_SPK for Dell 5530
Carlos Llamas (1):
mm/mmap: undo ->mmap() when arch_validate_flags() fails
Chao Qin (1):
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
Chao Yu (5):
f2fs: fix to do sanity check on destination blkaddr during recovery
f2fs: fix to do sanity check on summary info
f2fs: fix to avoid REQ_TIME and CP_TIME collision
f2fs: fix to account FS_CP_DATA_IO correctly
f2fs: fix wrong condition to trigger background checkpoint correctly
Chen-Yu Tsai (2):
drm/bridge: parade-ps8640: Fix regulator supply order
clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent
Christophe JAILLET (11):
MIPS: SGI-IP27: Free some unused memory
nfsd: Fix a memory leak in an error handling path
spi: mt7621: Fix an error message in mt7621_spi_probe()
mmc: au1xmmc: Fix an error handling path in au1xmmc_probe()
ASoC: da7219: Fix an error handling path in da7219_register_dai_clks()
mmc: wmt-sdmmc: Fix an error handling path in wmt_mci_probe()
serial: 8250: Add an empty line and remove some useless {}
mfd: intel_soc_pmic: Fix an error handling path in
intel_soc_pmic_i2c_probe()
mfd: fsl-imx25: Fix an error handling path in mx25_tsadc_setup_irq()
mfd: lp8788: Fix an error handling path in lp8788_probe()
mfd: lp8788: Fix an error handling path in lp8788_irq_init() and
lp8788_irq_init()
Chunfeng Yun (2):
usb: common: add function to get interval expressed in us unit
usb: common: move function's kerneldoc next to its definition
Claudiu Beznea (4):
iio: adc: at91-sama5d2_adc: fix AT91_SAMA5D2_MR_TRACKTIM_MAX
iio: adc: at91-sama5d2_adc: check return status for pressure and touch
iio: adc: at91-sama5d2_adc: lock around oversampling and sample freq
iio: adc: at91-sama5d2_adc: disable/prepare buffer on suspend/resume
Coly Li (1):
bcache: fix set_at_max_writeback_rate() for multiple attached devices
Dai Ngo (1):
NFSD: fix use-after-free on source server when doing inter-server copy
Daisuke Matsuda (1):
IB: Set IOVA/LENGTH on IB_MR in core/uverbs layers
Damian Muszynski (1):
crypto: qat - fix DMA transfer direction
Dan Carpenter (10):
wifi: rtl8xxxu: tighten bounds checking in rtl8xxxu_read_efuse()
drm/bridge: Avoid uninitialized variable warning
platform/chrome: fix memory corruption in ioctl
fpga: prevent integer overflow in dfl_feature_ioctl_set_irq()
mtd: rawnand: meson: fix bit map use in meson_nfc_ecc_correct()
drivers: serial: jsm: fix some leaks in probe
mfd: fsl-imx25: Fix check for platform_get_irq() errors
iommu/omap: Fix buffer overflow in debugfs
crypto: marvell/octeontx - prevent integer overflows
crypto: cavium - prevent integer overflow loading firmware
Daniel Golle (5):
wifi: rt2x00: don't run Rt5592 IQ calibration on MT7620
wifi: rt2x00: set correct TX_SW_CFG1 MAC register for MT7620
wifi: rt2x00: set VGC gain for both chains of MT7620
wifi: rt2x00: set SoC wmac clock register
wifi: rt2x00: correctly set BBP register 86 for MT7620
Dave Jiang (1):
dmaengine: ioat: stop mod_timer from resurrecting deleted timer in
__cleanup()
David Collins (1):
spmi: pmic-arb: correct duplicate APID to PPID mapping logic
David Gow (1):
drm/amd/display: fix overflow on MIN_I64 definition
Dmitry Baryshkov (1):
drm/msm/dpu: index dpu_kms->hw_vbif using vbif_idx
Dmitry Osipenko (3):
drm/virtio: Check whether transferred 2D BO is shmem
media: cedrus: Set the platform driver data earlier
soc/tegra: fuse: Drop Kconfig dependency on TEGRA20_APB_DMA
Dmitry Torokhov (2):
ARM: dts: exynos: correct s5k6a3 reset polarity on Midas family
ARM: dts: exynos: fix polarity of VBUS GPIO of Origen
Dongliang Mu (2):
phy: qualcomm: call clk_disable_unprepare in the error handling
usb: idmouse: fix an uninit-value in idmouse_open
Duoming Zhou (1):
scsi: libsas: Fix use-after-free bug in smp_execute_task_sg()
Eddie James (2):
iio: pressure: dps310: Refactor startup procedure
iio: pressure: dps310: Reset chip after timeout
Eric Dumazet (2):
once: add DO_ONCE_SLOW() for sleepable contexts
tcp: annotate data-race around tcp_md5sig_pool_populated
Fangrui Song (1):
riscv: Pass -mno-relax only on lld < 15.0.0
Filipe Manana (1):
btrfs: fix race between quota enable and quota rescan ioctl
Geert Uytterhoeven (1):
ARM: Drop CMDLINE_* dependency on ATAGS
Giovanni Cabiddu (1):
crypto: qat - use pre-allocated buffers in datapath
Greg Kroah-Hartman (2):
staging: greybus: audio_helper: remove unused and wrong debugfs usage
selinux: use "grep -E" instead of "egrep"
Guilherme G. Piccoli (1):
firmware: google: Test spinlock on panic path to avoid lockups
Haibo Chen (1):
ARM: dts: imx7d-sdb: config the max pressure for tsc2046
Hangyu Hua (1):
misc: ocxl: fix possible refcount leak in afu_ioctl()
Hans de Goede (3):
platform/x86: msi-laptop: Fix old-ec check for backlight registering
platform/x86: msi-laptop: Fix resource cleanup
platform/x86: msi-laptop: Change DMI match / alias strings to fix
module autoloading
Hari Chandrakanthan (1):
wifi: mac80211: allow bw change during channel switch in mesh
Helge Deller (1):
parisc: fbdev/stifb: Align graphics memory size to 4MB
Huacai Chen (1):
UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
Hui Tang (1):
crypto: qat - fix use of 'dma_map_single'
Ian Nam (1):
clk: zynqmp: Fix stack-out-of-bounds in strncpy`
Ian Rogers (1):
selftests/xsk: Avoid use-after-free on ctx
Ignat Korchagin (1):
crypto: akcipher - default implementation for setting a private key
Ilpo Järvinen (1):
serial: 8250: Toggle IER bits on only after irq has been set up
Jack Wang (2):
HSI: omap_ssi_port: Fix dma_map_sg error check
mailbox: bcm-ferxrm-mailbox: Fix error check for dma_map_sg
Jaegeuk Kim (1):
f2fs: increase the limit for reserve_root
Jairaj Arava (1):
ASoC: SOF: pci: Change DMI match info to support all Chrome platforms
Jameson Thies (1):
platform/chrome: cros_ec: Notify the PM of wake events during resume
Jan Kara (1):
ext4: avoid crash when inline data creation follows DIO write
Janis Schoetterl-Glausch (1):
kbuild: rpm-pkg: fix breakage when V=1 is used
Javier Martinez Canillas (2):
drm: Use size_t type for len variable in drm_copy_field()
drm: Prevent drm_copy_field() to attempt copying a NULL pointer
Jean-Francois Le Fillatre (1):
usb: add quirks for Lenovo OneLink+ Dock
Jerry Lee 李修賢 (1):
ext4: continue to expand file system when the target size doesn't
reach
Jesus Fernandez Manzano (1):
wifi: ath11k: fix number of VHT beamformee spatial streams
Jianglei Nie (3):
drm/nouveau: fix a use-after-free in
nouveau_gem_prime_import_sg_table()
drm/nouveau/nouveau_bo: fix potential memory leak in
nouveau_bo_alloc()
usb: host: xhci: Fix potential memory leak in xhci_alloc_stream_info()
Jiasheng Jiang (3):
ASoC: rsnd: Add check for rsnd_mod_power_on
fsi: core: Check error number after calling ida_simple_get
mfd: sm501: Add check for platform_driver_register()
Jie Hai (3):
dmaengine: hisilicon: Disable channels when unregister hisi_dma
dmaengine: hisilicon: Fix CQ head update
dmaengine: hisilicon: Add multi-thread support for a DMA channel
Jim Cromie (4):
dyndbg: fix static_branch manipulation
dyndbg: fix module.dyndbg handling
dyndbg: let query-modname override actual module name
dyndbg: drop EXPORTed dynamic_debug_exec_queries
Jinke Han (1):
ext4: place buffer head allocation before handle start
Joel Stanley (1):
clk: ast2600: BCLK comes from EPLL
Jonathan Cameron (1):
iio: ABI: Fix wrong format of differential capacitance channel ABI.
Junichi Uekawa (1):
vhost/vsock: Use kvmalloc/kvfree for larger packets.
Justin Chen (2):
usb: host: xhci-plat: suspend and resume clocks
usb: host: xhci-plat: suspend/resume clks for brcm
Kees Cook (7):
hardening: Clarify Kconfig text for auto-var-init
hardening: Avoid harmless Clang option under
CONFIG_INIT_STACK_ALL_ZERO
hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero
sh: machvec: Use char[] for section boundaries
x86/microcode/AMD: Track patch allocation size explicitly
MIPS: BCM47XX: Cast memcmp() of function to (void *)
x86/entry: Work around Clang __bdos() bug
Keith Busch (1):
nvme: copy firmware_rev on each init
Khaled Almahallawy (1):
drm/dp: Don't rewrite link config when setting phy test pattern
Khalid Masum (1):
xfrm: Update ipcomp_scratches with NULL when freed
Koba Ko (1):
crypto: ccp - Release dma channels before dmaengine unrgister
Kohei Tarumizu (1):
x86/resctrl: Fix to restore to original value when re-enabling
hardware prefetch register
Krzysztof Kozlowski (2):
ASoC: wcd9335: fix order of Slimbus unprepare/disable
ASoC: wcd934x: fix order of Slimbus unprepare/disable
Kshitiz Varshney (1):
hwrng: imx-rngc - Moving IRQ handler registering after
imx_rngc_irq_mask_clear()
Kuogee Hsieh (1):
drm/msm/dp: correct 1.62G link rate at dp_catalog_ctrl_config_msa()
Lalith Rajendran (1):
ext4: make ext4_lazyinit_thread freezable
Lam Thai (1):
bpftool: Fix a wrong type cast in btf_dumper_int
Lee Jones (1):
bpf: Ensure correct locking around vulnerable function find_vpid()
Letu Ren (1):
scsi: 3w-9xxx: Avoid disabling device if failing to enable it
Liang He (17):
hwmon: (gsc-hwmon) Call of_node_get() before of_find_xxx API
drm:pl111: Add of_node_put() when breaking out of
for_each_available_child_of_node()
drm/omap: dss: Fix refcount leak bugs
ASoC: eureka-tlv320: Hold reference returned from of_find_xxx API
memory: pl353-smc: Fix refcount leak bug in pl353_smc_probe()
memory: of: Fix refcount leak bug in of_get_ddr_timings()
memory: of: Fix refcount leak bug in of_lpddr3_get_ddr_timings()
soc: qcom: smsm: Fix refcount leak bugs in qcom_smsm_probe()
soc: qcom: smem_state: Add refcounting for the 'state->of_node'
clk: meson: Hold reference returned by of_get_parent()
clk: oxnas: Hold reference returned by of_get_parent()
clk: qoriq: Hold reference returned by of_get_parent()
clk: berlin: Add of_node_put() for of_get_parent()
clk: sprd: Hold reference returned by of_get_parent()
media: exynos4-is: fimc-is: Add of_node_put() when breaking out of
loop
powerpc/sysdev/fsl_msi: Add missing of_node_put()
powerpc/pci_dn: Add missing of_node_put()
Lin Yujun (1):
MIPS: SGI-IP27: Fix platform-device leak in bridge_platform_create()
Linus Walleij (1):
regulator: qcom_rpm: Fix circular deferral regression
Liu Jian (1):
net: If sock is dead don't access sock's sk_wq in
sk_stream_wait_memory
Logan Gunthorpe (2):
md/raid5: Ensure stripe_fill happens on non-read IO with journal
md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d
Lorenz Bauer (1):
bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
Lucas Stach (1):
drm: bridge: dw_hdmi: only trigger hotplug event on link change
Luciano Leão (1):
x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype
Luiz Augusto von Dentz (3):
Bluetooth: hci_core: Fix not handling link timeouts propertly
Bluetooth: hci_sysfs: Fix attempting to call device_add multiple times
Bluetooth: L2CAP: Fix user-after-free
Lukas Czerner (1):
ext4: don't increase iversion counter for ea_inodes
Luke D. Jones (2):
ALSA: hda/realtek: Correct pin configs for ASUS G533Z
ALSA: hda/realtek: Add quirk for ASUS GV601R laptop
Lyude Paul (1):
drm/nouveau/kms/nv140-: Disable interlacing
Maciej W. Rozycki (2):
RISC-V: Make port I/O string accessors actually work
PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge
Marek Behún (1):
ARM: dts: turris-omnia: Fix mpp26 pin name and comment
Marek Szyprowski (1):
spi: Ensure that sg_table won't be used after being freed
Mario Limonciello (2):
thunderbolt: Explicitly enable lane adapter hotplug events at startup
xhci: Don't show warning for reinit on known broken suspend
Mark Brown (1):
kselftest/arm64: Fix validatation termination record after
EXTRA_CONTEXT
Mark Chen (1):
Bluetooth: btusb: Fine-tune mt7663 mechanism.
Mark Zhang (1):
RDMA/cm: Use SLID in the work completion as the DLID in responder side
Martin Liska (1):
gcov: support GCC 12.1 and newer compilers
Martin Povišer (3):
ASoC: tas2764: Allow mono streams
ASoC: tas2764: Drop conflicting set_bias_level power setting
ASoC: tas2764: Fix mute/unmute
Masahiro Yamada (1):
kbuild: remove the target in signal traps when interrupted
Mateusz Kwiatkowski (1):
drm/vc4: vec: Fix timings for VEC modes
Maxime Ripard (2):
drm/mipi-dsi: Detach devices when removing the host
clk: bcm2835: Make peripheral PLLC critical
Maya Matuszczyk (1):
drm: panel-orientation-quirks: Add quirk for Anbernic Win600
Miaoqian Lin (6):
clk: tegra: Fix refcount leak in tegra210_clock_init
clk: tegra: Fix refcount leak in tegra114_clock_init
clk: tegra20: Fix refcount leak in tegra20_clock_init
HSI: omap_ssi: Fix refcount leak in ssi_probe
media: xilinx: vipp: Fix refcount leak in xvip_graph_dma_init
clk: ti: dra7-atl: Fix reference leak in of_dra7_atl_clk_probe
Michael Hennerich (1):
iio: dac: ad5593r: Fix i2c read protocol requirements
Michael Walle (2):
ARM: dts: kirkwood: lsxl: fix serial line
ARM: dts: kirkwood: lsxl: remove first ethernet port
Michal Hocko (1):
rcu: Back off upon fill_page_cache_func() allocation failure
Michal Luczaj (1):
KVM: x86/emulator: Fix handing of POP SS to correctly set
interruptibility
Mike Christie (1):
scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()
Mike Pattrick (2):
openvswitch: Fix double reporting of drops in dropwatch
openvswitch: Fix overreporting of drops in dropwatch
Nam Cao (2):
staging: vt6655: fix some erroneous memory clean-up loops
staging: vt6655: fix potential memory leak
Nathan Chancellor (1):
powerpc/math_emu/efp: Include module.h
Neal Cardwell (1):
tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited
Neil Armstrong (1):
spi: meson-spicc: do not rely on busy flag in pow2 clk ops
Nicholas Piggin (1):
powerpc/64s: Fix GENERIC_CPU build flags for PPC970 / G5
Niklas Cassel (4):
ata: fix ata_id_sense_reporting_enabled() and
ata_id_has_sense_reporting()
ata: fix ata_id_has_devslp()
ata: fix ata_id_has_ncq_autosense()
ata: fix ata_id_has_dipm()
Nuno Sá (2):
iio: adc: ad7923: fix channel readings for some variants
iio: inkern: only release the device node when done with it
Ondrej Mosnacek (1):
userfaultfd: open userfaultfds with O_RDONLY
Pali Rohár (4):
powerpc/boot: Explicitly disable usage of SPE instructions
mtd: rawnand: fsl_elbc: Fix none ECC mode
serial: 8250: Fix restoring termios speed after suspend
powerpc: Fix SPE Power ISA properties for e500v1 platforms
Patrick Rudolph (1):
regulator: core: Prevent integer underflow
Patryk Duda (1):
platform/chrome: cros_ec_proto: Update version on GET_NEXT_EVENT
failure
Pavel Begunkov (1):
io_uring/af_unix: defer registered files gc to io_uring release
Peter Harliman Liem (1):
crypto: inside-secure - Change swab to swab32
Phil Sutter (1):
netfilter: nft_fib: Fix for rpath check with VRF devices
Pierre-Louis Bossart (1):
soundwire: intel: fix error handling on dai registration issues
Qu Wenruo (1):
btrfs: scrub: try to fix super block errors
Quanyang Wang (1):
clk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate
Quentin Monnet (1):
bpftool: Clear errno after libcap's checks
Rafael J. Wysocki (1):
thermal: intel_powerclamp: Use first online CPU as control_cpu
Randy Dunlap (2):
drm: fix drm_mipi_dbi build errors
ia64: export memory_add_physaddr_to_nid to fix cxl build error
Richard Acayan (1):
mmc: sdhci-msm: add compatible string check for sdm670
Richard Fitzgerald (1):
soundwire: cadence: Don't overwrite msg->buf during write commands
Rik van Riel (1):
livepatch: fix race between fork and KLP transition
Rishabh Bhatnagar (1):
nvme-pci: set min_align_mask before calculating max_hw_sectors
Robert Marko (1):
clk: qcom: apss-ipq6018: mark apcs_alias0_core_clk as critical
Robin Guo (1):
usb: musb: Fix musb_gadget.c rxstate overflow bug
Robin Murphy (1):
iommu/iova: Fix module config properly
Ronnie Sahlberg (1):
cifs: destage dirty pages before re-reading them for cache=none
Rustam Subkhankulov (1):
platform/chrome: fix double-free in chromeos_laptop_prepare()
Sami Tolvanen (1):
objtool: Preserve special st_shndx indexes in elf_update_symbol
Saranya Gopal (1):
ALSA: hda/realtek: Add Intel Reference SSID to support headset keys
Saurabh Sengar (1):
md: Replace snprintf with scnprintf
Saurav Kashyap (1):
scsi: qedf: Populate sysfs attributes for vport
Sean Christopherson (2):
KVM: nVMX: Unconditionally purge queued/injected events on nested
"exit"
KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS
Sean Wang (1):
Bluetooth: btusb: mediatek: fix WMT failure during runtime suspend
Sebastian Krzyszkowiak (1):
arm64: dts: imx8mq-librem5: Add bq25895 as max17055's power supply
Serge Semin (5):
clk: vc5: Fix 5P49V6901 outputs disabling when enabling FOD
clk: baikal-t1: Fix invalid xGMAC PTP clock divider
clk: baikal-t1: Add shared xGMAC ref/ptp clocks internal parent
clk: baikal-t1: Add SATA internal ref clock buffer
ata: libahci_platform: Sanity check the DT child nodes number
Sherry Sun (1):
tty: serial: fsl_lpuart: disable dma rx/tx use flags in
lpuart_dma_shutdown
Shigeru Yoshida (1):
nbd: Fix hung when signal interrupts nbd_start_device_ioctl()
Shuah Khan (2):
Revert "drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for
vega"
Revert "drm/amdgpu: use dirty framebuffer helper"
Shuai Xue (1):
ACPI: APEI: do not add task_work to kernel thread to avoid memory leak
Shubhrajyoti Datta (1):
tty: xilinx_uartps: Fix the ignore_status
Simon Ser (1):
drm/dp_mst: fix drm_dp_dpcd_read return value checks
Srinivas Pandruvada (1):
thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id()
to avoid crash
Stefan Berger (1):
selftest: tpm2: Add Client.__del__() to close /dev/tpm* handle
Stefan Wahren (1):
clk: bcm2835: fix bcm2835_clock_rate_from_divisor declaration
Steve French (1):
smb3: must initialize two ACL struct fields to zero
Steven Rostedt (Google) (4):
ring-buffer: Allow splice to read previous partially read pages
ring-buffer: Have the shortest_full queue be the shortest not longest
ring-buffer: Check pending waiters when doing wake ups as well
ring-buffer: Add ring_buffer_wake_waiters()
Takashi Iwai (7):
ALSA: oss: Fix potential deadlock at unregistration
ALSA: rawmidi: Drop register_mutex in snd_rawmidi_free()
ALSA: usb-audio: Fix potential memory leaks
ALSA: usb-audio: Fix NULL dererence at error path
drm/udl: Restore display mode on resume
ALSA: hda: beep: Simplify keep-power-at-enable behavior
ALSA: hda/hdmi: Don't skip notification handling during PM operation
Tetsuo Handa (6):
Bluetooth: hci_{ldisc,serdev}: check percpu_init_rwsem() failure
net: rds: don't hold sock lock when cancelling work from
rds_tcp_reset_callbacks()
net/ieee802154: reject zero-sized raw_sendmsg()
wifi: ath9k: avoid uninit memory read in ath9k_htc_rx_msg()
Bluetooth: L2CAP: initialize delayed works at l2cap_chan_create()
net/ieee802154: don't warn zero-sized raw_sendmsg()
Thinh Nguyen (3):
usb: ch9: Add USB 3.2 SSP attributes
usb: common: Parse for USB SSP genXxY
usb: common: debug: Check non-standard control requests
Tudor Ambarus (1):
mtd: rawnand: atmel: Unmap streaming DMA mappings
Uwe Kleine-König (2):
iio: ltc2497: Fix reading conversion results
leds: lm3601x: Don't use mutex after it was destroyed
Varun Prakash (1):
nvmet-tcp: add bounds check on Transfer Tag
Ville Syrjälä (2):
drm/i915: Fix watermark calculations for gen12+ RC CCS modifier
drm/i915: Fix watermark calculations for gen12+ MC CCS modifier
Vincent Knecht (1):
thermal/drivers/qcom/tsens-v0_1: Fix MSM8939 fourth sensor hw_id
Vincent Whitchurch (1):
spi: s3c64xx: Fix large transfers with DMA
Vitaly Kuznetsov (1):
x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
Vivek Kasireddy (1):
udmabuf: Set ubuf->sg = NULL if the creation of sg table fails
Waiman Long (2):
tracing: Disable interrupt or preemption before acquiring
arch_spinlock_t
cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset
Wang Kefeng (2):
ARM: 9244/1: dump: Fix wrong pg_level in walk_pmd()
ARM: 9247/1: mm: set readonly for MT_MEMORY_RO with ARM_LPAE
Wei Yongjun (1):
power: supply: adp5061: fix out-of-bounds read in
adp5061_get_chg_type()
Wen Gong (1):
wifi: ath10k: add peer map clean up for peer delete in
ath10k_sta_state()
Wenchao Chen (1):
mmc: sdhci-sprd: Fix minimum clock limit
William Dean (1):
mtd: devices: docg3: check the return value of devm_ioremap() in the
probe
Wright Feng (1):
wifi: brcmfmac: fix invalid address access when enabling SCAN log
level
Xiaoke Wang (1):
staging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv()
Xin Long (1):
sctp: handle the error returned from sctp_auth_asoc_init_active_key
Xu Qiang (3):
spi: qup: add missing clk_disable_unprepare on error in
spi_qup_resume()
spi: qup: add missing clk_disable_unprepare on error in
spi_qup_pm_resume_runtime()
media: meson: vdec: add missing clk_disable_unprepare on error in
vdec_hevc_start()
Ye Bin (7):
jbd2: fix potential buffer head reference count leak
jbd2: fix potential use-after-free in jbd2_fc_wait_bufs
jbd2: add miss release buffer head in fc_do_one_pass()
ext4: fix miss release buffer head in ext4_fc_write_inode
ext4: fix potential memory leak in ext4_fc_record_modified_inode()
ext4: fix potential memory leak in ext4_fc_record_regions()
ext4: update 'state->fc_regions_size' after successful memory
allocation
Yipeng Zou (2):
tracing: kprobe: Fix kprobe event gen test module on exit
tracing: kprobe: Make gen test module work in arm and riscv
Yu Kuai (1):
blk-throttle: prevent overflow while calculating wait time
Zeng Jingxiang (1):
gpu: lontium-lt9611: Fix NULL pointer dereference in
lt9611_connector_init()
Zhang Qilong (7):
spi: dw: Fix PM disable depth imbalance in dw_spi_bt1_probe
spi/omap100k:Fix PM disable depth imbalance in omap1_spi100k_probe
ASoC: wm8997: Fix PM disable depth imbalance in wm8997_probe
ASoC: wm5110: Fix PM disable depth imbalance in wm5110_probe
ASoC: wm5102: Fix PM disable depth imbalance in wm5102_probe
ASoC: mt6660: Fix PM disable depth imbalance in mt6660_i2c_probe
f2fs: fix race condition on setting FI_NO_EXTENT flag
Zhang Rui (1):
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL
domain
Zhang Xiaoxu (1):
cifs: Fix the error length of VALIDATE_NEGOTIATE_INFO message
Zheng Yejian (1):
ftrace: Properly unset FTRACE_HASH_FL_MOD
Zheng Yongjun (2):
net: fs_enet: Fix wrong check in do_pd_setup
powerpc/powernv: add missing of_node_put() in opal_export_attrs()
Zhengchao Shao (1):
crypto: sahara - don't sleep when in softirq
Zheyu Ma (2):
drm/bridge: megachips: Fix a null pointer dereference bug
media: cx88: Fix a null-ptr-deref bug in buffer_prepare()
Zhu Yanjun (2):
RDMA/rxe: Fix "kernel NULL pointer dereference" error
RDMA/rxe: Fix the error caused by qp->sk
Ziyang Xuan (1):
can: bcm: check the result of can_send() in bcm_can_tx()
Zqiang (1):
rcu-tasks: Convert RCU_LOCKDEP_WARN() to WARN_ONCE()
hongao (1):
drm/amdgpu: fix initial connector audio value
sunghwan jung (1):
Revert "usb: storage: Add quirk for Samsung Fit flash"
Documentation/ABI/testing/sysfs-bus-iio | 2 +-
Makefile | 6 +-
arch/arm/Kconfig | 1 -
arch/arm/boot/dts/armada-385-turris-omnia.dts | 4 +-
arch/arm/boot/dts/exynos4412-midas.dtsi | 2 +-
arch/arm/boot/dts/exynos4412-origen.dts | 2 +-
arch/arm/boot/dts/imx6dl.dtsi | 3 +
arch/arm/boot/dts/imx6q.dtsi | 3 +
arch/arm/boot/dts/imx6qp.dtsi | 6 +
arch/arm/boot/dts/imx6sl.dtsi | 3 +
arch/arm/boot/dts/imx6sll.dtsi | 3 +
arch/arm/boot/dts/imx6sx.dtsi | 6 +
arch/arm/boot/dts/imx7d-sdb.dts | 7 +-
arch/arm/boot/dts/kirkwood-lsxl.dtsi | 16 +-
arch/arm/mm/dump.c | 2 +-
arch/arm/mm/mmu.c | 4 +
.../boot/dts/freescale/imx8mq-librem5.dtsi | 1 +
arch/ia64/mm/numa.c | 1 +
arch/mips/bcm47xx/prom.c | 4 +-
arch/mips/sgi-ip27/ip27-xtalk.c | 74 +++--
arch/powerpc/Makefile | 2 +-
arch/powerpc/boot/Makefile | 1 +
.../boot/dts/fsl/e500v1_power_isa.dtsi | 51 ++++
arch/powerpc/boot/dts/fsl/mpc8540ads.dts | 2 +-
arch/powerpc/boot/dts/fsl/mpc8541cds.dts | 2 +-
arch/powerpc/boot/dts/fsl/mpc8555cds.dts | 2 +-
arch/powerpc/boot/dts/fsl/mpc8560ads.dts | 2 +-
arch/powerpc/kernel/pci_dn.c | 1 +
arch/powerpc/math-emu/math_efp.c | 1 +
arch/powerpc/platforms/powernv/opal.c | 1 +
arch/powerpc/sysdev/fsl_msi.c | 2 +
arch/riscv/Makefile | 2 +
arch/riscv/include/asm/io.h | 16 +-
arch/riscv/kernel/sys_riscv.c | 3 -
arch/riscv/mm/fault.c | 3 +-
arch/sh/include/asm/sections.h | 2 +-
arch/sh/kernel/machvec.c | 10 +-
arch/um/kernel/um_arch.c | 2 +-
arch/x86/include/asm/hyperv-tlfs.h | 4 +-
arch/x86/include/asm/microcode.h | 1 +
arch/x86/kernel/cpu/feat_ctl.c | 2 +-
arch/x86/kernel/cpu/microcode/amd.c | 3 +-
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 12 +-
arch/x86/kvm/emulate.c | 2 +-
arch/x86/kvm/vmx/nested.c | 30 +-
arch/x86/kvm/vmx/vmx.c | 12 +-
arch/x86/xen/enlighten_pv.c | 3 +-
block/blk-throttle.c | 8 +-
crypto/akcipher.c | 8 +
drivers/acpi/acpi_video.c | 16 ++
drivers/acpi/apei/ghes.c | 2 +-
drivers/ata/libahci_platform.c | 14 +-
drivers/block/nbd.c | 6 +-
drivers/bluetooth/btusb.c | 47 +++-
drivers/bluetooth/hci_ldisc.c | 7 +-
drivers/bluetooth/hci_serdev.c | 10 +-
drivers/char/hw_random/imx-rngc.c | 14 +-
drivers/clk/baikal-t1/ccu-div.c | 65 +++++
drivers/clk/baikal-t1/ccu-div.h | 10 +
drivers/clk/baikal-t1/clk-ccu-div.c | 26 +-
drivers/clk/bcm/clk-bcm2835.c | 8 +-
drivers/clk/berlin/bg2.c | 5 +-
drivers/clk/berlin/bg2q.c | 6 +-
drivers/clk/clk-ast2600.c | 2 +-
drivers/clk/clk-oxnas.c | 6 +-
drivers/clk/clk-qoriq.c | 10 +-
drivers/clk/clk-versaclock5.c | 2 +-
drivers/clk/mediatek/clk-mt8183-mfgcfg.c | 6 +-
drivers/clk/meson/meson-aoclk.c | 5 +-
drivers/clk/meson/meson-eeclk.c | 5 +-
drivers/clk/meson/meson8b.c | 5 +-
drivers/clk/qcom/apss-ipq6018.c | 2 +-
drivers/clk/sprd/common.c | 9 +-
drivers/clk/tegra/clk-tegra114.c | 1 +
drivers/clk/tegra/clk-tegra20.c | 1 +
drivers/clk/tegra/clk-tegra210.c | 1 +
drivers/clk/ti/clk-dra7-atl.c | 9 +-
drivers/clk/zynqmp/clkc.c | 7 +
drivers/clk/zynqmp/pll.c | 31 +--
drivers/crypto/cavium/cpt/cptpf_main.c | 6 +-
drivers/crypto/ccp/ccp-dmaengine.c | 6 +-
drivers/crypto/inside-secure/safexcel_hash.c | 8 +-
.../crypto/marvell/octeontx/otx_cptpf_ucode.c | 18 +-
drivers/crypto/qat/qat_common/qat_algs.c | 109 +++++---
drivers/crypto/qat/qat_common/qat_crypto.h | 24 ++
drivers/crypto/sahara.c | 18 +-
drivers/dma-buf/udmabuf.c | 9 +-
drivers/dma/hisi_dma.c | 28 +-
drivers/dma/ioat/dma.c | 6 +-
drivers/firmware/efi/libstub/fdt.c | 8 -
drivers/firmware/google/gsmi.c | 9 +
drivers/fpga/dfl.c | 2 +-
drivers/fsi/fsi-core.c | 3 +
drivers/gpu/drm/Kconfig | 1 +
.../gpu/drm/amd/amdgpu/amdgpu_connectors.c | 7 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 -
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 5 -
drivers/gpu/drm/amd/amdgpu/soc15.c | 25 ++
.../gpu/drm/amd/display/dc/calcs/bw_fixed.c | 6 +-
drivers/gpu/drm/amd/display/dc/core/dc.c | 16 +-
drivers/gpu/drm/amd/display/dc/dc_stream.h | 6 +-
.../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 35 +--
.../amd/display/dc/dcn10/dcn10_hw_sequencer.h | 3 +-
.../gpu/drm/amd/display/dc/inc/hw_sequencer.h | 8 +-
drivers/gpu/drm/bridge/adv7511/adv7511.h | 5 +-
drivers/gpu/drm/bridge/adv7511/adv7511_cec.c | 4 +-
drivers/gpu/drm/bridge/lontium-lt9611.c | 3 +-
.../bridge/megachips-stdpxxxx-ge-b850v3-fw.c | 4 +-
drivers/gpu/drm/bridge/parade-ps8640.c | 4 +-
drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 13 +-
drivers/gpu/drm/drm_bridge.c | 4 +-
drivers/gpu/drm/drm_dp_helper.c | 9 -
drivers/gpu/drm/drm_dp_mst_topology.c | 6 +-
drivers/gpu/drm/drm_ioctl.c | 8 +-
drivers/gpu/drm/drm_mipi_dsi.c | 1 +
.../gpu/drm/drm_panel_orientation_quirks.c | 6 +
drivers/gpu/drm/i915/intel_pm.c | 8 +-
drivers/gpu/drm/meson/meson_drv.c | 8 +
drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 12 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_vbif.c | 29 +-
drivers/gpu/drm/msm/dp/dp_catalog.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 +-
drivers/gpu/drm/nouveau/nouveau_connector.c | 3 +-
drivers/gpu/drm/nouveau/nouveau_prime.c | 1 -
drivers/gpu/drm/omapdrm/dss/dss.c | 3 +
drivers/gpu/drm/pl111/pl111_versatile.c | 1 +
drivers/gpu/drm/udl/udl_modeset.c | 3 -
drivers/gpu/drm/vc4/vc4_vec.c | 4 +-
drivers/gpu/drm/virtio/virtgpu_vq.c | 2 +-
drivers/hid/hid-multitouch.c | 8 +-
drivers/hsi/controllers/omap_ssi_core.c | 1 +
drivers/hsi/controllers/omap_ssi_port.c | 8 +-
drivers/hwmon/gsc-hwmon.c | 1 +
drivers/i2c/busses/i2c-mlxbf.c | 44 ++-
drivers/iio/adc/ad7923.c | 4 +-
drivers/iio/adc/at91-sama5d2_adc.c | 28 +-
drivers/iio/adc/ltc2497.c | 13 +
drivers/iio/dac/ad5593r.c | 46 +--
drivers/iio/inkern.c | 6 +-
drivers/iio/pressure/dps310.c | 262 +++++++++++-------
drivers/infiniband/core/cm.c | 14 +-
drivers/infiniband/core/uverbs_cmd.c | 5 +-
drivers/infiniband/core/verbs.c | 2 +
drivers/infiniband/hw/hns/hns_roce_mr.c | 1 -
drivers/infiniband/hw/mlx4/mr.c | 1 -
drivers/infiniband/sw/rxe/rxe_qp.c | 10 +-
drivers/infiniband/sw/siw/siw_qp_rx.c | 27 +-
drivers/iommu/omap-iommu-debug.c | 6 +-
drivers/leds/leds-lm3601x.c | 2 -
drivers/mailbox/bcm-flexrm-mailbox.c | 8 +-
drivers/md/bcache/writeback.c | 73 +++--
drivers/md/raid0.c | 2 +-
drivers/md/raid5.c | 14 +-
drivers/media/pci/cx88/cx88-vbi.c | 9 +-
drivers/media/pci/cx88/cx88-video.c | 43 +--
drivers/media/platform/exynos4-is/fimc-is.c | 1 +
drivers/media/platform/xilinx/xilinx-vipp.c | 9 +-
drivers/memory/of_memory.c | 2 +
drivers/memory/pl353-smc.c | 1 +
drivers/mfd/fsl-imx25-tsadc.c | 34 ++-
drivers/mfd/intel_soc_pmic_core.c | 1 +
drivers/mfd/lp8788-irq.c | 3 +
drivers/mfd/lp8788.c | 12 +-
drivers/mfd/sm501.c | 7 +-
drivers/misc/ocxl/file.c | 2 +
drivers/mmc/host/au1xmmc.c | 3 +-
drivers/mmc/host/sdhci-msm.c | 1 +
drivers/mmc/host/sdhci-sprd.c | 2 +-
drivers/mmc/host/wmt-sdmmc.c | 5 +-
drivers/mtd/devices/docg3.c | 7 +-
drivers/mtd/nand/raw/atmel/nand-controller.c | 1 +
drivers/mtd/nand/raw/fsl_elbc_nand.c | 28 +-
drivers/mtd/nand/raw/meson_nand.c | 4 +-
drivers/net/can/usb/kvaser_usb/kvaser_usb.h | 2 +
.../net/can/usb/kvaser_usb/kvaser_usb_core.c | 3 +-
.../net/can/usb/kvaser_usb/kvaser_usb_hydra.c | 2 +-
.../net/can/usb/kvaser_usb/kvaser_usb_leaf.c | 79 ++++++
.../net/ethernet/freescale/fs_enet/mac-fec.c | 2 +-
drivers/net/wireless/ath/ath10k/mac.c | 54 ++--
drivers/net/wireless/ath/ath11k/mac.c | 25 +-
drivers/net/wireless/ath/ath9k/htc_hst.c | 43 ++-
.../broadcom/brcm80211/brcmfmac/core.c | 3 +-
.../broadcom/brcm80211/brcmfmac/pno.c | 12 +-
.../net/wireless/ralink/rt2x00/rt2800lib.c | 34 ++-
.../wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 75 ++++-
drivers/nvme/host/core.c | 3 +-
drivers/nvme/host/pci.c | 3 +-
drivers/nvme/target/tcp.c | 11 +-
drivers/pci/setup-res.c | 11 +
drivers/phy/qualcomm/phy-qcom-usb-hsic.c | 6 +-
drivers/platform/chrome/chromeos_laptop.c | 24 +-
drivers/platform/chrome/cros_ec.c | 8 +-
drivers/platform/chrome/cros_ec_chardev.c | 3 +
drivers/platform/chrome/cros_ec_proto.c | 32 +++
drivers/platform/x86/msi-laptop.c | 14 +-
drivers/power/supply/adp5061.c | 6 +-
drivers/powercap/intel_rapl_common.c | 4 +-
drivers/regulator/core.c | 2 +-
drivers/regulator/qcom_rpm-regulator.c | 24 +-
drivers/scsi/3w-9xxx.c | 2 +-
drivers/scsi/iscsi_tcp.c | 73 +++--
drivers/scsi/iscsi_tcp.h | 2 +
drivers/scsi/libsas/sas_expander.c | 2 +-
drivers/scsi/qedf/qedf_main.c | 21 ++
drivers/soc/qcom/smem_state.c | 3 +-
drivers/soc/qcom/smsm.c | 20 +-
drivers/soc/tegra/Kconfig | 1 -
drivers/soundwire/cadence_master.c | 9 +-
drivers/soundwire/intel.c | 1 -
drivers/spi/spi-dw-bt1.c | 4 +-
drivers/spi/spi-meson-spicc.c | 6 +-
drivers/spi/spi-mt7621.c | 8 +-
drivers/spi/spi-omap-100k.c | 1 +
drivers/spi/spi-qup.c | 21 +-
drivers/spi/spi-s3c64xx.c | 9 +
drivers/spi/spi.c | 2 +
drivers/spmi/spmi-pmic-arb.c | 13 +-
drivers/staging/greybus/audio_helper.c | 11 -
drivers/staging/media/meson/vdec/vdec_hevc.c | 6 +-
drivers/staging/media/sunxi/cedrus/cedrus.c | 4 +-
drivers/staging/rtl8723bs/core/rtw_cmd.c | 16 +-
drivers/staging/vt6655/device_main.c | 8 +-
drivers/thermal/intel/intel_powerclamp.c | 4 +-
drivers/thermal/qcom/tsens-v0_1.c | 2 +-
drivers/thunderbolt/switch.c | 24 ++
drivers/thunderbolt/tb.h | 1 +
drivers/thunderbolt/tb_regs.h | 1 +
drivers/thunderbolt/usb4.c | 20 ++
drivers/tty/serial/8250/8250_core.c | 19 +-
drivers/tty/serial/8250/8250_port.c | 15 +-
drivers/tty/serial/fsl_lpuart.c | 2 +
drivers/tty/serial/jsm/jsm_driver.c | 3 +-
drivers/tty/serial/xilinx_uartps.c | 2 +
drivers/usb/common/common.c | 102 ++++++-
drivers/usb/common/debug.c | 78 +++++-
drivers/usb/core/devices.c | 21 +-
drivers/usb/core/endpoint.c | 35 +--
drivers/usb/core/quirks.c | 4 +
drivers/usb/gadget/function/f_printer.c | 12 +-
drivers/usb/host/xhci-mem.c | 7 +-
drivers/usb/host/xhci-plat.c | 18 +-
drivers/usb/host/xhci.c | 3 +-
drivers/usb/host/xhci.h | 1 +
drivers/usb/misc/idmouse.c | 8 +-
drivers/usb/musb/musb_gadget.c | 3 +
drivers/usb/storage/unusual_devs.h | 6 -
drivers/vhost/vsock.c | 2 +-
drivers/video/fbdev/stifb.c | 2 +-
fs/btrfs/qgroup.c | 15 +
fs/btrfs/scrub.c | 36 +++
fs/cifs/file.c | 9 +
fs/cifs/smb2pdu.c | 7 +-
fs/dlm/ast.c | 6 +-
fs/dlm/lock.c | 16 +-
fs/ext4/fast_commit.c | 40 +--
fs/ext4/file.c | 6 +
fs/ext4/inode.c | 14 +-
fs/ext4/resize.c | 2 +-
fs/ext4/super.c | 1 +
fs/f2fs/checkpoint.c | 23 +-
fs/f2fs/data.c | 4 +-
fs/f2fs/extent_cache.c | 3 +-
fs/f2fs/f2fs.h | 27 +-
fs/f2fs/gc.c | 10 +-
fs/f2fs/recovery.c | 23 +-
fs/f2fs/segment.c | 47 ++--
fs/f2fs/super.c | 4 +-
fs/jbd2/commit.c | 2 +-
fs/jbd2/journal.c | 10 +-
fs/jbd2/recovery.c | 1 +
fs/jbd2/transaction.c | 6 +-
fs/nfsd/nfs4recover.c | 4 +-
fs/nfsd/nfs4state.c | 5 +
fs/nfsd/nfs4xdr.c | 2 +-
fs/userfaultfd.c | 4 +-
include/linux/ata.h | 39 +--
include/linux/dynamic_debug.h | 11 +-
include/linux/iova.h | 2 +-
include/linux/once.h | 28 ++
include/linux/ring_buffer.h | 2 +-
include/linux/serial_8250.h | 1 +
include/linux/skbuff.h | 2 +
include/linux/tcp.h | 2 +-
include/linux/usb/ch9.h | 62 +----
include/net/ieee802154_netdev.h | 12 +-
include/net/tcp.h | 5 +-
include/uapi/linux/usb/ch9.h | 13 +
kernel/bpf/btf.c | 2 +-
kernel/bpf/syscall.c | 2 +
kernel/cgroup/cpuset.c | 18 +-
kernel/gcov/gcc_4_7.c | 18 +-
kernel/livepatch/transition.c | 18 +-
kernel/rcu/tasks.h | 2 +-
kernel/rcu/tree.c | 17 +-
kernel/trace/ftrace.c | 8 +-
kernel/trace/kprobe_event_gen_test.c | 49 +++-
kernel/trace/ring_buffer.c | 54 +++-
kernel/trace/trace.c | 23 ++
lib/dynamic_debug.c | 45 +--
lib/once.c | 30 ++
mm/mmap.c | 5 +-
net/bluetooth/hci_core.c | 34 ++-
net/bluetooth/hci_sysfs.c | 3 +
net/bluetooth/l2cap_core.c | 17 +-
net/can/bcm.c | 7 +-
net/core/stream.c | 3 +-
net/ieee802154/socket.c | 4 +
net/ipv4/inet_hashtables.c | 4 +-
net/ipv4/netfilter/nft_fib_ipv4.c | 3 +
net/ipv4/tcp.c | 16 +-
net/ipv4/tcp_output.c | 19 +-
net/ipv6/netfilter/nft_fib_ipv6.c | 6 +-
net/mac80211/cfg.c | 3 -
net/openvswitch/datapath.c | 18 +-
net/rds/tcp.c | 2 +-
net/sctp/auth.c | 18 +-
net/vmw_vsock/virtio_transport_common.c | 2 +-
net/xfrm/xfrm_ipcomp.c | 1 +
scripts/Kbuild.include | 23 +-
scripts/package/mkspec | 4 +-
scripts/selinux/install_policy.sh | 2 +-
security/Kconfig.hardening | 63 +++--
sound/core/pcm_dmaengine.c | 8 +-
sound/core/rawmidi.c | 2 -
sound/core/sound_oss.c | 13 +-
sound/pci/hda/hda_beep.c | 15 +-
sound/pci/hda/hda_beep.h | 1 +
sound/pci/hda/patch_hdmi.c | 6 -
sound/pci/hda/patch_realtek.c | 11 +-
sound/pci/hda/patch_sigmatel.c | 25 +-
sound/soc/codecs/da7219.c | 5 +-
sound/soc/codecs/mt6660.c | 8 +-
sound/soc/codecs/tas2764.c | 78 ++----
sound/soc/codecs/wcd9335.c | 2 +-
sound/soc/codecs/wcd934x.c | 2 +-
sound/soc/codecs/wm5102.c | 6 +-
sound/soc/codecs/wm5110.c | 6 +-
sound/soc/codecs/wm8997.c | 6 +-
sound/soc/fsl/eukrea-tlv320.c | 8 +-
sound/soc/sh/rcar/ctu.c | 6 +-
sound/soc/sh/rcar/dvc.c | 6 +-
sound/soc/sh/rcar/mix.c | 6 +-
sound/soc/sh/rcar/src.c | 5 +-
sound/soc/sh/rcar/ssi.c | 4 +-
sound/soc/sof/sof-pci-dev.c | 2 +-
sound/usb/endpoint.c | 6 +-
tools/bpf/bpftool/btf_dumper.c | 2 +-
tools/bpf/bpftool/main.c | 10 +
tools/lib/bpf/xsk.c | 6 +-
tools/objtool/elf.c | 7 +-
tools/perf/util/intel-pt.c | 9 +-
.../arm64/signal/testcases/testcases.c | 2 +-
tools/testing/selftests/tpm2/tpm2.py | 4 +
354 files changed, 3018 insertions(+), 1399 deletions(-)
create mode 100644 arch/powerpc/boot/dts/fsl/e500v1_power_isa.dtsi
--
2.25.1
1
371
uniontech inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6TN56
CVE: NA
--------------------------------
smatch report:
fs/eulerfs/namei.c:118 eufs_lookup() error: 'inode' dereferencing possible ERR_PTR()
fix it by using the ino above in eufs_err.
Signed-off-by: Kang Chen <void0red(a)hust.edu.cn>
---
v2 -> v1: use correct string format
fs/eulerfs/namei.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/eulerfs/namei.c b/fs/eulerfs/namei.c
index e4c6c36575f2..dd3bbc0b453c 100644
--- a/fs/eulerfs/namei.c
+++ b/fs/eulerfs/namei.c
@@ -115,8 +115,8 @@ static struct dentry *eufs_lookup(struct inode *dir, struct dentry *dentry,
inode = eufs_iget(dir->i_sb, s2p(dir->i_sb, de->inode));
if (inode == ERR_PTR(-ESTALE)) {
- eufs_err(dir->i_sb, "deleted inode referenced: 0x%lx",
- inode->i_ino);
+ eufs_err(dir->i_sb, "deleted inode referenced: 0x%llx",
+ le64_to_cpu(de->inode));
return ERR_PTR(-EIO);
}
not_found:
--
2.34.1
3
7

12 Apr '23
From: Juan Zhou <zhoujuan51(a)h-partners.com>
Yixing Liu (2):
Update kernel headers
libhns: Support congestion control algorithm configuration
kernel-headers/rdma/hns-abi.h | 18 ++++-
providers/hns/hns_roce_u.c | 1 +
providers/hns/hns_roce_u.h | 6 ++
providers/hns/hns_roce_u_verbs.c | 114 +++++++++++++++++++++++++++++--
providers/hns/hnsdv.h | 24 ++++++-
providers/hns/libhns.map | 1 +
6 files changed, 156 insertions(+), 8 deletions(-)
--
2.30.0
1
2
From: Juan Zhou <zhoujuan51(a)h-partners.com>
This group of patches implements the congestion control algorithm
configuration.
Yixing Liu (2):
RDMA/hns: Modify congestion abbreviation
RDMA/hns: Support congestion control algorithm configuration at QP
granularity
drivers/infiniband/hw/hns/hns_roce_device.h | 16 +--
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 113 ++++++++++----------
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 11 +-
drivers/infiniband/hw/hns/hns_roce_main.c | 4 +
drivers/infiniband/hw/hns/hns_roce_qp.c | 46 ++++++++
drivers/infiniband/hw/hns/hns_roce_sysfs.c | 2 +-
include/uapi/rdma/hns-abi.h | 18 +++-
7 files changed, 142 insertions(+), 68 deletions(-)
--
2.30.0
1
2

[PATCH openEuler-5.10-LTS 01/27] xfs: log worker needs to start before intent/unlink recovery
by Jialin Zhang 12 Apr '23
by Jialin Zhang 12 Apr '23
12 Apr '23
From: Dave Chinner <dchinner(a)redhat.com>
mainline inclusion
from mainline-v5.17-rc6
commit a9a4bc8c76d747aa40b30e2dfc176c781f353a08
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
After 963 iterations of generic/530, it deadlocked during recovery
on a pinned inode cluster buffer like so:
XFS (pmem1): Starting recovery (logdev: internal)
INFO: task kworker/8:0:306037 blocked for more than 122 seconds.
Not tainted 5.17.0-rc6-dgc+ #975
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/8:0 state:D stack:13024 pid:306037 ppid: 2 flags:0x00004000
Workqueue: xfs-inodegc/pmem1 xfs_inodegc_worker
Call Trace:
<TASK>
__schedule+0x30d/0x9e0
schedule+0x55/0xd0
schedule_timeout+0x114/0x160
__down+0x99/0xf0
down+0x5e/0x70
xfs_buf_lock+0x36/0xf0
xfs_buf_find+0x418/0x850
xfs_buf_get_map+0x47/0x380
xfs_buf_read_map+0x54/0x240
xfs_trans_read_buf_map+0x1bd/0x490
xfs_imap_to_bp+0x4f/0x70
xfs_iunlink_map_ino+0x66/0xd0
xfs_iunlink_map_prev.constprop.0+0x148/0x2f0
xfs_iunlink_remove_inode+0xf2/0x1d0
xfs_inactive_ifree+0x1a3/0x900
xfs_inode_unlink+0xcc/0x210
xfs_inodegc_worker+0x1ac/0x2f0
process_one_work+0x1ac/0x390
worker_thread+0x56/0x3c0
kthread+0xf6/0x120
ret_from_fork+0x1f/0x30
</TASK>
task:mount state:D stack:13248 pid:324509 ppid:324233 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x30d/0x9e0
schedule+0x55/0xd0
schedule_timeout+0x114/0x160
__down+0x99/0xf0
down+0x5e/0x70
xfs_buf_lock+0x36/0xf0
xfs_buf_find+0x418/0x850
xfs_buf_get_map+0x47/0x380
xfs_buf_read_map+0x54/0x240
xfs_trans_read_buf_map+0x1bd/0x490
xfs_imap_to_bp+0x4f/0x70
xfs_iget+0x300/0xb40
xlog_recover_process_one_iunlink+0x4c/0x170
xlog_recover_process_iunlinks.isra.0+0xee/0x130
xlog_recover_finish+0x57/0x110
xfs_log_mount_finish+0xfc/0x1e0
xfs_mountfs+0x540/0x910
xfs_fs_fill_super+0x495/0x850
get_tree_bdev+0x171/0x270
xfs_fs_get_tree+0x15/0x20
vfs_get_tree+0x24/0xc0
path_mount+0x304/0xba0
__x64_sys_mount+0x108/0x140
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK>
task:xfsaild/pmem1 state:D stack:14544 pid:324525 ppid: 2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x30d/0x9e0
schedule+0x55/0xd0
io_schedule+0x4b/0x80
xfs_buf_wait_unpin+0x9e/0xf0
__xfs_buf_submit+0x14a/0x230
xfs_buf_delwri_submit_buffers+0x107/0x280
xfs_buf_delwri_submit_nowait+0x10/0x20
xfsaild+0x27e/0x9d0
kthread+0xf6/0x120
ret_from_fork+0x1f/0x30
We have the mount process waiting on an inode cluster buffer read,
inodegc doing unlink waiting on the same inode cluster buffer, and
the AIL push thread blocked in writeback waiting for the inode
cluster buffer to become unpinned.
What has happened here is that the AIL push thread has raced with
the inodegc process modifying, committing and pinning the inode
cluster buffer here in xfs_buf_delwri_submit_buffers() here:
blk_start_plug(&plug);
list_for_each_entry_safe(bp, n, buffer_list, b_list) {
if (!wait_list) {
if (xfs_buf_ispinned(bp)) {
pinned++;
continue;
}
Here >>>>>>
if (!xfs_buf_trylock(bp))
continue;
Basically, the AIL has found the buffer wasn't pinned and got the
lock without blocking, but then the buffer was pinned. This implies
the processing here was pre-empted between the pin check and the
lock, because the pin count can only be increased while holding the
buffer locked. Hence when it has gone to submit the IO, it has
blocked waiting for the buffer to be unpinned.
With all executing threads now waiting on the buffer to be unpinned,
we normally get out of situations like this via the background log
worker issuing a log force which will unpinned stuck buffers like
this. But at this point in recovery, we haven't started the log
worker. In fact, the first thing we do after processing intents and
unlinked inodes is *start the log worker*. IOWs, we start it too
late to have it break deadlocks like this.
Avoid this and any other similar deadlock vectors in intent and
unlinked inode recovery by starting the log worker before we recover
intents and unlinked inodes. This part of recovery runs as though
the filesystem is fully active, so we really should have the same
infrastructure running as we normally do at runtime.
Signed-off-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu(a)oracle.com>
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Long Li <leo.lilong(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/xfs/xfs_log.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 92c5d6ef47d6..e154c0d44f9c 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -782,10 +782,9 @@ xfs_log_mount_finish(
* mount failure occurs.
*/
mp->m_super->s_flags |= SB_ACTIVE;
+ xfs_log_work_queue(mp);
if (xlog_recovery_needed(log))
error = xlog_recover_finish(log);
- if (!error)
- xfs_log_work_queue(mp);
mp->m_super->s_flags &= ~SB_ACTIVE;
evict_inodes(mp->m_super);
--
2.25.1
1
26

[PATCH openEuler-5.10-LTS-SP1 01/33] xfs: log worker needs to start before intent/unlink recovery
by Jialin Zhang 12 Apr '23
by Jialin Zhang 12 Apr '23
12 Apr '23
From: Dave Chinner <dchinner(a)redhat.com>
mainline inclusion
from mainline-v5.17-rc6
commit a9a4bc8c76d747aa40b30e2dfc176c781f353a08
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
After 963 iterations of generic/530, it deadlocked during recovery
on a pinned inode cluster buffer like so:
XFS (pmem1): Starting recovery (logdev: internal)
INFO: task kworker/8:0:306037 blocked for more than 122 seconds.
Not tainted 5.17.0-rc6-dgc+ #975
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/8:0 state:D stack:13024 pid:306037 ppid: 2 flags:0x00004000
Workqueue: xfs-inodegc/pmem1 xfs_inodegc_worker
Call Trace:
<TASK>
__schedule+0x30d/0x9e0
schedule+0x55/0xd0
schedule_timeout+0x114/0x160
__down+0x99/0xf0
down+0x5e/0x70
xfs_buf_lock+0x36/0xf0
xfs_buf_find+0x418/0x850
xfs_buf_get_map+0x47/0x380
xfs_buf_read_map+0x54/0x240
xfs_trans_read_buf_map+0x1bd/0x490
xfs_imap_to_bp+0x4f/0x70
xfs_iunlink_map_ino+0x66/0xd0
xfs_iunlink_map_prev.constprop.0+0x148/0x2f0
xfs_iunlink_remove_inode+0xf2/0x1d0
xfs_inactive_ifree+0x1a3/0x900
xfs_inode_unlink+0xcc/0x210
xfs_inodegc_worker+0x1ac/0x2f0
process_one_work+0x1ac/0x390
worker_thread+0x56/0x3c0
kthread+0xf6/0x120
ret_from_fork+0x1f/0x30
</TASK>
task:mount state:D stack:13248 pid:324509 ppid:324233 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x30d/0x9e0
schedule+0x55/0xd0
schedule_timeout+0x114/0x160
__down+0x99/0xf0
down+0x5e/0x70
xfs_buf_lock+0x36/0xf0
xfs_buf_find+0x418/0x850
xfs_buf_get_map+0x47/0x380
xfs_buf_read_map+0x54/0x240
xfs_trans_read_buf_map+0x1bd/0x490
xfs_imap_to_bp+0x4f/0x70
xfs_iget+0x300/0xb40
xlog_recover_process_one_iunlink+0x4c/0x170
xlog_recover_process_iunlinks.isra.0+0xee/0x130
xlog_recover_finish+0x57/0x110
xfs_log_mount_finish+0xfc/0x1e0
xfs_mountfs+0x540/0x910
xfs_fs_fill_super+0x495/0x850
get_tree_bdev+0x171/0x270
xfs_fs_get_tree+0x15/0x20
vfs_get_tree+0x24/0xc0
path_mount+0x304/0xba0
__x64_sys_mount+0x108/0x140
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK>
task:xfsaild/pmem1 state:D stack:14544 pid:324525 ppid: 2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x30d/0x9e0
schedule+0x55/0xd0
io_schedule+0x4b/0x80
xfs_buf_wait_unpin+0x9e/0xf0
__xfs_buf_submit+0x14a/0x230
xfs_buf_delwri_submit_buffers+0x107/0x280
xfs_buf_delwri_submit_nowait+0x10/0x20
xfsaild+0x27e/0x9d0
kthread+0xf6/0x120
ret_from_fork+0x1f/0x30
We have the mount process waiting on an inode cluster buffer read,
inodegc doing unlink waiting on the same inode cluster buffer, and
the AIL push thread blocked in writeback waiting for the inode
cluster buffer to become unpinned.
What has happened here is that the AIL push thread has raced with
the inodegc process modifying, committing and pinning the inode
cluster buffer here in xfs_buf_delwri_submit_buffers() here:
blk_start_plug(&plug);
list_for_each_entry_safe(bp, n, buffer_list, b_list) {
if (!wait_list) {
if (xfs_buf_ispinned(bp)) {
pinned++;
continue;
}
Here >>>>>>
if (!xfs_buf_trylock(bp))
continue;
Basically, the AIL has found the buffer wasn't pinned and got the
lock without blocking, but then the buffer was pinned. This implies
the processing here was pre-empted between the pin check and the
lock, because the pin count can only be increased while holding the
buffer locked. Hence when it has gone to submit the IO, it has
blocked waiting for the buffer to be unpinned.
With all executing threads now waiting on the buffer to be unpinned,
we normally get out of situations like this via the background log
worker issuing a log force which will unpinned stuck buffers like
this. But at this point in recovery, we haven't started the log
worker. In fact, the first thing we do after processing intents and
unlinked inodes is *start the log worker*. IOWs, we start it too
late to have it break deadlocks like this.
Avoid this and any other similar deadlock vectors in intent and
unlinked inode recovery by starting the log worker before we recover
intents and unlinked inodes. This part of recovery runs as though
the filesystem is fully active, so we really should have the same
infrastructure running as we normally do at runtime.
Signed-off-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu(a)oracle.com>
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Long Li <leo.lilong(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/xfs/xfs_log.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 92c5d6ef47d6..e154c0d44f9c 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -782,10 +782,9 @@ xfs_log_mount_finish(
* mount failure occurs.
*/
mp->m_super->s_flags |= SB_ACTIVE;
+ xfs_log_work_queue(mp);
if (xlog_recovery_needed(log))
error = xlog_recover_finish(log);
- if (!error)
- xfs_log_work_queue(mp);
mp->m_super->s_flags &= ~SB_ACTIVE;
evict_inodes(mp->m_super);
--
2.25.1
1
32

[PATCH openEuler-1.0-LTS] hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition
by Yongqiang Liu 11 Apr '23
by Yongqiang Liu 11 Apr '23
11 Apr '23
From: Zheng Wang <zyytlz.wz(a)163.com>
mainline inclusion
from mainline-v6.3-rc3
commit cb090e64cf25602b9adaf32d5dfc9c8bec493cd1
category: bugfix
bugzilla: 188657, https://gitee.com/src-openeuler/kernel/issues/I6T36A
CVE: CVE-2023-1855
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
In xgene_hwmon_probe, &ctx->workq is bound with xgene_hwmon_evt_work.
Then it will be started.
If we remove the driver which will call xgene_hwmon_remove to clean up,
there may be unfinished work.
The possible sequence is as follows:
Fix it by finishing the work before cleanup in xgene_hwmon_remove.
CPU0 CPU1
|xgene_hwmon_evt_work
xgene_hwmon_remove |
kfifo_free(&ctx->async_msg_fifo);|
|
|kfifo_out_spinlocked
|//use &ctx->async_msg_fifo
Fixes: 2ca492e22cb7 ("hwmon: (xgene) Fix crash when alarm occurs before driver probe")
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Link: https://lore.kernel.org/r/20230310084007.1403388-1-zyytlz.wz@163.com
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Zhao Wenhui <zhaowenhui8(a)huawei.com>
Reviewed-by: songping yu <yusongping(a)huawei.com>
Reviewed-by: Zhang Qiao <zhangqiao22(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/hwmon/xgene-hwmon.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/hwmon/xgene-hwmon.c b/drivers/hwmon/xgene-hwmon.c
index a3cd91f23267..2dd19a420305 100644
--- a/drivers/hwmon/xgene-hwmon.c
+++ b/drivers/hwmon/xgene-hwmon.c
@@ -780,6 +780,7 @@ static int xgene_hwmon_remove(struct platform_device *pdev)
{
struct xgene_hwmon_dev *ctx = platform_get_drvdata(pdev);
+ cancel_work_sync(&ctx->workq);
hwmon_device_unregister(ctx->hwmon_dev);
kfifo_free(&ctx->async_msg_fifo);
if (acpi_disabled)
--
2.25.1
1
0

11 Apr '23
��ã�
openEuler Developer Day 2023(��� ODD 2023���� openEuler������ȶ������飬
�ǿ���ԭ�ӿ�Դ��������� openEuler ��������Ŀ����߷�ᡣ
ODD 2023����2023��4��20��~21�����Ϻ��ֶ�����Ƶ���У���ӭ��ұ����λᡣ
������̼������Ϣ�ɲο�������ӣ�https://www.openeuler.org/zh/interaction/summit-list/devday2023/
�������ӣ�https://e-campaign.huawei.com/m/RNN3Yz
����openEuler Kernel SIG�������μ�4��21������16��00 �C 17��30���е�ODD 2023 Kernel SIG�����¿��Ź������飬
��ʱ�������������ߣ�����������潻����
���¿��Ź���������ǰ�ռ����⣬�밴��������+����+�������ˡ�����ʽ�ύ��etherpad��
https://etherpad.openeuler.org/p/ODD2023_Kernel_SIG
1
0
inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6TN56
--------------------------------
smatch report:
fs/eulerfs/namei.c:118 eufs_lookup() error: 'inode' dereferencing possible ERR_PTR()
Signed-off-by: Kang Chen <void0red(a)hust.edu.cn>
---
I run smatch on the openEuler-2203-sp1 source and find some bugs.
But I have on idea on the fields of the commit message template.
Please let me know if there is anything that can be improved.
fs/eulerfs/namei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/eulerfs/namei.c b/fs/eulerfs/namei.c
index e4c6c36575f2..cfba6eefbcc8 100644
--- a/fs/eulerfs/namei.c
+++ b/fs/eulerfs/namei.c
@@ -116,7 +116,7 @@ static struct dentry *eufs_lookup(struct inode *dir, struct dentry *dentry,
inode = eufs_iget(dir->i_sb, s2p(dir->i_sb, de->inode));
if (inode == ERR_PTR(-ESTALE)) {
eufs_err(dir->i_sb, "deleted inode referenced: 0x%lx",
- inode->i_ino);
+ le64_to_cpu(de->inode));
return ERR_PTR(-EIO);
}
not_found:
--
2.34.1
2
1

[PATCH openEuler-1.0-LTS] xirc2ps_cs: Fix use after free bug in xirc2ps_detach
by Yongqiang Liu 10 Apr '23
by Yongqiang Liu 10 Apr '23
10 Apr '23
From: Zheng Wang <zyytlz.wz(a)163.com>
stable inclusion
from stable-v4.19.279
commit 526660c25d3b93b1232a525b75469048388f0928
category: bugfix
bugzilla: 188641, https://gitee.com/src-openeuler/kernel/issues/I6R4MM
CVE: CVE-2023-1670
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit e8d20c3ded59a092532513c9bd030d1ea66f5f44 ]
In xirc2ps_probe, the local->tx_timeout_task was bounded
with xirc2ps_tx_timeout_task. When timeout occurs,
it will call xirc_tx_timeout->schedule_work to start the
work.
When we call xirc2ps_detach to remove the driver, there
may be a sequence as follows:
Stop responding to timeout tasks and complete scheduled
tasks before cleanup in xirc2ps_detach, which will fix
the problem.
CPU0 CPU1
|xirc2ps_tx_timeout_task
xirc2ps_detach |
free_netdev |
kfree(dev); |
|
| do_reset
| //use dev
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/net/ethernet/xircom/xirc2ps_cs.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/xircom/xirc2ps_cs.c b/drivers/net/ethernet/xircom/xirc2ps_cs.c
index fd5288ff53b5..e3438cef5f9c 100644
--- a/drivers/net/ethernet/xircom/xirc2ps_cs.c
+++ b/drivers/net/ethernet/xircom/xirc2ps_cs.c
@@ -503,6 +503,11 @@ static void
xirc2ps_detach(struct pcmcia_device *link)
{
struct net_device *dev = link->priv;
+ struct local_info *local = netdev_priv(dev);
+
+ netif_carrier_off(dev);
+ netif_tx_disable(dev);
+ cancel_work_sync(&local->tx_timeout_task);
dev_dbg(&link->dev, "detach\n");
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition
by Yongqiang Liu 10 Apr '23
by Yongqiang Liu 10 Apr '23
10 Apr '23
From: Zheng Wang <zyytlz.wz(a)163.com>
maillist inclusion
category: bugfix
bugzilla: 188655, https://gitee.com/src-openeuler/kernel/issues/I6T36H
CVE: CVE-2023-1859
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/…
----------------------------------------
In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
to init priv->rings and bound &ring->work with p9_xen_response.
When it calls xen_9pfs_front_event_handler to handle IRQ requests,
it will finally call schedule_work to start the work.
When we call xen_9pfs_front_remove to remove the driver, there
may be a sequence as follows:
Fix it by finishing the work before cleanup in xen_9pfs_front_free.
Note that, this bug is found by static analysis, which might be
false positive.
CPU0 CPU1
|p9_xen_response
xen_9pfs_front_remove|
xen_9pfs_front_free|
kfree(priv) |
//free priv |
|p9_tag_lookup
|//use priv->client
Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
Signed-off-by: Zheng Wang <zyytlz.wz(a)163.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski(a)linux.intel.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Lu Wei <luwei32(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/9p/trans_xen.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index 81002cba88b3..8e158f09cba8 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -290,6 +290,10 @@ static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
write_unlock(&xen_9pfs_lock);
for (i = 0; i < priv->num_rings; i++) {
+ struct xen_9pfs_dataring *ring = &priv->rings[i];
+
+ cancel_work_sync(&ring->work);
+
if (!priv->rings[i].intf)
break;
if (priv->rings[i].irq > 0)
--
2.25.1
1
0

[PATCH OLK-5.10] KVM: nVMX: Set LDTR to its architecturally defined value on nested VM-Exit
by 任敏敏(联通集团联通数字科技有 限公司本部) 10 Apr '23
by 任敏敏(联通集团联通数字科技有 限公司本部) 10 Apr '23
10 Apr '23
From: Sean Christopherson <seanjc(a)google.com>
stable inclusion
from stable-v5.15
commit afc8de0118be84f4058b9977d481aeb3e0758dbb
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6SN2F
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
------------------------------
commit afc8de0118be84f4058b9977d481aeb3e0758dbb upstream
Set L1's LDTR on VM-Exit per the Intel SDM:
The host-state area does not contain a selector field for LDTR. LDTR is
established as follows on all VM exits: the selector is cleared to
0000H, the segment is marked unusable and is otherwise undefined
(although the base address is always canonical).
This is likely a benign bug since the LDTR is unusable, as it means the
L1 VMM is conditioned to reload its LDTR in order to function properly on
bare metal.
Fixes: 4704d0befb07 ("KVM: nVMX: Exiting from L2 to L1")
Reviewed-by: Reiji Watanabe <reijiw(a)google.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20210713163324.627647-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: rminmin <renmm6(a)chinaunicom.cn>
---
arch/x86/kvm/vmx/nested.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9003b14d72ca..05284589c14d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4326,6 +4326,10 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
};
vmx_set_segment(vcpu, &seg, VCPU_SREG_TR);
+ memset(&seg, 0, sizeof(seg));
+ seg.unusable = 1;
+ vmx_set_segment(vcpu, &seg, VCPU_SREG_LDTR);
+
kvm_set_dr(vcpu, 7, 0x400);
vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
--
2.33.0
Èç¹ûÄúŽíÎóœÓÊÕÁËžÃÓÊŒþ£¬ÇëÍš¹ýµç×ÓÓÊŒþÁ¢ŒŽÍšÖªÎÒÃÇ¡£Çë»ØžŽÓÊŒþµœ hqs-spmc@chinaunicom.cn£¬ŒŽ¿ÉÒÔÍ˶©ŽËÓÊŒþ¡£ÎÒÃÇœ«Á¢ŒŽœ«ÄúµÄÐÅÏ¢ŽÓÎÒÃǵķ¢ËÍÄ¿ÂŒÖÐÉŸ³ý¡£ If you have received this email in error please notify us immediately by e-mail. Please reply to hqs-spmc(a)chinaunicom.cn ,you can unsubscribe from this mail. We will immediately remove your information from send catalogue of our.
1
0
test report:
[root@localhost at]# bash -x dyn_affinity_interface.sh
+ ret=0
++ cat /proc/sys/kernel/sched_util_low_pct
+ default_sched_util_low_pct=85
+ read_default_val /proc/sys/kernel/sched_util_low_pct 85
+ local interface=/proc/sys/kernel/sched_util_low_pct
+ local expect_val=85
+ echo 'TINFO: test /proc/sys/kernel/sched_util_low_pct default value is 85'
TINFO: test /proc/sys/kernel/sched_util_low_pct default value is 85
+ '[' '!' -f /proc/sys/kernel/sched_util_low_pct ']'
++ cat /proc/sys/kernel/sched_util_low_pct
+ '[' 85 '!=' 85 ']'
+ '[' 0 -ne 0 ']'
+ echo 'TPASS: dyn_affinity_interface test success'
TPASS: dyn_affinity_interface test success
+ write_some_val /proc/sys/kernel/sched_util_low_pct -1 1
+ write_some_val /proc/sys/kernel/sched_util_low_pct 101 1
+ write_some_val /proc/sys/kernel/sched_util_low_pct abc 1
+ write_some_val /proc/sys/kernel/sched_util_low_pct 100 0
+ local interface=/proc/sys/kernel/sched_util_low_pct
+ local write_val=100
+ local expect_ret=0
+ echo 'TINFO: write 100 to /proc/sys/kernel/sched_util_low_pct, expect ret: 0'
TINFO: write 100 to /proc/sys/kernel/sched_util_low_pct, expect ret: 0
+ echo 100
+ '[' 0 -ne 0 ']'
+ echo 'TPASS: write 100 to /proc/sys/kernel/sched_util_low_pct success'
TPASS: write 100 to /proc/sys/kernel/sched_util_low_pct success
+ write_some_val /proc/sys/kernel/sched_util_low_pct 0 0
+ local interface=/proc/sys/kernel/sched_util_low_pct
+ local write_val=0
+ local expect_ret=0
+ echo 'TINFO: write 0 to /proc/sys/kernel/sched_util_low_pct, expect ret: 0'
TINFO: write 0 to /proc/sys/kernel/sched_util_low_pct, expect ret: 0
+ echo 0
+ '[' 0 -ne 0 ']'
+ echo 'TPASS: write 0 to /proc/sys/kernel/sched_util_low_pct success'
TPASS: write 0 to /proc/sys/kernel/sched_util_low_pct success
+ cleanup
+ echo 85
+ exit 0
[root@localhost at]#
[root@localhost at]# bash -x dyn_affinity_low_load.sh
+ TEST_CPUSET_PATH=/sys/fs/cgroup/cpuset/cloud
+ TEST_CPU_PATH=/sys/fs/cgroup/cpu/cloud
+ ret=0
++ seq 1 1
+ for i in $(seq 1 1)
+ setup
+ '[' -d /sys/fs/cgroup/cpuset/cloud ']'
+ mkdir -p /sys/fs/cgroup/cpuset/cloud /sys/fs/cgroup/cpu/cloud
+ echo 0
+ echo 0-7
+ echo 1
+ do_test
+ echo 4166
+ echo 4166
+ local pid=0
++ seq 1 3
+ for i in $(seq 1 3)
+ pid=4170
+ echo 4170
+ echo 4170
+ for i in $(seq 1 3)
+ pid=4171
+ echo 4171
+ echo 4171
+ ./cpu_load_not_while
+ for i in $(seq 1 3)
+ pid=4172
+ ./cpu_load_not_while
+ echo 4172
+ ./cpu_load_not_while
+ echo 4172
+ echo 4166
+ echo 4166
+ sleep 0.1
++ seq 1 100
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ egrep 1 -w
++ grep load
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ wc -l
++ egrep 1 -w
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ egrep 1 -w
++ wc -l
++ grep load
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=3
+ '[' 3 -ne 3 ']'
+ '[' 0 -eq 0 ']'
+ ps -o psr,comm,pid
+ grep cpu_load
1 cpu_load_not_wh 4170
1 cpu_load_not_wh 4171
1 cpu_load_not_wh 4172
+ echo 'TPASS: test success'
TPASS: test success
+ cleanup
+ killall cpu_load_not_while
+ sleep 0.1
+ rmdir /sys/fs/cgroup/cpuset/cloud /sys/fs/cgroup/cpu/cloud
+ exit 0
[root@localhost at]# bash -x dyn_affinity_high_load.sh
+ TEST_CPUSET_PATH=/sys/fs/cgroup/cpuset/cloud
+ TEST_CPU_PATH=/sys/fs/cgroup/cpu/cloud
+ ret=0
++ seq 1 1
+ for i in $(seq 1 1)
+ setup
+ '[' -d /sys/fs/cgroup/cpuset/cloud ']'
+ mkdir -p /sys/fs/cgroup/cpuset/cloud /sys/fs/cgroup/cpu/cloud
+ echo 0
+ echo 0-7
+ echo 1
+ do_test
+ echo 4121
+ echo 4121
+ local pid=0
++ seq 1 30
+ for i in $(seq 1 30)
+ pid=4125
+ echo 4125
+ echo 4125
+ for i in $(seq 1 30)
+ pid=4126
+ echo 4126
+ ./cpu_load_not_while
+ echo 4126
+ for i in $(seq 1 30)
+ pid=4127
+ echo 4127
+ ./cpu_load_not_while
+ ./cpu_load_not_while
+ echo 4127
+ for i in $(seq 1 30)
+ pid=4128
+ ./cpu_load_not_while
+ echo 4128
+ echo 4128
+ for i in $(seq 1 30)
+ pid=4129
+ echo 4129
+ echo 4129
+ for i in $(seq 1 30)
+ ./cpu_load_not_while
+ pid=4130
+ echo 4130
+ ./cpu_load_not_while
+ echo 4130
+ for i in $(seq 1 30)
+ pid=4131
+ ./cpu_load_not_while
+ echo 4131
+ echo 4131
+ for i in $(seq 1 30)
+ pid=4132
+ echo 4132
+ echo 4132
+ ./cpu_load_not_while
+ for i in $(seq 1 30)
+ pid=4133
+ ./cpu_load_not_while
+ echo 4133
+ echo 4133
+ for i in $(seq 1 30)
+ pid=4134
+ ./cpu_load_not_while
+ echo 4134
+ echo 4134
+ for i in $(seq 1 30)
+ pid=4135
+ ./cpu_load_not_while
+ echo 4135
+ echo 4135
+ for i in $(seq 1 30)
+ pid=4136
+ ./cpu_load_not_while
+ echo 4136
+ echo 4136
+ for i in $(seq 1 30)
+ pid=4137
+ ./cpu_load_not_while
+ echo 4137
+ echo 4137
+ for i in $(seq 1 30)
+ pid=4138
+ echo 4138
+ ./cpu_load_not_while
+ echo 4138
+ for i in $(seq 1 30)
+ pid=4139
+ ./cpu_load_not_while
+ echo 4139
+ echo 4139
+ for i in $(seq 1 30)
+ pid=4140
+ ./cpu_load_not_while
+ echo 4140
+ echo 4140
+ for i in $(seq 1 30)
+ pid=4141
+ echo 4141
+ ./cpu_load_not_while
+ echo 4141
+ for i in $(seq 1 30)
+ pid=4142
+ echo 4142
+ ./cpu_load_not_while
+ echo 4142
+ for i in $(seq 1 30)
+ pid=4143
+ echo 4143
+ ./cpu_load_not_while
+ echo 4143
+ for i in $(seq 1 30)
+ pid=4144
+ ./cpu_load_not_while
+ echo 4144
+ echo 4144
+ for i in $(seq 1 30)
+ pid=4145
+ ./cpu_load_not_while
+ echo 4145
+ echo 4145
+ for i in $(seq 1 30)
+ pid=4146
+ echo 4146
+ echo 4146
+ ./cpu_load_not_while
+ for i in $(seq 1 30)
+ pid=4147
+ echo 4147
+ echo 4147
+ ./cpu_load_not_while
+ for i in $(seq 1 30)
+ pid=4148
+ echo 4148
+ echo 4148
+ ./cpu_load_not_while
+ for i in $(seq 1 30)
+ pid=4149
+ echo 4149
+ echo 4149
+ ./cpu_load_not_while
+ for i in $(seq 1 30)
+ pid=4150
+ echo 4150
+ echo 4150
+ ./cpu_load_not_while
+ for i in $(seq 1 30)
+ pid=4151
+ echo 4151
+ echo 4151
+ ./cpu_load_not_while
+ for i in $(seq 1 30)
+ pid=4152
+ echo 4152
+ ./cpu_load_not_while
+ echo 4152
+ for i in $(seq 1 30)
+ pid=4153
+ ./cpu_load_not_while
+ echo 4153
+ echo 4153
+ for i in $(seq 1 30)
+ pid=4154
+ ./cpu_load_not_while
+ echo 4154
+ echo 4154
+ echo 4121
+ echo 4121
+ sleep 0.1
++ seq 1 100
+ for i in $(seq 1 100)
++ ps -o psr,comm,pid
++ grep load
++ egrep 1 -w
++ wc -l
+ local count=19
+ '[' 19 -ne 30 ']'
+ ps -o psr,comm,pid
+ grep cpu_load
1 cpu_load_not_wh 4125
6 cpu_load_not_wh 4126
1 cpu_load_not_wh 4127
1 cpu_load_not_wh 4128
1 cpu_load_not_wh 4129
1 cpu_load_not_wh 4130
1 cpu_load_not_wh 4131
1 cpu_load_not_wh 4132
1 cpu_load_not_wh 4133
1 cpu_load_not_wh 4134
7 cpu_load_not_wh 4135
1 cpu_load_not_wh 4136
1 cpu_load_not_wh 4137
4 cpu_load_not_wh 4138
1 cpu_load_not_wh 4139
1 cpu_load_not_wh 4140
1 cpu_load_not_wh 4141
5 cpu_load_not_wh 4142
1 cpu_load_not_wh 4143
1 cpu_load_not_wh 4144
1 cpu_load_not_wh 4145
1 cpu_load_not_wh 4146
1 cpu_load_not_wh 4147
3 cpu_load_not_wh 4148
5 cpu_load_not_wh 4149
1 cpu_load_not_wh 4150
1 cpu_load_not_wh 4151
1 cpu_load_not_wh 4152
5 cpu_load_not_wh 4153
6 cpu_load_not_wh 4154
+ echo 'TPASS: test success'
TPASS: test success
+ exit
tanghui (2):
sched: Add statistics for scheduler dynamic affinity
config: enable CONFIG_QOS_SCHED_DYNAMIC_AFFINITY by default
arch/arm64/configs/openeuler_defconfig | 1 +
arch/x86/configs/openeuler_defconfig | 1 +
include/linux/sched.h | 6 ++++++
kernel/sched/debug.c | 4 ++++
kernel/sched/fair.c | 11 +++++++++--
5 files changed, 21 insertions(+), 2 deletions(-)
--
2.17.1
1
2

08 Apr '23
From: Linus Torvalds <torvalds(a)linux-foundation.org>
stable inclusion
from stable-v4.19.274
commit c7603df97635954165fb599e64e197efc353979b
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6TIG1
CVE: NA
--------------------------------
commit f3dd0c53370e70c0f9b7e931bbec12916f3bb8cc upstream.
Commit 74e19ef0ff80 ("uaccess: Add speculation barrier to
copy_from_user()") built fine on x86-64 and arm64, and that's the extent
of my local build testing.
It turns out those got the <linux/nospec.h> include incidentally through
other header files (<linux/kvm_host.h> in particular), but that was not
true of other architectures, resulting in build errors
kernel/bpf/core.c: In function ‘___bpf_prog_run’:
kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’
so just make sure to explicitly include the proper <linux/nospec.h>
header file to make everybody see it.
Fixes: 74e19ef0ff80 ("uaccess: Add speculation barrier to copy_from_user()")
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Reported-by: Huacai Chen <chenhuacai(a)loongson.cn>
Tested-by: Geert Uytterhoeven <geert(a)linux-m68k.org>
Tested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Alexei Starovoitov <alexei.starovoitov(a)gmail.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
kernel/bpf/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index abf30609e0d9..56a60d483c8d 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -32,6 +32,7 @@
#include <linux/kallsyms.h>
#include <linux/rcupdate.h>
#include <linux/perf_event.h>
+#include <linux/nospec.h>
#include <asm/barrier.h>
#include <asm/unaligned.h>
--
2.25.1
1
0
*** BLURB HERE ***
Armin Wolf (1):
ACPI: battery: Fix missing NUL-termination with large strings
Baokun Li (1):
ext4: fail ext4_iget if special inode unallocated
Breno Leitao (1):
x86/bugs: Reset speculation control settings on init
Daniil Tatianin (1):
ACPICA: nsrepair: handle cases without a return value correctly
Dave Hansen (1):
uaccess: Add speculation barrier to copy_from_user()
Herbert Xu (2):
crypto: seqiv - Handle EBUSY correctly
crypto: rsa-pkcs1pad - Use akcipher_request_complete
Jann Horn (1):
timers: Prevent union confusion from unexpected restart_syscall()
Jason A. Donenfeld (1):
random: always mix cycle counter in add_latent_entropy()
Johan Hovold (3):
irqdomain: Fix association race
irqdomain: Fix disassociation race
irqdomain: Drop bogus fwspec-mapping error handling
Nikita Zhandarovich (1):
x86/mm: Fix use of uninitialized buffer in sme_enable()
Thomas Gleixner (1):
alarmtimer: Prevent starvation by small intervals and SIG_IGN
Vishal Verma (1):
ACPI: NFIT: fix a potential deadlock during NFIT teardown
Yang Jihong (2):
x86/kprobes: Fix __recover_optprobed_insn check optimizing logic
x86/kprobes: Fix arch_check_optimized_kprobe check within
optimized_kprobe range
Zhen Lei (1):
genirq: Fix the return type of kstat_cpu_irqs_sum()
Zhihao Cheng (1):
ext4: zero i_disksize when initializing the bootloader inode
arch/x86/include/asm/msr-index.h | 4 ++++
arch/x86/kernel/cpu/bugs.c | 10 ++++++++-
arch/x86/kernel/kprobes/opt.c | 6 +++---
arch/x86/mm/mem_encrypt_identity.c | 3 ++-
crypto/rsa-pkcs1pad.c | 34 +++++++++++++-----------------
crypto/seqiv.c | 2 +-
drivers/acpi/acpica/nsrepair.c | 12 ++++++-----
drivers/acpi/battery.c | 2 +-
drivers/acpi/nfit/core.c | 2 +-
fs/ext4/inode.c | 18 +++++++---------
fs/ext4/ioctl.c | 1 +
include/linux/kernel_stat.h | 2 +-
include/linux/kprobes.h | 2 ++
include/linux/nospec.h | 4 ++++
include/linux/random.h | 6 +++---
kernel/bpf/core.c | 2 --
kernel/irq/irqdomain.c | 31 +++++++++++++++++----------
kernel/kprobes.c | 6 +++---
kernel/time/alarmtimer.c | 33 +++++++++++++++++++++++++----
kernel/time/hrtimer.c | 2 ++
kernel/time/posix-stubs.c | 2 ++
kernel/time/posix-timers.c | 2 ++
lib/usercopy.c | 7 ++++++
23 files changed, 127 insertions(+), 66 deletions(-)
--
2.25.1
1
19

[PATCH openEuler-1.0-LTS 01/19] alarmtimer: Prevent starvation by small intervals and SIG_IGN
by Yongqiang Liu 07 Apr '23
by Yongqiang Liu 07 Apr '23
07 Apr '23
From: Thomas Gleixner <tglx(a)linutronix.de>
stable inclusion
from stable-v4.19.274
commit d6a300076d11a6e27b4d4f7fd986ec66ee97a3e1
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6TIG1
CVE: NA
--------------------------------
commit d125d1349abeb46945dc5e98f7824bf688266f13 upstream.
syzbot reported a RCU stall which is caused by setting up an alarmtimer
with a very small interval and ignoring the signal. The reproducer arms the
alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a
problem per se, but that's an issue when the signal is ignored because then
the timer is immediately rearmed because there is no way to delay that
rearming to the signal delivery path. See posix_timer_fn() and commit
58229a189942 ("posix-timers: Prevent softirq starvation by small intervals
and SIG_IGN") for details.
The reproducer does not set SIG_IGN explicitely, but it sets up the timers
signal with SIGCONT. That has the same effect as explicitely setting
SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and
the task is not ptraced.
The log clearly shows that:
[pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---
It works because the tasks are traced and therefore the signal is queued so
the tracer can see it, which delays the restart of the timer to the signal
delivery path. But then the tracer is killed:
[pid 5087] kill(-5102, SIGKILL <unfinished ...>
...
./strace-static-x86_64: Process 5107 detached
and after it's gone the stall can be observed:
syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns
[ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
...
[ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran:
[ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0:
[ 184.669821][ C0] NMI backtrace for cpu 0
[ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
...
[ 184.670036][ C0] Call Trace:
[ 184.670041][ C0] <IRQ>
[ 184.670045][ C0] alarmtimer_fired+0x327/0x670
posix_timer_fn() prevents that by checking whether the interval for
timers which have the signal ignored is smaller than a jiffie and
artifically delay it by shifting the next expiry out by a jiffie. That's
accurate vs. the overrun accounting, but slightly inaccurate
vs. timer_gettimer(2).
The comment in that function says what needs to be done and there was a fix
available for the regular userspace induced SIG_IGN mechanism, but that did
not work due to the implicit ignore for SIGCONT and similar signals. This
needs to be worked on, but for now the only available workaround is to do
exactly what posix_timer_fn() does:
Increase the interval of self-rearming timers, which have their signal
ignored, to at least a jiffie.
Interestingly this has been fixed before via commit ff86bf0c65f1
("alarmtimer: Rate limit periodic intervals") already, but that fix got
lost in a later rework.
Reported-by: syzbot+b9564ba6e8e00694511b(a)syzkaller.appspotmail.com
Fixes: f2c45807d399 ("alarmtimer: Switch over to generic set/get/rearm routine")
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: John Stultz <jstultz(a)google.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
kernel/time/alarmtimer.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 6a2ba39889bd..56af8a97cf2d 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -476,11 +476,35 @@ u64 alarm_forward(struct alarm *alarm, ktime_t now, ktime_t interval)
}
EXPORT_SYMBOL_GPL(alarm_forward);
-u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
+static u64 __alarm_forward_now(struct alarm *alarm, ktime_t interval, bool throttle)
{
struct alarm_base *base = &alarm_bases[alarm->type];
+ ktime_t now = base->gettime();
+
+ if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && throttle) {
+ /*
+ * Same issue as with posix_timer_fn(). Timers which are
+ * periodic but the signal is ignored can starve the system
+ * with a very small interval. The real fix which was
+ * promised in the context of posix_timer_fn() never
+ * materialized, but someone should really work on it.
+ *
+ * To prevent DOS fake @now to be 1 jiffie out which keeps
+ * the overrun accounting correct but creates an
+ * inconsistency vs. timer_gettime(2).
+ */
+ ktime_t kj = NSEC_PER_SEC / HZ;
+
+ if (interval < kj)
+ now = ktime_add(now, kj);
+ }
+
+ return alarm_forward(alarm, now, interval);
+}
- return alarm_forward(alarm, base->gettime(), interval);
+u64 alarm_forward_now(struct alarm *alarm, ktime_t interval)
+{
+ return __alarm_forward_now(alarm, interval, false);
}
EXPORT_SYMBOL_GPL(alarm_forward_now);
@@ -554,9 +578,10 @@ static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
if (posix_timer_event(ptr, si_private) && ptr->it_interval) {
/*
* Handle ignored signals and rearm the timer. This will go
- * away once we handle ignored signals proper.
+ * away once we handle ignored signals proper. Ensure that
+ * small intervals cannot starve the system.
*/
- ptr->it_overrun += alarm_forward_now(alarm, ptr->it_interval);
+ ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true);
++ptr->it_requeue_pending;
ptr->it_active = 1;
result = ALARMTIMER_RESTART;
--
2.25.1
1
18

[PATCH openEuler-1.0-LTS 1/7] Revert "cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()"
by Yongqiang Liu 07 Apr '23
by Yongqiang Liu 07 Apr '23
07 Apr '23
From: Cai Xinchen <caixinchen1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6TI3Y
CVE: NA
--------------------------------
This reverts commit c2d8355618485dd9108ee9077799a227771af307.
Signed-off-by: Cai Xinchen <caixinchen1(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
kernel/cgroup/cgroup-internal.h | 2 --
kernel/cgroup/cgroup-v1.c | 4 ++--
kernel/cgroup/cgroup.c | 4 ++--
3 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index edb45e2f7f54..2e65e4c4d6e7 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -226,8 +226,6 @@ int cgroup_migrate(struct task_struct *leader, bool threadgroup,
int cgroup_attach_task(struct cgroup *dst_cgrp, struct task_struct *leader,
bool threadgroup);
-void cgroup_attach_lock(void);
-void cgroup_attach_unlock(void);
struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup)
__acquires(&cgroup_threadgroup_rwsem);
void cgroup_procs_write_finish(struct task_struct *task)
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 8bd36f2143eb..c4cc6c1ddacd 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -55,7 +55,7 @@ int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
int retval = 0;
mutex_lock(&cgroup_mutex);
- cgroup_attach_lock();
+ percpu_down_write(&cgroup_threadgroup_rwsem);
for_each_root(root) {
struct cgroup *from_cgrp;
@@ -70,7 +70,7 @@ int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
if (retval)
break;
}
- cgroup_attach_unlock();
+ percpu_up_write(&cgroup_threadgroup_rwsem);
mutex_unlock(&cgroup_mutex);
return retval;
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 4a4d8a3f06ab..6487df9a6be0 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2236,7 +2236,7 @@ EXPORT_SYMBOL_GPL(task_cgroup_path);
* write-locking cgroup_threadgroup_rwsem. This allows ->attach() to assume that
* CPU hotplug is disabled on entry.
*/
-void cgroup_attach_lock(void)
+static void cgroup_attach_lock(void)
{
cpus_read_lock();
percpu_down_write(&cgroup_threadgroup_rwsem);
@@ -2246,7 +2246,7 @@ void cgroup_attach_lock(void)
* cgroup_attach_unlock - Undo cgroup_attach_lock()
* @lock_threadgroup: whether to up_write cgroup_threadgroup_rwsem
*/
-void cgroup_attach_unlock(void)
+static void cgroup_attach_unlock(void)
{
percpu_up_write(&cgroup_threadgroup_rwsem);
cpus_read_unlock();
--
2.25.1
1
6

07 Apr '23
mainline inclusion
from mainline-v5.19.rc1
commit 3aba103006bcc4a7472b7c9506b3bc065ffb7992
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6TK1U
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Connect with O_NONBLOCK will not be completed immediately
and returns -EINPROGRESS. It is possible to use selector/poll
for completion by selecting the socket for writing. After select
indicates writability, a second connect function call will return
0 to indicate connected successfully as TCP does, but smc returns
-EISCONN. Use socket state for smc to indicate connect state, which
can help smc aligning the connect behaviour with TCP.
Signed-off-by: Guangguan Wang <guangguan.wang(a)linux.alibaba.com>
Acked-by: Karsten Graul <kgraul(a)linux.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Litao Jiao <jiaolitao(a)sangfor.com.cn>
---
net/smc/af_smc.c | 51 ++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 47 insertions(+), 4 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 5d7710dd9514..8f73da1ee7b4 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -1097,9 +1097,29 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr,
goto out_err;
lock_sock(sk);
+ switch (sock->state) {
+ default:
+ rc = -EINVAL;
+ goto out;
+ case SS_CONNECTED:
+ rc = sk->sk_state == SMC_ACTIVE ? -EISCONN : -EINVAL;
+ goto out;
+ case SS_CONNECTING:
+ if (sk->sk_state == SMC_ACTIVE)
+ goto connected;
+ break;
+ case SS_UNCONNECTED:
+ sock->state = SS_CONNECTING;
+ break;
+ }
+
switch (sk->sk_state) {
default:
goto out;
+ case SMC_CLOSED:
+ rc = sock_error(sk) ? : -ECONNABORTED;
+ sock->state = SS_UNCONNECTED;
+ goto out;
case SMC_ACTIVE:
rc = -EISCONN;
goto out;
@@ -1118,21 +1138,26 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr,
if (rc && rc != -EINPROGRESS)
goto out;
- if (smc->use_fallback)
+ if (smc->use_fallback) {
+ sock->state = rc ? SS_CONNECTING : SS_CONNECTED;
goto out;
+ }
sock_hold(&smc->sk); /* sock put in passive closing */
+
if (flags & O_NONBLOCK) {
if (queue_work(smc_hs_wq, &smc->connect_work))
smc->connect_nonblock = 1;
rc = -EINPROGRESS;
+ goto out;
} else {
rc = __smc_connect(smc);
if (rc < 0)
goto out;
- else
- rc = 0; /* success cases including fallback */
}
+connected:
+ rc = 0;
+ sock->state = SS_CONNECTED;
out:
release_sock(sk);
out_err:
@@ -1234,6 +1259,7 @@ struct sock *smc_accept_dequeue(struct sock *parent,
}
if (new_sock) {
sock_graft(new_sk, new_sock);
+ new_sock->state = SS_CONNECTED;
if (isk->use_fallback) {
smc_sk(new_sk)->clcsock->file = new_sock->file;
isk->clcsock->file->private_data = isk->clcsock;
@@ -1865,7 +1891,7 @@ static int smc_listen(struct socket *sock, int backlog)
rc = -EINVAL;
if ((sk->sk_state != SMC_INIT && sk->sk_state != SMC_LISTEN) ||
- smc->connect_nonblock)
+ smc->connect_nonblock || sock->state != SS_UNCONNECTED)
goto out;
rc = 0;
@@ -2135,6 +2161,17 @@ static int smc_shutdown(struct socket *sock, int how)
lock_sock(sk);
+ if (sock->state == SS_CONNECTING) {
+ if (sk->sk_state == SMC_ACTIVE)
+ sock->state = SS_CONNECTED;
+ else if (sk->sk_state == SMC_PEERCLOSEWAIT1 ||
+ sk->sk_state == SMC_PEERCLOSEWAIT2 ||
+ sk->sk_state == SMC_APPCLOSEWAIT1 ||
+ sk->sk_state == SMC_APPCLOSEWAIT2 ||
+ sk->sk_state == SMC_APPFINCLOSEWAIT)
+ sock->state = SS_DISCONNECTING;
+ }
+
rc = -ENOTCONN;
if ((sk->sk_state != SMC_ACTIVE) &&
(sk->sk_state != SMC_PEERCLOSEWAIT1) &&
@@ -2148,6 +2185,7 @@ static int smc_shutdown(struct socket *sock, int how)
sk->sk_shutdown = smc->clcsock->sk->sk_shutdown;
if (sk->sk_shutdown == SHUTDOWN_MASK) {
sk->sk_state = SMC_CLOSED;
+ sk->sk_socket->state = SS_UNCONNECTED;
sock_put(sk);
}
goto out;
@@ -2173,6 +2211,10 @@ static int smc_shutdown(struct socket *sock, int how)
/* map sock_shutdown_cmd constants to sk_shutdown value range */
sk->sk_shutdown |= how + 1;
+ if (sk->sk_state == SMC_CLOSED)
+ sock->state = SS_UNCONNECTED;
+ else
+ sock->state = SS_DISCONNECTING;
out:
release_sock(sk);
return rc ? rc : rc1;
@@ -2464,6 +2506,7 @@ static int smc_create(struct net *net, struct socket *sock, int protocol,
rc = -ENOBUFS;
sock->ops = &smc_sock_ops;
+ sock->state = SS_UNCONNECTED;
sk = smc_sock_alloc(net, sock, protocol);
if (!sk)
goto out;
--
2.18.0.windows.1
1
0

[PATCH openEuler-1.0-LTS 01/13] mm: mem_reliable: Initialize reliable_nr_page when mm_init()
by Yongqiang Liu 07 Apr '23
by Yongqiang Liu 07 Apr '23
07 Apr '23
From: Ma Wupeng <mawupeng1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6RKHX
CVE: NA
--------------------------------
After the fork operation, it is erroneous for the child process to have a
reliable page size twice that of its parent process.
Upon examining the mm_struct structure, it was discovered that
reliable_nr_page should be initialized to 0, similar to how RSS is
initialized during mm_init(). This particular problem that arises during
forking is merely one such example.
To resolve this issue, it is recommended to set reliable_nr_page to 0
during the mm_init() operation.
Fixes: 094eaabb3fe8 ("proc: Count reliable memory usage of reliable tasks")
Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
include/linux/mem_reliable.h | 8 ++++++++
kernel/fork.c | 1 +
2 files changed, 9 insertions(+)
diff --git a/include/linux/mem_reliable.h b/include/linux/mem_reliable.h
index 6d57c36fb676..aa3fe77c8a72 100644
--- a/include/linux/mem_reliable.h
+++ b/include/linux/mem_reliable.h
@@ -123,6 +123,13 @@ static inline bool mem_reliable_shmem_limit_check(void)
shmem_reliable_nr_page;
}
+static inline void reliable_clear_page_counter(struct mm_struct *mm)
+{
+ if (!mem_reliable_is_enabled())
+ return;
+
+ atomic_long_set(&mm->reliable_nr_page, 0);
+}
#else
#define reliable_enabled 0
#define reliable_allow_fb_enabled() false
@@ -171,6 +178,7 @@ static inline void reliable_lru_add_batch(int zid, enum lru_list lru,
int val) {}
static inline bool mem_reliable_counter_initialized(void) { return false; }
+static inline void reliable_clear_page_counter(struct mm_struct *mm) {}
#endif
#endif
diff --git a/kernel/fork.c b/kernel/fork.c
index b5453a26655e..c256525d4ce5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1007,6 +1007,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
atomic_long_set(&mm->locked_vm, 0);
mm->pinned_vm = 0;
memset(&mm->rss_stat, 0, sizeof(mm->rss_stat));
+ reliable_clear_page_counter(mm);
spin_lock_init(&mm->page_table_lock);
spin_lock_init(&mm->arg_lock);
mm_init_cpumask(mm);
--
2.25.1
1
12

[PATCH openEuler-1.0-LTS 1/5] loop: Add parm check in loop_control_ioctl
by Yongqiang Liu 07 Apr '23
by Yongqiang Liu 07 Apr '23
07 Apr '23
From: Zhong Jinghua <zhongjinghua(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 188586, https://gitee.com/openeuler/kernel/issues/I6TFPJ
CVE: NA
----------------------------------------
We found that in loop_control_ioctl, the kernel panic can be easily caused:
1. syscall(__NR_ioctl, r[1], 0x4c80, 0x80000200000ul);
Create a loop device 0x80000200000ul.
In fact, in the code, it is used as the first_minor number, and the
first_minor number is 0.
So the created loop device number is 7:0.
2. syscall(__NR_ioctl, r[2], 0x4c80, 0ul);
Create a loop device 0x0ul.
Since the 7:0 device has been created in 1, add_disk will fail because
the major and first_minor numbers are consistent.
3. syscall(__NR_ioctl, r[5], 0x4c81, 0ul);
Delete the device that failed to create, the kernel panics.
Panic like below:
BUG: KASAN: null-ptr-deref in device_del+0xb3/0x840 drivers/base/core.c:3107
Call Trace:
kill_device drivers/base/core.c:3079 [inline]
device_del+0xb3/0x840 drivers/base/core.c:3107
del_gendisk+0x463/0x5f0 block/genhd.c:971
loop_remove drivers/block/loop.c:2190 [inline]
loop_control_ioctl drivers/block/loop.c:2289 [inline]
The stack like below:
Create loop device:
loop_control_ioctl
loop_add
add_disk
device_add_disk
bdi_register
bdi_register_va
device_create
device_create_groups_vargs
device_add
kfree(dev->p);
dev->p = NULL;
Remove loop device:
loop_control_ioctl
loop_remove
del_gendisk
device_del
kill_device
if (dev->p->dead) // p is null
Fix it by adding a check for parm.
Fixes: 770fe30a46a1 ("loop: add management interface for on-demand device allocation")
Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com>
Reviewed-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/block/loop.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 108a4ff27bcd..826633aa328c 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1972,6 +1972,17 @@ static int loop_add(struct loop_device **l, int i)
struct gendisk *disk;
int err;
+ /*
+ * i << part_shift is actually used as the first_minor.
+ * So here should avoid i << part_shift overflow.
+ * And, MKDEV() expect that the max bits of
+ * first_minor is 20.
+ */
+ if (i > 0 && i > MINORMASK >> part_shift) {
+ err = -EINVAL;
+ goto out;
+ }
+
err = -ENOMEM;
lo = kzalloc(sizeof(*lo), GFP_KERNEL);
if (!lo)
@@ -1985,7 +1996,8 @@ static int loop_add(struct loop_device **l, int i)
if (err == -ENOSPC)
err = -EEXIST;
} else {
- err = idr_alloc(&loop_index_idr, lo, 0, 0, GFP_KERNEL);
+ err = idr_alloc(&loop_index_idr, lo, 0,
+ (MINORMASK >> part_shift) + 1, GFP_KERNEL);
}
if (err < 0)
goto out_free_dev;
--
2.25.1
1
4

[PATCH openEuler-1.0-LTS] block/wbt: enable wbt after switching cfq to other schedulers
by Zhang Changzhong 07 Apr '23
by Zhang Changzhong 07 Apr '23
07 Apr '23
From: Li Lingfeng <lilingfeng3(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6LH5K
CVE: NA
--------------------------------
Commit 80061216078b ("block/wbt: fix negative inflight counter
when remove scsi device") move wbt_enable_default() from
elv_unregister_queue() to bfq_exit_queue(). As the result of
it, wbt can't be enabled when we switch cfq to other schedulers.
Fixes: 80061216078b ("block/wbt: fix negative inflight counter when remove scsi device")
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
Reviewed-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
---
block/cfq-iosched.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 88bae55..130854a 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -4563,6 +4563,7 @@ static void cfq_exit_queue(struct elevator_queue *e)
kfree(cfqd->root_group);
#endif
kfree(cfqd);
+ wbt_enable_default(q);
}
static int cfq_init_queue(struct request_queue *q, struct elevator_type *e)
--
2.9.5
1
0

06 Apr '23
From: Al Viro <viro(a)zeniv.linux.org.uk>
stable inclusion
from stable-v4.19.245
commit 6ca70982c646cc32e458150ee7f2530a24369b8c
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6T1EY
CVE: CVE-2023-1838
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit fb4554c2232e44d595920f4d5c66cf8f7d13f9bc upstream.
Descriptor table is a shared resource; two fget() on the same descriptor
may return different struct file references. get_tap_ptr_ring() is
called after we'd found (and pinned) the socket we'll be using and it
tries to find the private tun/tap data structures associated with it.
Redoing the lookup by the same file descriptor we'd used to get the
socket is racy - we need to same struct file.
Thanks to Jason for spotting a braino in the original variant of patch -
I'd missed the use of fd == -1 for disabling backend, and in that case
we can end up with sock == NULL and sock != oldsock.
Cc: stable(a)kernel.org
Acked-by: Michael S. Tsirkin <mst(a)redhat.com>
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/vhost/net.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1d99f5c443ee..4b9151474a24 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1211,13 +1211,9 @@ static struct socket *get_raw_socket(int fd)
return ERR_PTR(r);
}
-static struct ptr_ring *get_tap_ptr_ring(int fd)
+static struct ptr_ring *get_tap_ptr_ring(struct file *file)
{
struct ptr_ring *ring;
- struct file *file = fget(fd);
-
- if (!file)
- return NULL;
ring = tun_get_tx_ring(file);
if (!IS_ERR(ring))
goto out;
@@ -1226,7 +1222,6 @@ static struct ptr_ring *get_tap_ptr_ring(int fd)
goto out;
ring = NULL;
out:
- fput(file);
return ring;
}
@@ -1313,8 +1308,12 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
r = vhost_net_enable_vq(n, vq);
if (r)
goto err_used;
- if (index == VHOST_NET_VQ_RX)
- nvq->rx_ring = get_tap_ptr_ring(fd);
+ if (index == VHOST_NET_VQ_RX) {
+ if (sock)
+ nvq->rx_ring = get_tap_ptr_ring(sock->file);
+ else
+ nvq->rx_ring = NULL;
+ }
oldubufs = nvq->ubufs;
nvq->ubufs = ubufs;
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/4] btrfs: fix race between quota disable and quota assign ioctls
by Yongqiang Liu 06 Apr '23
by Yongqiang Liu 06 Apr '23
06 Apr '23
From: Filipe Manana <fdmanana(a)suse.com>
mainline inclusion
from mainline-v6.2-rc8
commit 2f1a6be12ab6c8470d5776e68644726c94257c54
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6PQCT
CVE: CVE-2023-1611
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
The quota assign ioctl can currently run in parallel with a quota disable
ioctl call. The assign ioctl uses the quota root, while the disable ioctl
frees that root, and therefore we can have a use-after-free triggered in
the assign ioctl, leading to a trace like the following when KASAN is
enabled:
[672.723][T736] BUG: KASAN: slab-use-after-free in btrfs_search_slot+0x2962/0x2db0
[672.723][T736] Read of size 8 at addr ffff888022ec0208 by task btrfs_search_sl/27736
[672.724][T736]
[672.725][T736] CPU: 1 PID: 27736 Comm: btrfs_search_sl Not tainted 6.3.0-rc3 #37
[672.723][T736] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[672.727][T736] Call Trace:
[672.728][T736] <TASK>
[672.728][T736] dump_stack_lvl+0xd9/0x150
[672.725][T736] print_report+0xc1/0x5e0
[672.720][T736] ? __virt_addr_valid+0x61/0x2e0
[672.727][T736] ? __phys_addr+0xc9/0x150
[672.725][T736] ? btrfs_search_slot+0x2962/0x2db0
[672.722][T736] kasan_report+0xc0/0xf0
[672.729][T736] ? btrfs_search_slot+0x2962/0x2db0
[672.724][T736] btrfs_search_slot+0x2962/0x2db0
[672.723][T736] ? fs_reclaim_acquire+0xba/0x160
[672.722][T736] ? split_leaf+0x13d0/0x13d0
[672.726][T736] ? rcu_is_watching+0x12/0xb0
[672.723][T736] ? kmem_cache_alloc+0x338/0x3c0
[672.722][T736] update_qgroup_status_item+0xf7/0x320
[672.724][T736] ? add_qgroup_rb+0x3d0/0x3d0
[672.739][T736] ? do_raw_spin_lock+0x12d/0x2b0
[672.730][T736] ? spin_bug+0x1d0/0x1d0
[672.737][T736] btrfs_run_qgroups+0x5de/0x840
[672.730][T736] ? btrfs_qgroup_rescan_worker+0xa70/0xa70
[672.738][T736] ? __del_qgroup_relation+0x4ba/0xe00
[672.738][T736] btrfs_ioctl+0x3d58/0x5d80
[672.735][T736] ? tomoyo_path_number_perm+0x16a/0x550
[672.737][T736] ? tomoyo_execute_permission+0x4a0/0x4a0
[672.731][T736] ? btrfs_ioctl_get_supported_features+0x50/0x50
[672.737][T736] ? __sanitizer_cov_trace_switch+0x54/0x90
[672.734][T736] ? do_vfs_ioctl+0x132/0x1660
[672.730][T736] ? vfs_fileattr_set+0xc40/0xc40
[672.730][T736] ? _raw_spin_unlock_irq+0x2e/0x50
[672.732][T736] ? sigprocmask+0xf2/0x340
[672.737][T736] ? __fget_files+0x26a/0x480
[672.732][T736] ? bpf_lsm_file_ioctl+0x9/0x10
[672.738][T736] ? btrfs_ioctl_get_supported_features+0x50/0x50
[672.736][T736] __x64_sys_ioctl+0x198/0x210
[672.736][T736] do_syscall_64+0x39/0xb0
[672.731][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[672.739][T736] RIP: 0033:0x4556ad
[672.742][T736] </TASK>
[672.743][T736]
[672.748][T736] Allocated by task 27677:
[672.743][T736] kasan_save_stack+0x22/0x40
[672.741][T736] kasan_set_track+0x25/0x30
[672.741][T736] __kasan_kmalloc+0xa4/0xb0
[672.749][T736] btrfs_alloc_root+0x48/0x90
[672.746][T736] btrfs_create_tree+0x146/0xa20
[672.744][T736] btrfs_quota_enable+0x461/0x1d20
[672.743][T736] btrfs_ioctl+0x4a1c/0x5d80
[672.747][T736] __x64_sys_ioctl+0x198/0x210
[672.749][T736] do_syscall_64+0x39/0xb0
[672.744][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[672.756][T736]
[672.757][T736] Freed by task 27677:
[672.759][T736] kasan_save_stack+0x22/0x40
[672.759][T736] kasan_set_track+0x25/0x30
[672.756][T736] kasan_save_free_info+0x2e/0x50
[672.751][T736] ____kasan_slab_free+0x162/0x1c0
[672.758][T736] slab_free_freelist_hook+0x89/0x1c0
[672.752][T736] __kmem_cache_free+0xaf/0x2e0
[672.752][T736] btrfs_put_root+0x1ff/0x2b0
[672.759][T736] btrfs_quota_disable+0x80a/0xbc0
[672.752][T736] btrfs_ioctl+0x3e5f/0x5d80
[672.756][T736] __x64_sys_ioctl+0x198/0x210
[672.753][T736] do_syscall_64+0x39/0xb0
[672.765][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[672.769][T736]
[672.768][T736] The buggy address belongs to the object at ffff888022ec0000
[672.768][T736] which belongs to the cache kmalloc-4k of size 4096
[672.769][T736] The buggy address is located 520 bytes inside of
[672.769][T736] freed 4096-byte region [ffff888022ec0000, ffff888022ec1000)
[672.760][T736]
[672.764][T736] The buggy address belongs to the physical page:
[672.761][T736] page:ffffea00008bb000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x22ec0
[672.766][T736] head:ffffea00008bb000 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[672.779][T736] flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
[672.770][T736] raw: 00fff00000010200 ffff888012842140 ffffea000054ba00 dead000000000002
[672.770][T736] raw: 0000000000000000 0000000000040004 00000001ffffffff 0000000000000000
[672.771][T736] page dumped because: kasan: bad access detected
[672.778][T736] page_owner tracks the page as allocated
[672.777][T736] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd2040(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 88
[672.779][T736] get_page_from_freelist+0x119c/0x2d50
[672.779][T736] __alloc_pages+0x1cb/0x4a0
[672.776][T736] alloc_pages+0x1aa/0x270
[672.773][T736] allocate_slab+0x260/0x390
[672.771][T736] ___slab_alloc+0xa9a/0x13e0
[672.778][T736] __slab_alloc.constprop.0+0x56/0xb0
[672.771][T736] __kmem_cache_alloc_node+0x136/0x320
[672.789][T736] __kmalloc+0x4e/0x1a0
[672.783][T736] tomoyo_realpath_from_path+0xc3/0x600
[672.781][T736] tomoyo_path_perm+0x22f/0x420
[672.782][T736] tomoyo_path_unlink+0x92/0xd0
[672.780][T736] security_path_unlink+0xdb/0x150
[672.788][T736] do_unlinkat+0x377/0x680
[672.788][T736] __x64_sys_unlink+0xca/0x110
[672.789][T736] do_syscall_64+0x39/0xb0
[672.783][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[672.784][T736] page last free stack trace:
[672.787][T736] free_pcp_prepare+0x4e5/0x920
[672.787][T736] free_unref_page+0x1d/0x4e0
[672.784][T736] __unfreeze_partials+0x17c/0x1a0
[672.797][T736] qlist_free_all+0x6a/0x180
[672.796][T736] kasan_quarantine_reduce+0x189/0x1d0
[672.797][T736] __kasan_slab_alloc+0x64/0x90
[672.793][T736] kmem_cache_alloc+0x17c/0x3c0
[672.799][T736] getname_flags.part.0+0x50/0x4e0
[672.799][T736] getname_flags+0x9e/0xe0
[672.792][T736] vfs_fstatat+0x77/0xb0
[672.791][T736] __do_sys_newlstat+0x84/0x100
[672.798][T736] do_syscall_64+0x39/0xb0
[672.796][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[672.790][T736]
[672.791][T736] Memory state around the buggy address:
[672.799][T736] ffff888022ec0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[672.805][T736] ffff888022ec0180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[672.802][T736] >ffff888022ec0200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[672.809][T736] ^
[672.809][T736] ffff888022ec0280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[672.809][T736] ffff888022ec0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fix this by having the qgroup assign ioctl take the qgroup ioctl mutex
before calling btrfs_run_qgroups(), which is what all qgroup ioctls should
call.
Reported-by: butt3rflyh4ck <butterflyhuangxx(a)gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CAFcO6XN3VD8ogmHwqRk4kbiwtpUSNySu2VAxN8…
CC: stable(a)vger.kernel.org # 5.10+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Conflicts:
fs/btrfs/qgroup.c
Signed-off-by: Long Li <leo.lilong(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/btrfs/ioctl.c | 2 ++
fs/btrfs/qgroup.c | 11 ++++++++++-
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a5ae02bf3652..00424d3f3464 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5240,7 +5240,9 @@ static long btrfs_ioctl_qgroup_assign(struct file *file, void __user *arg)
}
/* update qgroup status and info */
+ mutex_lock(&fs_info->qgroup_ioctl_lock);
err = btrfs_run_qgroups(trans);
+ mutex_unlock(&fs_info->qgroup_ioctl_lock);
if (err < 0)
btrfs_handle_fs_error(fs_info, err,
"failed to update qgroup status and info");
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 7916f711daf5..8e58c58f73a3 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2196,7 +2196,8 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans)
}
/*
- * called from commit_transaction. Writes all changed qgroups to disk.
+ * Writes all changed qgroups to disk.
+ * Called by the transaction commit path and the qgroup assign ioctl.
*/
int btrfs_run_qgroups(struct btrfs_trans_handle *trans)
{
@@ -2204,6 +2205,14 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans)
struct btrfs_root *quota_root = fs_info->quota_root;
int ret = 0;
+ /*
+ * In case we are called from the qgroup assign ioctl, assert that we
+ * are holding the qgroup_ioctl_lock, otherwise we can race with a quota
+ * disable operation (ioctl) and access a freed quota root.
+ */
+ if (trans->transaction->state != TRANS_STATE_COMMIT_DOING)
+ lockdep_assert_held(&fs_info->qgroup_ioctl_lock);
+
if (!quota_root)
return ret;
--
2.25.1
1
3

06 Apr '23
当前例会议题:
议题一:进展update --- 张伽琳 & 郑增凯
议题二:openEuler-22.03-LTS-SP2需求评审 ---张伽琳
欢迎大家继续申报~
-----原始约会-----
发件人: openEuler conference <public(a)openeuler.org>
发送时间: 2023年4月7日 9:30
收件人: dev@openeuler.org,kernel-discuss@openeuler.org,kernel@openeuler.org
主题: openEuler Kernel SIG双周例会
时间: 2023年4月7日 星期五 14:00-15:30(UTC+08:00) 北京,重庆,香港特别行政区,乌鲁木齐。
地点:
您好!
Kernel SIG 邀请您参加 2023-04-07 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG双周例会
会议内容:
1.进展update
2.议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/88353599877?pwd=aFZtMjJwUHl1UmNTZFV4eUJQM2xVdz09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-04-07 14:00,
The subject of the conference is openEuler Kernel SIG双周例会,
Summary:
1.进展update
2.议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/88353599877?pwd=aFZtMjJwUHl1UmNTZFV4eUJQM2xVdz09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[PATCH openEuler-1.0-LTS 1/4] ext4: fix race between writepages and remount
by Yongqiang Liu 04 Apr '23
by Yongqiang Liu 04 Apr '23
04 Apr '23
From: Baokun Li <libaokun1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 188500, https://gitee.com/openeuler/kernel/issues/I6RJ0V
CVE: NA
--------------------------------
We got a WARNING in ext4_add_complete_io:
==================================================================
WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250
CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85
RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4]
[...]
Call Trace:
<TASK>
ext4_end_bio+0xa8/0x240 [ext4]
bio_endio+0x195/0x310
blk_update_request+0x184/0x770
scsi_end_request+0x2f/0x240
scsi_io_completion+0x75/0x450
scsi_finish_command+0xef/0x160
scsi_complete+0xa3/0x180
blk_complete_reqs+0x60/0x80
blk_done_softirq+0x25/0x40
__do_softirq+0x119/0x4c8
run_ksoftirqd+0x42/0x70
smpboot_thread_fn+0x136/0x3c0
kthread+0x140/0x1a0
ret_from_fork+0x2c/0x50
==================================================================
Above issue may happen as follows:
cpu1 cpu2
----------------------------|----------------------------
mount -o dioread_lock
ext4_writepages
ext4_do_writepages
*if (ext4_should_dioread_nolock(inode))*
// rsv_blocks is not assigned here
mount -o remount,dioread_nolock
ext4_journal_start_with_reserve
__ext4_journal_start
__ext4_journal_start_sb
jbd2__journal_start
*if (rsv_blocks)*
// h_rsv_handle is not initialized here
mpage_map_and_submit_extent
mpage_map_one_extent
dioread_nolock = ext4_should_dioread_nolock(inode)
if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN))
mpd->io_submit.io_end->handle = handle->h_rsv_handle
ext4_set_io_unwritten_flag
io_end->flag |= EXT4_IO_END_UNWRITTEN
// now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag
scsi_finish_command
scsi_io_completion
scsi_io_completion_action
scsi_end_request
blk_update_request
req_bio_endio
bio_endio
bio->bi_end_io > ext4_end_bio
ext4_put_io_end_defer
ext4_add_complete_io
// trigger WARN_ON(!io_end->handle && sbi->s_journal);
The immediate cause of this problem is that ext4_should_dioread_nolock()
function returns inconsistent values in the ext4_do_writepages() and
mpage_map_one_extent(). There are four conditions in this function that
can be changed at mount time to cause this problem. These four conditions
can be divided into two categories:
(1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl
(2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount
The two in the first category have been fixed by commit c8585c6fcaf2
("ext4: fix races between changing inode journal mode and ext4_writepages")
and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling
EXT4_EXTENTS_FL") respectively.
Two cases in the other category have not yet been fixed, and the above
issue is caused by this situation. We refer to the fix for the first
category, when applying options during remount, we grab s_writepages_rwsem
to avoid racing with writepages ops to trigger this problem.
Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io")
Cc: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/ext4/ext4.h | 3 ++-
fs/ext4/super.c | 13 +++++++++++++
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 4c88e75180a2..6df919b154b4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1530,7 +1530,8 @@ struct ext4_sb_info {
/*
* Barrier between writepages ops and changing any inode's JOURNAL_DATA
- * or EXTENTS flag.
+ * or EXTENTS flag or between writepages ops and changing DIOREAD_NOLOCK
+ * mount option on remount.
*/
struct percpu_rw_semaphore s_writepages_rwsem;
struct dax_device *s_daxdev;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 8029a6f6471c..df07222f1cc5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5605,10 +5605,20 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
vfs_flags = SB_LAZYTIME | SB_I_VERSION;
sb->s_flags = (sb->s_flags & ~vfs_flags) | (*flags & vfs_flags);
+ /*
+ * Changing the DIOREAD_NOLOCK mount option may cause two calls to
+ * ext4_should_dioread_nolock() to return inconsistent values,
+ * triggering WARN_ON in ext4_add_complete_io(). we grab here
+ * s_writepages_rwsem to avoid race between writepages ops and
+ * remount.
+ */
+ percpu_down_write(&sbi->s_writepages_rwsem);
if (!parse_options(data, sb, NULL, &journal_ioprio, 1)) {
err = -EINVAL;
+ percpu_up_write(&sbi->s_writepages_rwsem);
goto restore_opts;
}
+ percpu_up_write(&sbi->s_writepages_rwsem);
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^
test_opt(sb, JOURNAL_CHECKSUM)) {
@@ -5833,6 +5843,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
return 0;
restore_opts:
+ percpu_down_write(&sbi->s_writepages_rwsem);
sb->s_flags = old_sb_flags;
sbi->s_mount_opt = old_opts.s_mount_opt;
sbi->s_mount_opt2 = old_opts.s_mount_opt2;
@@ -5841,6 +5852,8 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
sbi->s_commit_interval = old_opts.s_commit_interval;
sbi->s_min_batch_time = old_opts.s_min_batch_time;
sbi->s_max_batch_time = old_opts.s_max_batch_time;
+ percpu_up_write(&sbi->s_writepages_rwsem);
+
if (!test_opt(sb, BLOCK_VALIDITY) && sbi->system_blks)
ext4_release_system_zone(sb);
#ifdef CONFIG_QUOTA
--
2.25.1
1
6

[PATCH openEuler-5.10-LTS 01/15] scsi: scsi_dh_alua: fix memleak for 'qdata' in alua_activate()
by Jialin Zhang 04 Apr '23
by Jialin Zhang 04 Apr '23
04 Apr '23
From: Yu Kuai <yukuai3(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6NBAZ
CVE: NA
--------------------------------
If alua_rtpg_queue() failed from alua_activate(), then 'qdata' is not
freed, which will cause following memleak:
unreferenced object 0xffff88810b2c6980 (size 32):
comm "kworker/u16:2", pid 635322, jiffies 4355801099 (age 1216426.076s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
40 39 24 c1 ff ff ff ff 00 f8 ea 0a 81 88 ff ff @9$.............
backtrace:
[<0000000098f3a26d>] alua_activate+0xb0/0x320
[<000000003b529641>] scsi_dh_activate+0xb2/0x140
[<000000007b296db3>] activate_path_work+0xc6/0xe0 [dm_multipath]
[<000000007adc9ace>] process_one_work+0x3c5/0x730
[<00000000c457a985>] worker_thread+0x93/0x650
[<00000000cb80e628>] kthread+0x1ba/0x210
[<00000000a1e61077>] ret_from_fork+0x22/0x30
Fix the problem by freeing 'qdata' in error path.
Fixes: 625fe857e4fa ("scsi: scsi_dh_alua: Check scsi_device_get() return value")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/scsi/device_handler/scsi_dh_alua.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index fe8a5e5c0df8..bf0b3178f84d 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -1036,10 +1036,12 @@ static int alua_activate(struct scsi_device *sdev,
rcu_read_unlock();
mutex_unlock(&h->init_mutex);
- if (alua_rtpg_queue(pg, sdev, qdata, true))
+ if (alua_rtpg_queue(pg, sdev, qdata, true)) {
fn = NULL;
- else
+ } else {
+ kfree(qdata);
err = SCSI_DH_DEV_OFFLINED;
+ }
kref_put(&pg->kref, release_port_group);
out:
if (fn)
--
2.25.1
1
14

[PATCH openEuler-5.10-LTS-SP1 01/16] scsi: scsi_dh_alua: fix memleak for 'qdata' in alua_activate()
by Jialin Zhang 04 Apr '23
by Jialin Zhang 04 Apr '23
04 Apr '23
From: Yu Kuai <yukuai3(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6NBAZ
CVE: NA
--------------------------------
If alua_rtpg_queue() failed from alua_activate(), then 'qdata' is not
freed, which will cause following memleak:
unreferenced object 0xffff88810b2c6980 (size 32):
comm "kworker/u16:2", pid 635322, jiffies 4355801099 (age 1216426.076s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
40 39 24 c1 ff ff ff ff 00 f8 ea 0a 81 88 ff ff @9$.............
backtrace:
[<0000000098f3a26d>] alua_activate+0xb0/0x320
[<000000003b529641>] scsi_dh_activate+0xb2/0x140
[<000000007b296db3>] activate_path_work+0xc6/0xe0 [dm_multipath]
[<000000007adc9ace>] process_one_work+0x3c5/0x730
[<00000000c457a985>] worker_thread+0x93/0x650
[<00000000cb80e628>] kthread+0x1ba/0x210
[<00000000a1e61077>] ret_from_fork+0x22/0x30
Fix the problem by freeing 'qdata' in error path.
Fixes: 625fe857e4fa ("scsi: scsi_dh_alua: Check scsi_device_get() return value")
Signed-off-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/scsi/device_handler/scsi_dh_alua.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index fe8a5e5c0df8..bf0b3178f84d 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -1036,10 +1036,12 @@ static int alua_activate(struct scsi_device *sdev,
rcu_read_unlock();
mutex_unlock(&h->init_mutex);
- if (alua_rtpg_queue(pg, sdev, qdata, true))
+ if (alua_rtpg_queue(pg, sdev, qdata, true)) {
fn = NULL;
- else
+ } else {
+ kfree(qdata);
err = SCSI_DH_DEV_OFFLINED;
+ }
kref_put(&pg->kref, release_port_group);
out:
if (fn)
--
2.25.1
1
15
您好!
Kernel SIG 邀请您参加 2023-04-07 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG双周例会
会议内容:
1.进展update
2.议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/88353599877?pwd=aFZtMjJwUHl1UmNTZFV4eUJQM2xVdz09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-04-07 14:00,
The subject of the conference is openEuler Kernel SIG双周例会,
Summary:
1.进展update
2.议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/88353599877?pwd=aFZtMjJwUHl1UmNTZFV4eUJQM2xVdz09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[PATCH openEuler-1.0-LTS 1/4] ext4: fix race between writepages and remount
by Yongqiang Liu 03 Apr '23
by Yongqiang Liu 03 Apr '23
03 Apr '23
From: Baokun Li <libaokun1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 188500, https://gitee.com/openeuler/kernel/issues/I6RJ0V
CVE: NA
--------------------------------
We got a WARNING in ext4_add_complete_io:
==================================================================
WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250
CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85
RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4]
[...]
Call Trace:
<TASK>
ext4_end_bio+0xa8/0x240 [ext4]
bio_endio+0x195/0x310
blk_update_request+0x184/0x770
scsi_end_request+0x2f/0x240
scsi_io_completion+0x75/0x450
scsi_finish_command+0xef/0x160
scsi_complete+0xa3/0x180
blk_complete_reqs+0x60/0x80
blk_done_softirq+0x25/0x40
__do_softirq+0x119/0x4c8
run_ksoftirqd+0x42/0x70
smpboot_thread_fn+0x136/0x3c0
kthread+0x140/0x1a0
ret_from_fork+0x2c/0x50
==================================================================
Above issue may happen as follows:
cpu1 cpu2
----------------------------|----------------------------
mount -o dioread_lock
ext4_writepages
ext4_do_writepages
*if (ext4_should_dioread_nolock(inode))*
// rsv_blocks is not assigned here
mount -o remount,dioread_nolock
ext4_journal_start_with_reserve
__ext4_journal_start
__ext4_journal_start_sb
jbd2__journal_start
*if (rsv_blocks)*
// h_rsv_handle is not initialized here
mpage_map_and_submit_extent
mpage_map_one_extent
dioread_nolock = ext4_should_dioread_nolock(inode)
if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN))
mpd->io_submit.io_end->handle = handle->h_rsv_handle
ext4_set_io_unwritten_flag
io_end->flag |= EXT4_IO_END_UNWRITTEN
// now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag
scsi_finish_command
scsi_io_completion
scsi_io_completion_action
scsi_end_request
blk_update_request
req_bio_endio
bio_endio
bio->bi_end_io > ext4_end_bio
ext4_put_io_end_defer
ext4_add_complete_io
// trigger WARN_ON(!io_end->handle && sbi->s_journal);
The immediate cause of this problem is that ext4_should_dioread_nolock()
function returns inconsistent values in the ext4_do_writepages() and
mpage_map_one_extent(). There are four conditions in this function that
can be changed at mount time to cause this problem. These four conditions
can be divided into two categories:
(1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl
(2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount
The two in the first category have been fixed by commit c8585c6fcaf2
("ext4: fix races between changing inode journal mode and ext4_writepages")
and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling
EXT4_EXTENTS_FL") respectively.
Two cases in the other category have not yet been fixed, and the above
issue is caused by this situation. We refer to the fix for the first
category, when applying options during remount, we grab s_writepages_rwsem
to avoid racing with writepages ops to trigger this problem.
Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io")
Cc: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/ext4/ext4.h | 3 ++-
fs/ext4/super.c | 13 +++++++++++++
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 4c88e75180a2..6df919b154b4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1530,7 +1530,8 @@ struct ext4_sb_info {
/*
* Barrier between writepages ops and changing any inode's JOURNAL_DATA
- * or EXTENTS flag.
+ * or EXTENTS flag or between writepages ops and changing DIOREAD_NOLOCK
+ * mount option on remount.
*/
struct percpu_rw_semaphore s_writepages_rwsem;
struct dax_device *s_daxdev;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 8029a6f6471c..df07222f1cc5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5605,10 +5605,20 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
vfs_flags = SB_LAZYTIME | SB_I_VERSION;
sb->s_flags = (sb->s_flags & ~vfs_flags) | (*flags & vfs_flags);
+ /*
+ * Changing the DIOREAD_NOLOCK mount option may cause two calls to
+ * ext4_should_dioread_nolock() to return inconsistent values,
+ * triggering WARN_ON in ext4_add_complete_io(). we grab here
+ * s_writepages_rwsem to avoid race between writepages ops and
+ * remount.
+ */
+ percpu_down_write(&sbi->s_writepages_rwsem);
if (!parse_options(data, sb, NULL, &journal_ioprio, 1)) {
err = -EINVAL;
+ percpu_up_write(&sbi->s_writepages_rwsem);
goto restore_opts;
}
+ percpu_up_write(&sbi->s_writepages_rwsem);
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^
test_opt(sb, JOURNAL_CHECKSUM)) {
@@ -5833,6 +5843,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
return 0;
restore_opts:
+ percpu_down_write(&sbi->s_writepages_rwsem);
sb->s_flags = old_sb_flags;
sbi->s_mount_opt = old_opts.s_mount_opt;
sbi->s_mount_opt2 = old_opts.s_mount_opt2;
@@ -5841,6 +5852,8 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
sbi->s_commit_interval = old_opts.s_commit_interval;
sbi->s_min_batch_time = old_opts.s_min_batch_time;
sbi->s_max_batch_time = old_opts.s_max_batch_time;
+ percpu_up_write(&sbi->s_writepages_rwsem);
+
if (!test_opt(sb, BLOCK_VALIDITY) && sbi->system_blks)
ext4_release_system_zone(sb);
#ifdef CONFIG_QUOTA
--
2.25.1
1
3

31 Mar '23
From: Clement Lecigne <clecigne(a)google.com>
stable inclusion
from stable-v5.10.162
commit df02234e6b87d2a9a82acd3198e44bdeff8488c6
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6AOWP
CVE: CVE-2023-0266
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Note: this is a fix that works around the bug equivalently as the
two upstream commits:
1fa4445f9adf ("ALSA: control - introduce snd_ctl_notify_one() helper")
56b88b50565c ("ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF")
but in a simpler way to fit with older stable trees -- tiwai ]
Add missing locking in ctl_elem_read_user/ctl_elem_write_user which can be
easily triggered and turned into an use-after-free.
Example code paths with SNDRV_CTL_IOCTL_ELEM_READ:
64-bits:
snd_ctl_ioctl
snd_ctl_elem_read_user
[takes controls_rwsem]
snd_ctl_elem_read [lock properly held, all good]
[drops controls_rwsem]
32-bits (compat):
snd_ctl_ioctl_compat
snd_ctl_elem_write_read_compat
ctl_elem_write_read
snd_ctl_elem_read [missing lock, not good]
CVE-2023-0266 was assigned for this issue.
Signed-off-by: Clement Lecigne <clecigne(a)google.com>
Cc: stable(a)kernel.org # 5.12 and older
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Reviewed-by: Jaroslav Kysela <perex(a)perex.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Hui Tang <tanghui20(a)huawei.com>
---
sound/core/control_compat.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/sound/core/control_compat.c b/sound/core/control_compat.c
index 507fd5210c1c..f5e51f5e3437 100644
--- a/sound/core/control_compat.c
+++ b/sound/core/control_compat.c
@@ -316,7 +316,9 @@ static int ctl_elem_read_user(struct snd_card *card,
err = snd_power_wait(card, SNDRV_CTL_POWER_D0);
if (err < 0)
goto error;
+ down_read(&card->controls_rwsem);
err = snd_ctl_elem_read(card, data);
+ up_read(&card->controls_rwsem);
if (err < 0)
goto error;
err = copy_ctl_value_to_user(userdata, valuep, data, type, count);
@@ -344,7 +346,9 @@ static int ctl_elem_write_user(struct snd_ctl_file *file,
err = snd_power_wait(card, SNDRV_CTL_POWER_D0);
if (err < 0)
goto error;
+ down_write(&card->controls_rwsem);
err = snd_ctl_elem_write(card, file, data);
+ up_write(&card->controls_rwsem);
if (err < 0)
goto error;
err = copy_ctl_value_to_user(userdata, valuep, data, type, count);
--
2.17.1
1
0

[PATCH openEuler-1.0-LTS] ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF
by Zhang Changzhong 31 Mar '23
by Zhang Changzhong 31 Mar '23
31 Mar '23
From: Clement Lecigne <clecigne(a)google.com>
stable inclusion
from stable-v5.10.162
commit df02234e6b87d2a9a82acd3198e44bdeff8488c6
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6AOWP
CVE: CVE-2023-0266
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Note: this is a fix that works around the bug equivalently as the
two upstream commits:
1fa4445f9adf ("ALSA: control - introduce snd_ctl_notify_one() helper")
56b88b50565c ("ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF")
but in a simpler way to fit with older stable trees -- tiwai ]
Add missing locking in ctl_elem_read_user/ctl_elem_write_user which can be
easily triggered and turned into an use-after-free.
Example code paths with SNDRV_CTL_IOCTL_ELEM_READ:
64-bits:
snd_ctl_ioctl
snd_ctl_elem_read_user
[takes controls_rwsem]
snd_ctl_elem_read [lock properly held, all good]
[drops controls_rwsem]
32-bits (compat):
snd_ctl_ioctl_compat
snd_ctl_elem_write_read_compat
ctl_elem_write_read
snd_ctl_elem_read [missing lock, not good]
CVE-2023-0266 was assigned for this issue.
Signed-off-by: Clement Lecigne <clecigne(a)google.com>
Cc: stable(a)kernel.org # 5.12 and older
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Reviewed-by: Jaroslav Kysela <perex(a)perex.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Hui Tang <tanghui20(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: songping yu <yusongping(a)huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong(a)huawei.com>
---
sound/core/control_compat.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/sound/core/control_compat.c b/sound/core/control_compat.c
index 507fd52..f5e51f5 100644
--- a/sound/core/control_compat.c
+++ b/sound/core/control_compat.c
@@ -316,7 +316,9 @@ static int ctl_elem_read_user(struct snd_card *card,
err = snd_power_wait(card, SNDRV_CTL_POWER_D0);
if (err < 0)
goto error;
+ down_read(&card->controls_rwsem);
err = snd_ctl_elem_read(card, data);
+ up_read(&card->controls_rwsem);
if (err < 0)
goto error;
err = copy_ctl_value_to_user(userdata, valuep, data, type, count);
@@ -344,7 +346,9 @@ static int ctl_elem_write_user(struct snd_ctl_file *file,
err = snd_power_wait(card, SNDRV_CTL_POWER_D0);
if (err < 0)
goto error;
+ down_write(&card->controls_rwsem);
err = snd_ctl_elem_write(card, file, data);
+ up_write(&card->controls_rwsem);
if (err < 0)
goto error;
err = copy_ctl_value_to_user(userdata, valuep, data, type, count);
--
2.9.5
1
0
openEuler-22.03-LTS-SP2需求收集截止日期:2023年4月7日星期五
openEuler-22.03-LTS-SP2需求合入截止日期:2023年5月19日星期五
目前是需求收集阶段,如果您有合入openEuler-22.03-LTS-SP2的需求,请您尽快向openEuler社区提交需求issue
需求issue提交链接:https://gitee.com/openeuler/kernel/issues,issue类型选择需求,issue标题以[…
我们将在2023年4月7日kernel sig双周例会上集中讨论,采纳的需求将纳入里程碑规划
1
0

30 Mar '23
From: Leon Romanovsky <leonro(a)nvidia.com>
stable inclusion
from stable-v4.19.225
commit 153843e270459b08529f80a0a0d8258d91597594
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6Q364
CVE: CVE-2021-3923
--------------------------------
commit b35a0f4dd544eaa6162b6d2f13a2557a121ae5fd upstream.
If dst->is_global field is not set, the GRH fields are not cleared
and the following infoleak is reported.
=====================================================
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_user+0x1c9/0x270 lib/usercopy.c:33
instrument_copy_to_user include/linux/instrumented.h:121 [inline]
_copy_to_user+0x1c9/0x270 lib/usercopy.c:33
copy_to_user include/linux/uaccess.h:209 [inline]
ucma_init_qp_attr+0x8c7/0xb10 drivers/infiniband/core/ucma.c:1242
ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732
vfs_write+0x8ce/0x2030 fs/read_write.c:588
ksys_write+0x28b/0x510 fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__ia32_sys_write+0xdb/0x120 fs/read_write.c:652
do_syscall_32_irqs_on arch/x86/entry/common.c:114 [inline]
__do_fast_syscall_32+0x96/0xf0 arch/x86/entry/common.c:180
do_fast_syscall_32+0x34/0x70 arch/x86/entry/common.c:205
do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:248
entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
Local variable resp created at:
ucma_init_qp_attr+0xa4/0xb10 drivers/infiniband/core/ucma.c:1214
ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732
Bytes 40-59 of 144 are uninitialized
Memory access of size 144 starts at ffff888167523b00
Data copied to user address 0000000020000100
CPU: 1 PID: 25910 Comm: syz-executor.1 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
=====================================================
Fixes: 4ba66093bdc6 ("IB/core: Check for global flag when using ah_attr")
Link: https://lore.kernel.org/r/0e9dd51f93410b7b2f4f5562f52befc878b71afa.16412988…
Reported-by: syzbot+6d532fa8f9463da290bc(a)syzkaller.appspotmail.com
Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Baisong Zhong <zhongbaisong(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/infiniband/core/uverbs_marshall.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/uverbs_marshall.c b/drivers/infiniband/core/uverbs_marshall.c
index b8d715c68ca4..11a080646916 100644
--- a/drivers/infiniband/core/uverbs_marshall.c
+++ b/drivers/infiniband/core/uverbs_marshall.c
@@ -66,7 +66,7 @@ void ib_copy_ah_attr_to_user(struct ib_device *device,
struct rdma_ah_attr *src = ah_attr;
struct rdma_ah_attr conv_ah;
- memset(&dst->grh.reserved, 0, sizeof(dst->grh.reserved));
+ memset(&dst->grh, 0, sizeof(dst->grh));
if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
(rdma_ah_get_dlid(ah_attr) > be16_to_cpu(IB_LID_PERMISSIVE)) &&
--
2.25.1
1
3

30 Mar '23
From: Leon Romanovsky <leonro(a)nvidia.com>
stable inclusion
from stable-v4.19.225
commit 153843e270459b08529f80a0a0d8258d91597594
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6Q364
CVE: CVE-2021-3923
--------------------------------
commit b35a0f4dd544eaa6162b6d2f13a2557a121ae5fd upstream.
If dst->is_global field is not set, the GRH fields are not cleared
and the following infoleak is reported.
=====================================================
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_user+0x1c9/0x270 lib/usercopy.c:33
instrument_copy_to_user include/linux/instrumented.h:121 [inline]
_copy_to_user+0x1c9/0x270 lib/usercopy.c:33
copy_to_user include/linux/uaccess.h:209 [inline]
ucma_init_qp_attr+0x8c7/0xb10 drivers/infiniband/core/ucma.c:1242
ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732
vfs_write+0x8ce/0x2030 fs/read_write.c:588
ksys_write+0x28b/0x510 fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__ia32_sys_write+0xdb/0x120 fs/read_write.c:652
do_syscall_32_irqs_on arch/x86/entry/common.c:114 [inline]
__do_fast_syscall_32+0x96/0xf0 arch/x86/entry/common.c:180
do_fast_syscall_32+0x34/0x70 arch/x86/entry/common.c:205
do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:248
entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
Local variable resp created at:
ucma_init_qp_attr+0xa4/0xb10 drivers/infiniband/core/ucma.c:1214
ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732
Bytes 40-59 of 144 are uninitialized
Memory access of size 144 starts at ffff888167523b00
Data copied to user address 0000000020000100
CPU: 1 PID: 25910 Comm: syz-executor.1 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
=====================================================
Fixes: 4ba66093bdc6 ("IB/core: Check for global flag when using ah_attr")
Link: https://lore.kernel.org/r/0e9dd51f93410b7b2f4f5562f52befc878b71afa.16412988…
Reported-by: syzbot+6d532fa8f9463da290bc(a)syzkaller.appspotmail.com
Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Baisong Zhong <zhongbaisong(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/infiniband/core/uverbs_marshall.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/uverbs_marshall.c b/drivers/infiniband/core/uverbs_marshall.c
index b8d715c68ca4..11a080646916 100644
--- a/drivers/infiniband/core/uverbs_marshall.c
+++ b/drivers/infiniband/core/uverbs_marshall.c
@@ -66,7 +66,7 @@ void ib_copy_ah_attr_to_user(struct ib_device *device,
struct rdma_ah_attr *src = ah_attr;
struct rdma_ah_attr conv_ah;
- memset(&dst->grh.reserved, 0, sizeof(dst->grh.reserved));
+ memset(&dst->grh, 0, sizeof(dst->grh));
if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
(rdma_ah_get_dlid(ah_attr) > be16_to_cpu(IB_LID_PERMISSIVE)) &&
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS 1/3] cgroup/cpuset: Change cpuset_rwsem and hotplug lock order
by Yongqiang Liu 29 Mar '23
by Yongqiang Liu 29 Mar '23
29 Mar '23
From: Juri Lelli <juri.lelli(a)redhat.com>
mainline inclusion
from mainline-v5.4-rc1
commit d74b27d63a8bebe2fe634944e4ebdc7b10db7a39
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6L46J
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
cpuset_rwsem is going to be acquired from sched_setscheduler() with a
following patch. There are however paths (e.g., spawn_ksoftirqd) in
which sched_scheduler() is eventually called while holding hotplug lock;
this creates a dependecy between hotplug lock (to be always acquired
first) and cpuset_rwsem (to be always acquired after hotplug lock).
Fix paths which currently take the two locks in the wrong order (after
a following patch is applied).
Tested-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Signed-off-by: Juri Lelli <juri.lelli(a)redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: bristot(a)redhat.com
Cc: claudio(a)evidence.eu.com
Cc: lizefan(a)huawei.com
Cc: longman(a)redhat.com
Cc: luca.abeni(a)santannapisa.it
Cc: mathieu.poirier(a)linaro.org
Cc: rostedt(a)goodmis.org
Cc: tj(a)kernel.org
Cc: tommaso.cucinotta(a)santannapisa.it
Link: https://lkml.kernel.org/r/20190719140000.31694-7-juri.lelli@redhat.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
conflicts:
kernel/cgroup/cpuset.c
Signed-off-by: Cai Xinchen <caixinchen1(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
include/linux/cpuset.h | 8 ++++----
kernel/cgroup/cpuset.c | 24 +++++++++++++++++-------
2 files changed, 21 insertions(+), 11 deletions(-)
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index 934633a05d20..7f1478c26a33 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -40,14 +40,14 @@ static inline bool cpusets_enabled(void)
static inline void cpuset_inc(void)
{
- static_branch_inc(&cpusets_pre_enable_key);
- static_branch_inc(&cpusets_enabled_key);
+ static_branch_inc_cpuslocked(&cpusets_pre_enable_key);
+ static_branch_inc_cpuslocked(&cpusets_enabled_key);
}
static inline void cpuset_dec(void)
{
- static_branch_dec(&cpusets_enabled_key);
- static_branch_dec(&cpusets_pre_enable_key);
+ static_branch_dec_cpuslocked(&cpusets_enabled_key);
+ static_branch_dec_cpuslocked(&cpusets_pre_enable_key);
}
extern int cpuset_init(void);
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 55bfbc4cdb16..def36c3fc524 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -914,8 +914,8 @@ static void rebuild_sched_domains_locked(void)
cpumask_var_t *doms;
int ndoms;
+ lockdep_assert_cpus_held();
lockdep_assert_held(&cpuset_mutex);
- get_online_cpus();
/*
* We have raced with CPU hotplug. Don't do anything to avoid
@@ -923,15 +923,13 @@ static void rebuild_sched_domains_locked(void)
* Anyways, hotplug work item will rebuild sched domains.
*/
if (!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
- goto out;
+ return;
/* Generate domain masks and attrs */
ndoms = generate_sched_domains(&doms, &attr);
/* Have scheduler rebuild the domains */
partition_sched_domains(ndoms, doms, attr);
-out:
- put_online_cpus();
}
#else /* !CONFIG_SMP */
static void rebuild_sched_domains_locked(void)
@@ -941,9 +939,11 @@ static void rebuild_sched_domains_locked(void)
void rebuild_sched_domains(void)
{
+ get_online_cpus();
mutex_lock(&cpuset_mutex);
rebuild_sched_domains_locked();
mutex_unlock(&cpuset_mutex);
+ put_online_cpus();
}
/**
@@ -1612,13 +1612,13 @@ static void cpuset_attach(struct cgroup_taskset *tset)
cgroup_taskset_first(tset, &css);
cs = css_cs(css);
- mutex_lock(&cpuset_mutex);
-
/*
* It should hold cpus lock because a cpu offline event can
* cause set_cpus_allowed_ptr() failed.
*/
get_online_cpus();
+ mutex_lock(&cpuset_mutex);
+
/* prepare for attach */
if (cs == &top_cpuset)
cpumask_copy(cpus_attach, cpu_possible_mask);
@@ -1644,7 +1644,6 @@ static void cpuset_attach(struct cgroup_taskset *tset)
cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to);
cpuset_update_task_spread_flag(cs, task);
}
- put_online_cpus();
/*
* Change mm for all threadgroup leaders. This is expensive and may
@@ -1680,6 +1679,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
wake_up(&cpuset_attach_wq);
mutex_unlock(&cpuset_mutex);
+ put_online_cpus();
}
/* The various types of files and directories in a cpuset file system */
@@ -1711,6 +1711,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
cpuset_filetype_t type = cft->private;
int retval = 0;
+ get_online_cpus();
mutex_lock(&cpuset_mutex);
if (!is_cpuset_online(cs)) {
retval = -ENODEV;
@@ -1748,6 +1749,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
}
out_unlock:
mutex_unlock(&cpuset_mutex);
+ put_online_cpus();
return retval;
}
@@ -1758,6 +1760,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
cpuset_filetype_t type = cft->private;
int retval = -ENODEV;
+ get_online_cpus();
mutex_lock(&cpuset_mutex);
if (!is_cpuset_online(cs))
goto out_unlock;
@@ -1772,6 +1775,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
}
out_unlock:
mutex_unlock(&cpuset_mutex);
+ put_online_cpus();
return retval;
}
@@ -1810,6 +1814,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
kernfs_break_active_protection(of->kn);
flush_work(&cpuset_hotplug_work);
+ get_online_cpus();
mutex_lock(&cpuset_mutex);
if (!is_cpuset_online(cs))
goto out_unlock;
@@ -1840,6 +1845,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
free_trial_cpuset(trialcs);
out_unlock:
mutex_unlock(&cpuset_mutex);
+ put_online_cpus();
kernfs_unbreak_active_protection(of->kn);
css_put(&cs->css);
flush_workqueue(cpuset_migrate_mm_wq);
@@ -2108,6 +2114,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
if (!parent)
return 0;
+ get_online_cpus();
mutex_lock(&cpuset_mutex);
set_bit(CS_ONLINE, &cs->flags);
@@ -2161,6 +2168,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
spin_unlock_irq(&callback_lock);
out_unlock:
mutex_unlock(&cpuset_mutex);
+ put_online_cpus();
return 0;
}
@@ -2174,6 +2182,7 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css)
{
struct cpuset *cs = css_cs(css);
+ get_online_cpus();
mutex_lock(&cpuset_mutex);
if (is_sched_load_balance(cs))
@@ -2183,6 +2192,7 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css)
clear_bit(CS_ONLINE, &cs->flags);
mutex_unlock(&cpuset_mutex);
+ put_online_cpus();
}
static void cpuset_css_free(struct cgroup_subsys_state *css)
--
2.25.1
1
2

[PATCH openEuler-5.10-LTS-SP1 01/36] bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
by Jialin Zhang 29 Mar '23
by Jialin Zhang 29 Mar '23
29 Mar '23
From: Liu Jian <liujian56(a)huawei.com>
mainline inclusion
from mainline-v6.3-rc2
commit d900f3d20cc3169ce42ec72acc850e662a4d4db2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
When the buffer length of the recvmsg system call is 0, we got the
flollowing soft lockup problem:
watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149]
CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:remove_wait_queue+0xb/0xc0
Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20
RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768
RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040
RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7
R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800
R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0
FS: 00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
tcp_msg_wait_data+0x279/0x2f0
tcp_bpf_recvmsg_parser+0x3c6/0x490
inet_recvmsg+0x280/0x290
sock_recvmsg+0xfc/0x120
____sys_recvmsg+0x160/0x3d0
___sys_recvmsg+0xf0/0x180
__sys_recvmsg+0xea/0x1a0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
The logic in tcp_bpf_recvmsg_parser is as follows:
msg_bytes_ready:
copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
if (!copied) {
wait data;
goto msg_bytes_ready;
}
In this case, "copied" always is 0, the infinite loop occurs.
According to the Linux system call man page, 0 should be returned in this
case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly
return. Also modify several other functions with the same problem.
Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap")
Fixes: 9825d866ce0d ("af_unix: Implement unix_dgram_bpf_recvmsg()")
Fixes: c5d2177a72a1 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Cc: Jakub Sitnicki <jakub(a)cloudflare.com>
Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com
(cherry picked from commit d900f3d20cc3169ce42ec72acc850e662a4d4db2)
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Conflicts:
net/ipv4/udp_bpf.c
net/unix/unix_bpf.c
net/ipv4/tcp_bpf.c
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
net/ipv4/tcp_bpf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 1a8f129fb309..89bbb8ee1634 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -275,6 +275,9 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
if (unlikely(flags & MSG_ERRQUEUE))
return inet_recv_error(sk, msg, len, addr_len);
+ if (!len)
+ return 0;
+
psock = sk_psock_get(sk);
if (unlikely(!psock))
return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
--
2.25.1
1
35

[PATCH openEuler-5.10-LTS 01/36] bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
by Jialin Zhang 29 Mar '23
by Jialin Zhang 29 Mar '23
29 Mar '23
From: Liu Jian <liujian56(a)huawei.com>
mainline inclusion
from mainline-v6.3-rc2
commit d900f3d20cc3169ce42ec72acc850e662a4d4db2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I65HYE
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
When the buffer length of the recvmsg system call is 0, we got the
flollowing soft lockup problem:
watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149]
CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:remove_wait_queue+0xb/0xc0
Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20
RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768
RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040
RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7
R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800
R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0
FS: 00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
tcp_msg_wait_data+0x279/0x2f0
tcp_bpf_recvmsg_parser+0x3c6/0x490
inet_recvmsg+0x280/0x290
sock_recvmsg+0xfc/0x120
____sys_recvmsg+0x160/0x3d0
___sys_recvmsg+0xf0/0x180
__sys_recvmsg+0xea/0x1a0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
The logic in tcp_bpf_recvmsg_parser is as follows:
msg_bytes_ready:
copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
if (!copied) {
wait data;
goto msg_bytes_ready;
}
In this case, "copied" always is 0, the infinite loop occurs.
According to the Linux system call man page, 0 should be returned in this
case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly
return. Also modify several other functions with the same problem.
Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap")
Fixes: 9825d866ce0d ("af_unix: Implement unix_dgram_bpf_recvmsg()")
Fixes: c5d2177a72a1 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Cc: Jakub Sitnicki <jakub(a)cloudflare.com>
Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com
(cherry picked from commit d900f3d20cc3169ce42ec72acc850e662a4d4db2)
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Conflicts:
net/ipv4/udp_bpf.c
net/unix/unix_bpf.c
net/ipv4/tcp_bpf.c
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
net/ipv4/tcp_bpf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index eaf2308c355a..ddb1730cdf9b 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -273,6 +273,9 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
if (unlikely(flags & MSG_ERRQUEUE))
return inet_recv_error(sk, msg, len, addr_len);
+ if (!len)
+ return 0;
+
psock = sk_psock_get(sk);
if (unlikely(!psock))
return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
--
2.25.1
1
35
Push some of ascend features to openEuler
Bixuan Cui (1):
open modules for sig
Chen Jun (4):
mm/sharepool: Fix a double free problem caused by init_local_group
mm/sharepool: use delete_spg_node to replace some repetitive code
mm/sharepool: extract group_add_task
mm/sharepool: check permission of task to operate spg_node->proc_node
Fang Lijun (2):
enable fdm in panic
HZ 12
Jian Zhang (5):
drivers: add RAS support for multiple die cpu
bugfix for RAS acpi map
add SDMA support for 1980B
using mm's pasid rather than a random value when using SDMA
kernel/sdei: enable SDEI for nmi
Lijun Fang (1):
modify ascend910 ARCH_NR_GPIOS
Wang Wensheng (11):
char/sdma: Add driver framework for sdma
char/sdma: Implement probe function for sdma device
char/sdma: Implement sdma_memcpy for kernel-space usage
char/sdma: Export the sdma features to userspace via ioctl
char/sdma: Add support for ACPI
char/sdma: Support multiple sdma devices
char/sdma: Add deffer probe feature
char/sdma: Add discrete memcpy interface for userspace
char/sdma: Pin pages when copying user memory
memcg/ascend: Support not account pages of cdm for memcg
mm/cdm: Extend cdm-nodes to support more nodes than 64
Xiang Rui (1):
Add kernel securec support for 1980.
Xingang Wang (1):
ascend_mpam: Add dts driver to traverse accelerators
Xu Qiang (1):
Fix kernel boot failed in fpga.
Yuan Can (4):
driver/iommu: Introduce IOMMU_SVA_FEAT_SVSP feature
driver/svm: save svsp_mm in mm and release it in __mmput
Add bootdot support for ascend 310B
Set bootdot base to 10
Zhang Jian (2):
drivers: apci parse apei table
drivers: acpi register irq
Zhou Guanghui (5):
Modify memblock number
driver/svm: support SVSP feature, interface definition
driver/svm: reserve address range for svsp
efi: when kaslr is enabled, restrict the random address range.
mm/hugetlb: support disable clear hugepage
chenjunwei (1):
ascend310B_defconfig
.../admin-guide/kernel-parameters.txt | 1 +
arch/arm64/Kconfig | 30 +
arch/arm64/configs/ascend310B_defconfig | 7040 +++++++++++++++++
arch/arm64/kernel/acpi.c | 5 +
arch/arm64/kernel/sdei.c | 11 -
arch/arm64/mm/init.c | 5 +
arch/arm64/mm/numa.c | 39 +-
certs/Makefile | 3 +-
drivers/Kconfig | 3 +
drivers/Makefile | 2 +
drivers/acpi/Kconfig | 11 +
drivers/acpi/Makefile | 1 +
drivers/acpi/apei/ghes.c | 47 +-
drivers/acpi/apei/hest.c | 35 +
drivers/acpi/dt_apei.c | 201 +
drivers/bootdot/Kconfig | 7 +
drivers/bootdot/Makefile | 2 +
drivers/bootdot/bootdot.c | 447 ++
drivers/char/Kconfig | 6 +
drivers/char/Makefile | 1 +
drivers/char/sdma.c | 1088 +++
.../firmware/efi/libstub/efi-stub-helper.c | 3 +
drivers/firmware/efi/libstub/efistub.h | 1 +
drivers/firmware/efi/libstub/randomalloc.c | 11 +
drivers/gpio/gpiolib.c | 45 +-
drivers/hisi/Kconfig | 16 +
drivers/hisi/Makefile | 1 +
drivers/hisi/securec/Kconfig | 1 +
drivers/hisi/securec/Makefile | 1 +
drivers/hisi/securec/src | 1 +
drivers/iommu/Kconfig | 7 +
drivers/iommu/Makefile | 1 +
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 35 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +
drivers/iommu/ascend_mpam.c | 286 +
drivers/irqchip/irq-gic-v3-its.c | 13 +-
fs/hugetlbfs/inode.c | 12 +-
include/acpi/apei.h | 7 +
include/acpi/dt_apei.h | 31 +
include/asm-generic/gpio.h | 13 +
include/linux/ascend_smmu.h | 38 +
include/linux/bootdot.h | 40 +
include/linux/hisi_sdma.h | 42 +
include/linux/init.h | 1 +
include/linux/iommu.h | 1 +
include/linux/kernel.h | 6 +
include/linux/mm_types.h | 4 +
include/linux/mman.h | 46 +
include/linux/securec.h | 1 +
include/linux/securectype.h | 1 +
include/uapi/asm-generic/mman-common.h | 1 +
kernel/Kconfig.hz | 6 +
kernel/fork.c | 13 +
kernel/panic.c | 31 +
mm/hugetlb.c | 2 +
mm/memblock.c | 2 +-
mm/memcontrol.c | 8 +
mm/mmap.c | 32 +-
mm/share_pool.c | 105 +-
60 files changed, 9806 insertions(+), 68 deletions(-)
create mode 100644 arch/arm64/configs/ascend310B_defconfig
create mode 100644 drivers/acpi/dt_apei.c
create mode 100644 drivers/bootdot/Kconfig
create mode 100644 drivers/bootdot/Makefile
create mode 100644 drivers/bootdot/bootdot.c
create mode 100644 drivers/char/sdma.c
create mode 100644 drivers/hisi/Kconfig
create mode 100644 drivers/hisi/Makefile
create mode 100644 drivers/hisi/securec/Kconfig
create mode 100644 drivers/hisi/securec/Makefile
create mode 120000 drivers/hisi/securec/src
create mode 100644 drivers/iommu/ascend_mpam.c
create mode 100644 include/acpi/dt_apei.h
create mode 100644 include/linux/bootdot.h
create mode 100644 include/linux/hisi_sdma.h
create mode 120000 include/linux/securec.h
create mode 120000 include/linux/securectype.h
--
2.17.1
1
39
Paul E. McKenney (1):
rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer()
Pedro Tammela (2):
net/sched: tcindex: update imperfect hash filters respecting rcu
net/sched: tcindex: search key must be 16 bits
include/linux/rcupdate.h | 18 ++++++++++++++++++
net/sched/cls_tcindex.c | 34 ++++++++++++++++++++++++++++++----
2 files changed, 48 insertions(+), 4 deletions(-)
--
2.9.5
1
3
Send 14 patches to test patchwork->PR function
Baisong Zhong (1):
media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()
Chen Zhongjin (1):
ftrace: Fix invalid address access in lookup_rec() when index is 0
Darrick J. Wong (1):
ext4: fix another off-by-one fsmap error on 1k block filesystems
David Hildenbrand (2):
mm: optimize do_wp_page() for exclusive pages in the swapcache
mm: optimize do_wp_page() for fresh pages in local LRU pagevecs
Kuniyuki Iwashima (1):
seccomp: Move copy_seccomp() to no failure path.
Luke D. Jones (1):
HID: asus: Remove check for same LED brightness on set
Nicholas Piggin (1):
mm/vmalloc: huge vmalloc backing pages should be split rather than
compound
Pietro Borrello (2):
HID: asus: use spinlock to protect concurrent accesses
HID: asus: use spinlock to safely schedule workers
Xin Long (2):
tipc: set con sock in tipc_conn_alloc
tipc: add an extra conn_get in tipc_conn_alloc
Zheng Yejian (1):
livepatch/core: Fix hungtask against cpu hotplug on x86
Zhihao Cheng (1):
jbd2: fix data missing when reusing bh which is ready to be
checkpointed
drivers/hid/hid-asus.c | 38 ++++++++++++++++++-----
drivers/media/usb/dvb-usb/az6027.c | 4 +++
fs/ext4/fsmap.c | 2 ++
fs/jbd2/transaction.c | 50 +++++++++++++++++-------------
kernel/fork.c | 17 ++++++----
kernel/livepatch/core.c | 36 ++++++++++++++-------
kernel/trace/ftrace.c | 3 +-
mm/memory.c | 28 +++++++++++++----
mm/vmalloc.c | 22 ++++++++++---
net/tipc/topsrv.c | 20 ++++++------
10 files changed, 154 insertions(+), 66 deletions(-)
--
2.25.1
1
14
CVE-2022-29901
Alexandre Chartre (2):
x86/bugs: Report AMD retbleed vulnerability
x86/bugs: Add AMD retbleed= boot parameter
Andrew Cooper (1):
x86/cpu/amd: Enumerate BTC_NO
Daniel Sneddon (1):
x86/speculation: Add RSB VM Exit protections
Ingo Molnar (1):
x86/cpufeature: Fix various quality problems in the
<asm/cpu_device_hd.h> header
Josh Poimboeuf (8):
x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
x86/speculation: Fix firmware entry SPEC_CTRL handling
x86/speculation: Fix SPEC_CTRL write on SMT state change
x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
x86/speculation: Remove x86_spec_ctrl_mask
KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
KVM: VMX: Fix IBRS handling after vmexit
x86/speculation: Fill RSB on vmexit for IBRS
Kan Liang (1):
x86/cpufeature: Add facility to check for min microcode revisions
Mark Gross (1):
x86/cpu: Add a steppings field to struct x86_cpu_id
Nathan Chancellor (1):
x86/speculation: Use DECLARE_PER_CPU for x86_spec_ctrl_current
Pawan Gupta (4):
x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS
x86/bugs: Add Cannon lake to RETBleed affected CPU list
x86/speculation: Disable RRSBA behavior
x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS
parts
Peter Zijlstra (11):
x86/nospec: Fix i386 RSB stuffing
x86/cpufeatures: Move RETPOLINE flags to word 11
x86/bugs: Keep a per-CPU IA32_SPEC_CTRL value
x86/entry: Remove skip_r11rcx
x86/entry: Add kernel IBRS implementation
x86/bugs: Optimize SPEC_CTRL MSR writes
x86/bugs: Split spectre_v2_select_mitigation() and
spectre_v2_user_select_mitigation()
x86/bugs: Report Intel retbleed vulnerability
intel_idle: Disable IBRS during long idle
x86/speculation: Change FILL_RETURN_BUFFER to work with objtool
x86/common: Stamp out the stepping madness
Suleiman Souhlal (2):
Revert "x86/speculation: Add RSB VM Exit protections"
Revert "x86/cpu: Add a steppings field to struct x86_cpu_id"
Thomas Gleixner (2):
x86/devicetable: Move x86 specific macro out of generic code
x86/cpu: Add consistent CPU match macros
.../admin-guide/kernel-parameters.txt | 13 +
arch/x86/entry/calling.h | 68 +++-
arch/x86/entry/entry_32.S | 2 -
arch/x86/entry/entry_64.S | 34 +-
arch/x86/entry/entry_64_compat.S | 11 +-
arch/x86/include/asm/cpu_device_id.h | 168 +++++++-
arch/x86/include/asm/cpufeatures.h | 18 +-
arch/x86/include/asm/intel-family.h | 6 +
arch/x86/include/asm/msr-index.h | 10 +
arch/x86/include/asm/nospec-branch.h | 67 ++--
arch/x86/kernel/cpu/amd.c | 21 +-
arch/x86/kernel/cpu/bugs.c | 368 ++++++++++++++----
arch/x86/kernel/cpu/common.c | 72 ++--
arch/x86/kernel/cpu/match.c | 44 ++-
arch/x86/kernel/cpu/scattered.c | 1 +
arch/x86/kernel/process.c | 2 +-
arch/x86/kvm/svm.c | 1 +
arch/x86/kvm/vmx.c | 53 ++-
arch/x86/kvm/x86.c | 2 +-
drivers/base/cpu.c | 8 +
drivers/cpufreq/acpi-cpufreq.c | 1 +
drivers/cpufreq/amd_freq_sensitivity.c | 1 +
drivers/idle/intel_idle.c | 43 +-
include/linux/cpu.h | 2 +
include/linux/kvm_host.h | 2 +-
include/linux/mod_devicetable.h | 4 +-
tools/arch/x86/include/asm/cpufeatures.h | 2 +-
27 files changed, 832 insertions(+), 192 deletions(-)
--
2.25.1
1
35

[PATCH openEuler-1.0-LTS 1/2] ext4: commit super block if fs record error when journal record without error
by Yongqiang Liu 25 Mar '23
by Yongqiang Liu 25 Mar '23
25 Mar '23
From: Ye Bin <yebin10(a)huawei.com>
mainline inclusion
from mainline-v6.3-rc2
commit eee00237fa5ec8f704f7323b54e48cc34e2d9168
category: bugfix
bugzilla: 188471,https://gitee.com/openeuler/kernel/issues/I6MR1V
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Now, 'es->s_state' maybe covered by recover journal. And journal errno
maybe not recorded in journal sb as IO error. ext4_update_super() only
update error information when 'sbi->s_add_error_count' large than zero.
Then 'EXT4_ERROR_FS' flag maybe lost.
To solve above issue just recover 'es->s_state' error flag after journal
replay like error info.
Signed-off-by: Ye Bin <yebin10(a)huawei.com>
Reviewed-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20230307061703.245965-2-yebin@huaweicloud.com
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/ext4/super.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 594fc6dd0e9f..31a4725da745 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5161,6 +5161,7 @@ static int ext4_load_journal(struct super_block *sb,
err = jbd2_journal_wipe(journal, !really_read_only);
if (!err) {
char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
+
if (save)
memcpy(save, ((char *) es) +
EXT4_S_ERR_START, EXT4_S_ERR_LEN);
@@ -5169,6 +5170,14 @@ static int ext4_load_journal(struct super_block *sb,
memcpy(((char *) es) + EXT4_S_ERR_START,
save, EXT4_S_ERR_LEN);
kfree(save);
+ es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state &
+ EXT4_ERROR_FS);
+ /* Write out restored error information to the superblock */
+ if (!bdev_read_only(sb->s_bdev)) {
+ int err2;
+ err2 = ext4_commit_super(sb);
+ err = err ? : err2;
+ }
}
if (err) {
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS 01/10] scsi: hisi_sas: Grab sas_dev lock when traversing the members of sas_dev.list
by Yongqiang Liu 25 Mar '23
by Yongqiang Liu 25 Mar '23
25 Mar '23
From: Xingui Yang <yangxingui(a)huawei.com>
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6J1SI
CVE: NA
----------------------------
When the IO complete and free slot in function slot_complete_v3_hw(), it is
possible that sas_dev.list is being traversed elsewhere, and it may trigger
a null pointer exception, such as follows:
==>cq thread ==>scsi_eh_6
==>scsi_error_handler()
==>sas_eh_handle_sas_errors()
==>sas_scsi_find_task()
==>lldd_abort_task()
==>slot_complete_v3_hw() ==>hisi_sas_abort_task()
==>hisi_sas_slot_task_free() ==>dereg_device_v3_hw()
==>list_del_init() ==>list_for_each_entry_safe()
[ 7165.434918] sas: Enter sas_scsi_recover_host busy: 32 failed: 32
[ 7165.434926] sas: trying to find task 0x00000000769b5ba5
[ 7165.434927] sas: sas_scsi_find_task: aborting task 0x00000000769b5ba5
[ 7165.434940] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(00000000769b5ba5) aborted
[ 7165.434964] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(00000000c9f7aa07) ignored
[ 7165.434965] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(00000000e2a1cf01) ignored
[ 7165.434968] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 7165.434972] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(0000000022d52d93) ignored
[ 7165.434975] hisi_sas_v3_hw 0000:b4:02.0: slot complete: task(0000000066a7516c) ignored
[ 7165.434976] Mem abort info:
[ 7165.434982] ESR = 0x96000004
[ 7165.434991] Exception class = DABT (current EL), IL = 32 bits
[ 7165.434992] SET = 0, FnV = 0
[ 7165.434993] EA = 0, S1PTW = 0
[ 7165.434994] Data abort info:
[ 7165.434994] ISV = 0, ISS = 0x00000004
[ 7165.434995] CM = 0, WnR = 0
[ 7165.434997] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000f29543f2
[ 7165.434998] [0000000000000000] pgd=0000000000000000
[ 7165.435003] Internal error: Oops: 96000004 [#1] SMP
[ 7165.439863] Process scsi_eh_6 (pid: 4109, stack limit = 0x00000000c43818d5)
[ 7165.468862] pstate: 00c00009 (nzcv daif +PAN +UAO)
[ 7165.473637] pc : dereg_device_v3_hw+0x68/0xa8 [hisi_sas_v3_hw]
[ 7165.479443] lr : dereg_device_v3_hw+0x2c/0xa8 [hisi_sas_v3_hw]
[ 7165.485247] sp : ffff00001d623bc0
[ 7165.488546] x29: ffff00001d623bc0 x28: ffffa027d03b9508
[ 7165.493835] x27: ffff80278ed50af0 x26: ffffa027dd31e0a8
[ 7165.499123] x25: ffffa027d9b27f88 x24: ffffa027d9b209f8
[ 7165.504411] x23: ffffa027c45b0d60 x22: ffff80278ec07c00
[ 7165.509700] x21: 0000000000000008 x20: ffffa027d9b209f8
[ 7165.514988] x19: ffffa027d9b27f88 x18: ffffffffffffffff
[ 7165.520276] x17: 0000000000000000 x16: 0000000000000000
[ 7165.525564] x15: ffff0000091d9708 x14: ffff0000093b7dc8
[ 7165.530852] x13: ffff0000093b7a23 x12: 6e7265746e692067
[ 7165.536140] x11: 0000000000000000 x10: 0000000000000bb0
[ 7165.541429] x9 : ffff00001d6238f0 x8 : ffffa027d877af00
[ 7165.546718] x7 : ffffa027d6329600 x6 : ffff7e809f58ca00
[ 7165.552006] x5 : 0000000000001f8a x4 : 000000000000088e
[ 7165.557295] x3 : ffffa027d9b27fa8 x2 : 0000000000000000
[ 7165.562583] x1 : 0000000000000000 x0 : 000000003000188e
[ 7165.567872] Call trace:
[ 7165.570309] dereg_device_v3_hw+0x68/0xa8 [hisi_sas_v3_hw]
[ 7165.575775] hisi_sas_abort_task+0x248/0x358 [hisi_sas_main]
[ 7165.581415] sas_eh_handle_sas_errors+0x258/0x8e0 [libsas]
[ 7165.586876] sas_scsi_recover_host+0x134/0x458 [libsas]
[ 7165.592082] scsi_error_handler+0xb4/0x488
[ 7165.596163] kthread+0x134/0x138
[ 7165.599380] ret_from_fork+0x10/0x18
[ 7165.602940] Code: d5033e9f b9000040 aa0103e2 eb03003f (f9400021)
[ 7165.609004] kernel fault(0x1) notification starting on CPU 75
[ 7165.700728] ---[ end trace fc042cbbea224efc ]---
[ 7165.705326] Kernel panic - not syncing: Fatal exception
So grab sas_dev lock when traversing the members of sas_dev.list in
dereg_device_v3_hw() and hisi_sas_release_tasks() to avoid concurrency of
adding and deleting member, which may cause an exception. And when
hisi_sas_release_tasks() call hisi_sas_do_release_task() to free slot, the
lock cannot be grabbed again in hisi_sas_slot_task_free(), then a bool
parameter need_lock is added.
Signed-off-by: Xingui Yang <yangxingui(a)huawei.com>
Reviewed-by: kang fenglong <kangfenglong(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/scsi/hisi_sas/hisi_sas.h | 3 ++-
drivers/scsi/hisi_sas/hisi_sas_main.c | 26 +++++++++++++++++---------
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 2 +-
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 2 +-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 5 ++++-
5 files changed, 25 insertions(+), 13 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index 533408277156..2880e4e22a73 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -601,7 +601,8 @@ extern void hisi_sas_phy_enable(struct hisi_hba *hisi_hba, int phy_no,
extern void hisi_sas_phy_down(struct hisi_hba *hisi_hba, int phy_no, int rdy);
extern void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba,
struct sas_task *task,
- struct hisi_sas_slot *slot);
+ struct hisi_sas_slot *slot,
+ bool need_lock);
extern void hisi_sas_init_mem(struct hisi_hba *hisi_hba);
extern void hisi_sas_rst_work_handler(struct work_struct *work);
extern void hisi_sas_sync_rst_work_handler(struct work_struct *work);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index e6ccaedcd8b9..b00bf3b0f418 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -229,7 +229,7 @@ static void hisi_sas_slot_index_init(struct hisi_hba *hisi_hba)
}
void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task,
- struct hisi_sas_slot *slot)
+ struct hisi_sas_slot *slot, bool need_lock)
{
unsigned long flags;
int device_id = slot->device_id;
@@ -260,9 +260,13 @@ void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task,
}
}
- spin_lock_irqsave(&sas_dev->lock, flags);
- list_del_init(&slot->entry);
- spin_unlock_irqrestore(&sas_dev->lock, flags);
+ if (need_lock) {
+ spin_lock_irqsave(&sas_dev->lock, flags);
+ list_del_init(&slot->entry);
+ spin_unlock_irqrestore(&sas_dev->lock, flags);
+ } else {
+ list_del_init(&slot->entry);
+ }
memset(slot, 0, offsetof(struct hisi_sas_slot, buf));
@@ -989,7 +993,7 @@ static void hisi_sas_port_notify_formed(struct asd_sas_phy *sas_phy)
}
static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, struct sas_task *task,
- struct hisi_sas_slot *slot)
+ struct hisi_sas_slot *slot, bool need_lock)
{
if (task) {
unsigned long flags;
@@ -1011,7 +1015,7 @@ static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, struct sas_task
task->task_done(task);
}
- hisi_sas_slot_task_free(hisi_hba, task, slot);
+ hisi_sas_slot_task_free(hisi_hba, task, slot, need_lock);
}
static void hisi_sas_release_task(struct hisi_hba *hisi_hba,
@@ -1019,9 +1023,13 @@ static void hisi_sas_release_task(struct hisi_hba *hisi_hba,
{
struct hisi_sas_slot *slot, *slot2;
struct hisi_sas_device *sas_dev = device->lldd_dev;
+ unsigned long flags;
+ spin_lock_irqsave(&sas_dev->lock, flags);
list_for_each_entry_safe(slot, slot2, &sas_dev->list, entry)
- hisi_sas_do_release_task(hisi_hba, slot->task, slot);
+ hisi_sas_do_release_task(hisi_hba, slot->task, slot, false);
+
+ spin_unlock_irqrestore(&sas_dev->lock, flags);
}
void hisi_sas_release_tasks(struct hisi_hba *hisi_hba)
@@ -1701,7 +1709,7 @@ static int hisi_sas_abort_task(struct sas_task *task)
*/
if (rc == TMF_RESP_FUNC_COMPLETE && rc2 != TMF_RESP_FUNC_SUCC) {
if (task->lldd_task)
- hisi_sas_do_release_task(hisi_hba, task, slot);
+ hisi_sas_do_release_task(hisi_hba, task, slot, true);
}
} else if (task->task_proto & SAS_PROTOCOL_SATA ||
task->task_proto & SAS_PROTOCOL_STP) {
@@ -1722,7 +1730,7 @@ static int hisi_sas_abort_task(struct sas_task *task)
*/
if ((sas_dev->dev_status == HISI_SAS_DEV_NCQ_ERR) &&
qc && qc->scsicmd) {
- hisi_sas_do_release_task(hisi_hba, task, slot);
+ hisi_sas_do_release_task(hisi_hba, task, slot, true);
rc = TMF_RESP_FUNC_COMPLETE;
} else {
rc = hisi_sas_softreset_ata_disk(device);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
index 452665c641a6..6485e2b6456c 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
@@ -1320,7 +1320,7 @@ static int slot_complete_v1_hw(struct hisi_hba *hisi_hba,
}
out:
- hisi_sas_slot_task_free(hisi_hba, task, slot);
+ hisi_sas_slot_task_free(hisi_hba, task, slot, true);
sts = ts->stat;
if (task->task_done)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 46d274582684..2df70c4873f5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2480,7 +2480,7 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
}
task->task_state_flags |= SAS_TASK_STATE_DONE;
spin_unlock_irqrestore(&task->task_state_lock, flags);
- hisi_sas_slot_task_free(hisi_hba, task, slot);
+ hisi_sas_slot_task_free(hisi_hba, task, slot, true);
if (!is_internal && (task->task_proto != SAS_PROTOCOL_SMP)) {
spin_lock_irqsave(&device->done_lock, flags);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 77f451040c4c..de88dead4b72 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -914,9 +914,11 @@ static void dereg_device_v3_hw(struct hisi_hba *hisi_hba,
struct hisi_sas_slot *slot, *slot2;
struct hisi_sas_device *sas_dev = device->lldd_dev;
u32 cfg_abt_set_query_iptt;
+ unsigned long flags;
cfg_abt_set_query_iptt = hisi_sas_read32(hisi_hba,
CFG_ABT_SET_QUERY_IPTT);
+ spin_lock_irqsave(&sas_dev->lock, flags);
list_for_each_entry_safe(slot, slot2, &sas_dev->list, entry) {
cfg_abt_set_query_iptt &= ~CFG_SET_ABORTED_IPTT_MSK;
cfg_abt_set_query_iptt |= (1 << CFG_SET_ABORTED_EN_OFF) |
@@ -924,6 +926,7 @@ static void dereg_device_v3_hw(struct hisi_hba *hisi_hba,
hisi_sas_write32(hisi_hba, CFG_ABT_SET_QUERY_IPTT,
cfg_abt_set_query_iptt);
}
+ spin_unlock_irqrestore(&sas_dev->lock, flags);
cfg_abt_set_query_iptt &= ~(1 << CFG_SET_ABORTED_EN_OFF);
hisi_sas_write32(hisi_hba, CFG_ABT_SET_QUERY_IPTT,
cfg_abt_set_query_iptt);
@@ -2558,7 +2561,7 @@ slot_complete_v3_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot)
}
task->task_state_flags |= SAS_TASK_STATE_DONE;
spin_unlock_irqrestore(&task->task_state_lock, flags);
- hisi_sas_slot_task_free(hisi_hba, task, slot);
+ hisi_sas_slot_task_free(hisi_hba, task, slot, true);
if (!is_internal && (task->task_proto != SAS_PROTOCOL_SMP)) {
spin_lock_irqsave(&device->done_lock, flags);
--
2.25.1
1
9

[PATCH openEuler-1.0-LTS 1/6] net: tls: fix possible race condition between do_tls_getsockopt_conf() and do_tls_setsockopt_conf()
by Yongqiang Liu 25 Mar '23
by Yongqiang Liu 25 Mar '23
25 Mar '23
From: Hangyu Hua <hbh25y(a)gmail.com>
mainline inclusion
from mainline-v6.3-rc2
commit 49c47cc21b5b7a3d8deb18fc57b0aa2ab1286962
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NIUR
CVE: CVE-2023-28466
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
---------------------------
ctx->crypto_send.info is not protected by lock_sock in
do_tls_getsockopt_conf(). A race condition between do_tls_getsockopt_conf()
and error paths of do_tls_setsockopt_conf() may lead to a use-after-free
or null-deref.
More discussion: https://lore.kernel.org/all/Y/ht6gQL+u6fj3dG@hog/
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Hangyu Hua <hbh25y(a)gmail.com>
Link: https://lore.kernel.org/r/20230228023344.9623-1-hbh25y@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Conflicts:
net/tls/tls_main.c
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/tls/tls_main.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 3cf4916c541b..19646ef9f6f6 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -364,13 +364,11 @@ static int do_tls_getsockopt_tx(struct sock *sk, char __user *optval,
rc = -EINVAL;
goto out;
}
- lock_sock(sk);
memcpy(crypto_info_aes_gcm_128->iv,
ctx->tx.iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
TLS_CIPHER_AES_GCM_128_IV_SIZE);
memcpy(crypto_info_aes_gcm_128->rec_seq, ctx->tx.rec_seq,
TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE);
- release_sock(sk);
if (copy_to_user(optval,
crypto_info_aes_gcm_128,
sizeof(*crypto_info_aes_gcm_128)))
@@ -390,6 +388,8 @@ static int do_tls_getsockopt(struct sock *sk, int optname,
{
int rc = 0;
+ lock_sock(sk);
+
switch (optname) {
case TLS_TX:
rc = do_tls_getsockopt_tx(sk, optval, optlen);
@@ -398,6 +398,9 @@ static int do_tls_getsockopt(struct sock *sk, int optname,
rc = -ENOPROTOOPT;
break;
}
+
+ release_sock(sk);
+
return rc;
}
--
2.25.1
1
5

[PATCH openEuler-1.0-LTS] wifi: brcmfmac: slab-out-of-bounds read in brcmf_get_assoc_ies()
by Yongqiang Liu 23 Mar '23
by Yongqiang Liu 23 Mar '23
23 Mar '23
From: Jisoo Jang <jisoo.jang(a)yonsei.ac.kr>
maillist inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NCVX
CVE: CVE-2023-1380
Reference: https://patchwork.kernel.org/project/linux-wireless/patch/20230309104457.22…
--------------------------------
Fix a slab-out-of-bounds read that occurs in kmemdup() called from
brcmf_get_assoc_ies().
The bug could occur when assoc_info->req_len, data from a URB provided
by a USB device, is bigger than the size of buffer which is defined as
WL_EXTRA_BUF_MAX.
Add the size check for req_len/resp_len of assoc_info.
Found by a modified version of syzkaller.
[ 46.592467][ T7] ==================================================================
[ 46.594687][ T7] BUG: KASAN: slab-out-of-bounds in kmemdup+0x3e/0x50
[ 46.596572][ T7] Read of size 3014656 at addr ffff888019442000 by task kworker/0:1/7
[ 46.598575][ T7]
[ 46.599157][ T7] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G O 5.14.0+ #145
[ 46.601333][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 46.604360][ T7] Workqueue: events brcmf_fweh_event_worker
[ 46.605943][ T7] Call Trace:
[ 46.606584][ T7] dump_stack_lvl+0x8e/0xd1
[ 46.607446][ T7] print_address_description.constprop.0.cold+0x93/0x334
[ 46.608610][ T7] ? kmemdup+0x3e/0x50
[ 46.609341][ T7] kasan_report.cold+0x79/0xd5
[ 46.610151][ T7] ? kmemdup+0x3e/0x50
[ 46.610796][ T7] kasan_check_range+0x14e/0x1b0
[ 46.611691][ T7] memcpy+0x20/0x60
[ 46.612323][ T7] kmemdup+0x3e/0x50
[ 46.612987][ T7] brcmf_get_assoc_ies+0x967/0xf60
[ 46.613904][ T7] ? brcmf_notify_vif_event+0x3d0/0x3d0
[ 46.614831][ T7] ? lock_chain_count+0x20/0x20
[ 46.615683][ T7] ? mark_lock.part.0+0xfc/0x2770
[ 46.616552][ T7] ? lock_chain_count+0x20/0x20
[ 46.617409][ T7] ? mark_lock.part.0+0xfc/0x2770
[ 46.618244][ T7] ? lock_chain_count+0x20/0x20
[ 46.619024][ T7] brcmf_bss_connect_done.constprop.0+0x241/0x2e0
[ 46.620019][ T7] ? brcmf_parse_configure_security.isra.0+0x2a0/0x2a0
[ 46.620818][ T7] ? __lock_acquire+0x181f/0x5790
[ 46.621462][ T7] brcmf_notify_connect_status+0x448/0x1950
[ 46.622134][ T7] ? rcu_read_lock_bh_held+0xb0/0xb0
[ 46.622736][ T7] ? brcmf_cfg80211_join_ibss+0x7b0/0x7b0
[ 46.623390][ T7] ? find_held_lock+0x2d/0x110
[ 46.623962][ T7] ? brcmf_fweh_event_worker+0x19f/0xc60
[ 46.624603][ T7] ? mark_held_locks+0x9f/0xe0
[ 46.625145][ T7] ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
[ 46.625871][ T7] ? brcmf_cfg80211_join_ibss+0x7b0/0x7b0
[ 46.626545][ T7] brcmf_fweh_call_event_handler.isra.0+0x90/0x100
[ 46.627338][ T7] brcmf_fweh_event_worker+0x557/0xc60
[ 46.627962][ T7] ? brcmf_fweh_call_event_handler.isra.0+0x100/0x100
[ 46.628736][ T7] ? rcu_read_lock_sched_held+0xa1/0xd0
[ 46.629396][ T7] ? rcu_read_lock_bh_held+0xb0/0xb0
[ 46.629970][ T7] ? lockdep_hardirqs_on_prepare+0x273/0x3e0
[ 46.630649][ T7] process_one_work+0x92b/0x1460
[ 46.631205][ T7] ? pwq_dec_nr_in_flight+0x330/0x330
[ 46.631821][ T7] ? rwlock_bug.part.0+0x90/0x90
[ 46.632347][ T7] worker_thread+0x95/0xe00
[ 46.632832][ T7] ? __kthread_parkme+0x115/0x1e0
[ 46.633393][ T7] ? process_one_work+0x1460/0x1460
[ 46.633957][ T7] kthread+0x3a1/0x480
[ 46.634369][ T7] ? set_kthread_struct+0x120/0x120
[ 46.634933][ T7] ret_from_fork+0x1f/0x30
[ 46.635431][ T7]
[ 46.635687][ T7] Allocated by task 7:
[ 46.636151][ T7] kasan_save_stack+0x1b/0x40
[ 46.636628][ T7] __kasan_kmalloc+0x7c/0x90
[ 46.637108][ T7] kmem_cache_alloc_trace+0x19e/0x330
[ 46.637696][ T7] brcmf_cfg80211_attach+0x4a0/0x4040
[ 46.638275][ T7] brcmf_attach+0x389/0xd40
[ 46.638739][ T7] brcmf_usb_probe+0x12de/0x1690
[ 46.639279][ T7] usb_probe_interface+0x2aa/0x760
[ 46.639820][ T7] really_probe+0x205/0xb70
[ 46.640342][ T7] __driver_probe_device+0x311/0x4b0
[ 46.640876][ T7] driver_probe_device+0x4e/0x150
[ 46.641445][ T7] __device_attach_driver+0x1cc/0x2a0
[ 46.642000][ T7] bus_for_each_drv+0x156/0x1d0
[ 46.642543][ T7] __device_attach+0x23f/0x3a0
[ 46.643065][ T7] bus_probe_device+0x1da/0x290
[ 46.643644][ T7] device_add+0xb7b/0x1eb0
[ 46.644130][ T7] usb_set_configuration+0xf59/0x16f0
[ 46.644720][ T7] usb_generic_driver_probe+0x82/0xa0
[ 46.645295][ T7] usb_probe_device+0xbb/0x250
[ 46.645786][ T7] really_probe+0x205/0xb70
[ 46.646258][ T7] __driver_probe_device+0x311/0x4b0
[ 46.646804][ T7] driver_probe_device+0x4e/0x150
[ 46.647387][ T7] __device_attach_driver+0x1cc/0x2a0
[ 46.647926][ T7] bus_for_each_drv+0x156/0x1d0
[ 46.648454][ T7] __device_attach+0x23f/0x3a0
[ 46.648939][ T7] bus_probe_device+0x1da/0x290
[ 46.649478][ T7] device_add+0xb7b/0x1eb0
[ 46.649936][ T7] usb_new_device.cold+0x49c/0x1029
[ 46.650526][ T7] hub_event+0x1c98/0x3950
[ 46.650975][ T7] process_one_work+0x92b/0x1460
[ 46.651535][ T7] worker_thread+0x95/0xe00
[ 46.651991][ T7] kthread+0x3a1/0x480
[ 46.652413][ T7] ret_from_fork+0x1f/0x30
[ 46.652885][ T7]
[ 46.653131][ T7] The buggy address belongs to the object at ffff888019442000
[ 46.653131][ T7] which belongs to the cache kmalloc-2k of size 2048
[ 46.654669][ T7] The buggy address is located 0 bytes inside of
[ 46.654669][ T7] 2048-byte region [ffff888019442000, ffff888019442800)
[ 46.656137][ T7] The buggy address belongs to the page:
[ 46.656720][ T7] page:ffffea0000651000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x19440
[ 46.657792][ T7] head:ffffea0000651000 order:3 compound_mapcount:0 compound_pincount:0
[ 46.658673][ T7] flags: 0x100000000010200(slab|head|node=0|zone=1)
[ 46.659422][ T7] raw: 0100000000010200 0000000000000000 dead000000000122 ffff888100042000
[ 46.660363][ T7] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
[ 46.661236][ T7] page dumped because: kasan: bad access detected
[ 46.661956][ T7] page_owner tracks the page as allocated
[ 46.662588][ T7] page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 7, ts 31136961085, free_ts 0
[ 46.664271][ T7] prep_new_page+0x1aa/0x240
[ 46.664763][ T7] get_page_from_freelist+0x159a/0x27c0
[ 46.665340][ T7] __alloc_pages+0x2da/0x6a0
[ 46.665847][ T7] alloc_pages+0xec/0x1e0
[ 46.666308][ T7] allocate_slab+0x380/0x4e0
[ 46.666770][ T7] ___slab_alloc+0x5bc/0x940
[ 46.667264][ T7] __slab_alloc+0x6d/0x80
[ 46.667712][ T7] kmem_cache_alloc_trace+0x30a/0x330
[ 46.668299][ T7] brcmf_usbdev_qinit.constprop.0+0x50/0x470
[ 46.668885][ T7] brcmf_usb_probe+0xc97/0x1690
[ 46.669438][ T7] usb_probe_interface+0x2aa/0x760
[ 46.669988][ T7] really_probe+0x205/0xb70
[ 46.670487][ T7] __driver_probe_device+0x311/0x4b0
[ 46.671031][ T7] driver_probe_device+0x4e/0x150
[ 46.671604][ T7] __device_attach_driver+0x1cc/0x2a0
[ 46.672192][ T7] bus_for_each_drv+0x156/0x1d0
[ 46.672739][ T7] page_owner free stack trace missing
[ 46.673335][ T7]
[ 46.673620][ T7] Memory state around the buggy address:
[ 46.674213][ T7] ffff888019442700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 46.675083][ T7] ffff888019442780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 46.675994][ T7] >ffff888019442800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 46.676875][ T7] ^
[ 46.677323][ T7] ffff888019442880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 46.678190][ T7] ffff888019442900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 46.679052][ T7] ==================================================================
[ 46.679945][ T7] Disabling lock debugging due to kernel taint
[ 46.680725][ T7] Kernel panic - not syncing:
Reviewed-by: Arend van Spriel <arend.vanspriel(a)broadcom.com>
Signed-off-by: Jisoo Jang <jisoo.jang(a)yonsei.ac.kr>
Signed-off-by: Kalle Valo <kvalo(a)kernel.org>
Link: https://lore.kernel.org/r/20230309104457.22628-1-jisoo.jang@yonsei.ac.kr
Signed-off-by: Baisong Zhong <zhongbaisong(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
index bbdc6000afb9..4e1bd049dd06 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
@@ -5356,6 +5356,11 @@ static s32 brcmf_get_assoc_ies(struct brcmf_cfg80211_info *cfg,
(struct brcmf_cfg80211_assoc_ielen_le *)cfg->extra_buf;
req_len = le32_to_cpu(assoc_info->req_len);
resp_len = le32_to_cpu(assoc_info->resp_len);
+ if (req_len > WL_EXTRA_BUF_MAX || resp_len > WL_EXTRA_BUF_MAX) {
+ brcmf_err("invalid lengths in assoc info: req %u resp %u\n",
+ req_len, resp_len);
+ return -EINVAL;
+ }
if (req_len) {
err = brcmf_fil_iovar_data_get(ifp, "assoc_req_ies",
cfg->extra_buf,
--
2.25.1
1
0

[RFC PATCH openEuler-1.0-LTS v2] sched: memqos: add memqos for dynamic affinity
by Wang ShaoBo 23 Mar '23
by Wang ShaoBo 23 Mar '23
23 Mar '23
Add debug memband interface to dynamic affinity, this would be
useful for those threads sensitive to memory bandwidth.
Signed-off-by: Wang ShaoBo <bobo.shaobowang(a)huawei.com>
v2: Fix update thread's mpamid failed.
---
arch/arm64/include/asm/mpam.h | 2 +
arch/arm64/include/asm/mpam_sched.h | 2 +
arch/arm64/kernel/mpam/mpam_device.c | 58 ++-
arch/arm64/kernel/mpam/mpam_resctrl.c | 37 ++
arch/arm64/kernel/process.c | 2 +-
include/linux/memqos.h | 142 +++++++
include/linux/sched.h | 15 +-
include/linux/sysctl.h | 2 +
kernel/cgroup/cpuset.c | 1 +
kernel/exit.c | 3 +
kernel/fork.c | 4 +
kernel/sched/Makefile | 1 +
kernel/sched/core.c | 52 ++-
kernel/sched/fair.c | 14 +-
kernel/sched/memqos/Makefile | 6 +
kernel/sched/memqos/memqos.c | 297 +++++++++++++++
kernel/sched/memqos/phase_feature_sysctl.c | 183 +++++++++
kernel/sched/memqos/phase_memband.c | 179 +++++++++
kernel/sched/memqos/phase_perf.c | 412 +++++++++++++++++++++
kernel/sched/memqos/phase_sim_knn.c | 92 +++++
kernel/sysctl.c | 7 +
mm/mempolicy.c | 10 +-
22 files changed, 1500 insertions(+), 21 deletions(-)
create mode 100644 include/linux/memqos.h
create mode 100644 kernel/sched/memqos/Makefile
create mode 100644 kernel/sched/memqos/memqos.c
create mode 100644 kernel/sched/memqos/phase_feature_sysctl.c
create mode 100644 kernel/sched/memqos/phase_memband.c
create mode 100644 kernel/sched/memqos/phase_perf.c
create mode 100644 kernel/sched/memqos/phase_sim_knn.c
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
index 6338eab817e75..269a91d8ca907 100644
--- a/arch/arm64/include/asm/mpam.h
+++ b/arch/arm64/include/asm/mpam.h
@@ -4,6 +4,8 @@
#ifdef CONFIG_MPAM
extern int mpam_rmid_to_partid_pmg(int rmid, int *partid, int *pmg);
+
+void mpam_component_config_mbwu_mon(int partid, int pmg, int monitor, int *result, int nr);
#endif
#endif /* _ASM_ARM64_MPAM_H */
diff --git a/arch/arm64/include/asm/mpam_sched.h b/arch/arm64/include/asm/mpam_sched.h
index 08ed349b6efa1..32d08cf654b31 100644
--- a/arch/arm64/include/asm/mpam_sched.h
+++ b/arch/arm64/include/asm/mpam_sched.h
@@ -40,6 +40,8 @@ static inline void mpam_sched_in(void)
__mpam_sched_in();
}
+void __mpam_sched_in_v2(struct task_struct *tsk);
+
#else
static inline void mpam_sched_in(void) {}
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c
index 6455c69f132fd..48de3982a0b9a 100644
--- a/arch/arm64/kernel/mpam/mpam_device.c
+++ b/arch/arm64/kernel/mpam/mpam_device.c
@@ -84,14 +84,14 @@ void mpam_class_list_lock_held(void)
static inline u32 mpam_read_reg(struct mpam_device *dev, u16 reg)
{
WARN_ON_ONCE(reg > SZ_MPAM_DEVICE);
- assert_spin_locked(&dev->lock);
+ //assert_spin_locked(&dev->lock);
/*
* If we touch a device that isn't accessible from this CPU we may get
* an external-abort.
*/
- WARN_ON_ONCE(preemptible());
- WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
+ //WARN_ON_ONCE(preemptible());
+ //WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
return readl_relaxed(dev->mapped_hwpage + reg);
}
@@ -99,14 +99,14 @@ static inline u32 mpam_read_reg(struct mpam_device *dev, u16 reg)
static inline void mpam_write_reg(struct mpam_device *dev, u16 reg, u32 val)
{
WARN_ON_ONCE(reg > SZ_MPAM_DEVICE);
- assert_spin_locked(&dev->lock);
+ //assert_spin_locked(&dev->lock);
/*
* If we touch a device that isn't accessible from this CPU we may get
* an external-abort. If we're lucky, we corrupt another mpam:component.
*/
- WARN_ON_ONCE(preemptible());
- WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
+ //WARN_ON_ONCE(preemptible());
+ //WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
writel_relaxed(val, dev->mapped_hwpage + reg);
}
@@ -1208,6 +1208,7 @@ static u32 mpam_device_read_mbwu_mon(struct mpam_device *dev,
{
u16 mon;
u32 clt, flt, cur_clt, cur_flt;
+ u32 total = 0;
mon = args->mon;
@@ -1249,7 +1250,12 @@ static u32 mpam_device_read_mbwu_mon(struct mpam_device *dev,
wmb();
}
- return mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ return total / 5;
}
static int mpam_device_frob_mon(struct mpam_device *dev,
@@ -1470,6 +1476,44 @@ static void mpam_component_device_sync(void *__ctx)
cpumask_set_cpu(smp_processor_id(), &ctx->updated_on);
}
+static DEFINE_SPINLOCK(mpam_tmp_lock);
+
+void mpam_component_config_mbwu_mon(int partid, int pmg, int monitor, int *result, int nr)
+{
+ struct mpam_class *class;
+ struct mpam_component *comp;
+ struct mpam_device *dev;
+ struct sync_args args;
+ int i = 0;
+
+ args.pmg = pmg;
+ args.mon = monitor;
+ args.closid.reqpartid = partid;
+ args.match_pmg = 1;
+
+ spin_lock(&mpam_tmp_lock);
+ list_for_each_entry(class, &mpam_classes, classes_list) {
+ if (class->type != MPAM_CLASS_MEMORY)
+ continue;
+
+ list_for_each_entry(comp, &class->components, class_list) {
+ if (i >= nr) {
+ pr_err_once("error, i > result nr");
+ break;
+ }
+ result[i] = 0;
+ list_for_each_entry(dev, &comp->devices, comp_list) {
+ result[i] += mpam_device_read_mbwu_mon(dev, &args);
+ }
+ i++;
+ }
+ break;
+ }
+ spin_unlock(&mpam_tmp_lock);
+
+}
+EXPORT_SYMBOL(mpam_component_config_mbwu_mon);
+
/**
* in some cases/platforms the MSC register access is only possible with
* the associated CPUs. And need to check if those CPUS are online before
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c
index 60d3d8706a38b..26258f7508ac4 100644
--- a/arch/arm64/kernel/mpam/mpam_resctrl.c
+++ b/arch/arm64/kernel/mpam/mpam_resctrl.c
@@ -2226,6 +2226,43 @@ int mpam_resctrl_init(void)
return resctrl_group_init();
}
+
+void __mpam_sched_in_v2(struct task_struct *tsk)
+{
+ struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
+ u64 rmid = state->default_rmid;
+ u64 closid = state->default_closid;
+
+ /*
+ * If this task has a closid/rmid assigned, use it.
+ * Else use the closid/rmid assigned to this cpu.
+ */
+ if (tsk->closid)
+ closid = tsk->closid;
+
+ if (tsk->rmid)
+ rmid = tsk->rmid;
+
+ if (closid != state->cur_closid || rmid != state->cur_rmid) {
+ u64 reg;
+
+ /* set in EL0 */
+ reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
+ reg = PARTID_SET(reg, closid);
+ reg = PMG_SET(reg, rmid);
+ mpam_write_sysreg_s(reg, SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
+
+ /* set in EL1 */
+ reg = mpam_read_sysreg_s(SYS_MPAM1_EL1, "SYS_MPAM1_EL1");
+ reg = PARTID_SET(reg, closid);
+ reg = PMG_SET(reg, rmid);
+ mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1");
+
+ state->cur_rmid = rmid;
+ state->cur_closid = closid;
+ }
+}
+
/*
* __intel_rdt_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
*
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index e5be78915632c..7896bb74ecc49 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -531,7 +531,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
/* the actual thread switch */
last = cpu_switch_to(prev, next);
- mpam_sched_in();
+ //mpam_sched_in();
return last;
}
diff --git a/include/linux/memqos.h b/include/linux/memqos.h
new file mode 100644
index 0000000000000..814e9935590d3
--- /dev/null
+++ b/include/linux/memqos.h
@@ -0,0 +1,142 @@
+#ifndef _MEMQOS_H
+#define _MEMQOS_H
+
+#include <linux/vmstat.h>
+#include <linux/rbtree.h>
+//#include <linux/sched.h>
+
+struct task_struct;
+
+struct memqos_domain {
+ int dom_id;
+ int total_memband_div_10;
+ int total_out_memband_div_10;
+
+ //record 10 timers
+ int memband_ringpos;
+ int memband_div_10_history[4][10];
+};
+
+struct memqos_mpam_profile {
+ int partid;
+ int pmg;
+ int monitor;
+
+ struct task_struct *tsk;
+ int used;
+};
+
+struct memqos_wait_profile {
+ struct memqos_mpam_profile *profile;
+ struct list_head wait_list;
+};
+
+struct memqos_class {
+ struct list_head turbo_list;
+ struct list_head tasks_list;
+};
+
+#include <linux/topology.h>
+//embed in task_struct
+
+struct task_memqos {
+ int ipc_ringpos;
+ int ipcx10;
+ int ipcx10_total[4];
+ int ipcx10_history[10];
+
+ int memband_div_10;
+ int memband_ringpos;
+ int memband_div_10_total[4];
+ int memband_div_10_history[4][10];
+
+ u32 sample_times;
+ int account_ready;
+ int numa_score[4];
+ int turbo;
+
+ struct memqos_wait_profile mpam_profile;
+
+ struct list_head turbo_list;
+ struct list_head task_list;
+
+ struct cpumask *advise_mem_node_mask;
+ int preferred_nid;
+
+ int class_id;
+
+ int corrupt;
+};
+
+#define PHASE_PEVENT_NUM 10
+
+struct phase_event_pcount {
+ u64 data[PHASE_PEVENT_NUM];
+};
+
+struct phase_event_count {
+ struct phase_event_pcount pcount;
+};
+
+void phase_update_mpam_label(struct task_struct *tsk);
+
+void phase_release_mpam_label(struct task_struct *tsk);
+
+static inline void memqos_update_mpam_label(struct task_struct *tsk)
+{
+ phase_update_mpam_label(tsk);
+}
+
+static inline void memqos_release_mpam_label(struct task_struct *tsk)
+{
+ phase_release_mpam_label(tsk);
+}
+
+void phase_destroy_waitqueue(struct task_struct *tsk);
+
+void phase_get_memband(struct memqos_mpam_profile *pm, int *result, int nr);
+
+DECLARE_STATIC_KEY_FALSE(sched_phase);
+DECLARE_STATIC_KEY_FALSE(sched_phase_printk);
+
+int phase_perf_create(void);
+
+void phase_perf_release(void);
+
+void memqos_account_task(struct task_struct *p, int cpu);
+
+void memqos_drop_class(struct task_struct *p);
+
+void phase_account_task(struct task_struct *p, int cpu);
+
+static inline void memqos_task_collect_data(struct task_struct *p, int cpu)
+{
+ phase_account_task(p, cpu);
+}
+
+static inline void memqos_task_account(struct task_struct *p, int cpu)
+{
+ memqos_account_task(p, cpu);
+}
+
+static inline void memqos_task_exit(struct task_struct *p)
+{
+
+ memqos_drop_class(p);
+ phase_destroy_waitqueue(p);
+}
+
+void memqos_select_nicest_cpus(struct task_struct *p);
+
+void memqos_exclude_low_level_task_single(struct task_struct *p);
+
+int knn_get_tag(int ipcx10, int memband_div_10);
+
+void memqos_init_class(struct task_struct *p);
+
+void phase_trace_printk(struct task_struct *p);
+static inline void memqos_trace_printk(struct task_struct *p)
+{
+ phase_trace_printk(p);
+}
+#endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 928186f161000..c5b74cd0c5830 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -29,6 +29,7 @@
#include <linux/task_io_accounting.h>
#include <linux/rseq.h>
#include <linux/thread_bits.h>
+#include <linux/memqos.h>
/* task_struct member predeclarations (sorted alphabetically): */
struct audit_context;
@@ -1268,7 +1269,7 @@ struct task_struct {
#if !defined(__GENKSYMS__)
#if defined(CONFIG_QOS_SCHED_DYNAMIC_AFFINITY)
cpumask_t *prefer_cpus;
- const cpumask_t *select_cpus;
+ cpumask_t *select_cpus;
#else
KABI_RESERVE(6)
KABI_RESERVE(7)
@@ -1279,6 +1280,10 @@ struct task_struct {
#endif
KABI_RESERVE(8)
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+ struct task_memqos sched_memqos;
+#endif
+
/* CPU-specific state of this task: */
struct thread_struct thread;
@@ -1998,6 +2003,14 @@ int set_prefer_cpus_ptr(struct task_struct *p,
const struct cpumask *new_mask);
int sched_prefer_cpus_fork(struct task_struct *p, struct task_struct *orig);
void sched_prefer_cpus_free(struct task_struct *p);
+static inline bool prefer_cpus_valid(struct task_struct *p)
+{
+ return p->prefer_cpus &&
+ !cpumask_empty(p->prefer_cpus) &&
+ !cpumask_equal(p->prefer_cpus, &p->cpus_allowed) &&
+ cpumask_subset(p->prefer_cpus, &p->cpus_allowed);
+}
+void sched_memqos_task_collect_data_range(int start_cpu, int end_cpu);
#endif
#endif
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index b769ecfcc3bd4..73bce39107cb3 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -230,6 +230,8 @@ static inline void setup_sysctl_set(struct ctl_table_set *p,
#endif /* CONFIG_SYSCTL */
+extern struct ctl_table phase_table[];
+
int sysctl_max_threads(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos);
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 55bfbc4cdb16c..d94a9065a5605 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -106,6 +106,7 @@ struct cpuset {
nodemask_t mems_allowed;
#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
cpumask_var_t prefer_cpus;
+ int mem_turbo;
#endif
/* effective CPUs and Memory Nodes allow to tasks */
diff --git a/kernel/exit.c b/kernel/exit.c
index 2a32d32bdc03d..b731c19618176 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -699,6 +699,8 @@ static void check_stack_usage(void)
static inline void check_stack_usage(void) {}
#endif
+#include <linux/memqos.h>
+
void __noreturn do_exit(long code)
{
struct task_struct *tsk = current;
@@ -806,6 +808,7 @@ void __noreturn do_exit(long code)
* because of cgroup mode, must be called before cgroup_exit()
*/
perf_event_exit_task(tsk);
+ memqos_task_exit(tsk);
sched_autogroup_exit_task(tsk);
cgroup_exit(tsk);
diff --git a/kernel/fork.c b/kernel/fork.c
index b5453a26655e2..0a762b92dc814 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -841,6 +841,8 @@ void set_task_stack_end_magic(struct task_struct *tsk)
*stackend = STACK_END_MAGIC; /* for overflow detection */
}
+
+#include <linux/memqos.h>
static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
{
struct task_struct *tsk;
@@ -923,6 +925,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
kcov_task_init(tsk);
+ memqos_init_class(tsk);
+
#ifdef CONFIG_FAULT_INJECTION
tsk->fail_nth = 0;
#endif
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 7fe183404c383..471380d6686e3 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_CPU_FREQ) += cpufreq.o
obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o
obj-$(CONFIG_MEMBARRIER) += membarrier.o
obj-$(CONFIG_CPU_ISOLATION) += isolation.o
+obj-$(CONFIG_QOS_SCHED_DYNAMIC_AFFINITY) += memqos/
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 970616070da86..15c7e1e3408cb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2787,6 +2787,8 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
calculate_sigpending();
}
+#include <linux/memqos.h>
+
/*
* context_switch - switch to the new MM and the new thread's register state.
*/
@@ -2794,6 +2796,8 @@ static __always_inline struct rq *
context_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next, struct rq_flags *rf)
{
+ struct rq *ret;
+
prepare_task_switch(rq, prev, next);
/*
@@ -2837,6 +2841,18 @@ context_switch(struct rq *rq, struct task_struct *prev,
}
}
+ //account and release
+ memqos_task_account(prev, smp_processor_id());
+
+ if (prefer_cpus_valid(prev))
+ memqos_trace_printk(prev);
+
+ memqos_release_mpam_label(prev);
+
+ //label new task's mpamid
+ if (prefer_cpus_valid(next))
+ memqos_update_mpam_label(next);
+
rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
prepare_lock_switch(rq, next, rf);
@@ -2845,7 +2861,9 @@ context_switch(struct rq *rq, struct task_struct *prev,
switch_to(prev, next, prev);
barrier();
- return finish_task_switch(prev);
+ ret = finish_task_switch(prev);
+
+ return ret;
}
/*
@@ -3051,6 +3069,20 @@ unsigned long long task_sched_runtime(struct task_struct *p)
return ns;
}
+void sched_memqos_task_collect_data_range(int start_cpu, int end_cpu)
+{
+ int cpu;
+ struct task_struct *curr;
+ struct rq *rq_curr;
+
+ for (cpu = start_cpu; cpu <= end_cpu; cpu++) {
+ rq_curr = cpu_rq(cpu);
+ curr = rq_curr->curr;
+ if (curr && prefer_cpus_valid(curr))
+ memqos_task_collect_data(curr, cpu);
+ }
+}
+
/*
* This function gets called by the timer code, with HZ frequency.
* We call it with interrupts disabled.
@@ -3058,8 +3090,12 @@ unsigned long long task_sched_runtime(struct task_struct *p)
void scheduler_tick(void)
{
int cpu = smp_processor_id();
+ //memqos clooect next cpu's memband and perf
+ //int cpu_memqos = (cpu + 1) % nr_cpu_ids;
struct rq *rq = cpu_rq(cpu);
+ //struct rq *rq_next = cpu_rq(cpu_memqos);
struct task_struct *curr = rq->curr;
+ //struct task_struct *curr_memqos = rq_next->curr;
struct rq_flags rf;
sched_clock_tick();
@@ -3075,6 +3111,10 @@ void scheduler_tick(void)
perf_event_task_tick();
+ //only monitor task enabled dynamic affinity
+ //if (curr_memqos && prefer_cpus_valid(curr_memqos))
+ // memqos_task_collect_data(curr_memqos, cpu_memqos);
+
#ifdef CONFIG_SMP
rq->idle_balance = idle_cpu(cpu);
trigger_load_balance(rq);
@@ -3524,6 +3564,16 @@ static void __sched notrace __schedule(bool preempt)
/* Also unlocks the rq: */
rq = context_switch(rq, prev, next, &rf);
} else {
+ memqos_task_account(prev, smp_processor_id());
+
+ if (prefer_cpus_valid(prev))
+ memqos_trace_printk(prev);
+
+ memqos_release_mpam_label(prev);
+ //relabel this task's mpamid
+ if (prefer_cpus_valid(prev))
+ memqos_update_mpam_label(prev);
+
rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
rq_unlock_irq(rq, &rf);
}
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index af55a26d11fcb..12e9675495d2c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6675,6 +6675,7 @@ static int wake_cap(struct task_struct *p, int cpu, int prev_cpu)
}
#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+#include <linux/memqos.h>
/*
* Low utilization threshold for CPU
*
@@ -6749,14 +6750,6 @@ static inline int cpu_vutil_of(int cpu)
return cputime->vutil;
}
-static inline bool prefer_cpus_valid(struct task_struct *p)
-{
- return p->prefer_cpus &&
- !cpumask_empty(p->prefer_cpus) &&
- !cpumask_equal(p->prefer_cpus, &p->cpus_allowed) &&
- cpumask_subset(p->prefer_cpus, &p->cpus_allowed);
-}
-
/*
* set_task_select_cpus: select the cpu range for task
* @p: the task whose available cpu range will to set
@@ -6828,8 +6821,13 @@ static void set_task_select_cpus(struct task_struct *p, int *idlest_cpu,
if (util_avg_sum < sysctl_sched_util_low_pct *
cpumask_weight(p->prefer_cpus)) {
p->select_cpus = p->prefer_cpus;
+ memqos_select_nicest_cpus(p);
if (sd_flag & SD_BALANCE_WAKE)
schedstat_inc(p->se.dyn_affi_stats->nr_wakeups_preferred_cpus);
+ } else {
+ //select trubo task
+ //select low class task
+ memqos_exclude_low_level_task_single(p);
}
}
#endif
diff --git a/kernel/sched/memqos/Makefile b/kernel/sched/memqos/Makefile
new file mode 100644
index 0000000000000..ed8f42649a8a7
--- /dev/null
+++ b/kernel/sched/memqos/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+# These files are disabled because they produce non-interesting flaky coverage
+# that is not a function of syscall inputs. E.g. involuntary context switches.
+KCOV_INSTRUMENT := n
+
+obj-y := memqos.o phase_feature_sysctl.o phase_memband.o phase_perf.o phase_sim_knn.o
diff --git a/kernel/sched/memqos/memqos.c b/kernel/sched/memqos/memqos.c
new file mode 100644
index 0000000000000..29fc6af1f02c1
--- /dev/null
+++ b/kernel/sched/memqos/memqos.c
@@ -0,0 +1,297 @@
+#include <linux/memqos.h>
+#include <linux/cpumask.h>
+#include <linux/sched.h>
+
+static void memqos_set_task_classid(struct task_struct *p)
+{
+ int class_id;
+ int memband_div_10 = p->sched_memqos.memband_div_10;
+ int ipcx10 = p->sched_memqos.ipcx10;
+
+ class_id = knn_get_tag((u64)ipcx10, (u64)memband_div_10);
+ p->sched_memqos.class_id = class_id;
+}
+
+//static memqos_domain mq_domains[] = {
+// {.dom_id = 0, .total_memband = 0, .total_out_memband = 0,},
+// {.dom_id = 1, .total_memband = 0, .total_out_memband = 0,},
+// {.dom_id = 2, .total_memband = 0, .total_out_memband = 0,},
+// {.dom_id = 3, .total_memband = 0, .total_out_memband = 0,},
+//};
+
+static DEFINE_PER_CPU(struct memqos_class, memqos_classes[8]);
+//static DEFINE_PER_CPU(spinlock_t, memqos_class_lock);
+static DEFINE_SPINLOCK(memqos_class_lock);
+
+static int memqos_class_online(unsigned int cpu)
+{
+ int class_id = 0;
+ struct memqos_class *class;
+
+ for (class_id = 0; class_id < 8; class_id++) {
+ class = &per_cpu(memqos_classes, cpu)[class_id];
+ INIT_LIST_HEAD(&class->tasks_list);
+ INIT_LIST_HEAD(&class->turbo_list);
+ }
+ return 0;
+}
+
+static int memqos_class_offline(unsigned int cpu)
+{
+ return 0;
+}
+
+#include <linux/cpu.h>
+#include <linux/cacheinfo.h>
+
+static void memqos_init(void)
+{
+ int cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "memqos:online", memqos_class_online,
+ memqos_class_offline);
+ if (cpuhp_state <= 0) {
+ pr_err("Failed to register 'dyn' cpuhp callbacks");
+ return;
+ }
+}
+late_initcall(memqos_init);
+
+static void memqos_insert_to_class(struct task_struct *p, int cpu)
+{
+ unsigned long flag;
+ int class_id = p->sched_memqos.class_id;
+ struct memqos_class *class;
+ struct task_memqos *memqos;
+
+ if (class_id >= 8)
+ return;
+
+ memqos = &p->sched_memqos;
+
+ class = &per_cpu(memqos_classes, cpu)[class_id];
+
+ spin_lock_irqsave(&memqos_class_lock, flag);
+ if (p->sched_memqos.corrupt) {
+ spin_unlock_irqrestore(&memqos_class_lock, flag);
+ return;
+ }
+
+ list_move_tail(&p->sched_memqos.task_list, &class->tasks_list);
+ if (memqos->turbo)
+ list_move_tail(&p->sched_memqos.turbo_list, &class->turbo_list);
+ spin_unlock_irqrestore(&memqos_class_lock, flag);
+}
+
+static void memqos_drop_class_without_lock(struct task_struct *p)
+{
+ list_del_init(&p->sched_memqos.task_list);
+ list_del_init(&p->sched_memqos.turbo_list);
+}
+
+static void memqos_score(struct task_struct *p)
+{
+ int total_n1 = p->sched_memqos.memband_div_10_total[0];
+ int total_n2 = p->sched_memqos.memband_div_10_total[1];
+ int total_n3 = p->sched_memqos.memband_div_10_total[2];
+ int total_n4 = p->sched_memqos.memband_div_10_total[3];
+
+ p->sched_memqos.numa_score[0] = (total_n1 - (total_n2 + total_n3 + total_n4)) * 10 / total_n1;
+ p->sched_memqos.numa_score[1] = (total_n2 - (total_n1 + total_n3 + total_n4)) * 10 / total_n2;
+ p->sched_memqos.numa_score[2] = (total_n3 - (total_n1 + total_n2 + total_n4)) * 10 / total_n3;
+ p->sched_memqos.numa_score[3] = (total_n4 - (total_n1 + total_n2 + total_n3)) * 10 / total_n4;
+
+ //over x% percent
+ if (p->sched_memqos.numa_score[0] > 0)
+ p->sched_memqos.turbo = 1;
+ else if (p->sched_memqos.numa_score[1] > 0)
+ p->sched_memqos.turbo = 2;
+ else if (p->sched_memqos.numa_score[2] > 0)
+ p->sched_memqos.turbo = 3;
+ else if (p->sched_memqos.numa_score[3] > 0)
+ p->sched_memqos.turbo = 4;
+ else
+ p->sched_memqos.turbo = 0;
+}
+
+void memqos_account_task(struct task_struct *p, int cpu)
+{
+ if (!p->sched_memqos.account_ready ||
+ p->sched_memqos.corrupt)
+ return;
+ memqos_set_task_classid(p);
+ memqos_insert_to_class(p, cpu);
+ memqos_score(p);
+ p->sched_memqos.account_ready = 0;
+}
+
+void memqos_init_class(struct task_struct *p)
+{
+ memset(&p->sched_memqos, 0, sizeof(struct task_memqos));
+ spin_lock(&memqos_class_lock);
+ INIT_LIST_HEAD(&p->sched_memqos.task_list);
+ INIT_LIST_HEAD(&p->sched_memqos.turbo_list);
+ INIT_LIST_HEAD(&p->sched_memqos.mpam_profile.wait_list);
+ spin_unlock(&memqos_class_lock);
+
+ p->closid = 0;
+ p->rmid = 0;
+}
+
+//destroy ?
+void memqos_drop_class(struct task_struct *p)
+{
+ spin_lock(&memqos_class_lock);
+ memqos_drop_class_without_lock(p);
+ p->sched_memqos.corrupt = 1;
+ spin_unlock(&memqos_class_lock);
+}
+
+void memqos_select_nicest_cpus(struct task_struct *p)
+{
+ int i = 0;
+ int max_score = -10000;
+ int select_node = 0;
+ struct task_memqos *memqos = &p->sched_memqos;
+
+ if (!memqos->turbo) {
+ for (i = 0; i < 4; i++) {
+ if (!cpumask_intersects(cpumask_of_node(i), p->select_cpus))
+ continue;
+
+ if (memqos->numa_score[i] > max_score) {
+ select_node = i;
+ max_score = memqos->numa_score[i];
+ }
+ }
+
+ cpumask_and(p->select_cpus, p->select_cpus, cpumask_of_node(select_node));
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ return;
+ }
+
+ select_node = memqos->turbo - 1;
+ if (cpumask_intersects(cpumask_of_node(select_node), p->select_cpus)) {
+ cpumask_and(p->select_cpus, p->select_cpus, cpumask_of_node(select_node));
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ }
+
+ //if turbo another cpus, wait...
+ return;
+}
+
+void memqos_exclude_low_level_task_single(struct task_struct *p)
+{
+ int i, j, cpu;
+ int find = 0;
+ int select_node = 0;
+ const struct cpumask *cpumask;
+ struct cpumask cpumask_med;
+ struct memqos_class *class;
+ struct task_memqos *memqos = &p->sched_memqos;;
+ struct task_struct *tsk = NULL;
+ int max_score = -100000;
+
+ if (memqos->turbo) {
+ select_node = memqos->turbo - 1;
+ cpumask = cpumask_of_node(select_node);
+ if (!cpumask_intersects(cpumask, p->prefer_cpus) &&
+ (cpumask_intersects(&p->cpus_allowed, cpumask))) {
+ cpumask_and(p->select_cpus, &p->cpus_allowed, cpumask);
+ //go out!
+ spin_lock(&memqos_class_lock);
+ memqos_drop_class_without_lock(p);
+ spin_unlock(&memqos_class_lock);
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ return;
+ } else if (cpumask_intersects(p->prefer_cpus, cpumask)) {
+ cpumask_and(p->select_cpus, p->prefer_cpus, cpumask);
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ }
+ }
+
+ //select turbo one
+ for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
+ if (!cpumask_test_cpu(cpu, p->prefer_cpus))
+ continue;
+
+ spin_lock(&memqos_class_lock);
+ for (i = 7; i >= 0; i--) {
+ class = &per_cpu(memqos_classes, cpu)[i];
+ list_for_each_entry(memqos, &class->turbo_list, turbo_list) {
+ if (!memqos->turbo)
+ continue;
+ select_node = memqos->turbo - 1;
+ cpumask = cpumask_of_node(select_node);
+ if (!cpumask_intersects(cpumask, p->prefer_cpus)) {
+ tsk = container_of(memqos, struct task_struct, sched_memqos);
+ if (!cpumask_intersects(cpumask, &tsk->cpus_allowed))
+ continue;
+ cpumask_and(tsk->select_cpus, &tsk->cpus_allowed, cpumask);
+ //mem prefered
+ //tsk->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ tsk->sched_memqos.preferred_nid = memqos->turbo;
+ find = 1;
+ break;
+ }
+ }
+ if (find) {
+ memqos_drop_class_without_lock(tsk);
+ spin_unlock(&memqos_class_lock);
+ return;
+ }
+ }
+ spin_unlock(&memqos_class_lock);
+ }
+
+ find = 0;
+
+ //if not, select lower class's tsk
+ for (cpu = 0; cpu < nr_cpu_ids; cpu++) {
+ if (!cpumask_test_cpu(cpu, p->prefer_cpus))
+ continue;
+
+ spin_lock(&memqos_class_lock);
+ //only find below class tsk
+ for (i = 0; i < memqos->class_id; i++) {
+ class = &per_cpu(memqos_classes, cpu)[i];
+ list_for_each_entry(memqos, &class->tasks_list, task_list) {
+ if (memqos->turbo)
+ continue;
+
+ tsk = container_of(memqos, struct task_struct, sched_memqos);
+ for (j = 0; j < 4; j++) {
+ if (!cpumask_intersects(cpumask_of_node(i), &tsk->cpus_allowed))
+ continue;
+ if (memqos->numa_score[j] > max_score) {
+ select_node = j;
+ max_score = memqos->numa_score[j];
+ }
+ find = 1;
+ }
+ if (!find)
+ continue;
+
+ cpumask_and(&cpumask_med, cpumask_of_node(select_node), &tsk->cpus_allowed);
+ cpumask_andnot(&cpumask_med, &cpumask_med, p->prefer_cpus);
+ if (cpumask_empty(&cpumask_med))
+ continue;
+ cpumask_copy(tsk->select_cpus, &cpumask_med);
+ //mem prefered
+ //tsk->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ tsk->sched_memqos.preferred_nid = memqos->turbo;
+ memqos_drop_class_without_lock(tsk);
+ spin_unlock(&memqos_class_lock);
+ return;
+ }
+ }
+ spin_unlock(&memqos_class_lock);
+ }
+
+ //do not care, this task may out
+ return;
+}
+
diff --git a/kernel/sched/memqos/phase_feature_sysctl.c b/kernel/sched/memqos/phase_feature_sysctl.c
new file mode 100644
index 0000000000000..9106a90868a3d
--- /dev/null
+++ b/kernel/sched/memqos/phase_feature_sysctl.c
@@ -0,0 +1,183 @@
+#include <linux/sched.h>
+#include <linux/sysctl.h>
+#include <linux/capability.h>
+#include <linux/cpumask.h>
+#include <linux/topology.h>
+#include <linux/sched/task.h>
+
+#include <linux/memqos.h>
+
+#ifdef CONFIG_PROC_SYSCTL
+
+//setup timer for counting
+#include <linux/sched.h>
+#include <linux/timer.h>
+#include <asm/ioctl.h>
+
+//at least 2 cpu
+static enum hrtimer_restart timer_fn_twin_a(struct hrtimer *timer_data)
+{
+ sched_memqos_task_collect_data_range(0, nr_cpu_ids / 2 - 1);
+ hrtimer_forward_now(timer_data, 1 * NSEC_PER_MSEC);
+ return HRTIMER_RESTART;
+}
+
+static enum hrtimer_restart timer_fn_twin_b(struct hrtimer *timer_data)
+{
+ sched_memqos_task_collect_data_range(nr_cpu_ids / 2, nr_cpu_ids - 1);
+ hrtimer_forward_now(timer_data, 1 * NSEC_PER_MSEC);
+ return HRTIMER_RESTART;
+}
+
+static struct hrtimer timer_twin_a;
+static struct hrtimer timer_twin_b;
+
+static void memqos_timer_init_func_a(void *info) {
+ hrtimer_init(&timer_twin_a, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+ timer_twin_a.function = timer_fn_twin_a;
+ hrtimer_start(&timer_twin_a, ktime_add_ns(ktime_get(), 10000000), HRTIMER_MODE_ABS);
+}
+
+static void memqos_timer_init_func_b(void *info) {
+ hrtimer_init(&timer_twin_b, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+ timer_twin_b.function = timer_fn_twin_b;
+ hrtimer_start(&timer_twin_b, ktime_add_ns(ktime_get(), 10000000), HRTIMER_MODE_ABS);
+}
+
+static void memqos_timer_init_a(void)
+{
+ smp_call_function_single(0, memqos_timer_init_func_b, NULL, 0);
+}
+
+static void memqos_timer_init_b(void)
+{
+ smp_call_function_single(nr_cpu_ids / 2, memqos_timer_init_func_a, NULL, 0);
+}
+
+static void memqos_timer_twin_init(void) {
+ memqos_timer_init_a();
+ memqos_timer_init_b();
+}
+
+static void memqos_timer_twin_exit(void) {
+ hrtimer_cancel(&timer_twin_a);
+ hrtimer_cancel(&timer_twin_b);
+}
+
+DEFINE_STATIC_KEY_FALSE(sched_phase);
+DEFINE_STATIC_KEY_FALSE(sched_phase_printk);
+
+static int set_phase_state(bool enabled)
+{
+ int err;
+ int state = static_branch_likely(&sched_phase);
+
+ if (enabled == state) {
+ pr_warn("phase has already %s\n", state ? "enabled" : "disabled");
+ return 0;
+ }
+
+ if (enabled) {
+ err = phase_perf_create();
+ if (err) {
+ pr_err("phase enable failed\n");
+ return err;
+ }
+ static_branch_enable(&sched_phase);
+ pr_info("phase enabled\n");
+ memqos_timer_twin_init();
+ } else {
+ static_branch_disable(&sched_phase);
+ phase_perf_release();
+ pr_info("phase disabled\n");
+ memqos_timer_twin_exit();
+ }
+
+ return 0;
+}
+
+/*
+ * the other procfs files of phase cannot be modified if sched_phase is already enabled
+ */
+static int phase_proc_state(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table t;
+ int err;
+ int state = static_branch_likely(&sched_phase);
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ t = *table;
+ t.data = &state;
+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+ if (write)
+ err = set_phase_state(state);
+
+ return err;
+}
+
+static int set_phase_state_printk(bool enabled)
+{
+ if (enabled) {
+ static_branch_enable(&sched_phase_printk);
+ } else {
+ static_branch_disable(&sched_phase_printk);
+ }
+
+ return 0;
+}
+
+/*
+ * the other procfs files of phase cannot be modified if sched_phase is already enabled
+ */
+static int phase_proc_state_printk(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table t;
+ int err;
+ int state = static_branch_likely(&sched_phase);
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ t = *table;
+ t.data = &state;
+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+ if (write)
+ err = set_phase_state_printk(state);
+
+ return err;
+}
+
+
+static int __maybe_unused zero;
+static int __maybe_unused one = 1;
+
+struct ctl_table phase_table[] = {
+ {
+ .procname = "enabled",
+ .data = NULL,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = phase_proc_state,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+ {
+ .procname = "trace_enabled",
+ .data = NULL,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = phase_proc_state_printk,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+ { }
+};
+#endif /* CONFIG_PROC_SYSCTL */
diff --git a/kernel/sched/memqos/phase_memband.c b/kernel/sched/memqos/phase_memband.c
new file mode 100644
index 0000000000000..df8b2811f6ab7
--- /dev/null
+++ b/kernel/sched/memqos/phase_memband.c
@@ -0,0 +1,179 @@
+#include <linux/types.h>
+#include <linux/cpu.h>
+#include <linux/memqos.h>
+
+#include <asm/cpu.h>
+#include <asm/cputype.h>
+#include <asm/cpufeature.h>
+#include <asm/mpam_sched.h>
+
+static const int nr_partid = 15;
+static const int nr_monitor = 4;
+
+static LIST_HEAD(phase_mpam_waitqueue);
+
+//mpam_profile_res[0] not used
+struct memqos_mpam_profile mpam_profile_res[16] = {
+ { .partid = 0, .monitor = 0, .used = 1},
+ { .partid = 1, .monitor = 0,},
+ { .partid = 2, .monitor = 1,},
+ { .partid = 3, .monitor = 2,},
+ { .partid = 4, .monitor = 3,},
+ { .partid = 5, .monitor = 0,},
+ { .partid = 6, .monitor = 1,},
+ { .partid = 7, .monitor = 2,},
+ { .partid = 8, .monitor = 3,},
+ { .partid = 9, .monitor = 0,},
+ { .partid = 10, .monitor = 1,},
+ { .partid = 11, .monitor = 2,},
+ { .partid = 12, .monitor = 3,},
+ { .partid = 13, .monitor = 0,},
+ { .partid = 14, .monitor = 1,},
+ { .partid = 15, .monitor = 2,},
+};
+
+static DEFINE_SPINLOCK(phase_partid_lock);
+
+void phase_update_mpam_label(struct task_struct *tsk)
+{
+ int i = 0;
+ //unsigned long flag;
+
+ WARN_ON_ONCE(tsk->closid);
+
+ if (tsk->sched_memqos.corrupt) {
+ phase_release_mpam_label(tsk);
+ return;
+ }
+
+ spin_lock(&phase_partid_lock);
+ if (tsk->sched_memqos.mpam_profile.profile != &mpam_profile_res[0] &&
+ tsk->sched_memqos.mpam_profile.profile != NULL) {
+ tsk->closid = tsk->sched_memqos.mpam_profile.profile->partid;
+ tsk->sched_memqos.mpam_profile.profile->tsk = tsk;
+ //tsk->sched_memqos.mpam_profile.profile->used = 1;
+ tsk->rmid = 0;
+ spin_unlock(&phase_partid_lock);
+ //if (static_branch_unlikely(&sched_phase_printk)) {
+ // trace_printk("task pid:%d get partid%d succeed\n", tsk->pid, tsk->closid);
+ //}
+ __mpam_sched_in_v2(tsk);
+ return;
+ }
+
+ //is in profile queue, wait...
+ if (tsk->sched_memqos.mpam_profile.profile == &mpam_profile_res[0]) {
+ spin_unlock(&phase_partid_lock);
+ return;
+ }
+
+ for (i = 1; i < 16; i++) {
+ if (mpam_profile_res[i].used) {
+ if (static_branch_unlikely(&sched_phase_printk)) {
+ //if (mpam_profile_res[i].tsk)
+ // trace_printk("i%d want get partid, butpartid:%d get by pid:%d closid:%d\n",
+ //tsk->pid, i, mpam_profile_res[i].tsk->pid, mpam_profile_res[i].tsk->closid);
+ //else
+ // trace_printk("i%d want get partid, butpartid:%d get by pid:%d(NULL)\n",
+ //tsk->pid, i, tsk->pid);
+ }
+
+ continue;
+ }
+
+ tsk->sched_memqos.mpam_profile.profile = NULL;
+ break;
+ }
+
+ if (i == 16) {
+ list_move_tail(&tsk->sched_memqos.mpam_profile.wait_list, &phase_mpam_waitqueue);
+ tsk->sched_memqos.mpam_profile.profile = &mpam_profile_res[0];
+ spin_unlock(&phase_partid_lock);
+ //if (static_branch_unlikely(&sched_phase_printk)) {
+ // trace_printk("task pid:%d no partid found, go to list\n", tsk->pid);
+ //}
+ //wait...
+ return;
+ }
+
+ mpam_profile_res[i].used = 1;
+ tsk->closid = mpam_profile_res[i].partid;
+ mpam_profile_res[i].tsk = tsk;
+ tsk->sched_memqos.mpam_profile.profile = &mpam_profile_res[i];
+ tsk->rmid = 0;
+ spin_unlock(&phase_partid_lock);
+ //if (static_branch_unlikely(&sched_phase_printk)) {
+ //trace_printk("task pid:%d get partid%d succeed\n", tsk->pid, i);
+ //}
+
+ __mpam_sched_in_v2(tsk);
+}
+
+static void phase_release_mpam_label_without_lock(struct task_struct *tsk)
+{
+ int closid;
+ struct memqos_wait_profile *next;
+
+ //assert locked
+
+ if (tsk->sched_memqos.mpam_profile.profile &&
+ tsk->sched_memqos.mpam_profile.profile->partid) {
+ closid = tsk->sched_memqos.mpam_profile.profile->partid;
+ } else if (tsk->closid == 0) {
+ return;
+ } else {
+ closid = tsk->closid;
+ }
+
+ tsk->closid = 0;
+ tsk->sched_memqos.mpam_profile.profile = NULL;
+ mpam_profile_res[closid].used = 0;
+ mpam_profile_res[closid].tsk = NULL;
+
+ //if (static_branch_unlikely(&sched_phase_printk)) {
+ // trace_printk("task pid:%d release partid%d, list empty:%d\n", tsk->pid, closid, list_empty(&phase_mpam_waitqueue));
+ //}
+
+ next = list_first_entry_or_null(&phase_mpam_waitqueue, struct memqos_wait_profile, wait_list);
+ if (next) {
+ list_del_init(&next->wait_list);
+ mpam_profile_res[closid].used = 1;
+ next->profile = &mpam_profile_res[closid];
+ }
+
+ return;
+}
+
+//task shutdown
+void phase_destroy_waitqueue(struct task_struct *tsk)
+{
+ spin_lock(&phase_partid_lock);
+
+ //if (tsk->sched_memqos.mpam_profile.profile == &mpam_profile_res[0]) {
+ list_del_init(&tsk->sched_memqos.mpam_profile.wait_list);
+ //} else {
+ phase_release_mpam_label_without_lock(tsk);
+ //}
+ spin_unlock(&phase_partid_lock);
+}
+
+void phase_release_mpam_label(struct task_struct *tsk)
+{
+ spin_lock(&phase_partid_lock);
+ phase_release_mpam_label_without_lock(tsk);
+ spin_unlock(&phase_partid_lock);
+}
+
+#include <asm/mpam.h>
+void phase_get_memband(struct memqos_mpam_profile *pm, int *result, int nr)
+{
+ if (pm == &mpam_profile_res[0] || pm == NULL) {
+ result[0] = 0;
+ result[1] = 0;
+ result[2] = 0;
+ result[3] = 0;
+ return;
+ }
+
+ mpam_component_config_mbwu_mon(pm->partid, pm->pmg, pm->monitor, result, nr);
+}
diff --git a/kernel/sched/memqos/phase_perf.c b/kernel/sched/memqos/phase_perf.c
new file mode 100644
index 0000000000000..7b7f37e46f76c
--- /dev/null
+++ b/kernel/sched/memqos/phase_perf.c
@@ -0,0 +1,412 @@
+#include <linux/kernel.h>
+#include <linux/perf_event.h>
+#include <linux/percpu-defs.h>
+#include <linux/slab.h>
+#include <linux/stop_machine.h>
+#include <linux/memqos.h>
+#include <linux/sched.h>
+
+#define PHASE_FEVENT_NUM 3
+
+int *phase_perf_pevents = NULL;
+
+static DEFINE_PER_CPU(__typeof__(struct perf_event *)[PHASE_PEVENT_NUM], cpu_phase_perf_events);
+
+/******************************************
+ * Helpers for phase perf event
+ *****************************************/
+static inline struct perf_event *perf_event_of_cpu(int cpu, int index)
+{
+ return per_cpu(cpu_phase_perf_events, cpu)[index];
+}
+
+static inline struct perf_event **perf_events_of_cpu(int cpu)
+{
+ return per_cpu(cpu_phase_perf_events, cpu);
+}
+
+static inline u64 perf_event_local_pmu_read(struct perf_event *event)
+{
+ return 0;
+ if (event->state == PERF_EVENT_STATE_ACTIVE)
+ event->pmu->read(event);
+ return local64_read(&event->count);
+}
+
+/******************************************
+ * Helpers for cpu counters
+ *****************************************/
+static inline u64 read_cpu_counter(int cpu, int index)
+{
+ struct perf_event *event = perf_event_of_cpu(cpu, index);
+
+ if (!event || !event->pmu)
+ return 0;
+
+ return perf_event_local_pmu_read(event);
+}
+
+static struct perf_event_attr *alloc_attr(int event_id)
+{
+ struct perf_event_attr *attr;
+
+ attr = kzalloc(sizeof(struct perf_event_attr), GFP_KERNEL);
+ if (!attr)
+ return ERR_PTR(-ENOMEM);
+
+ attr->type = PERF_TYPE_RAW;
+ attr->config = event_id;
+ attr->size = sizeof(struct perf_event_attr);
+ attr->pinned = 1;
+ attr->disabled = 1;
+ //attr->exclude_hv;
+ //attr->exclude_idle;
+ //attr->exclude_kernel;
+
+ return attr;
+}
+
+static int create_cpu_counter(int cpu, int event_id, int index)
+{
+ struct perf_event_attr *attr = NULL;
+ struct perf_event **events = perf_events_of_cpu(cpu);
+ struct perf_event *event = NULL;
+
+ return 0;
+ attr = alloc_attr(event_id);
+ if (IS_ERR(attr))
+ return PTR_ERR(attr);
+
+ event = perf_event_create_kernel_counter(attr, cpu, NULL, NULL, NULL);
+ if (IS_ERR(event)) {
+ pr_err("unable to create perf event (cpu:%i-type:%d-pinned:%d-config:0x%llx) : %ld",
+ cpu, attr->type, attr->pinned, attr->config, PTR_ERR(event));
+ kfree(attr);
+ return PTR_ERR(event);
+ } else {
+ events[index] = event;
+ perf_event_enable(events[index]);
+ if (event->hw.idx == -1) {
+ pr_err("pinned event unable to get onto hardware, perf event (cpu:%i-type:%d-config:0x%llx)",
+ cpu, attr->type, attr->config);
+ kfree(attr);
+ return -EINVAL;
+ }
+ pr_info("create perf_event (cpu:%i-idx:%d-type:%d-pinned:%d-exclude_hv:%d"
+ "-exclude_idle:%d-exclude_kernel:%d-config:0x%llx-addr:%px)",
+ event->cpu, event->hw.idx,
+ event->attr.type, event->attr.pinned, event->attr.exclude_hv,
+ event->attr.exclude_idle, event->attr.exclude_kernel,
+ event->attr.config, event);
+ }
+
+ kfree(attr);
+ return 0;
+}
+
+static int release_cpu_counter(int cpu, int event_id, int index)
+{
+ struct perf_event **events = perf_events_of_cpu(cpu);
+ struct perf_event *event = NULL;
+
+ return 0;
+ event = events[index];
+
+ if (!event)
+ return 0;
+
+ pr_info("release perf_event (cpu:%i-idx:%d-type:%d-pinned:%d-exclude_hv:%d"
+ "-exclude_idle:%d-exclude_kernel:%d-config:0x%llx)",
+ event->cpu, event->hw.idx,
+ event->attr.type, event->attr.pinned, event->attr.exclude_hv,
+ event->attr.exclude_idle, event->attr.exclude_kernel,
+ event->attr.config);
+
+ perf_event_release_kernel(event);
+ events[index] = NULL;
+
+ return 0;
+}
+
+enum {
+ CYCLES_INDEX = 0,
+ INST_RETIRED_INDEX,
+ PHASE_EVENT_FINAL_TERMINATOR
+};
+
+#define CYCLES 0x0011
+#define INST_RETIRED 0x0008
+
+static int pevents[PHASE_PEVENT_NUM] = {
+ CYCLES,
+ INST_RETIRED,
+ PHASE_EVENT_FINAL_TERMINATOR,
+};
+
+#define for_each_phase_pevents(index, events) \
+ for (index = 0; events != NULL && index < PHASE_PEVENT_NUM && \
+ events[index] != PHASE_EVENT_FINAL_TERMINATOR; index++)
+
+
+/******************************************
+ * Helpers for phase perf
+ *****************************************/
+static int do_pevents(int (*fn)(int, int, int), int cpu)
+{
+ int index;
+ int err;
+
+ for_each_phase_pevents(index, phase_perf_pevents) {
+ err = fn(cpu, phase_perf_pevents[index], index);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static int __phase_perf_create(void *args)
+{
+ int err;
+ int cpu = raw_smp_processor_id();
+
+ /* create pinned events */
+ pr_info("create pinned events\n");
+ err = do_pevents(create_cpu_counter, cpu);
+ if (err) {
+ pr_err("create pinned events failed\n");
+ do_pevents(release_cpu_counter, cpu);
+ return err;
+ }
+
+ pr_info("[%d] phase class event create success\n", cpu);
+ return 0;
+}
+
+static int do_phase_perf_create(int *pevents, const struct cpumask *cpus)
+{
+ phase_perf_pevents = pevents;
+ return stop_machine(__phase_perf_create, NULL, cpus);
+}
+
+static int __do_phase_perf_release(void *args)
+{
+ int cpu = raw_smp_processor_id();
+
+ /* release pinned events */
+ pr_info("release pinned events\n");
+ do_pevents(release_cpu_counter, cpu);
+
+ pr_info("[%d] phase class event release success\n", cpu);
+ return 0;
+}
+
+static void do_phase_perf_release(const struct cpumask *cpus)
+{
+ stop_machine(__do_phase_perf_release, NULL, cpus);
+ phase_perf_pevents = NULL;
+}
+
+int phase_perf_create(void)
+{
+ return do_phase_perf_create(pevents, cpu_possible_mask);
+}
+
+void phase_perf_release(void)
+{
+ do_phase_perf_release(cpu_possible_mask);
+}
+
+DECLARE_STATIC_KEY_FALSE(sched_phase);
+DECLARE_STATIC_KEY_FALSE(sched_phase_printk);
+
+#define PHASE_EVENT_OVERFLOW (~0ULL)
+
+static inline u64 phase_event_count_sub(u64 curr, u64 prev)
+{
+ if (curr < prev) { /* ovewrflow */
+ u64 tmp = PHASE_EVENT_OVERFLOW - prev;
+ return curr + tmp;
+ } else {
+ return curr - prev;
+ }
+}
+
+static inline void phase_calc_delta(struct task_struct *p,
+ struct phase_event_count *prev,
+ struct phase_event_count *curr,
+ struct phase_event_count *delta)
+{
+ int *pevents = phase_perf_pevents;
+ int index;
+
+ for_each_phase_pevents(index, pevents) {
+ delta->pcount.data[index] = phase_event_count_sub(curr->pcount.data[index], prev->pcount.data[index]);
+ }
+}
+
+static inline u64 phase_data_of_pevent(struct phase_event_pcount *counter, int event_id)
+{
+ int index;
+ int *events = phase_perf_pevents;
+
+ for_each_phase_pevents(index, events) {
+ if (event_id == events[index])
+ return counter->data[index];
+ }
+
+ return 0;
+}
+
+static int cal_ring_history_average(int *history, int nr, int s_pos, int c_nr)
+{
+ int average = 0;
+ int start = ((s_pos - c_nr) + nr) % nr;
+
+ if (start < 0)
+ return 0;
+
+ for (;start != s_pos;) {
+ if (history[start] == 0) {
+ c_nr--;
+ if (c_nr == 0)
+ return 0;
+ continue;
+ }
+ average += history[start];
+ start = (start + 1) % nr;
+ }
+
+ return start / c_nr;
+}
+
+static void __phase_cal_ipcx10(struct task_struct *p, struct phase_event_count *delta)
+{
+ u64 ins;
+ u64 cycles;
+ //invalid zero
+ int ipcx10 = 0;
+
+ ins = phase_data_of_pevent(&delta->pcount, INST_RETIRED_INDEX);
+ cycles = phase_data_of_pevent(&delta->pcount, CYCLES_INDEX);
+
+ if (cycles)
+ ipcx10 = (ins * 10) / cycles;
+
+ //if (static_branch_unlikely(&sched_phase_printk)) {
+ // trace_printk("ins:%lld cycles:%lld\n", ins, cycles);
+ //}
+
+ p->sched_memqos.ipcx10_history[p->sched_memqos.ipc_ringpos] = ipcx10;
+ p->sched_memqos.ipc_ringpos = (p->sched_memqos.ipc_ringpos + 1) % 10;
+ cal_ring_history_average(p->sched_memqos.ipcx10_history, 10, p->sched_memqos.ipc_ringpos, 5);
+}
+
+static void __phase_cal_memband_div_10(struct task_struct *p)
+{
+ int pos;
+ int result[4];
+
+ pos = p->sched_memqos.memband_ringpos;
+
+ phase_get_memband(p->sched_memqos.mpam_profile.profile, result, 4);
+
+ //if (static_branch_unlikely(&sched_phase_printk)) {
+ // trace_printk("memband:%d %d %d %d profile:%llx\n", result[0], result[1], result[2], result[3], p->sched_memqos.mpam_profile.profile);
+ //}
+
+ p->sched_memqos.memband_div_10_total[0] = p->sched_memqos.memband_div_10_total[0] - p->sched_memqos.memband_div_10_history[0][pos];
+ p->sched_memqos.memband_div_10_total[0] = p->sched_memqos.memband_div_10_total[0] + result[0] / 10;
+ p->sched_memqos.memband_div_10_history[0][p->sched_memqos.memband_ringpos] = result[0] / 10;
+
+ p->sched_memqos.memband_div_10_total[1] = p->sched_memqos.memband_div_10_total[1] - p->sched_memqos.memband_div_10_history[1][pos];
+ p->sched_memqos.memband_div_10_total[1] = p->sched_memqos.memband_div_10_total[1] + result[1] / 10;
+ p->sched_memqos.memband_div_10_history[1][p->sched_memqos.memband_ringpos] = result[1] / 10;
+
+ p->sched_memqos.memband_div_10_total[2] = p->sched_memqos.memband_div_10_total[2] - p->sched_memqos.memband_div_10_history[2][pos];
+ p->sched_memqos.memband_div_10_total[2] = p->sched_memqos.memband_div_10_total[2] + result[2] / 10;
+ p->sched_memqos.memband_div_10_history[2][p->sched_memqos.memband_ringpos] = result[2] / 10;
+
+ p->sched_memqos.memband_div_10_total[3] = p->sched_memqos.memband_div_10_total[3] - p->sched_memqos.memband_div_10_history[3][pos];
+ p->sched_memqos.memband_div_10_total[3] = p->sched_memqos.memband_div_10_total[3] + result[3] / 10;
+ p->sched_memqos.memband_div_10_history[3][p->sched_memqos.memband_ringpos] = result[3] / 10;
+
+ p->sched_memqos.memband_ringpos = (pos + 1) % 10;
+
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[0], 10, pos, 5);
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[1], 10, pos, 5);
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[2], 10, pos, 5);
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[3], 10, pos, 5);
+}
+
+static DEFINE_PER_CPU(struct phase_event_count, prev_phase_event_count);
+static DEFINE_PER_CPU(struct phase_event_count, curr_phase_event_count);
+
+static void phase_perf_read_events(int cpu, u64 *pdata)
+{
+ int index;
+
+ for_each_phase_pevents(index, phase_perf_pevents) {
+ pdata[index] = read_cpu_counter(cpu, index);
+ }
+}
+
+static inline struct phase_event_count *phase_read_prev(unsigned int cpu)
+{
+ return &per_cpu(prev_phase_event_count, cpu);
+}
+
+static inline struct phase_event_count *phase_read_curr(unsigned int cpu)
+{
+ struct phase_event_count *curr = &per_cpu(curr_phase_event_count, cpu);
+
+ phase_perf_read_events(cpu, curr->pcount.data);
+
+ return curr;
+}
+
+void phase_account_task(struct task_struct *p, int cpu)
+{
+ struct phase_event_count delta;
+ struct phase_event_count *prev, *curr;
+
+ if (!static_branch_likely(&sched_phase))
+ return;
+
+ //if (!sched_core_enabled(cpu_rq(cpu)))
+ // return;
+
+ /* update phase_event_count */
+ prev = phase_read_prev(cpu);
+ curr = phase_read_curr(cpu);
+ phase_calc_delta(p, prev, curr, &delta);
+ *prev = *curr;
+
+ /* calculate phase */
+ __phase_cal_ipcx10(p, &delta);
+ __phase_cal_memband_div_10(p);
+ p->sched_memqos.sample_times++;
+ if ((p->sched_memqos.sample_times % 3) == 0)
+ p->sched_memqos.account_ready = 1;
+}
+
+
+void phase_trace_printk(struct task_struct *p)
+{
+ if (!static_branch_unlikely(&sched_phase_printk))
+ return;
+
+ trace_printk("p->comm:%s(%d) ipcpos:%d ipcx10:%d membandpos:%d memband_div_10:%d numa_score[0]:%d numa_score[1]:%d numa_score[2]:%d numa_score[3]:%d turbo:%d prefered_nid:%d classid:%d partid:%d\n",
+ p->comm, p->pid, p->sched_memqos.ipc_ringpos,\
+ p->sched_memqos.ipcx10, \
+ p->sched_memqos.memband_ringpos,\
+ p->sched_memqos.memband_div_10, \
+ p->sched_memqos.numa_score[0], \
+ p->sched_memqos.numa_score[1], \
+ p->sched_memqos.numa_score[2], \
+ p->sched_memqos.numa_score[3], \
+ p->sched_memqos.turbo, \
+ p->sched_memqos.preferred_nid, \
+ p->sched_memqos.class_id, \
+ p->closid);
+}
diff --git a/kernel/sched/memqos/phase_sim_knn.c b/kernel/sched/memqos/phase_sim_knn.c
new file mode 100644
index 0000000000000..b80bb6b9ae0a3
--- /dev/null
+++ b/kernel/sched/memqos/phase_sim_knn.c
@@ -0,0 +1,92 @@
+#include <linux/types.h>
+
+#define DATA_ROW 20
+void QuickSort(u64 arr[DATA_ROW][2], int L, int R) {
+ int i = L;
+ int j = R;
+ int kk = (L + R) / 2;
+ u64 pivot = arr[kk][0];
+
+ while (i <= j) {
+ while (pivot > arr[i][0]) {
+ i++;
+ }
+ while (pivot < arr[j][0]) {
+ j--;
+ }
+ if (i <= j) {
+ u64 temp = arr[i][0];
+
+ arr[i][0] = arr[j][0];
+ arr[j][0] = temp;
+ i++; j--;
+ }
+ }
+ if (L < j) {
+ QuickSort(arr, L, j);
+ }
+ if (i < R) {
+ QuickSort(arr, i, R);
+ }
+}
+
+u64 euclidean_distance(u64 *row1, u64 *row2, int col) {
+ u64 distance = 0;
+ int i;
+
+ for (i = 0; i < col - 1; i++) {
+ distance += ((row1[i] - row2[i]) * (row1[i] - row2[i]));
+ }
+ return distance;
+}
+
+#define num_neighbors 6
+#define MAX_TAG 8
+
+int get_neighbors_tag(u64 train_data[DATA_ROW][3], int train_row, int col, u64 *test_row) {
+ int i;
+ u64 neighbors[MAX_TAG] = {0};
+ int max_tag = 0;
+ u64 distances[DATA_ROW][2];
+
+ for (i = 0; i < train_row; i++) {
+ distances[i][0] = euclidean_distance(train_data[i], test_row, col);
+ distances[i][1] = train_data[i][col - 1];
+ }
+ QuickSort(distances, 0, train_row - 1);
+ for (i = 0; i < num_neighbors; i++) {
+ neighbors[distances[i][1]]++;
+ if (neighbors[distances[i][1]] > neighbors[max_tag])
+ max_tag = distances[i][1];
+ }
+ return max_tag;
+}
+
+static u64 train_data[DATA_ROW][3] = {
+ {0, 1, 0},
+ {0, 9, 0},
+ {0, 20, 1},
+ {0, 30, 1},
+ {0, 40, 2},
+ {0, 50, 3},
+ {0, 60, 3},
+ {0, 70, 3},
+ {0, 80, 4},
+ {0, 90, 4},
+ {0, 100, 4},
+ {0, 110, 5},
+ {0, 120, 5},
+ {0, 130, 6},
+ {0, 140, 6},
+ {0, 150, 7},
+};
+
+int knn_get_tag(int ipcx10, int memband_div_10)
+{
+ u64 test_data[2];
+
+ test_data[0] = ipcx10;
+ test_data[1] = memband_div_10;
+
+ return get_neighbors_tag(train_data, DATA_ROW, 3, test_data);
+}
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 685f9881b8e23..0d2764c4449ce 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
.extra2 = &one,
},
#endif /* CONFIG_NUMA_BALANCING */
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+ {
+ .procname = "phase",
+ .mode = 0555,
+ .child = phase_table,
+ },
+#endif
#endif /* CONFIG_SCHED_DEBUG */
{
.procname = "sched_rt_period_us",
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 4cac46d56f387..d748c291e7047 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2164,12 +2164,15 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
{
struct mempolicy *pol;
struct page *page;
- int preferred_nid;
+ int preferred_nid = -1;
nodemask_t *nmask;
+ if (current->sched_memqos.preferred_nid)
+ preferred_nid = current->sched_memqos.preferred_nid - 1;
+
pol = get_vma_policy(vma, addr);
- if (pol->mode == MPOL_INTERLEAVE) {
+ if (pol->mode == MPOL_INTERLEAVE && preferred_nid != -1) {
unsigned nid;
nid = interleave_nid(pol, vma, addr, PAGE_SHIFT + order);
@@ -2233,7 +2236,8 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
}
nmask = policy_nodemask(gfp, pol);
- preferred_nid = policy_node(gfp, pol, node);
+ if (preferred_nid == -1)
+ preferred_nid = policy_node(gfp, pol, node);
page = __alloc_pages_nodemask(gfp, order, preferred_nid, nmask);
mark_vma_cdm(nmask, page, vma);
mpol_cond_put(pol);
--
2.25.1
1
0

23 Mar '23
From: Yixing Liu <liuyixing1(a)huawei.com>
mainline inclusion
form mainline-master
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I6K9B6
CVE: NA
Reference: https://patchwork.kernel.org/project/linux-rdma/cover/20230304091555.224129…
---------------------------------------------------------------
The current resource query for vf caps is driven
by the driver, which is unreasonable. This patch
adds a new command HNS_ROCE_OPC_QUERY_PF_CAPS_NUM
to support obtaining vf caps information from firmware.
Signed-off-by: Yixing Liu <liuyixing1(a)huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1(a)hisilicon.com>
Reviewed-by: Yangyang Li <liyangyang20(a)huawei.com>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 203 +++++++--------------
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 33 +---
2 files changed, 64 insertions(+), 172 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index c9826a010f38..cba78f0eac14 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -2157,102 +2157,6 @@ static int hns_roce_v2_set_bt(struct hns_roce_dev *hr_dev)
return hns_roce_cmq_send(hr_dev, &desc, 1);
}
-/* Use default caps when hns_roce_query_pf_caps() failed or init VF profile */
-static void set_default_caps(struct hns_roce_dev *hr_dev)
-{
- struct hns_roce_caps *caps = &hr_dev->caps;
-
- caps->num_qps = HNS_ROCE_V2_MAX_QP_NUM;
- caps->max_wqes = HNS_ROCE_V2_MAX_WQE_NUM;
- caps->num_cqs = HNS_ROCE_V2_MAX_CQ_NUM;
- caps->num_srqs = HNS_ROCE_V2_MAX_SRQ_NUM;
- caps->min_cqes = HNS_ROCE_MIN_CQE_NUM;
- caps->max_cqes = HNS_ROCE_V2_MAX_CQE_NUM;
- caps->max_sq_sg = HNS_ROCE_V2_MAX_SQ_SGE_NUM;
- caps->max_extend_sg = HNS_ROCE_V2_MAX_EXTEND_SGE_NUM;
- caps->max_rq_sg = HNS_ROCE_V2_MAX_RQ_SGE_NUM;
-
- caps->num_uars = HNS_ROCE_V2_UAR_NUM;
- caps->phy_num_uars = HNS_ROCE_V2_PHY_UAR_NUM;
- caps->num_aeq_vectors = HNS_ROCE_V2_AEQE_VEC_NUM;
- caps->num_other_vectors = HNS_ROCE_V2_ABNORMAL_VEC_NUM;
- caps->num_comp_vectors = 0;
-
- caps->num_mtpts = HNS_ROCE_V2_MAX_MTPT_NUM;
- caps->num_pds = HNS_ROCE_V2_MAX_PD_NUM;
- caps->num_qpc_timer = HNS_ROCE_V2_MAX_QPC_TIMER_NUM;
- caps->cqc_timer_bt_num = HNS_ROCE_V2_MAX_CQC_TIMER_BT_NUM;
-
- caps->max_qp_init_rdma = HNS_ROCE_V2_MAX_QP_INIT_RDMA;
- caps->max_qp_dest_rdma = HNS_ROCE_V2_MAX_QP_DEST_RDMA;
- caps->max_sq_desc_sz = HNS_ROCE_V2_MAX_SQ_DESC_SZ;
- caps->max_rq_desc_sz = HNS_ROCE_V2_MAX_RQ_DESC_SZ;
- caps->max_srq_desc_sz = HNS_ROCE_V2_MAX_SRQ_DESC_SZ;
- caps->irrl_entry_sz = HNS_ROCE_V2_IRRL_ENTRY_SZ;
- caps->trrl_entry_sz = HNS_ROCE_V2_EXT_ATOMIC_TRRL_ENTRY_SZ;
- caps->cqc_entry_sz = HNS_ROCE_V2_CQC_ENTRY_SZ;
- caps->srqc_entry_sz = HNS_ROCE_V2_SRQC_ENTRY_SZ;
- caps->mtpt_entry_sz = HNS_ROCE_V2_MTPT_ENTRY_SZ;
- caps->idx_entry_sz = HNS_ROCE_V2_IDX_ENTRY_SZ;
- caps->page_size_cap = HNS_ROCE_V2_PAGE_SIZE_SUPPORTED;
- caps->reserved_lkey = 0;
- caps->reserved_pds = 0;
- caps->reserved_mrws = 1;
- caps->reserved_uars = 0;
- caps->reserved_cqs = 0;
- caps->reserved_srqs = 0;
- caps->reserved_qps = HNS_ROCE_V2_RSV_QPS;
-
- caps->qpc_hop_num = HNS_ROCE_CONTEXT_HOP_NUM;
- caps->srqc_hop_num = HNS_ROCE_CONTEXT_HOP_NUM;
- caps->cqc_hop_num = HNS_ROCE_CONTEXT_HOP_NUM;
- caps->mpt_hop_num = HNS_ROCE_CONTEXT_HOP_NUM;
- caps->sccc_hop_num = HNS_ROCE_SCCC_HOP_NUM;
-
- caps->mtt_hop_num = HNS_ROCE_MTT_HOP_NUM;
- caps->wqe_sq_hop_num = HNS_ROCE_SQWQE_HOP_NUM;
- caps->wqe_sge_hop_num = HNS_ROCE_EXT_SGE_HOP_NUM;
- caps->wqe_rq_hop_num = HNS_ROCE_RQWQE_HOP_NUM;
- caps->cqe_hop_num = HNS_ROCE_CQE_HOP_NUM;
- caps->srqwqe_hop_num = HNS_ROCE_SRQWQE_HOP_NUM;
- caps->idx_hop_num = HNS_ROCE_IDX_HOP_NUM;
- caps->chunk_sz = HNS_ROCE_V2_TABLE_CHUNK_SIZE;
-
- caps->flags = HNS_ROCE_CAP_FLAG_REREG_MR |
- HNS_ROCE_CAP_FLAG_ROCE_V1_V2 |
- HNS_ROCE_CAP_FLAG_CQ_RECORD_DB |
- HNS_ROCE_CAP_FLAG_QP_RECORD_DB;
-
- caps->pkey_table_len[0] = 1;
- caps->ceqe_depth = HNS_ROCE_V2_COMP_EQE_NUM;
- caps->aeqe_depth = HNS_ROCE_V2_ASYNC_EQE_NUM;
- caps->local_ca_ack_delay = 0;
- caps->max_mtu = IB_MTU_4096;
-
- caps->max_srq_wrs = HNS_ROCE_V2_MAX_SRQ_WR;
- caps->max_srq_sges = HNS_ROCE_V2_MAX_SRQ_SGE;
-
- caps->flags |= HNS_ROCE_CAP_FLAG_ATOMIC | HNS_ROCE_CAP_FLAG_MW |
- HNS_ROCE_CAP_FLAG_SRQ | HNS_ROCE_CAP_FLAG_FRMR |
- HNS_ROCE_CAP_FLAG_QP_FLOW_CTRL | HNS_ROCE_CAP_FLAG_XRC;
-
- caps->gid_table_len[0] = HNS_ROCE_V2_GID_INDEX_NUM;
-
- if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) {
- caps->flags |= HNS_ROCE_CAP_FLAG_STASH |
- HNS_ROCE_CAP_FLAG_DIRECT_WQE |
- HNS_ROCE_CAP_FLAG_DCA_MODE;
- caps->max_sq_inline = HNS_ROCE_V3_MAX_SQ_INLINE;
- } else {
- caps->max_sq_inline = HNS_ROCE_V2_MAX_SQ_INLINE;
-
- /* The following configuration are only valid for HIP08 */
- caps->qpc_sz = HNS_ROCE_V2_QPC_SZ;
- caps->sccc_sz = HNS_ROCE_V2_SCCC_SZ;
- caps->cqe_sz = HNS_ROCE_V2_CQE_SIZE;
- }
-}
-
static void calc_pg_sz(u32 obj_num, u32 obj_size, u32 hop_num, u32 ctx_bt_num,
u32 *buf_page_size, u32 *bt_page_size, u32 hem_type)
{
@@ -2395,7 +2299,8 @@ static void apply_func_caps(struct hns_roce_dev *hr_dev)
if (!caps->num_comp_vectors)
caps->num_comp_vectors = min_t(u32, caps->eqc_bt_num - 1,
- (u32)priv->handle->rinfo.num_vectors - 2);
+ (u32)priv->handle->rinfo.num_vectors -
+ (HNS_ROCE_V2_AEQE_VEC_NUM + HNS_ROCE_V2_ABNORMAL_VEC_NUM));
if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) {
caps->eqe_hop_num = HNS_ROCE_V3_EQE_HOP_NUM;
@@ -2437,7 +2342,7 @@ static void apply_func_caps(struct hns_roce_dev *hr_dev)
set_hem_page_size(hr_dev);
}
-static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev)
+static int hns_roce_query_caps(struct hns_roce_dev *hr_dev)
{
struct hns_roce_cmq_desc desc[HNS_ROCE_QUERY_PF_CAPS_CMD_NUM];
struct hns_roce_caps *caps = &hr_dev->caps;
@@ -2446,15 +2351,17 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev)
struct hns_roce_query_pf_caps_c *resp_c;
struct hns_roce_query_pf_caps_d *resp_d;
struct hns_roce_query_pf_caps_e *resp_e;
+ enum hns_roce_opcode_type cmd;
int ctx_hop_num;
int pbl_hop_num;
int ret;
int i;
+ cmd = hr_dev->is_vf ? HNS_ROCE_OPC_QUERY_VF_CAPS_NUM :
+ HNS_ROCE_OPC_QUERY_PF_CAPS_NUM;
+
for (i = 0; i < HNS_ROCE_QUERY_PF_CAPS_CMD_NUM; i++) {
- hns_roce_cmq_setup_basic_desc(&desc[i],
- HNS_ROCE_OPC_QUERY_PF_CAPS_NUM,
- true);
+ hns_roce_cmq_setup_basic_desc(&desc[i], cmd, true);
if (i < (HNS_ROCE_QUERY_PF_CAPS_CMD_NUM - 1))
desc[i].flag |= cpu_to_le16(HNS_ROCE_CMD_FLAG_NEXT);
else
@@ -2471,38 +2378,38 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev)
resp_d = (struct hns_roce_query_pf_caps_d *)desc[3].data;
resp_e = (struct hns_roce_query_pf_caps_e *)desc[4].data;
- caps->local_ca_ack_delay = resp_a->local_ca_ack_delay;
- caps->max_sq_sg = le16_to_cpu(resp_a->max_sq_sg);
- caps->max_sq_inline = le16_to_cpu(resp_a->max_sq_inline);
- caps->max_rq_sg = le16_to_cpu(resp_a->max_rq_sg);
+ caps->local_ca_ack_delay = resp_a->local_ca_ack_delay;
+ caps->max_sq_sg = le16_to_cpu(resp_a->max_sq_sg);
+ caps->max_sq_inline = le16_to_cpu(resp_a->max_sq_inline);
+ caps->max_rq_sg = le16_to_cpu(resp_a->max_rq_sg);
caps->max_rq_sg = roundup_pow_of_two(caps->max_rq_sg);
- caps->max_extend_sg = le32_to_cpu(resp_a->max_extend_sg);
- caps->num_qpc_timer = le16_to_cpu(resp_a->num_qpc_timer);
- caps->max_srq_sges = le16_to_cpu(resp_a->max_srq_sges);
+ caps->max_extend_sg = le32_to_cpu(resp_a->max_extend_sg);
+ caps->num_qpc_timer = le16_to_cpu(resp_a->num_qpc_timer);
+ caps->max_srq_sges = le16_to_cpu(resp_a->max_srq_sges);
caps->max_srq_sges = roundup_pow_of_two(caps->max_srq_sges);
- caps->num_aeq_vectors = resp_a->num_aeq_vectors;
- caps->num_other_vectors = resp_a->num_other_vectors;
- caps->max_sq_desc_sz = resp_a->max_sq_desc_sz;
- caps->max_rq_desc_sz = resp_a->max_rq_desc_sz;
- caps->max_srq_desc_sz = resp_a->max_srq_desc_sz;
- caps->cqe_sz = resp_a->cqe_sz;
-
- caps->mtpt_entry_sz = resp_b->mtpt_entry_sz;
- caps->irrl_entry_sz = resp_b->irrl_entry_sz;
- caps->trrl_entry_sz = resp_b->trrl_entry_sz;
- caps->cqc_entry_sz = resp_b->cqc_entry_sz;
- caps->srqc_entry_sz = resp_b->srqc_entry_sz;
- caps->idx_entry_sz = resp_b->idx_entry_sz;
- caps->sccc_sz = resp_b->sccc_sz;
- caps->max_mtu = resp_b->max_mtu;
- caps->qpc_sz = le16_to_cpu(resp_b->qpc_sz);
- caps->min_cqes = resp_b->min_cqes;
- caps->min_wqes = resp_b->min_wqes;
- caps->page_size_cap = le32_to_cpu(resp_b->page_size_cap);
- caps->pkey_table_len[0] = resp_b->pkey_table_len;
- caps->phy_num_uars = resp_b->phy_num_uars;
- ctx_hop_num = resp_b->ctx_hop_num;
- pbl_hop_num = resp_b->pbl_hop_num;
+ caps->num_aeq_vectors = resp_a->num_aeq_vectors;
+ caps->num_other_vectors = resp_a->num_other_vectors;
+ caps->max_sq_desc_sz = resp_a->max_sq_desc_sz;
+ caps->max_rq_desc_sz = resp_a->max_rq_desc_sz;
+ caps->max_srq_desc_sz = resp_a->max_srq_desc_sz;
+ caps->cqe_sz = resp_a->cqe_sz;
+
+ caps->mtpt_entry_sz = resp_b->mtpt_entry_sz;
+ caps->irrl_entry_sz = resp_b->irrl_entry_sz;
+ caps->trrl_entry_sz = resp_b->trrl_entry_sz;
+ caps->cqc_entry_sz = resp_b->cqc_entry_sz;
+ caps->srqc_entry_sz = resp_b->srqc_entry_sz;
+ caps->idx_entry_sz = resp_b->idx_entry_sz;
+ caps->sccc_sz = resp_b->sccc_sz;
+ caps->max_mtu = resp_b->max_mtu;
+ caps->qpc_sz = le16_to_cpu(resp_b->qpc_sz);
+ caps->min_cqes = resp_b->min_cqes;
+ caps->min_wqes = resp_b->min_wqes;
+ caps->page_size_cap = le32_to_cpu(resp_b->page_size_cap);
+ caps->pkey_table_len[0] = resp_b->pkey_table_len;
+ caps->phy_num_uars = resp_b->phy_num_uars;
+ ctx_hop_num = resp_b->ctx_hop_num;
+ pbl_hop_num = resp_b->pbl_hop_num;
caps->num_pds = 1 << hr_reg_read(resp_c, PF_CAPS_C_NUM_PDS);
@@ -2525,8 +2432,6 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev)
caps->ceqe_depth = 1 << hr_reg_read(resp_d, PF_CAPS_D_CEQ_DEPTH);
caps->num_comp_vectors = hr_reg_read(resp_d, PF_CAPS_D_NUM_CEQS);
caps->aeqe_depth = 1 << hr_reg_read(resp_d, PF_CAPS_D_AEQ_DEPTH);
- caps->default_aeq_arm_st = hr_reg_read(resp_d, PF_CAPS_D_AEQ_ARM_ST);
- caps->default_ceq_arm_st = hr_reg_read(resp_d, PF_CAPS_D_CEQ_ARM_ST);
caps->reserved_pds = hr_reg_read(resp_d, PF_CAPS_D_RSV_PDS);
caps->num_uars = 1 << hr_reg_read(resp_d, PF_CAPS_D_NUM_UARS);
caps->reserved_qps = hr_reg_read(resp_d, PF_CAPS_D_RSV_QPS);
@@ -2537,10 +2442,6 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev)
caps->reserved_cqs = hr_reg_read(resp_e, PF_CAPS_E_RSV_CQS);
caps->reserved_srqs = hr_reg_read(resp_e, PF_CAPS_E_RSV_SRQS);
caps->reserved_lkey = hr_reg_read(resp_e, PF_CAPS_E_RSV_LKEYS);
- caps->default_ceq_max_cnt = le16_to_cpu(resp_e->ceq_max_cnt);
- caps->default_ceq_period = le16_to_cpu(resp_e->ceq_period);
- caps->default_aeq_max_cnt = le16_to_cpu(resp_e->aeq_max_cnt);
- caps->default_aeq_period = le16_to_cpu(resp_e->aeq_period);
caps->qpc_hop_num = ctx_hop_num;
caps->sccc_hop_num = ctx_hop_num;
@@ -2557,6 +2458,20 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev)
if (!(caps->page_size_cap & PAGE_SIZE))
caps->page_size_cap = HNS_ROCE_V2_PAGE_SIZE_SUPPORTED;
+
+ if (!hr_dev->is_vf) {
+ caps->cqe_sz = resp_a->cqe_sz;
+ caps->qpc_sz = le16_to_cpu(resp_b->qpc_sz);
+ caps->default_aeq_arm_st =
+ hr_reg_read(resp_d, PF_CAPS_D_AEQ_ARM_ST);
+ caps->default_ceq_arm_st =
+ hr_reg_read(resp_d, PF_CAPS_D_CEQ_ARM_ST);
+ caps->default_ceq_max_cnt = le16_to_cpu(resp_e->ceq_max_cnt);
+ caps->default_ceq_period = le16_to_cpu(resp_e->ceq_period);
+ caps->default_aeq_max_cnt = le16_to_cpu(resp_e->aeq_max_cnt);
+ caps->default_aeq_period = le16_to_cpu(resp_e->aeq_period);
+ }
+
return 0;
}
@@ -2626,7 +2541,11 @@ static int hns_roce_v2_vf_profile(struct hns_roce_dev *hr_dev)
hr_dev->func_num = 1;
- set_default_caps(hr_dev);
+ ret = hns_roce_query_caps(hr_dev);
+ if (ret) {
+ dev_err(dev, "failed to query VF caps, ret = %d.\n", ret);
+ return ret;
+ }
ret = hns_roce_query_vf_resource(hr_dev);
if (ret) {
@@ -2666,9 +2585,11 @@ static int hns_roce_v2_pf_profile(struct hns_roce_dev *hr_dev)
return ret;
}
- ret = hns_roce_query_pf_caps(hr_dev);
- if (ret)
- set_default_caps(hr_dev);
+ ret = hns_roce_query_caps(hr_dev);
+ if (ret) {
+ dev_err(dev, "failed to query PF caps, ret = %d.\n", ret);
+ return ret;
+ }
ret = hns_roce_query_pf_resource(hr_dev);
if (ret) {
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index 90401577865e..e5f3a4639bf3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -35,46 +35,16 @@
#include <linux/bitops.h>
-#define HNS_ROCE_V2_MAX_QP_NUM 0x1000
-#define HNS_ROCE_V2_MAX_QPC_TIMER_NUM 0x200
-#define HNS_ROCE_V2_MAX_WQE_NUM 0x8000
-#define HNS_ROCE_V2_MAX_SRQ_WR 0x8000
-#define HNS_ROCE_V2_MAX_SRQ_SGE 64
-#define HNS_ROCE_V2_MAX_CQ_NUM 0x100000
-#define HNS_ROCE_V2_MAX_CQC_TIMER_BT_NUM 0x100
-#define HNS_ROCE_V2_MAX_SRQ_NUM 0x100000
-#define HNS_ROCE_V2_MAX_CQE_NUM 0x400000
-#define HNS_ROCE_V2_MAX_RQ_SGE_NUM 64
-#define HNS_ROCE_V2_MAX_SQ_SGE_NUM 64
-#define HNS_ROCE_V2_MAX_EXTEND_SGE_NUM 0x200000
-#define HNS_ROCE_V2_MAX_SQ_INLINE 0x20
-#define HNS_ROCE_V3_MAX_SQ_INLINE 0x400
#define HNS_ROCE_V2_MAX_RC_INL_INN_SZ 32
-#define HNS_ROCE_V2_UAR_NUM 256
-#define HNS_ROCE_V2_PHY_UAR_NUM 1
+#define HNS_ROCE_V2_MTT_ENTRY_SZ 64
#define HNS_ROCE_V2_AEQE_VEC_NUM 1
#define HNS_ROCE_V2_ABNORMAL_VEC_NUM 1
-#define HNS_ROCE_V2_MAX_MTPT_NUM 0x100000
#define HNS_ROCE_V2_MAX_MTT_SEGS 0x1000000
#define HNS_ROCE_V2_MAX_SRQWQE_SEGS 0x1000000
#define HNS_ROCE_V2_MAX_IDX_SEGS 0x1000000
-#define HNS_ROCE_V2_MAX_PD_NUM 0x1000000
#define HNS_ROCE_V2_MAX_XRCD_NUM 0x1000000
#define HNS_ROCE_V2_RSV_XRCD_NUM 0
-#define HNS_ROCE_V2_MAX_QP_INIT_RDMA 128
-#define HNS_ROCE_V2_MAX_QP_DEST_RDMA 128
-#define HNS_ROCE_V2_MAX_SQ_DESC_SZ 64
-#define HNS_ROCE_V2_MAX_RQ_DESC_SZ 16
-#define HNS_ROCE_V2_MAX_SRQ_DESC_SZ 64
-#define HNS_ROCE_V2_IRRL_ENTRY_SZ 64
-#define HNS_ROCE_V2_EXT_ATOMIC_TRRL_ENTRY_SZ 100
-#define HNS_ROCE_V2_CQC_ENTRY_SZ 64
-#define HNS_ROCE_V2_SRQC_ENTRY_SZ 64
-#define HNS_ROCE_V2_MTPT_ENTRY_SZ 64
-#define HNS_ROCE_V2_MTT_ENTRY_SZ 64
-#define HNS_ROCE_V2_IDX_ENTRY_SZ 4
-#define HNS_ROCE_V2_SCCC_SZ 32
#define HNS_ROCE_V3_SCCC_SZ 64
#define HNS_ROCE_V3_GMV_ENTRY_SZ 32
@@ -242,6 +212,7 @@ enum hns_roce_opcode_type {
HNS_ROCE_OPC_QUERY_FUNC_INFO = 0x8407,
HNS_ROCE_OPC_QUERY_PF_CAPS_NUM = 0x8408,
HNS_ROCE_OPC_CFG_ENTRY_SIZE = 0x8409,
+ HNS_ROCE_OPC_QUERY_VF_CAPS_NUM = 0x8410,
HNS_ROCE_OPC_CFG_SGID_TB = 0x8500,
HNS_ROCE_OPC_CFG_SMAC_TB = 0x8501,
HNS_ROCE_OPC_POST_MB = 0x8504,
--
2.30.0
1
0

[RFC PATCH openEuler-1.0-LTS] sched: memqos: add memqos for dynamic affinity
by Wang ShaoBo 22 Mar '23
by Wang ShaoBo 22 Mar '23
22 Mar '23
Add debug memband interface to dynamic affinity, this would be
useful for those threads sensitive to memory bandwidth.
Signed-off-by: Wang ShaoBo <bobo.shaobowang(a)huawei.com>
---
arch/arm64/include/asm/mpam.h | 2 +
arch/arm64/include/asm/mpam_sched.h | 2 +
arch/arm64/kernel/mpam/mpam_device.c | 58 ++-
arch/arm64/kernel/mpam/mpam_resctrl.c | 65 ++++
include/linux/memqos.h | 142 +++++++
include/linux/sched.h | 14 +-
include/linux/sysctl.h | 2 +
kernel/cgroup/cpuset.c | 1 +
kernel/exit.c | 3 +
kernel/fork.c | 4 +
kernel/sched/Makefile | 1 +
kernel/sched/core.c | 29 +-
kernel/sched/fair.c | 14 +-
kernel/sched/memqos/Makefile | 6 +
kernel/sched/memqos/memqos.c | 297 +++++++++++++++
kernel/sched/memqos/phase_feature_sysctl.c | 126 +++++++
kernel/sched/memqos/phase_memband.c | 145 ++++++++
kernel/sched/memqos/phase_perf.c | 409 +++++++++++++++++++++
kernel/sched/memqos/phase_sim_knn.c | 92 +++++
kernel/sysctl.c | 7 +
mm/mempolicy.c | 10 +-
21 files changed, 1409 insertions(+), 20 deletions(-)
create mode 100644 include/linux/memqos.h
create mode 100644 kernel/sched/memqos/Makefile
create mode 100644 kernel/sched/memqos/memqos.c
create mode 100644 kernel/sched/memqos/phase_feature_sysctl.c
create mode 100644 kernel/sched/memqos/phase_memband.c
create mode 100644 kernel/sched/memqos/phase_perf.c
create mode 100644 kernel/sched/memqos/phase_sim_knn.c
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
index 6338eab817e75..269a91d8ca907 100644
--- a/arch/arm64/include/asm/mpam.h
+++ b/arch/arm64/include/asm/mpam.h
@@ -4,6 +4,8 @@
#ifdef CONFIG_MPAM
extern int mpam_rmid_to_partid_pmg(int rmid, int *partid, int *pmg);
+
+void mpam_component_config_mbwu_mon(int partid, int pmg, int monitor, int *result, int nr);
#endif
#endif /* _ASM_ARM64_MPAM_H */
diff --git a/arch/arm64/include/asm/mpam_sched.h b/arch/arm64/include/asm/mpam_sched.h
index 08ed349b6efa1..32d08cf654b31 100644
--- a/arch/arm64/include/asm/mpam_sched.h
+++ b/arch/arm64/include/asm/mpam_sched.h
@@ -40,6 +40,8 @@ static inline void mpam_sched_in(void)
__mpam_sched_in();
}
+void __mpam_sched_in_v2(struct task_struct *tsk);
+
#else
static inline void mpam_sched_in(void) {}
diff --git a/arch/arm64/kernel/mpam/mpam_device.c b/arch/arm64/kernel/mpam/mpam_device.c
index 6455c69f132fd..48de3982a0b9a 100644
--- a/arch/arm64/kernel/mpam/mpam_device.c
+++ b/arch/arm64/kernel/mpam/mpam_device.c
@@ -84,14 +84,14 @@ void mpam_class_list_lock_held(void)
static inline u32 mpam_read_reg(struct mpam_device *dev, u16 reg)
{
WARN_ON_ONCE(reg > SZ_MPAM_DEVICE);
- assert_spin_locked(&dev->lock);
+ //assert_spin_locked(&dev->lock);
/*
* If we touch a device that isn't accessible from this CPU we may get
* an external-abort.
*/
- WARN_ON_ONCE(preemptible());
- WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
+ //WARN_ON_ONCE(preemptible());
+ //WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
return readl_relaxed(dev->mapped_hwpage + reg);
}
@@ -99,14 +99,14 @@ static inline u32 mpam_read_reg(struct mpam_device *dev, u16 reg)
static inline void mpam_write_reg(struct mpam_device *dev, u16 reg, u32 val)
{
WARN_ON_ONCE(reg > SZ_MPAM_DEVICE);
- assert_spin_locked(&dev->lock);
+ //assert_spin_locked(&dev->lock);
/*
* If we touch a device that isn't accessible from this CPU we may get
* an external-abort. If we're lucky, we corrupt another mpam:component.
*/
- WARN_ON_ONCE(preemptible());
- WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
+ //WARN_ON_ONCE(preemptible());
+ //WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &dev->fw_affinity));
writel_relaxed(val, dev->mapped_hwpage + reg);
}
@@ -1208,6 +1208,7 @@ static u32 mpam_device_read_mbwu_mon(struct mpam_device *dev,
{
u16 mon;
u32 clt, flt, cur_clt, cur_flt;
+ u32 total = 0;
mon = args->mon;
@@ -1249,7 +1250,12 @@ static u32 mpam_device_read_mbwu_mon(struct mpam_device *dev,
wmb();
}
- return mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ total += mpam_read_reg(dev, MSMON_MBWU);
+ return total / 5;
}
static int mpam_device_frob_mon(struct mpam_device *dev,
@@ -1470,6 +1476,44 @@ static void mpam_component_device_sync(void *__ctx)
cpumask_set_cpu(smp_processor_id(), &ctx->updated_on);
}
+static DEFINE_SPINLOCK(mpam_tmp_lock);
+
+void mpam_component_config_mbwu_mon(int partid, int pmg, int monitor, int *result, int nr)
+{
+ struct mpam_class *class;
+ struct mpam_component *comp;
+ struct mpam_device *dev;
+ struct sync_args args;
+ int i = 0;
+
+ args.pmg = pmg;
+ args.mon = monitor;
+ args.closid.reqpartid = partid;
+ args.match_pmg = 1;
+
+ spin_lock(&mpam_tmp_lock);
+ list_for_each_entry(class, &mpam_classes, classes_list) {
+ if (class->type != MPAM_CLASS_MEMORY)
+ continue;
+
+ list_for_each_entry(comp, &class->components, class_list) {
+ if (i >= nr) {
+ pr_err_once("error, i > result nr");
+ break;
+ }
+ result[i] = 0;
+ list_for_each_entry(dev, &comp->devices, comp_list) {
+ result[i] += mpam_device_read_mbwu_mon(dev, &args);
+ }
+ i++;
+ }
+ break;
+ }
+ spin_unlock(&mpam_tmp_lock);
+
+}
+EXPORT_SYMBOL(mpam_component_config_mbwu_mon);
+
/**
* in some cases/platforms the MSC register access is only possible with
* the associated CPUs. And need to check if those CPUS are online before
diff --git a/arch/arm64/kernel/mpam/mpam_resctrl.c b/arch/arm64/kernel/mpam/mpam_resctrl.c
index 60d3d8706a38b..f4d87964616f2 100644
--- a/arch/arm64/kernel/mpam/mpam_resctrl.c
+++ b/arch/arm64/kernel/mpam/mpam_resctrl.c
@@ -2226,6 +2226,71 @@ int mpam_resctrl_init(void)
return resctrl_group_init();
}
+
+void __mpam_sched_in_v2(struct task_struct *tsk)
+{
+ struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
+ u64 partid_d, partid_i;
+ u64 rmid = state->default_rmid;
+ u64 closid = state->default_closid;
+ u64 reqpartid = 0;
+ u64 pmg = 0;
+
+ /*
+ * If this task has a closid/rmid assigned, use it.
+ * Else use the closid/rmid assigned to this cpu.
+ */
+ if (static_branch_likely(&resctrl_alloc_enable_key)) {
+ if (tsk->closid)
+ closid = tsk->closid;
+ }
+
+ if (static_branch_likely(&resctrl_mon_enable_key)) {
+ if (tsk->rmid)
+ rmid = tsk->rmid;
+ }
+
+ if (closid != state->cur_closid || rmid != state->cur_rmid) {
+ u64 reg;
+
+ resctrl_navie_rmid_partid_pmg(rmid, (int *)&reqpartid, (int *)&pmg);
+
+ if (resctrl_cdp_enabled) {
+ resctrl_cdp_mpamid_map_val(reqpartid, CDP_DATA, partid_d);
+ resctrl_cdp_mpamid_map_val(reqpartid, CDP_CODE, partid_i);
+
+ /* set in EL0 */
+ reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
+ reg = PARTID_D_SET(reg, partid_d);
+ reg = PARTID_I_SET(reg, partid_i);
+ reg = PMG_SET(reg, pmg);
+ mpam_write_sysreg_s(reg, SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
+
+ /* set in EL1 */
+ reg = mpam_read_sysreg_s(SYS_MPAM1_EL1, "SYS_MPAM1_EL1");
+ reg = PARTID_D_SET(reg, partid_d);
+ reg = PARTID_I_SET(reg, partid_i);
+ reg = PMG_SET(reg, pmg);
+ mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1");
+ } else {
+ /* set in EL0 */
+ reg = mpam_read_sysreg_s(SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
+ reg = PARTID_SET(reg, reqpartid);
+ reg = PMG_SET(reg, pmg);
+ mpam_write_sysreg_s(reg, SYS_MPAM0_EL1, "SYS_MPAM0_EL1");
+
+ /* set in EL1 */
+ reg = mpam_read_sysreg_s(SYS_MPAM1_EL1, "SYS_MPAM1_EL1");
+ reg = PARTID_SET(reg, reqpartid);
+ reg = PMG_SET(reg, pmg);
+ mpam_write_sysreg_s(reg, SYS_MPAM1_EL1, "SYS_MPAM1_EL1");
+ }
+
+ state->cur_rmid = rmid;
+ state->cur_closid = closid;
+ }
+}
+
/*
* __intel_rdt_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
*
diff --git a/include/linux/memqos.h b/include/linux/memqos.h
new file mode 100644
index 0000000000000..814e9935590d3
--- /dev/null
+++ b/include/linux/memqos.h
@@ -0,0 +1,142 @@
+#ifndef _MEMQOS_H
+#define _MEMQOS_H
+
+#include <linux/vmstat.h>
+#include <linux/rbtree.h>
+//#include <linux/sched.h>
+
+struct task_struct;
+
+struct memqos_domain {
+ int dom_id;
+ int total_memband_div_10;
+ int total_out_memband_div_10;
+
+ //record 10 timers
+ int memband_ringpos;
+ int memband_div_10_history[4][10];
+};
+
+struct memqos_mpam_profile {
+ int partid;
+ int pmg;
+ int monitor;
+
+ struct task_struct *tsk;
+ int used;
+};
+
+struct memqos_wait_profile {
+ struct memqos_mpam_profile *profile;
+ struct list_head wait_list;
+};
+
+struct memqos_class {
+ struct list_head turbo_list;
+ struct list_head tasks_list;
+};
+
+#include <linux/topology.h>
+//embed in task_struct
+
+struct task_memqos {
+ int ipc_ringpos;
+ int ipcx10;
+ int ipcx10_total[4];
+ int ipcx10_history[10];
+
+ int memband_div_10;
+ int memband_ringpos;
+ int memband_div_10_total[4];
+ int memband_div_10_history[4][10];
+
+ u32 sample_times;
+ int account_ready;
+ int numa_score[4];
+ int turbo;
+
+ struct memqos_wait_profile mpam_profile;
+
+ struct list_head turbo_list;
+ struct list_head task_list;
+
+ struct cpumask *advise_mem_node_mask;
+ int preferred_nid;
+
+ int class_id;
+
+ int corrupt;
+};
+
+#define PHASE_PEVENT_NUM 10
+
+struct phase_event_pcount {
+ u64 data[PHASE_PEVENT_NUM];
+};
+
+struct phase_event_count {
+ struct phase_event_pcount pcount;
+};
+
+void phase_update_mpam_label(struct task_struct *tsk);
+
+void phase_release_mpam_label(struct task_struct *tsk);
+
+static inline void memqos_update_mpam_label(struct task_struct *tsk)
+{
+ phase_update_mpam_label(tsk);
+}
+
+static inline void memqos_release_mpam_label(struct task_struct *tsk)
+{
+ phase_release_mpam_label(tsk);
+}
+
+void phase_destroy_waitqueue(struct task_struct *tsk);
+
+void phase_get_memband(struct memqos_mpam_profile *pm, int *result, int nr);
+
+DECLARE_STATIC_KEY_FALSE(sched_phase);
+DECLARE_STATIC_KEY_FALSE(sched_phase_printk);
+
+int phase_perf_create(void);
+
+void phase_perf_release(void);
+
+void memqos_account_task(struct task_struct *p, int cpu);
+
+void memqos_drop_class(struct task_struct *p);
+
+void phase_account_task(struct task_struct *p, int cpu);
+
+static inline void memqos_task_collect_data(struct task_struct *p, int cpu)
+{
+ phase_account_task(p, cpu);
+}
+
+static inline void memqos_task_account(struct task_struct *p, int cpu)
+{
+ memqos_account_task(p, cpu);
+}
+
+static inline void memqos_task_exit(struct task_struct *p)
+{
+
+ memqos_drop_class(p);
+ phase_destroy_waitqueue(p);
+}
+
+void memqos_select_nicest_cpus(struct task_struct *p);
+
+void memqos_exclude_low_level_task_single(struct task_struct *p);
+
+int knn_get_tag(int ipcx10, int memband_div_10);
+
+void memqos_init_class(struct task_struct *p);
+
+void phase_trace_printk(struct task_struct *p);
+static inline void memqos_trace_printk(struct task_struct *p)
+{
+ phase_trace_printk(p);
+}
+#endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 928186f161000..5f710dc5bc03b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -29,6 +29,7 @@
#include <linux/task_io_accounting.h>
#include <linux/rseq.h>
#include <linux/thread_bits.h>
+#include <linux/memqos.h>
/* task_struct member predeclarations (sorted alphabetically): */
struct audit_context;
@@ -1268,7 +1269,7 @@ struct task_struct {
#if !defined(__GENKSYMS__)
#if defined(CONFIG_QOS_SCHED_DYNAMIC_AFFINITY)
cpumask_t *prefer_cpus;
- const cpumask_t *select_cpus;
+ cpumask_t *select_cpus;
#else
KABI_RESERVE(6)
KABI_RESERVE(7)
@@ -1279,6 +1280,10 @@ struct task_struct {
#endif
KABI_RESERVE(8)
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+ struct task_memqos sched_memqos;
+#endif
+
/* CPU-specific state of this task: */
struct thread_struct thread;
@@ -1998,6 +2003,13 @@ int set_prefer_cpus_ptr(struct task_struct *p,
const struct cpumask *new_mask);
int sched_prefer_cpus_fork(struct task_struct *p, struct task_struct *orig);
void sched_prefer_cpus_free(struct task_struct *p);
+static inline bool prefer_cpus_valid(struct task_struct *p)
+{
+ return p->prefer_cpus &&
+ !cpumask_empty(p->prefer_cpus) &&
+ !cpumask_equal(p->prefer_cpus, &p->cpus_allowed) &&
+ cpumask_subset(p->prefer_cpus, &p->cpus_allowed);
+}
#endif
#endif
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index b769ecfcc3bd4..73bce39107cb3 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -230,6 +230,8 @@ static inline void setup_sysctl_set(struct ctl_table_set *p,
#endif /* CONFIG_SYSCTL */
+extern struct ctl_table phase_table[];
+
int sysctl_max_threads(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos);
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 55bfbc4cdb16c..d94a9065a5605 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -106,6 +106,7 @@ struct cpuset {
nodemask_t mems_allowed;
#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
cpumask_var_t prefer_cpus;
+ int mem_turbo;
#endif
/* effective CPUs and Memory Nodes allow to tasks */
diff --git a/kernel/exit.c b/kernel/exit.c
index 2a32d32bdc03d..b731c19618176 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -699,6 +699,8 @@ static void check_stack_usage(void)
static inline void check_stack_usage(void) {}
#endif
+#include <linux/memqos.h>
+
void __noreturn do_exit(long code)
{
struct task_struct *tsk = current;
@@ -806,6 +808,7 @@ void __noreturn do_exit(long code)
* because of cgroup mode, must be called before cgroup_exit()
*/
perf_event_exit_task(tsk);
+ memqos_task_exit(tsk);
sched_autogroup_exit_task(tsk);
cgroup_exit(tsk);
diff --git a/kernel/fork.c b/kernel/fork.c
index b5453a26655e2..0a762b92dc814 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -841,6 +841,8 @@ void set_task_stack_end_magic(struct task_struct *tsk)
*stackend = STACK_END_MAGIC; /* for overflow detection */
}
+
+#include <linux/memqos.h>
static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
{
struct task_struct *tsk;
@@ -923,6 +925,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
kcov_task_init(tsk);
+ memqos_init_class(tsk);
+
#ifdef CONFIG_FAULT_INJECTION
tsk->fail_nth = 0;
#endif
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 7fe183404c383..471380d6686e3 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_CPU_FREQ) += cpufreq.o
obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o
obj-$(CONFIG_MEMBARRIER) += membarrier.o
obj-$(CONFIG_CPU_ISOLATION) += isolation.o
+obj-$(CONFIG_QOS_SCHED_DYNAMIC_AFFINITY) += memqos/
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 970616070da86..1171025aaa440 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2787,6 +2787,8 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
calculate_sigpending();
}
+#include <linux/memqos.h>
+
/*
* context_switch - switch to the new MM and the new thread's register state.
*/
@@ -2794,6 +2796,8 @@ static __always_inline struct rq *
context_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next, struct rq_flags *rf)
{
+ struct rq *ret;
+
prepare_task_switch(rq, prev, next);
/*
@@ -2837,6 +2841,18 @@ context_switch(struct rq *rq, struct task_struct *prev,
}
}
+ //account and release
+ memqos_task_account(prev, smp_processor_id());
+
+ if (prefer_cpus_valid(prev))
+ memqos_trace_printk(prev);
+
+ memqos_release_mpam_label(prev);
+
+ //label new task's mpamid
+ if (prefer_cpus_valid(next))
+ memqos_update_mpam_label(next);
+
rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
prepare_lock_switch(rq, next, rf);
@@ -2845,7 +2861,9 @@ context_switch(struct rq *rq, struct task_struct *prev,
switch_to(prev, next, prev);
barrier();
- return finish_task_switch(prev);
+ ret = finish_task_switch(prev);
+
+ return ret;
}
/*
@@ -3058,8 +3076,12 @@ unsigned long long task_sched_runtime(struct task_struct *p)
void scheduler_tick(void)
{
int cpu = smp_processor_id();
+ //memqos clooect next cpu's memband and perf
+ int cpu_memqos = (cpu + 1) % nr_cpu_ids;
struct rq *rq = cpu_rq(cpu);
+ struct rq *rq_next = cpu_rq(cpu_memqos);
struct task_struct *curr = rq->curr;
+ struct task_struct *curr_memqos = rq_next->curr;
struct rq_flags rf;
sched_clock_tick();
@@ -3075,6 +3097,10 @@ void scheduler_tick(void)
perf_event_task_tick();
+ //only monitor task enabled dynamic affinity
+ if (curr_memqos && prefer_cpus_valid(curr_memqos))
+ memqos_task_collect_data(curr_memqos, cpu_memqos);
+
#ifdef CONFIG_SMP
rq->idle_balance = idle_cpu(cpu);
trigger_load_balance(rq);
@@ -3524,6 +3550,7 @@ static void __sched notrace __schedule(bool preempt)
/* Also unlocks the rq: */
rq = context_switch(rq, prev, next, &rf);
} else {
+ memqos_task_account(prev, smp_processor_id());
rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
rq_unlock_irq(rq, &rf);
}
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index af55a26d11fcb..12e9675495d2c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6675,6 +6675,7 @@ static int wake_cap(struct task_struct *p, int cpu, int prev_cpu)
}
#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+#include <linux/memqos.h>
/*
* Low utilization threshold for CPU
*
@@ -6749,14 +6750,6 @@ static inline int cpu_vutil_of(int cpu)
return cputime->vutil;
}
-static inline bool prefer_cpus_valid(struct task_struct *p)
-{
- return p->prefer_cpus &&
- !cpumask_empty(p->prefer_cpus) &&
- !cpumask_equal(p->prefer_cpus, &p->cpus_allowed) &&
- cpumask_subset(p->prefer_cpus, &p->cpus_allowed);
-}
-
/*
* set_task_select_cpus: select the cpu range for task
* @p: the task whose available cpu range will to set
@@ -6828,8 +6821,13 @@ static void set_task_select_cpus(struct task_struct *p, int *idlest_cpu,
if (util_avg_sum < sysctl_sched_util_low_pct *
cpumask_weight(p->prefer_cpus)) {
p->select_cpus = p->prefer_cpus;
+ memqos_select_nicest_cpus(p);
if (sd_flag & SD_BALANCE_WAKE)
schedstat_inc(p->se.dyn_affi_stats->nr_wakeups_preferred_cpus);
+ } else {
+ //select trubo task
+ //select low class task
+ memqos_exclude_low_level_task_single(p);
}
}
#endif
diff --git a/kernel/sched/memqos/Makefile b/kernel/sched/memqos/Makefile
new file mode 100644
index 0000000000000..ed8f42649a8a7
--- /dev/null
+++ b/kernel/sched/memqos/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+# These files are disabled because they produce non-interesting flaky coverage
+# that is not a function of syscall inputs. E.g. involuntary context switches.
+KCOV_INSTRUMENT := n
+
+obj-y := memqos.o phase_feature_sysctl.o phase_memband.o phase_perf.o phase_sim_knn.o
diff --git a/kernel/sched/memqos/memqos.c b/kernel/sched/memqos/memqos.c
new file mode 100644
index 0000000000000..ddf8785439aa6
--- /dev/null
+++ b/kernel/sched/memqos/memqos.c
@@ -0,0 +1,297 @@
+#include <linux/memqos.h>
+#include <linux/cpumask.h>
+#include <linux/sched.h>
+
+static void memqos_set_task_classid(struct task_struct *p)
+{
+ int class_id;
+ int memband_div_10 = p->sched_memqos.memband_div_10;
+ int ipcx10 = p->sched_memqos.ipcx10;
+
+ class_id = knn_get_tag((u64)ipcx10, (u64)memband_div_10);
+ p->sched_memqos.class_id = class_id;
+}
+
+//static memqos_domain mq_domains[] = {
+// {.dom_id = 0, .total_memband = 0, .total_out_memband = 0,},
+// {.dom_id = 1, .total_memband = 0, .total_out_memband = 0,},
+// {.dom_id = 2, .total_memband = 0, .total_out_memband = 0,},
+// {.dom_id = 3, .total_memband = 0, .total_out_memband = 0,},
+//};
+
+static DEFINE_PER_CPU(struct memqos_class, memqos_classes[8]);
+//static DEFINE_PER_CPU(spinlock_t, memqos_class_lock);
+static DEFINE_SPINLOCK(memqos_class_lock);
+
+static int memqos_class_online(unsigned int cpu)
+{
+ int class_id = 0;
+ struct memqos_class *class;
+
+ for (class_id = 0; class_id < 8; class_id++) {
+ class = &per_cpu(memqos_classes, cpu)[class_id];
+ INIT_LIST_HEAD(&class->tasks_list);
+ INIT_LIST_HEAD(&class->turbo_list);
+ }
+ return 0;
+}
+
+static int memqos_class_offline(unsigned int cpu)
+{
+ return 0;
+}
+
+#include <linux/cpu.h>
+#include <linux/cacheinfo.h>
+
+static void memqos_init(void)
+{
+ int cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "memqos:online", memqos_class_online,
+ memqos_class_offline);
+ if (cpuhp_state <= 0) {
+ pr_err("Failed to register 'dyn' cpuhp callbacks");
+ return;
+ }
+}
+late_initcall(memqos_init);
+
+static void memqos_insert_to_class(struct task_struct *p, int cpu)
+{
+ unsigned long flag;
+ int class_id = p->sched_memqos.class_id;
+ struct memqos_class *class;
+ struct task_memqos *memqos;
+
+ if (class_id >= 8)
+ return;
+
+ memqos = &p->sched_memqos;
+
+ class = &per_cpu(memqos_classes, cpu)[class_id];
+
+ spin_lock_irqsave(&memqos_class_lock, flag);
+ if (p->sched_memqos.corrupt) {
+ spin_unlock_irqrestore(&memqos_class_lock, flag);
+ return;
+ }
+
+ //pr_info("count:%d %d add (%llx) %llx %llx to list %llx!!!!!!!!!!!!!\n", count, p->pid, &p->sched_memqos.task_list, p->sched_memqos.task_list.next, p->sched_memqos.task_list.prev, &class->tasks_list);
+ list_move_tail(&p->sched_memqos.task_list, &class->tasks_list);
+ if (memqos->turbo)
+ list_move_tail(&p->sched_memqos.turbo_list, &class->turbo_list);
+ spin_unlock_irqrestore(&memqos_class_lock, flag);
+}
+
+static void memqos_drop_class_without_lock(struct task_struct *p)
+{
+ //pr_info("%d drop (%llx) %llx %llx to list %llx!!!!!!!!!!!!!\n", p->pid, &p->sched_memqos.task_list, p->sched_memqos.task_list.next, p->sched_memqos.task_list.prev);
+ list_del_init(&p->sched_memqos.task_list);
+ list_del_init(&p->sched_memqos.turbo_list);
+}
+
+static void memqos_score(struct task_struct *p)
+{
+ int total_n1 = p->sched_memqos.memband_div_10_total[0];
+ int total_n2 = p->sched_memqos.memband_div_10_total[1];
+ int total_n3 = p->sched_memqos.memband_div_10_total[2];
+ int total_n4 = p->sched_memqos.memband_div_10_total[3];
+
+ p->sched_memqos.numa_score[0] = (total_n1 - (total_n2 + total_n3 + total_n4)) * 10 / total_n1;
+ p->sched_memqos.numa_score[1] = (total_n2 - (total_n1 + total_n3 + total_n4)) * 10 / total_n2;
+ p->sched_memqos.numa_score[2] = (total_n3 - (total_n1 + total_n2 + total_n4)) * 10 / total_n3;
+ p->sched_memqos.numa_score[3] = (total_n4 - (total_n1 + total_n2 + total_n3)) * 10 / total_n4;
+
+ //over x% percent
+ if (p->sched_memqos.numa_score[0] > 0)
+ p->sched_memqos.turbo = 1;
+ else if (p->sched_memqos.numa_score[1] > 0)
+ p->sched_memqos.turbo = 2;
+ else if (p->sched_memqos.numa_score[2] > 0)
+ p->sched_memqos.turbo = 3;
+ else if (p->sched_memqos.numa_score[3] > 0)
+ p->sched_memqos.turbo = 4;
+ else
+ p->sched_memqos.turbo = 0;
+}
+
+void memqos_account_task(struct task_struct *p, int cpu)
+{
+ if (!p->sched_memqos.account_ready ||
+ p->sched_memqos.corrupt)
+ return;
+ memqos_set_task_classid(p);
+ memqos_insert_to_class(p, cpu);
+ memqos_score(p);
+ p->sched_memqos.account_ready = 0;
+}
+
+void memqos_init_class(struct task_struct *p)
+{
+ memset(&p->sched_memqos, 0, sizeof(struct task_memqos));
+ spin_lock(&memqos_class_lock);
+ INIT_LIST_HEAD(&p->sched_memqos.task_list);
+ INIT_LIST_HEAD(&p->sched_memqos.turbo_list);
+ INIT_LIST_HEAD(&p->sched_memqos.mpam_profile.wait_list);
+ spin_unlock(&memqos_class_lock);
+
+ p->closid = 0;
+ p->rmid = 0;
+}
+
+//destroy ?
+void memqos_drop_class(struct task_struct *p)
+{
+ spin_lock(&memqos_class_lock);
+ memqos_drop_class_without_lock(p);
+ p->sched_memqos.corrupt = 1;
+ spin_unlock(&memqos_class_lock);
+}
+
+void memqos_select_nicest_cpus(struct task_struct *p)
+{
+ int i = 0;
+ int max_score = -10000;
+ int select_node = 0;
+ struct task_memqos *memqos = &p->sched_memqos;
+
+ if (!memqos->turbo) {
+ for (i = 0; i < 4; i++) {
+ if (!cpumask_intersects(cpumask_of_node(i), p->select_cpus))
+ continue;
+
+ if (memqos->numa_score[i] > max_score) {
+ select_node = i;
+ max_score = memqos->numa_score[i];
+ }
+ }
+
+ cpumask_and(p->select_cpus, p->select_cpus, cpumask_of_node(select_node));
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ return;
+ }
+
+ select_node = memqos->turbo - 1;
+ if (cpumask_intersects(cpumask_of_node(select_node), p->select_cpus)) {
+ cpumask_and(p->select_cpus, p->select_cpus, cpumask_of_node(select_node));
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ }
+
+ return;
+
+ //if turbo another cpus, wait...
+}
+
+void memqos_exclude_low_level_task_single(struct task_struct *p)
+{
+ int i, j, cpu;
+ int find = 0;
+ int select_node = 0;
+ const struct cpumask *cpumask;
+ struct cpumask *cpumask_med;
+ struct memqos_class *class;
+ struct task_memqos *memqos = &p->sched_memqos;;
+ struct task_struct *tsk = NULL;
+ int max_score = -100000;
+
+ if (memqos->turbo) {
+ select_node = memqos->turbo - 1;
+ cpumask = cpumask_of_node(select_node);
+ if (!cpumask_intersects(cpumask, p->prefer_cpus) &&
+ (cpumask_intersects(&p->cpus_allowed, cpumask))) {
+ cpumask_and(p->select_cpus, &p->cpus_allowed, cpumask);
+ memqos_drop_class(p);
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ return;
+ } else if (cpumask_intersects(p->prefer_cpus, cpumask)) {
+ cpumask_and(p->select_cpus, p->prefer_cpus, cpumask);
+ //p->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ p->sched_memqos.preferred_nid = memqos->turbo;
+ }
+ }
+
+ //select turbo one
+ for (cpu = 0; cpu < nr_cpu_ids; i++) {
+ if (!cpumask_test_cpu(cpu, p->prefer_cpus))
+ continue;
+
+ spin_lock(&memqos_class_lock);
+ for (i = 7; i >= 0; i--) {
+ class = &per_cpu(memqos_classes, cpu)[i];
+ list_for_each_entry(memqos, &class->turbo_list, turbo_list) {
+ if (!memqos->turbo)
+ continue;
+ select_node = memqos->turbo - 1;
+ cpumask = cpumask_of_node(select_node);
+ if (!cpumask_intersects(cpumask, p->prefer_cpus)) {
+ tsk = container_of(memqos, struct task_struct, sched_memqos);
+ if (!cpumask_intersects(cpumask, &tsk->cpus_allowed))
+ continue;
+ cpumask_and(tsk->select_cpus, &tsk->cpus_allowed, cpumask);
+ //mem prefered
+ //tsk->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ tsk->sched_memqos.preferred_nid = memqos->turbo;
+ find = 1;
+ break;
+ }
+ }
+ if (find) {
+ memqos_drop_class_without_lock(tsk);
+ spin_unlock(&memqos_class_lock);
+ return;
+ }
+ }
+ spin_unlock(&memqos_class_lock);
+ }
+
+ find = 0;
+
+ //if not, select lower class's tsk
+ for (cpu = 0; cpu < nr_cpu_ids; i++) {
+ if (!cpumask_test_cpu(cpu, p->prefer_cpus))
+ continue;
+
+ spin_lock(&memqos_class_lock);
+ //only find below class tsk
+ for (i = 0; i < memqos->class_id; i++) {
+ class = &per_cpu(memqos_classes, cpu)[i];
+ list_for_each_entry(memqos, &class->tasks_list, task_list) {
+ if (memqos->turbo)
+ continue;
+
+ tsk = container_of(memqos, struct task_struct, sched_memqos);
+ for (j = 0; j < 4; j++) {
+ if (!cpumask_intersects(cpumask_of_node(i), &tsk->cpus_allowed))
+ continue;
+ if (memqos->numa_score[j] > max_score) {
+ select_node = j;
+ max_score = memqos->numa_score[j];
+ }
+ find = 1;
+ }
+ if (!find)
+ continue;
+
+ cpumask_and(cpumask_med, cpumask_of_node(select_node), &tsk->cpus_allowed);
+ cpumask_andnot(cpumask_med, cpumask_med, p->prefer_cpus);
+ if (cpumask_empty(cpumask_med))
+ continue;
+ cpumask_copy(tsk->select_cpus, cpumask_med);
+ //mem prefered
+ //tsk->sched_memqos.advise_mem_node_mask = cpumask_of_node(select_node);
+ tsk->sched_memqos.preferred_nid = memqos->turbo;
+ memqos_drop_class_without_lock(tsk);
+ spin_unlock(&memqos_class_lock);
+ return;
+ }
+ }
+ spin_unlock(&memqos_class_lock);
+ }
+
+ //do not care, this task may out
+ return;
+}
+
diff --git a/kernel/sched/memqos/phase_feature_sysctl.c b/kernel/sched/memqos/phase_feature_sysctl.c
new file mode 100644
index 0000000000000..443ae03275605
--- /dev/null
+++ b/kernel/sched/memqos/phase_feature_sysctl.c
@@ -0,0 +1,126 @@
+#include <linux/sched.h>
+#include <linux/sysctl.h>
+#include <linux/capability.h>
+#include <linux/cpumask.h>
+#include <linux/topology.h>
+#include <linux/sched/task.h>
+
+#include <linux/memqos.h>
+
+#ifdef CONFIG_PROC_SYSCTL
+
+DEFINE_STATIC_KEY_FALSE(sched_phase);
+DEFINE_STATIC_KEY_FALSE(sched_phase_printk);
+
+static int set_phase_state(bool enabled)
+{
+ int err;
+ int state = static_branch_likely(&sched_phase);
+
+ if (enabled == state) {
+ pr_warn("phase has already %s\n", state ? "enabled" : "disabled");
+ return 0;
+ }
+
+ if (enabled) {
+ err = phase_perf_create();
+ if (err) {
+ pr_err("phase enable failed\n");
+ return err;
+ }
+ static_branch_enable(&sched_phase);
+ pr_info("phase enabled\n");
+ } else {
+ static_branch_disable(&sched_phase);
+ phase_perf_release();
+ pr_info("phase disabled\n");
+ }
+
+ return 0;
+}
+
+/*
+ * the other procfs files of phase cannot be modified if sched_phase is already enabled
+ */
+static int phase_proc_state(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table t;
+ int err;
+ int state = static_branch_likely(&sched_phase);
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ t = *table;
+ t.data = &state;
+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+ if (write)
+ err = set_phase_state(state);
+
+ return err;
+}
+
+static int set_phase_state_printk(bool enabled)
+{
+ if (enabled) {
+ static_branch_enable(&sched_phase_printk);
+ } else {
+ static_branch_disable(&sched_phase_printk);
+ }
+
+ return 0;
+}
+
+/*
+ * the other procfs files of phase cannot be modified if sched_phase is already enabled
+ */
+static int phase_proc_state_printk(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table t;
+ int err;
+ int state = static_branch_likely(&sched_phase);
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ t = *table;
+ t.data = &state;
+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+ if (write)
+ err = set_phase_state_printk(state);
+
+ return err;
+}
+
+
+static int __maybe_unused zero;
+static int __maybe_unused one = 1;
+
+struct ctl_table phase_table[] = {
+ {
+ .procname = "enabled",
+ .data = NULL,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = phase_proc_state,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+ {
+ .procname = "trace_enabled",
+ .data = NULL,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = phase_proc_state_printk,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+ { }
+};
+#endif /* CONFIG_PROC_SYSCTL */
diff --git a/kernel/sched/memqos/phase_memband.c b/kernel/sched/memqos/phase_memband.c
new file mode 100644
index 0000000000000..d83909c8eca45
--- /dev/null
+++ b/kernel/sched/memqos/phase_memband.c
@@ -0,0 +1,145 @@
+#include <linux/types.h>
+#include <linux/cpu.h>
+#include <linux/memqos.h>
+
+#include <asm/cpu.h>
+#include <asm/cputype.h>
+#include <asm/cpufeature.h>
+#include <asm/mpam_sched.h>
+
+static const int nr_partid = 15;
+static const int nr_monitor = 4;
+
+static LIST_HEAD(phase_mpam_waitqueue);
+
+//mpam_profile_res[0] not used
+struct memqos_mpam_profile mpam_profile_res[16] = {
+ { .partid = 0, .monitor = 0, .used = 1},
+ { .partid = 1, .monitor = 0,},
+ { .partid = 2, .monitor = 1,},
+ { .partid = 3, .monitor = 2,},
+ { .partid = 4, .monitor = 3,},
+ { .partid = 5, .monitor = 0,},
+ { .partid = 6, .monitor = 1,},
+ { .partid = 7, .monitor = 2,},
+ { .partid = 8, .monitor = 3,},
+ { .partid = 9, .monitor = 0,},
+ { .partid = 10, .monitor = 1,},
+ { .partid = 11, .monitor = 2,},
+ { .partid = 12, .monitor = 3,},
+ { .partid = 13, .monitor = 0,},
+ { .partid = 14, .monitor = 1,},
+ { .partid = 15, .monitor = 2,},
+};
+
+static DEFINE_SPINLOCK(phase_partid_lock);
+
+void phase_update_mpam_label(struct task_struct *tsk)
+{
+ int i = 0;
+ //unsigned long flag;
+
+ WARN_ON_ONCE(tsk->closid);
+
+ if (tsk->sched_memqos.mpam_profile.profile != &mpam_profile_res[0] &&
+ tsk->sched_memqos.mpam_profile.profile != NULL) {
+ tsk->closid = tsk->sched_memqos.mpam_profile.profile->partid;
+ tsk->rmid = 0;
+ mpam_profile_res[tsk->closid].tsk = tsk;
+ __mpam_sched_in_v2(tsk);
+ return;
+ }
+
+ spin_lock(&phase_partid_lock);
+ //is in profile queue, wait...
+ if (tsk->sched_memqos.mpam_profile.profile == &mpam_profile_res[0]) {
+ spin_unlock(&phase_partid_lock);
+ return;
+ }
+
+ for (i = 1; i < 16; i++) {
+ if (mpam_profile_res[i].used) {
+ continue;
+ }
+
+ tsk->sched_memqos.mpam_profile.profile = NULL;
+ break;
+ }
+
+ if (i == 16) {
+ list_move_tail(&tsk->sched_memqos.mpam_profile.wait_list, &phase_mpam_waitqueue);
+ tsk->sched_memqos.mpam_profile.profile = &mpam_profile_res[0];
+ spin_unlock(&phase_partid_lock);
+ //wait...
+ return;
+ }
+
+ mpam_profile_res[i].used = 1;
+ spin_unlock(&phase_partid_lock);
+
+ tsk->closid = mpam_profile_res[i].partid;
+ mpam_profile_res[i].tsk = tsk;
+ tsk->sched_memqos.mpam_profile.profile = &mpam_profile_res[i];
+ tsk->rmid = 0;
+ __mpam_sched_in_v2(tsk);
+}
+
+static void phase_release_mpam_label_without_lock(struct task_struct *tsk)
+{
+ int closid;
+ struct memqos_wait_profile *next;
+
+ //assert locked
+
+ if (tsk->closid == 0)
+ return;
+
+ closid = tsk->closid;
+ tsk->closid = 0;
+ tsk->sched_memqos.mpam_profile.profile = NULL;
+ mpam_profile_res[closid].used = 0;
+ mpam_profile_res[closid].tsk = NULL;
+
+ next = list_first_entry_or_null(&phase_mpam_waitqueue, struct memqos_wait_profile, wait_list);
+ if (next) {
+ list_del_init(&next->wait_list);
+ next->profile = &mpam_profile_res[closid];
+ mpam_profile_res[closid].used = 1;
+ }
+
+ return;
+}
+
+//task shutdown
+void phase_destroy_waitqueue(struct task_struct *tsk)
+{
+ spin_lock(&phase_partid_lock);
+
+ //if (tsk->sched_memqos.mpam_profile.profile == &mpam_profile_res[0]) {
+ list_del_init(&tsk->sched_memqos.mpam_profile.wait_list);
+ //} else {
+ phase_release_mpam_label_without_lock(tsk);
+ //}
+ spin_unlock(&phase_partid_lock);
+}
+
+void phase_release_mpam_label(struct task_struct *tsk)
+{
+ spin_lock(&phase_partid_lock);
+ phase_release_mpam_label_without_lock(tsk);
+ spin_unlock(&phase_partid_lock);
+}
+
+#include <asm/mpam.h>
+void phase_get_memband(struct memqos_mpam_profile *pm, int *result, int nr)
+{
+ if (pm == &mpam_profile_res[0] || pm == NULL) {
+ result[0] = 0;
+ result[1] = 0;
+ result[2] = 0;
+ result[3] = 0;
+ return;
+ }
+
+ mpam_component_config_mbwu_mon(pm->partid, pm->pmg, pm->monitor, result, nr);
+}
diff --git a/kernel/sched/memqos/phase_perf.c b/kernel/sched/memqos/phase_perf.c
new file mode 100644
index 0000000000000..9b450a20e808f
--- /dev/null
+++ b/kernel/sched/memqos/phase_perf.c
@@ -0,0 +1,409 @@
+#include <linux/kernel.h>
+#include <linux/perf_event.h>
+#include <linux/percpu-defs.h>
+#include <linux/slab.h>
+#include <linux/stop_machine.h>
+#include <linux/memqos.h>
+#include <linux/sched.h>
+
+#define PHASE_FEVENT_NUM 3
+
+int *phase_perf_pevents = NULL;
+
+static DEFINE_PER_CPU(__typeof__(struct perf_event *)[PHASE_PEVENT_NUM], cpu_phase_perf_events);
+
+/******************************************
+ * Helpers for phase perf event
+ *****************************************/
+static inline struct perf_event *perf_event_of_cpu(int cpu, int index)
+{
+ return per_cpu(cpu_phase_perf_events, cpu)[index];
+}
+
+static inline struct perf_event **perf_events_of_cpu(int cpu)
+{
+ return per_cpu(cpu_phase_perf_events, cpu);
+}
+
+static inline u64 perf_event_local_pmu_read(struct perf_event *event)
+{
+ if (event->state == PERF_EVENT_STATE_ACTIVE)
+ event->pmu->read(event);
+ return local64_read(&event->count);
+}
+
+/******************************************
+ * Helpers for cpu counters
+ *****************************************/
+static inline u64 read_cpu_counter(int cpu, int index)
+{
+ struct perf_event *event = perf_event_of_cpu(cpu, index);
+
+ if (!event || !event->pmu)
+ return 0;
+
+ return perf_event_local_pmu_read(event);
+}
+
+static struct perf_event_attr *alloc_attr(int event_id)
+{
+ struct perf_event_attr *attr;
+
+ attr = kzalloc(sizeof(struct perf_event_attr), GFP_KERNEL);
+ if (!attr)
+ return ERR_PTR(-ENOMEM);
+
+ attr->type = PERF_TYPE_RAW;
+ attr->config = event_id;
+ attr->size = sizeof(struct perf_event_attr);
+ attr->pinned = 1;
+ attr->disabled = 1;
+ //attr->exclude_hv;
+ //attr->exclude_idle;
+ //attr->exclude_kernel;
+
+ return attr;
+}
+
+static int create_cpu_counter(int cpu, int event_id, int index)
+{
+ struct perf_event_attr *attr = NULL;
+ struct perf_event **events = perf_events_of_cpu(cpu);
+ struct perf_event *event = NULL;
+
+ attr = alloc_attr(event_id);
+ if (IS_ERR(attr))
+ return PTR_ERR(attr);
+
+ event = perf_event_create_kernel_counter(attr, cpu, NULL, NULL, NULL);
+ if (IS_ERR(event)) {
+ pr_err("unable to create perf event (cpu:%i-type:%d-pinned:%d-config:0x%llx) : %ld",
+ cpu, attr->type, attr->pinned, attr->config, PTR_ERR(event));
+ kfree(attr);
+ return PTR_ERR(event);
+ } else {
+ events[index] = event;
+ perf_event_enable(events[index]);
+ if (event->hw.idx == -1) {
+ pr_err("pinned event unable to get onto hardware, perf event (cpu:%i-type:%d-config:0x%llx)",
+ cpu, attr->type, attr->config);
+ kfree(attr);
+ return -EINVAL;
+ }
+ pr_info("create perf_event (cpu:%i-idx:%d-type:%d-pinned:%d-exclude_hv:%d"
+ "-exclude_idle:%d-exclude_kernel:%d-config:0x%llx-addr:%px)",
+ event->cpu, event->hw.idx,
+ event->attr.type, event->attr.pinned, event->attr.exclude_hv,
+ event->attr.exclude_idle, event->attr.exclude_kernel,
+ event->attr.config, event);
+ }
+
+ kfree(attr);
+ return 0;
+}
+
+static int release_cpu_counter(int cpu, int event_id, int index)
+{
+ struct perf_event **events = perf_events_of_cpu(cpu);
+ struct perf_event *event = NULL;
+
+ event = events[index];
+
+ if (!event)
+ return 0;
+
+ pr_info("release perf_event (cpu:%i-idx:%d-type:%d-pinned:%d-exclude_hv:%d"
+ "-exclude_idle:%d-exclude_kernel:%d-config:0x%llx)",
+ event->cpu, event->hw.idx,
+ event->attr.type, event->attr.pinned, event->attr.exclude_hv,
+ event->attr.exclude_idle, event->attr.exclude_kernel,
+ event->attr.config);
+
+ perf_event_release_kernel(event);
+ events[index] = NULL;
+
+ return 0;
+}
+
+enum {
+ CYCLES_INDEX = 0,
+ INST_RETIRED_INDEX,
+ PHASE_EVENT_FINAL_TERMINATOR
+};
+
+#define CYCLES 0x0011
+#define INST_RETIRED 0x0008
+
+static int pevents[PHASE_PEVENT_NUM] = {
+ CYCLES,
+ INST_RETIRED,
+ PHASE_EVENT_FINAL_TERMINATOR,
+};
+
+#define for_each_phase_pevents(index, events) \
+ for (index = 0; events != NULL && index < PHASE_PEVENT_NUM && \
+ events[index] != PHASE_EVENT_FINAL_TERMINATOR; index++)
+
+
+/******************************************
+ * Helpers for phase perf
+ *****************************************/
+static int do_pevents(int (*fn)(int, int, int), int cpu)
+{
+ int index;
+ int err;
+
+ for_each_phase_pevents(index, phase_perf_pevents) {
+ err = fn(cpu, phase_perf_pevents[index], index);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static int __phase_perf_create(void *args)
+{
+ int err;
+ int cpu = raw_smp_processor_id();
+
+ /* create pinned events */
+ pr_info("create pinned events\n");
+ err = do_pevents(create_cpu_counter, cpu);
+ if (err) {
+ pr_err("create pinned events failed\n");
+ do_pevents(release_cpu_counter, cpu);
+ return err;
+ }
+
+ pr_info("[%d] phase class event create success\n", cpu);
+ return 0;
+}
+
+static int do_phase_perf_create(int *pevents, const struct cpumask *cpus)
+{
+ phase_perf_pevents = pevents;
+ return stop_machine(__phase_perf_create, NULL, cpus);
+}
+
+static int __do_phase_perf_release(void *args)
+{
+ int cpu = raw_smp_processor_id();
+
+ /* release pinned events */
+ pr_info("release pinned events\n");
+ do_pevents(release_cpu_counter, cpu);
+
+ pr_info("[%d] phase class event release success\n", cpu);
+ return 0;
+}
+
+static void do_phase_perf_release(const struct cpumask *cpus)
+{
+ stop_machine(__do_phase_perf_release, NULL, cpus);
+ phase_perf_pevents = NULL;
+}
+
+int phase_perf_create(void)
+{
+ return do_phase_perf_create(pevents, cpu_possible_mask);
+}
+
+void phase_perf_release(void)
+{
+ do_phase_perf_release(cpu_possible_mask);
+}
+
+DECLARE_STATIC_KEY_FALSE(sched_phase);
+DECLARE_STATIC_KEY_FALSE(sched_phase_printk);
+
+#define PHASE_EVENT_OVERFLOW (~0ULL)
+
+static inline u64 phase_event_count_sub(u64 curr, u64 prev)
+{
+ if (curr < prev) { /* ovewrflow */
+ u64 tmp = PHASE_EVENT_OVERFLOW - prev;
+ return curr + tmp;
+ } else {
+ return curr - prev;
+ }
+}
+
+static inline void phase_calc_delta(struct task_struct *p,
+ struct phase_event_count *prev,
+ struct phase_event_count *curr,
+ struct phase_event_count *delta)
+{
+ int *pevents = phase_perf_pevents;
+ int index;
+
+ for_each_phase_pevents(index, pevents) {
+ delta->pcount.data[index] = phase_event_count_sub(curr->pcount.data[index], prev->pcount.data[index]);
+ }
+}
+
+static inline u64 phase_data_of_pevent(struct phase_event_pcount *counter, int event_id)
+{
+ int index;
+ int *events = phase_perf_pevents;
+
+ for_each_phase_pevents(index, events) {
+ if (event_id == events[index])
+ return counter->data[index];
+ }
+
+ return 0;
+}
+
+static int cal_ring_history_average(int *history, int nr, int s_pos, int c_nr)
+{
+ int average = 0;
+ int start = ((s_pos - c_nr) + nr) % nr;
+
+ if (start < 0)
+ return 0;
+
+ for (;start != s_pos;) {
+ if (history[start] == 0) {
+ c_nr--;
+ if (c_nr == 0)
+ return 0;
+ continue;
+ }
+ average += history[start];
+ start = (start + 1) % nr;
+ }
+
+ return start / c_nr;
+}
+
+static void __phase_cal_ipcx10(struct task_struct *p, struct phase_event_count *delta)
+{
+ u64 ins;
+ u64 cycles;
+ //invalid zero
+ int ipcx10 = 0;
+
+ ins = phase_data_of_pevent(&delta->pcount, INST_RETIRED_INDEX);
+ cycles = phase_data_of_pevent(&delta->pcount, CYCLES_INDEX);
+
+ if (cycles)
+ ipcx10 = (ins * 10) / cycles;
+
+ if (static_branch_unlikely(&sched_phase_printk)) {
+ trace_printk("ins:%lld cycles:%lld\n", ins, cycles);
+ }
+
+ p->sched_memqos.ipcx10_history[p->sched_memqos.ipc_ringpos] = ipcx10;
+ p->sched_memqos.ipc_ringpos = (p->sched_memqos.ipc_ringpos + 1) % 10;
+ cal_ring_history_average(p->sched_memqos.ipcx10_history, 10, p->sched_memqos.ipc_ringpos, 5);
+}
+
+static void __phase_cal_memband_div_10(struct task_struct *p)
+{
+ int pos;
+ int result[4];
+
+ pos = p->sched_memqos.memband_ringpos;
+
+ phase_get_memband(p->sched_memqos.mpam_profile.profile, result, 4);
+
+ if (static_branch_unlikely(&sched_phase_printk)) {
+ trace_printk("memband:%d %d %d %d profile:%llx\n", result[0], result[1], result[2], result[3], p->sched_memqos.mpam_profile.profile);
+ }
+
+ p->sched_memqos.memband_div_10_total[0] = p->sched_memqos.memband_div_10_total[0] - p->sched_memqos.memband_div_10_history[0][pos];
+ p->sched_memqos.memband_div_10_total[0] = p->sched_memqos.memband_div_10_total[0] + result[0] / 10;
+ p->sched_memqos.memband_div_10_history[0][p->sched_memqos.memband_ringpos] = result[0] / 10;
+
+ p->sched_memqos.memband_div_10_total[1] = p->sched_memqos.memband_div_10_total[1] - p->sched_memqos.memband_div_10_history[1][pos];
+ p->sched_memqos.memband_div_10_total[1] = p->sched_memqos.memband_div_10_total[1] + result[1] / 10;
+ p->sched_memqos.memband_div_10_history[1][p->sched_memqos.memband_ringpos] = result[1] / 10;
+
+ p->sched_memqos.memband_div_10_total[2] = p->sched_memqos.memband_div_10_total[2] - p->sched_memqos.memband_div_10_history[2][pos];
+ p->sched_memqos.memband_div_10_total[2] = p->sched_memqos.memband_div_10_total[2] + result[2] / 10;
+ p->sched_memqos.memband_div_10_history[2][p->sched_memqos.memband_ringpos] = result[2] / 10;
+
+ p->sched_memqos.memband_div_10_total[3] = p->sched_memqos.memband_div_10_total[3] - p->sched_memqos.memband_div_10_history[3][pos];
+ p->sched_memqos.memband_div_10_total[3] = p->sched_memqos.memband_div_10_total[3] + result[3] / 10;
+ p->sched_memqos.memband_div_10_history[3][p->sched_memqos.memband_ringpos] = result[3] / 10;
+
+ p->sched_memqos.memband_ringpos = (pos + 1) % 10;
+
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[0], 10, pos, 5);
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[1], 10, pos, 5);
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[2], 10, pos, 5);
+ cal_ring_history_average(p->sched_memqos.memband_div_10_history[3], 10, pos, 5);
+}
+
+static DEFINE_PER_CPU(struct phase_event_count, prev_phase_event_count);
+static DEFINE_PER_CPU(struct phase_event_count, curr_phase_event_count);
+
+static void phase_perf_read_events(int cpu, u64 *pdata)
+{
+ int index;
+
+ for_each_phase_pevents(index, phase_perf_pevents) {
+ pdata[index] = read_cpu_counter(cpu, index);
+ }
+}
+
+static inline struct phase_event_count *phase_read_prev(unsigned int cpu)
+{
+ return &per_cpu(prev_phase_event_count, cpu);
+}
+
+static inline struct phase_event_count *phase_read_curr(unsigned int cpu)
+{
+ struct phase_event_count *curr = &per_cpu(curr_phase_event_count, cpu);
+
+ phase_perf_read_events(cpu, curr->pcount.data);
+
+ return curr;
+}
+
+void phase_account_task(struct task_struct *p, int cpu)
+{
+ struct phase_event_count delta;
+ struct phase_event_count *prev, *curr;
+
+ if (!static_branch_likely(&sched_phase))
+ return;
+
+ //if (!sched_core_enabled(cpu_rq(cpu)))
+ // return;
+
+ /* update phase_event_count */
+ prev = phase_read_prev(cpu);
+ curr = phase_read_curr(cpu);
+ phase_calc_delta(p, prev, curr, &delta);
+ *prev = *curr;
+
+ /* calculate phase */
+ __phase_cal_ipcx10(p, &delta);
+ __phase_cal_memband_div_10(p);
+ p->sched_memqos.sample_times++;
+ if ((p->sched_memqos.sample_times % 3) == 0)
+ p->sched_memqos.account_ready = 1;
+}
+
+
+void phase_trace_printk(struct task_struct *p)
+{
+ if (!static_branch_unlikely(&sched_phase_printk))
+ return;
+
+ trace_printk("p->comm:%s(%d) ipcpos:%d ipcx10:%d membandpos:%d memband_div_10:%d numa_score[0]:%d numa_score[1]:%d numa_score[2]:%d numa_score[3]:%d turbo:%d prefered_nid:%d classid:%d partid:%d\n",
+ p->comm, p->pid, p->sched_memqos.ipc_ringpos,\
+ p->sched_memqos.ipcx10, \
+ p->sched_memqos.memband_ringpos,\
+ p->sched_memqos.memband_div_10, \
+ p->sched_memqos.numa_score[0], \
+ p->sched_memqos.numa_score[1], \
+ p->sched_memqos.numa_score[2], \
+ p->sched_memqos.numa_score[3], \
+ p->sched_memqos.turbo, \
+ p->sched_memqos.preferred_nid, \
+ p->sched_memqos.class_id, \
+ p->closid);
+}
diff --git a/kernel/sched/memqos/phase_sim_knn.c b/kernel/sched/memqos/phase_sim_knn.c
new file mode 100644
index 0000000000000..b80bb6b9ae0a3
--- /dev/null
+++ b/kernel/sched/memqos/phase_sim_knn.c
@@ -0,0 +1,92 @@
+#include <linux/types.h>
+
+#define DATA_ROW 20
+void QuickSort(u64 arr[DATA_ROW][2], int L, int R) {
+ int i = L;
+ int j = R;
+ int kk = (L + R) / 2;
+ u64 pivot = arr[kk][0];
+
+ while (i <= j) {
+ while (pivot > arr[i][0]) {
+ i++;
+ }
+ while (pivot < arr[j][0]) {
+ j--;
+ }
+ if (i <= j) {
+ u64 temp = arr[i][0];
+
+ arr[i][0] = arr[j][0];
+ arr[j][0] = temp;
+ i++; j--;
+ }
+ }
+ if (L < j) {
+ QuickSort(arr, L, j);
+ }
+ if (i < R) {
+ QuickSort(arr, i, R);
+ }
+}
+
+u64 euclidean_distance(u64 *row1, u64 *row2, int col) {
+ u64 distance = 0;
+ int i;
+
+ for (i = 0; i < col - 1; i++) {
+ distance += ((row1[i] - row2[i]) * (row1[i] - row2[i]));
+ }
+ return distance;
+}
+
+#define num_neighbors 6
+#define MAX_TAG 8
+
+int get_neighbors_tag(u64 train_data[DATA_ROW][3], int train_row, int col, u64 *test_row) {
+ int i;
+ u64 neighbors[MAX_TAG] = {0};
+ int max_tag = 0;
+ u64 distances[DATA_ROW][2];
+
+ for (i = 0; i < train_row; i++) {
+ distances[i][0] = euclidean_distance(train_data[i], test_row, col);
+ distances[i][1] = train_data[i][col - 1];
+ }
+ QuickSort(distances, 0, train_row - 1);
+ for (i = 0; i < num_neighbors; i++) {
+ neighbors[distances[i][1]]++;
+ if (neighbors[distances[i][1]] > neighbors[max_tag])
+ max_tag = distances[i][1];
+ }
+ return max_tag;
+}
+
+static u64 train_data[DATA_ROW][3] = {
+ {0, 1, 0},
+ {0, 9, 0},
+ {0, 20, 1},
+ {0, 30, 1},
+ {0, 40, 2},
+ {0, 50, 3},
+ {0, 60, 3},
+ {0, 70, 3},
+ {0, 80, 4},
+ {0, 90, 4},
+ {0, 100, 4},
+ {0, 110, 5},
+ {0, 120, 5},
+ {0, 130, 6},
+ {0, 140, 6},
+ {0, 150, 7},
+};
+
+int knn_get_tag(int ipcx10, int memband_div_10)
+{
+ u64 test_data[2];
+
+ test_data[0] = ipcx10;
+ test_data[1] = memband_div_10;
+
+ return get_neighbors_tag(train_data, DATA_ROW, 3, test_data);
+}
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 685f9881b8e23..0d2764c4449ce 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
.extra2 = &one,
},
#endif /* CONFIG_NUMA_BALANCING */
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+ {
+ .procname = "phase",
+ .mode = 0555,
+ .child = phase_table,
+ },
+#endif
#endif /* CONFIG_SCHED_DEBUG */
{
.procname = "sched_rt_period_us",
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 4cac46d56f387..d748c291e7047 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2164,12 +2164,15 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
{
struct mempolicy *pol;
struct page *page;
- int preferred_nid;
+ int preferred_nid = -1;
nodemask_t *nmask;
+ if (current->sched_memqos.preferred_nid)
+ preferred_nid = current->sched_memqos.preferred_nid - 1;
+
pol = get_vma_policy(vma, addr);
- if (pol->mode == MPOL_INTERLEAVE) {
+ if (pol->mode == MPOL_INTERLEAVE && preferred_nid != -1) {
unsigned nid;
nid = interleave_nid(pol, vma, addr, PAGE_SHIFT + order);
@@ -2233,7 +2236,8 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
}
nmask = policy_nodemask(gfp, pol);
- preferred_nid = policy_node(gfp, pol, node);
+ if (preferred_nid == -1)
+ preferred_nid = policy_node(gfp, pol, node);
page = __alloc_pages_nodemask(gfp, order, preferred_nid, nmask);
mark_vma_cdm(nmask, page, vma);
mpol_cond_put(pol);
--
2.25.1
1
0

22 Mar '23
Send 16 patches to test patchwork->PR function
Baisong Zhong (1):
media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()
Chen Zhongjin (1):
ftrace: Fix invalid address access in lookup_rec() when index is 0
Darrick J. Wong (1):
ext4: fix another off-by-one fsmap error on 1k block filesystems
David Hildenbrand (2):
mm: optimize do_wp_page() for exclusive pages in the swapcache
mm: optimize do_wp_page() for fresh pages in local LRU pagevecs
Kuniyuki Iwashima (1):
seccomp: Move copy_seccomp() to no failure path.
Li Huafei (2):
livepatch: Cleanup klp_mem_prepare()
livepatch: Narrow the scope of the 'text_mutex' lock
Luke D. Jones (1):
HID: asus: Remove check for same LED brightness on set
Nicholas Piggin (1):
mm/vmalloc: huge vmalloc backing pages should be split rather than
compound
Pietro Borrello (2):
HID: asus: use spinlock to protect concurrent accesses
HID: asus: use spinlock to safely schedule workers
Xin Long (2):
tipc: set con sock in tipc_conn_alloc
tipc: add an extra conn_get in tipc_conn_alloc
Zheng Yejian (1):
livepatch/core: Fix hungtask against cpu hotplug on x86
Zhihao Cheng (1):
jbd2: fix data missing when reusing bh which is ready to be
checkpointed
arch/x86/kernel/livepatch.c | 11 +++++--
drivers/hid/hid-asus.c | 38 ++++++++++++++++++-----
drivers/media/usb/dvb-usb/az6027.c | 4 +++
fs/ext4/fsmap.c | 2 ++
fs/jbd2/transaction.c | 50 +++++++++++++++++-------------
kernel/fork.c | 17 ++++++----
kernel/livepatch/core.c | 49 ++++++++++++++++++++---------
kernel/trace/ftrace.c | 3 +-
mm/memory.c | 28 +++++++++++++----
mm/vmalloc.c | 22 ++++++++++---
net/tipc/topsrv.c | 20 ++++++------
11 files changed, 172 insertions(+), 72 deletions(-)
--
2.25.1
1
16

[PATCH openEuler-5.10-LTS-SP1 01/16] jbd2: fix data missing when reusing bh which is ready to be checkpointed
by Jialin Zhang 22 Mar '23
by Jialin Zhang 22 Mar '23
22 Mar '23
From: Zhihao Cheng <chengzhihao1(a)huawei.com>
mainline inclusion
from mainline-v6.3-rc1
commit e6b9bd7290d334451ce054e98e752abc055e0034
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C5HV
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Following process will make data lost and could lead to a filesystem
corrupted problem:
1. jh(bh) is inserted into T1->t_checkpoint_list, bh is dirty, and
jh->b_transaction = NULL
2. T1 is added into journal->j_checkpoint_transactions.
3. Get bh prepare to write while doing checkpoing:
PA PB
do_get_write_access jbd2_log_do_checkpoint
spin_lock(&jh->b_state_lock)
if (buffer_dirty(bh))
clear_buffer_dirty(bh) // clear buffer dirty
set_buffer_jbddirty(bh)
transaction =
journal->j_checkpoint_transactions
jh = transaction->t_checkpoint_list
if (!buffer_dirty(bh))
__jbd2_journal_remove_checkpoint(jh)
// bh won't be flushed
jbd2_cleanup_journal_tail
__jbd2_journal_file_buffer(jh, transaction, BJ_Reserved)
4. Aborting journal/Power-cut before writing latest bh on journal area.
In this way we get a corrupted filesystem with bh's data lost.
Fix it by moving the clearing of buffer_dirty bit just before the call
to __jbd2_journal_file_buffer(), both bit clearing and jh->b_transaction
assignment are under journal->j_list_lock locked, so that
jbd2_log_do_checkpoint() will wait until jh's new transaction fininshed
even bh is currently not dirty. And journal_shrink_one_cp_list() won't
remove jh from checkpoint list if the buffer head is reused in
do_get_write_access().
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216898
Cc: <stable(a)kernel.org>
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Signed-off-by: zhanchengbin <zhanchengbin1(a)huawei.com>
Suggested-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20230110015327.1181863-1-chengzhihao1@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/jbd2/transaction.c | 50 +++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 21 deletions(-)
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index cefee2dead54..8fa88c42fcb4 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -984,36 +984,28 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
* ie. locked but not dirty) or tune2fs (which may actually have
* the buffer dirtied, ugh.) */
- if (buffer_dirty(bh)) {
+ if (buffer_dirty(bh) && jh->b_transaction) {
+ warn_dirty_buffer(bh);
/*
- * First question: is this buffer already part of the current
- * transaction or the existing committing transaction?
- */
- if (jh->b_transaction) {
- J_ASSERT_JH(jh,
- jh->b_transaction == transaction ||
- jh->b_transaction ==
- journal->j_committing_transaction);
- if (jh->b_next_transaction)
- J_ASSERT_JH(jh, jh->b_next_transaction ==
- transaction);
- warn_dirty_buffer(bh);
- }
- /*
- * In any case we need to clean the dirty flag and we must
- * do it under the buffer lock to be sure we don't race
- * with running write-out.
+ * We need to clean the dirty flag and we must do it under the
+ * buffer lock to be sure we don't race with running write-out.
*/
JBUFFER_TRACE(jh, "Journalling dirty buffer");
clear_buffer_dirty(bh);
+ /*
+ * The buffer is going to be added to BJ_Reserved list now and
+ * nothing guarantees jbd2_journal_dirty_metadata() will be
+ * ever called for it. So we need to set jbddirty bit here to
+ * make sure the buffer is dirtied and written out when the
+ * journaling machinery is done with it.
+ */
set_buffer_jbddirty(bh);
}
- unlock_buffer(bh);
-
error = -EROFS;
if (is_handle_aborted(handle)) {
spin_unlock(&jh->b_state_lock);
+ unlock_buffer(bh);
goto out;
}
error = 0;
@@ -1023,8 +1015,10 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
* b_next_transaction points to it
*/
if (jh->b_transaction == transaction ||
- jh->b_next_transaction == transaction)
+ jh->b_next_transaction == transaction) {
+ unlock_buffer(bh);
goto done;
+ }
/*
* this is the first time this transaction is touching this buffer,
@@ -1048,10 +1042,24 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
*/
smp_wmb();
spin_lock(&journal->j_list_lock);
+ if (test_clear_buffer_dirty(bh)) {
+ /*
+ * Execute buffer dirty clearing and jh->b_transaction
+ * assignment under journal->j_list_lock locked to
+ * prevent bh being removed from checkpoint list if
+ * the buffer is in an intermediate state (not dirty
+ * and jh->b_transaction is NULL).
+ */
+ JBUFFER_TRACE(jh, "Journalling dirty buffer");
+ set_buffer_jbddirty(bh);
+ }
__jbd2_journal_file_buffer(jh, transaction, BJ_Reserved);
spin_unlock(&journal->j_list_lock);
+ unlock_buffer(bh);
goto done;
}
+ unlock_buffer(bh);
+
/*
* If there is already a copy-out version of this buffer, then we don't
* need to make another one
--
2.25.1
1
15

[PATCH openEuler-5.10-LTS-SP1 01/14] jbd2: fix data missing when reusing bh which is ready to be checkpointed
by Jialin Zhang 22 Mar '23
by Jialin Zhang 22 Mar '23
22 Mar '23
From: Zhihao Cheng <chengzhihao1(a)huawei.com>
mainline inclusion
from mainline-v6.3-rc1
commit e6b9bd7290d334451ce054e98e752abc055e0034
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6C5HV
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Following process will make data lost and could lead to a filesystem
corrupted problem:
1. jh(bh) is inserted into T1->t_checkpoint_list, bh is dirty, and
jh->b_transaction = NULL
2. T1 is added into journal->j_checkpoint_transactions.
3. Get bh prepare to write while doing checkpoing:
PA PB
do_get_write_access jbd2_log_do_checkpoint
spin_lock(&jh->b_state_lock)
if (buffer_dirty(bh))
clear_buffer_dirty(bh) // clear buffer dirty
set_buffer_jbddirty(bh)
transaction =
journal->j_checkpoint_transactions
jh = transaction->t_checkpoint_list
if (!buffer_dirty(bh))
__jbd2_journal_remove_checkpoint(jh)
// bh won't be flushed
jbd2_cleanup_journal_tail
__jbd2_journal_file_buffer(jh, transaction, BJ_Reserved)
4. Aborting journal/Power-cut before writing latest bh on journal area.
In this way we get a corrupted filesystem with bh's data lost.
Fix it by moving the clearing of buffer_dirty bit just before the call
to __jbd2_journal_file_buffer(), both bit clearing and jh->b_transaction
assignment are under journal->j_list_lock locked, so that
jbd2_log_do_checkpoint() will wait until jh's new transaction fininshed
even bh is currently not dirty. And journal_shrink_one_cp_list() won't
remove jh from checkpoint list if the buffer head is reused in
do_get_write_access().
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216898
Cc: <stable(a)kernel.org>
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Signed-off-by: zhanchengbin <zhanchengbin1(a)huawei.com>
Suggested-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20230110015327.1181863-1-chengzhihao1@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/jbd2/transaction.c | 50 +++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 21 deletions(-)
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index cefee2dead54..8fa88c42fcb4 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -984,36 +984,28 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
* ie. locked but not dirty) or tune2fs (which may actually have
* the buffer dirtied, ugh.) */
- if (buffer_dirty(bh)) {
+ if (buffer_dirty(bh) && jh->b_transaction) {
+ warn_dirty_buffer(bh);
/*
- * First question: is this buffer already part of the current
- * transaction or the existing committing transaction?
- */
- if (jh->b_transaction) {
- J_ASSERT_JH(jh,
- jh->b_transaction == transaction ||
- jh->b_transaction ==
- journal->j_committing_transaction);
- if (jh->b_next_transaction)
- J_ASSERT_JH(jh, jh->b_next_transaction ==
- transaction);
- warn_dirty_buffer(bh);
- }
- /*
- * In any case we need to clean the dirty flag and we must
- * do it under the buffer lock to be sure we don't race
- * with running write-out.
+ * We need to clean the dirty flag and we must do it under the
+ * buffer lock to be sure we don't race with running write-out.
*/
JBUFFER_TRACE(jh, "Journalling dirty buffer");
clear_buffer_dirty(bh);
+ /*
+ * The buffer is going to be added to BJ_Reserved list now and
+ * nothing guarantees jbd2_journal_dirty_metadata() will be
+ * ever called for it. So we need to set jbddirty bit here to
+ * make sure the buffer is dirtied and written out when the
+ * journaling machinery is done with it.
+ */
set_buffer_jbddirty(bh);
}
- unlock_buffer(bh);
-
error = -EROFS;
if (is_handle_aborted(handle)) {
spin_unlock(&jh->b_state_lock);
+ unlock_buffer(bh);
goto out;
}
error = 0;
@@ -1023,8 +1015,10 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
* b_next_transaction points to it
*/
if (jh->b_transaction == transaction ||
- jh->b_next_transaction == transaction)
+ jh->b_next_transaction == transaction) {
+ unlock_buffer(bh);
goto done;
+ }
/*
* this is the first time this transaction is touching this buffer,
@@ -1048,10 +1042,24 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
*/
smp_wmb();
spin_lock(&journal->j_list_lock);
+ if (test_clear_buffer_dirty(bh)) {
+ /*
+ * Execute buffer dirty clearing and jh->b_transaction
+ * assignment under journal->j_list_lock locked to
+ * prevent bh being removed from checkpoint list if
+ * the buffer is in an intermediate state (not dirty
+ * and jh->b_transaction is NULL).
+ */
+ JBUFFER_TRACE(jh, "Journalling dirty buffer");
+ set_buffer_jbddirty(bh);
+ }
__jbd2_journal_file_buffer(jh, transaction, BJ_Reserved);
spin_unlock(&journal->j_list_lock);
+ unlock_buffer(bh);
goto done;
}
+ unlock_buffer(bh);
+
/*
* If there is already a copy-out version of this buffer, then we don't
* need to make another one
--
2.25.1
1
13

[PATCH openEuler-1.0-LTS] ext4: fix another off-by-one fsmap error on 1k block filesystems
by Yongqiang Liu 22 Mar '23
by Yongqiang Liu 22 Mar '23
22 Mar '23
From: "Darrick J. Wong" <djwong(a)kernel.org>
mainline inclusion
from mainline-v6.3-rc2
commit c993799baf9c5861f8df91beb80e1611b12efcbd
category: bugfix
bugzilla: 188522,https://gitee.com/openeuler/kernel/issues/I6N7ZP
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Apparently syzbot figured out that issuing this FSMAP call:
struct fsmap_head cmd = {
.fmh_count = ...;
.fmh_keys = {
{ .fmr_device = /* ext4 dev */, .fmr_physical = 0, },
{ .fmr_device = /* ext4 dev */, .fmr_physical = 0, },
},
...
};
ret = ioctl(fd, FS_IOC_GETFSMAP, &cmd);
Produces this crash if the underlying filesystem is a 1k-block ext4
filesystem:
kernel BUG at fs/ext4/ext4.h:3331!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 3 PID: 3227965 Comm: xfs_io Tainted: G W O 6.2.0-rc8-achx
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:ext4_mb_load_buddy_gfp+0x47c/0x570 [ext4]
RSP: 0018:ffffc90007c03998 EFLAGS: 00010246
RAX: ffff888004978000 RBX: ffffc90007c03a20 RCX: ffff888041618000
RDX: 0000000000000000 RSI: 00000000000005a4 RDI: ffffffffa0c99b11
RBP: ffff888012330000 R08: ffffffffa0c2b7d0 R09: 0000000000000400
R10: ffffc90007c03950 R11: 0000000000000000 R12: 0000000000000001
R13: 00000000ffffffff R14: 0000000000000c40 R15: ffff88802678c398
FS: 00007fdf2020c880(0000) GS:ffff88807e100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd318a5fe8 CR3: 000000007f80f001 CR4: 00000000001706e0
Call Trace:
<TASK>
ext4_mballoc_query_range+0x4b/0x210 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
ext4_getfsmap_datadev+0x713/0x890 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
ext4_getfsmap+0x2b7/0x330 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
ext4_ioc_getfsmap+0x153/0x2b0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
__ext4_ioctl+0x2a7/0x17e0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80]
__x64_sys_ioctl+0x82/0xa0
do_syscall_64+0x2b/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fdf20558aff
RSP: 002b:00007ffd318a9e30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000000200c0 RCX: 00007fdf20558aff
RDX: 00007fdf1feb2010 RSI: 00000000c0c0583b RDI: 0000000000000003
RBP: 00005625c0634be0 R08: 00005625c0634c40 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdf1feb2010
R13: 00005625be70d994 R14: 0000000000000800 R15: 0000000000000000
For GETFSMAP calls, the caller selects a physical block device by
writing its block number into fsmap_head.fmh_keys[01].fmr_device.
To query mappings for a subrange of the device, the starting byte of the
range is written to fsmap_head.fmh_keys[0].fmr_physical and the last
byte of the range goes in fsmap_head.fmh_keys[1].fmr_physical.
IOWs, to query what mappings overlap with bytes 3-14 of /dev/sda, you'd
set the inputs as follows:
fmh_keys[0] = { .fmr_device = major(8, 0), .fmr_physical = 3},
fmh_keys[1] = { .fmr_device = major(8, 0), .fmr_physical = 14},
Which would return you whatever is mapped in the 12 bytes starting at
physical offset 3.
The crash is due to insufficient range validation of keys[1] in
ext4_getfsmap_datadev. On 1k-block filesystems, block 0 is not part of
the filesystem, which means that s_first_data_block is nonzero.
ext4_get_group_no_and_offset subtracts this quantity from the blocknr
argument before cracking it into a group number and a block number
within a group. IOWs, block group 0 spans blocks 1-8192 (1-based)
instead of 0-8191 (0-based) like what happens with larger blocksizes.
The net result of this encoding is that blocknr < s_first_data_block is
not a valid input to this function. The end_fsb variable is set from
the keys that are copied from userspace, which means that in the above
example, its value is zero. That leads to an underflow here:
blocknr = blocknr - le32_to_cpu(es->s_first_data_block);
The division then operates on -1:
offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >>
EXT4_SB(sb)->s_cluster_bits;
Leaving an impossibly large group number (2^32-1) in blocknr.
ext4_getfsmap_check_keys checked that keys[0].fmr_physical and
keys[1].fmr_physical are in increasing order, but
ext4_getfsmap_datadev adjusts keys[0].fmr_physical to be at least
s_first_data_block. This implies that we have to check it again after
the adjustment, which is the piece that I forgot.
Reported-by: syzbot+6be2b977c89f79b6b153(a)syzkaller.appspotmail.com
Fixes: 4a4956249dac ("ext4: fix off-by-one fsmap error on 1k block filesystems")
Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c300…
Cc: stable(a)vger.kernel.org
Signed-off-by: Darrick J. Wong <djwong(a)kernel.org>
Link: https://lore.kernel.org/r/Y+58NPTH7VNGgzdd@magnolia
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/ext4/fsmap.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
index 6f3f245f3a80..6b52ace1463c 100644
--- a/fs/ext4/fsmap.c
+++ b/fs/ext4/fsmap.c
@@ -486,6 +486,8 @@ static int ext4_getfsmap_datadev(struct super_block *sb,
keys[0].fmr_physical = bofs;
if (keys[1].fmr_physical >= eofs)
keys[1].fmr_physical = eofs - 1;
+ if (keys[1].fmr_physical < keys[0].fmr_physical)
+ return 0;
start_fsb = keys[0].fmr_physical;
end_fsb = keys[1].fmr_physical;
--
2.25.1
1
0
您好!
Kernel SIG 邀请您参加 2023-03-24 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG双周例会
会议内容:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/85131900036?pwd=RFZVYlZFK3B1RVhpTHROOW82OTdLQT09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-03-24 14:00,
The subject of the conference is openEuler Kernel SIG双周例会,
Summary:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/85131900036?pwd=RFZVYlZFK3B1RVhpTHROOW82OTdLQT09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

21 Mar '23
From: Xin Long <lucien.xin(a)gmail.com>
stable inclusion
from stable-v4.19.268
commit 2c9c64a95d97727c9ada0d35abc90ee5fdbaeff7
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NCRH
CVE: CVE-2023-1382
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 0e5d56c64afcd6fd2d132ea972605b66f8a7d3c4 ]
A crash was reported by Wei Chen:
BUG: kernel NULL pointer dereference, address: 0000000000000018
RIP: 0010:tipc_conn_close+0x12/0x100
Call Trace:
tipc_topsrv_exit_net+0x139/0x320
ops_exit_list.isra.9+0x49/0x80
cleanup_net+0x31a/0x540
process_one_work+0x3fa/0x9f0
worker_thread+0x42/0x5c0
It was caused by !con->sock in tipc_conn_close(). In tipc_topsrv_accept(),
con is allocated in conn_idr then its sock is set:
con = tipc_conn_alloc();
... <----[1]
con->sock = newsock;
If tipc_conn_close() is called in anytime of [1], the null-pointer-def
is triggered by con->sock->sk due to con->sock is not yet set.
This patch fixes it by moving the con->sock setting to tipc_conn_alloc()
under s->idr_lock. So that con->sock can never be NULL when getting the
con from s->conn_idr. It will be also safer to move con->server and flag
CF_CONNECTED setting under s->idr_lock, as they should all be set before
tipc_conn_alloc() is called.
Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure")
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Jon Maloy <jmaloy(a)redhat.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
conflict:
net/tipc/topsrv.c
Signed-off-by: Lu Wei <luwei32(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/tipc/topsrv.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c
index 1c4733153d74..89a1f127dfaf 100644
--- a/net/tipc/topsrv.c
+++ b/net/tipc/topsrv.c
@@ -184,7 +184,7 @@ static void tipc_conn_close(struct tipc_conn *con)
conn_put(con);
}
-static struct tipc_conn *tipc_conn_alloc(struct tipc_topsrv *s)
+static struct tipc_conn *tipc_conn_alloc(struct tipc_topsrv *s, struct socket *sock)
{
struct tipc_conn *con;
int ret;
@@ -210,10 +210,11 @@ static struct tipc_conn *tipc_conn_alloc(struct tipc_topsrv *s)
}
con->conid = ret;
s->idr_in_use++;
- spin_unlock_bh(&s->idr_lock);
set_bit(CF_CONNECTED, &con->flags);
con->server = s;
+ con->sock = sock;
+ spin_unlock_bh(&s->idr_lock);
return con;
}
@@ -467,7 +468,7 @@ static void tipc_topsrv_accept(struct work_struct *work)
ret = kernel_accept(lsock, &newsock, O_NONBLOCK);
if (ret < 0)
return;
- con = tipc_conn_alloc(srv);
+ con = tipc_conn_alloc(srv, newsock);
if (IS_ERR(con)) {
ret = PTR_ERR(con);
sock_release(newsock);
@@ -479,7 +480,6 @@ static void tipc_topsrv_accept(struct work_struct *work)
newsk->sk_data_ready = tipc_conn_data_ready;
newsk->sk_write_space = tipc_conn_write_space;
newsk->sk_user_data = con;
- con->sock = newsock;
write_unlock_bh(&newsk->sk_callback_lock);
/* Wake up receive process in case of 'SYN+' message */
@@ -577,12 +577,11 @@ bool tipc_topsrv_kern_subscr(struct net *net, u32 port, u32 type, u32 lower,
sub.filter = filter;
*(u32 *)&sub.usr_handle = port;
- con = tipc_conn_alloc(tipc_topsrv(net));
+ con = tipc_conn_alloc(tipc_topsrv(net), NULL);
if (IS_ERR(con))
return false;
*conid = con->conid;
- con->sock = NULL;
rc = tipc_conn_rcv_sub(tipc_topsrv(net), con, &sub);
if (rc >= 0)
return true;
--
2.25.1
1
1

[PATCH PR openEuler-22.03-LTS-SP1] mm/vmalloc: huge vmalloc backing pages should be split rather than compound
by Jialin Zhang 21 Mar '23
by Jialin Zhang 21 Mar '23
21 Mar '23
From: Nicholas Piggin <npiggin(a)gmail.com>
mainline inclusion
from mainline-v5.18-rc4
commit 3b8000ae185cb068adbda5f966a3835053c85fd4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6LD0S
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Huge vmalloc higher-order backing pages were allocated with __GFP_COMP
in order to allow the sub-pages to be refcounted by callers such as
"remap_vmalloc_page [sic]" (remap_vmalloc_range).
However a similar problem exists for other struct page fields callers
use, for example fb_deferred_io_fault() takes a vmalloc'ed page and
not only refcounts it but uses ->lru, ->mapping, ->index.
This is not compatible with compound sub-pages, and can cause bad page
state issues like
BUG: Bad page state in process swapper/0 pfn:00743
page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x743
flags: 0x7ffff000000000(node=0|zone=0|lastcpupid=0x7ffff)
raw: 007ffff000000000 c00c00000001d0c8 c00c00000001d0c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: corrupted mapping in tail page
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-00082-gfc6fff4a7ce1-dirty #2810
Call Trace:
dump_stack_lvl+0x74/0xa8 (unreliable)
bad_page+0x12c/0x170
free_tail_pages_check+0xe8/0x190
free_pcp_prepare+0x31c/0x4e0
free_unref_page+0x40/0x1b0
__vunmap+0x1d8/0x420
...
The correct approach is to use split high-order pages for the huge
vmalloc backing. These allow callers to treat them in exactly the same
way as individually-allocated order-0 pages.
Link: https://lore.kernel.org/all/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg…
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Paul Menzel <pmenzel(a)molgen.mpg.de>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
conflicts:
mm/vmalloc.c
Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
mm/vmalloc.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e27cd716ca95..2ca2c1bc0db9 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2641,14 +2641,17 @@ static void __vunmap(const void *addr, int deallocate_pages)
vm_remove_mappings(area, deallocate_pages);
if (deallocate_pages) {
- unsigned int page_order = vm_area_page_order(area);
int i;
- for (i = 0; i < area->nr_pages; i += 1U << page_order) {
+ for (i = 0; i < area->nr_pages; i++) {
struct page *page = area->pages[i];
BUG_ON(!page);
- __free_pages(page, page_order);
+ /*
+ * High-order allocs for huge vmallocs are split, so
+ * can be freed as an array of order-0 allocations
+ */
+ __free_pages(page, 0);
}
atomic_long_sub(area->nr_pages, &nr_vmalloc_pages);
@@ -2930,8 +2933,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
struct page *page;
int p;
- /* Compound pages required for remap_vmalloc_page */
- page = alloc_pages_node(node, gfp_mask | __GFP_COMP, page_order);
+ page = alloc_pages_node(node, gfp_mask, page_order);
if (unlikely(!page)) {
/* Successfully allocated i pages, free them in __vfree() */
area->nr_pages = i;
@@ -2943,6 +2945,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
goto fail;
}
+ /*
+ * Higher order allocations must be able to be treated as
+ * indepdenent small pages by callers (as they can with
+ * small-page vmallocs). Some drivers do their own refcounting
+ * on vmalloc_to_page() pages, some use page->mapping,
+ * page->lru, etc.
+ */
+ if (page_order)
+ split_page(page, page_order);
+
for (p = 0; p < (1U << page_order); p++)
area->pages[i + p] = page + p;
--
2.25.1
1
0

[PATCH openEuler-23.03] Fix CVE-2023-23005, CVE-2023-0597 and CVE-2022-4269
by Jialin Zhang 18 Mar '23
by Jialin Zhang 18 Mar '23
18 Mar '23
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
...NULL-vs-IS_ERR-checking-in-memory_ti.patch | 50 +++++
...-x86-mm-Randomize-per-cpu-entry-area.patch | 172 ++++++++++++++++++
...ap-shadow-for-percpu-pages-on-demand.patch | 126 +++++++++++++
...-physical-address-for-every-page-of-.patch | 50 +++++
...KASAN-shadow-for-entire-per-CPU-rang.patch | 121 ++++++++++++
...-local-CPU_ENTRY_AREA-variables-to-s.patch | 88 +++++++++
...lpers-to-align-shadow-addresses-up-a.patch | 113 ++++++++++++
...te-shadow-for-shared-chunk-of-the-CP.patch | 99 ++++++++++
...rred-better-wording-on-protection-ag.patch | 97 ++++++++++
...he-backlog-for-nested-calls-to-mirre.patch | 149 +++++++++++++++
kernel.spec | 27 ++-
11 files changed, 1090 insertions(+), 2 deletions(-)
create mode 100644 0015-mm-demotion-fix-NULL-vs-IS_ERR-checking-in-memory_ti.patch
create mode 100644 0016-x86-mm-Randomize-per-cpu-entry-area.patch
create mode 100644 0017-x86-kasan-Map-shadow-for-percpu-pages-on-demand.patch
create mode 100644 0018-x86-mm-Recompute-physical-address-for-every-page-of-.patch
create mode 100644 0019-x86-mm-Populate-KASAN-shadow-for-entire-per-CPU-rang.patch
create mode 100644 0020-x86-kasan-Rename-local-CPU_ENTRY_AREA-variables-to-s.patch
create mode 100644 0021-x86-kasan-Add-helpers-to-align-shadow-addresses-up-a.patch
create mode 100644 0022-x86-kasan-Populate-shadow-for-shared-chunk-of-the-CP.patch
create mode 100644 0023-net-sched-act_mirred-better-wording-on-protection-ag.patch
create mode 100644 0024-act_mirred-use-the-backlog-for-nested-calls-to-mirre.patch
diff --git a/0015-mm-demotion-fix-NULL-vs-IS_ERR-checking-in-memory_ti.patch b/0015-mm-demotion-fix-NULL-vs-IS_ERR-checking-in-memory_ti.patch
new file mode 100644
index 0000000..f598fc5
--- /dev/null
+++ b/0015-mm-demotion-fix-NULL-vs-IS_ERR-checking-in-memory_ti.patch
@@ -0,0 +1,50 @@
+From e11c121e73d4e98ed13259d6b19830f33ca60d76 Mon Sep 17 00:00:00 2001
+From: Miaoqian Lin <linmq006(a)gmail.com>
+Date: Fri, 17 Mar 2023 10:05:08 +0800
+Subject: [PATCH 15/24] mm/demotion: fix NULL vs IS_ERR checking in
+ memory_tier_init
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit 4a625ceee8a0ab0273534cb6b432ce6b331db5ee
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6IXO8
+CVE: CVE-2023-23005
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+alloc_memory_type() returns error pointers on error instead of NULL. Use
+IS_ERR() to check the return value to fix this.
+
+Link: https://lkml.kernel.org/r/20221110030751.1627266-1-linmq006@gmail.com
+Fixes: 7b88bda3761b ("mm/demotion/dax/kmem: set node's abstract distance to MEMTIER_DEFAULT_DAX_ADISTANCE")
+Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
+Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
+Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
+Cc: Wei Xu <weixugc(a)google.com>
+Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
+Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com>
+Reviewed-by: tong tiangen <tongtiangen(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ mm/memory-tiers.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
+index ba863f46759d..96022973c9ba 100644
+--- a/mm/memory-tiers.c
++++ b/mm/memory-tiers.c
+@@ -645,7 +645,7 @@ static int __init memory_tier_init(void)
+ * than default DRAM tier.
+ */
+ default_dram_type = alloc_memory_type(MEMTIER_ADISTANCE_DRAM);
+- if (!default_dram_type)
++ if (IS_ERR(default_dram_type))
+ panic("%s() failed to allocate default DRAM tier\n", __func__);
+
+ /*
+--
+2.25.1
+
diff --git a/0016-x86-mm-Randomize-per-cpu-entry-area.patch b/0016-x86-mm-Randomize-per-cpu-entry-area.patch
new file mode 100644
index 0000000..6f5dd2d
--- /dev/null
+++ b/0016-x86-mm-Randomize-per-cpu-entry-area.patch
@@ -0,0 +1,172 @@
+From 0324d3cd1b57c06b0cf31b6db643ced5b29b0947 Mon Sep 17 00:00:00 2001
+From: Peter Zijlstra <peterz(a)infradead.org>
+Date: Fri, 17 Mar 2023 03:07:41 +0000
+Subject: [PATCH 16/24] x86/mm: Randomize per-cpu entry area
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit 97e3d26b5e5f371b3ee223d94dd123e6c442ba80
+category: bugfix
+bugzilla: 188336
+CVE: CVE-2023-0597
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+Seth found that the CPU-entry-area; the piece of per-cpu data that is
+mapped into the userspace page-tables for kPTI is not subject to any
+randomization -- irrespective of kASLR settings.
+
+On x86_64 a whole P4D (512 GB) of virtual address space is reserved for
+this structure, which is plenty large enough to randomize things a
+little.
+
+As such, use a straight forward randomization scheme that avoids
+duplicates to spread the existing CPUs over the available space.
+
+ [ bp: Fix le build. ]
+
+Reported-by: Seth Jenkins <sethjenkins(a)google.com>
+Reviewed-by: Kees Cook <keescook(a)chromium.org>
+Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
+Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
+Signed-off-by: Borislav Petkov <bp(a)suse.de>
+Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
+Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
+Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ arch/x86/include/asm/cpu_entry_area.h | 4 ---
+ arch/x86/include/asm/pgtable_areas.h | 8 ++++-
+ arch/x86/kernel/hw_breakpoint.c | 2 +-
+ arch/x86/mm/cpu_entry_area.c | 46 ++++++++++++++++++++++++---
+ 4 files changed, 50 insertions(+), 10 deletions(-)
+
+diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h
+index 75efc4c6f076..462fc34f1317 100644
+--- a/arch/x86/include/asm/cpu_entry_area.h
++++ b/arch/x86/include/asm/cpu_entry_area.h
+@@ -130,10 +130,6 @@ struct cpu_entry_area {
+ };
+
+ #define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area))
+-#define CPU_ENTRY_AREA_ARRAY_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS)
+-
+-/* Total size includes the readonly IDT mapping page as well: */
+-#define CPU_ENTRY_AREA_TOTAL_SIZE (CPU_ENTRY_AREA_ARRAY_SIZE + PAGE_SIZE)
+
+ DECLARE_PER_CPU(struct cpu_entry_area *, cpu_entry_area);
+ DECLARE_PER_CPU(struct cea_exception_stacks *, cea_exception_stacks);
+diff --git a/arch/x86/include/asm/pgtable_areas.h b/arch/x86/include/asm/pgtable_areas.h
+index d34cce1b995c..4f056fb88174 100644
+--- a/arch/x86/include/asm/pgtable_areas.h
++++ b/arch/x86/include/asm/pgtable_areas.h
+@@ -11,6 +11,12 @@
+
+ #define CPU_ENTRY_AREA_RO_IDT_VADDR ((void *)CPU_ENTRY_AREA_RO_IDT)
+
+-#define CPU_ENTRY_AREA_MAP_SIZE (CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_ARRAY_SIZE - CPU_ENTRY_AREA_BASE)
++#ifdef CONFIG_X86_32
++#define CPU_ENTRY_AREA_MAP_SIZE (CPU_ENTRY_AREA_PER_CPU + \
++ (CPU_ENTRY_AREA_SIZE * NR_CPUS) - \
++ CPU_ENTRY_AREA_BASE)
++#else
++#define CPU_ENTRY_AREA_MAP_SIZE P4D_SIZE
++#endif
+
+ #endif /* _ASM_X86_PGTABLE_AREAS_H */
+diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
+index 668a4a6533d9..bbb0f737aab1 100644
+--- a/arch/x86/kernel/hw_breakpoint.c
++++ b/arch/x86/kernel/hw_breakpoint.c
+@@ -266,7 +266,7 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
+
+ /* CPU entry erea is always used for CPU entry */
+ if (within_area(addr, end, CPU_ENTRY_AREA_BASE,
+- CPU_ENTRY_AREA_TOTAL_SIZE))
++ CPU_ENTRY_AREA_MAP_SIZE))
+ return true;
+
+ /*
+diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
+index 6c2f1b76a0b6..20844cf141fb 100644
+--- a/arch/x86/mm/cpu_entry_area.c
++++ b/arch/x86/mm/cpu_entry_area.c
+@@ -15,16 +15,53 @@ static DEFINE_PER_CPU_PAGE_ALIGNED(struct entry_stack_page, entry_stack_storage)
+ #ifdef CONFIG_X86_64
+ static DEFINE_PER_CPU_PAGE_ALIGNED(struct exception_stacks, exception_stacks);
+ DEFINE_PER_CPU(struct cea_exception_stacks*, cea_exception_stacks);
+-#endif
+
+-#ifdef CONFIG_X86_32
++static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, _cea_offset);
++
++static __always_inline unsigned int cea_offset(unsigned int cpu)
++{
++ return per_cpu(_cea_offset, cpu);
++}
++
++static __init void init_cea_offsets(void)
++{
++ unsigned int max_cea;
++ unsigned int i, j;
++
++ max_cea = (CPU_ENTRY_AREA_MAP_SIZE - PAGE_SIZE) / CPU_ENTRY_AREA_SIZE;
++
++ /* O(sodding terrible) */
++ for_each_possible_cpu(i) {
++ unsigned int cea;
++
++again:
++ cea = prandom_u32_max(max_cea);
++
++ for_each_possible_cpu(j) {
++ if (cea_offset(j) == cea)
++ goto again;
++
++ if (i == j)
++ break;
++ }
++
++ per_cpu(_cea_offset, i) = cea;
++ }
++}
++#else /* !X86_64 */
+ DECLARE_PER_CPU_PAGE_ALIGNED(struct doublefault_stack, doublefault_stack);
++
++static __always_inline unsigned int cea_offset(unsigned int cpu)
++{
++ return cpu;
++}
++static inline void init_cea_offsets(void) { }
+ #endif
+
+ /* Is called from entry code, so must be noinstr */
+ noinstr struct cpu_entry_area *get_cpu_entry_area(int cpu)
+ {
+- unsigned long va = CPU_ENTRY_AREA_PER_CPU + cpu * CPU_ENTRY_AREA_SIZE;
++ unsigned long va = CPU_ENTRY_AREA_PER_CPU + cea_offset(cpu) * CPU_ENTRY_AREA_SIZE;
+ BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0);
+
+ return (struct cpu_entry_area *) va;
+@@ -205,7 +242,6 @@ static __init void setup_cpu_entry_area_ptes(void)
+
+ /* The +1 is for the readonly IDT: */
+ BUILD_BUG_ON((CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE);
+- BUILD_BUG_ON(CPU_ENTRY_AREA_TOTAL_SIZE != CPU_ENTRY_AREA_MAP_SIZE);
+ BUG_ON(CPU_ENTRY_AREA_BASE & ~PMD_MASK);
+
+ start = CPU_ENTRY_AREA_BASE;
+@@ -221,6 +257,8 @@ void __init setup_cpu_entry_areas(void)
+ {
+ unsigned int cpu;
+
++ init_cea_offsets();
++
+ setup_cpu_entry_area_ptes();
+
+ for_each_possible_cpu(cpu)
+--
+2.25.1
+
diff --git a/0017-x86-kasan-Map-shadow-for-percpu-pages-on-demand.patch b/0017-x86-kasan-Map-shadow-for-percpu-pages-on-demand.patch
new file mode 100644
index 0000000..b9129b6
--- /dev/null
+++ b/0017-x86-kasan-Map-shadow-for-percpu-pages-on-demand.patch
@@ -0,0 +1,126 @@
+From 68992563c4b6b1776bd90dafe76caa88ff6dbfe8 Mon Sep 17 00:00:00 2001
+From: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
+Date: Fri, 17 Mar 2023 03:07:42 +0000
+Subject: [PATCH 17/24] x86/kasan: Map shadow for percpu pages on demand
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit 3f148f3318140035e87decc1214795ff0755757b
+category: bugfix
+bugzilla: 188336
+CVE: CVE-2023-0597
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+KASAN maps shadow for the entire CPU-entry-area:
+ [CPU_ENTRY_AREA_BASE, CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_MAP_SIZE]
+
+This will explode once the per-cpu entry areas are randomized since it
+will increase CPU_ENTRY_AREA_MAP_SIZE to 512 GB and KASAN fails to
+allocate shadow for such big area.
+
+Fix this by allocating KASAN shadow only for really used cpu entry area
+addresses mapped by cea_map_percpu_pages()
+
+Thanks to the 0day folks for finding and reporting this to be an issue.
+
+[ dhansen: tweak changelog since this will get committed before peterz's
+ actual cpu-entry-area randomization ]
+
+Signed-off-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
+Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
+Tested-by: Yujie Liu <yujie.liu(a)intel.com>
+Cc: kernel test robot <yujie.liu(a)intel.com>
+Link: https://lore.kernel.org/r/202210241508.2e203c3d-yujie.liu@intel.com
+Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
+Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
+Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ arch/x86/include/asm/kasan.h | 3 +++
+ arch/x86/mm/cpu_entry_area.c | 8 +++++++-
+ arch/x86/mm/kasan_init_64.c | 15 ++++++++++++---
+ 3 files changed, 22 insertions(+), 4 deletions(-)
+
+diff --git a/arch/x86/include/asm/kasan.h b/arch/x86/include/asm/kasan.h
+index 13e70da38bed..de75306b932e 100644
+--- a/arch/x86/include/asm/kasan.h
++++ b/arch/x86/include/asm/kasan.h
+@@ -28,9 +28,12 @@
+ #ifdef CONFIG_KASAN
+ void __init kasan_early_init(void);
+ void __init kasan_init(void);
++void __init kasan_populate_shadow_for_vaddr(void *va, size_t size, int nid);
+ #else
+ static inline void kasan_early_init(void) { }
+ static inline void kasan_init(void) { }
++static inline void kasan_populate_shadow_for_vaddr(void *va, size_t size,
++ int nid) { }
+ #endif
+
+ #endif
+diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
+index 20844cf141fb..dff9001e5e12 100644
+--- a/arch/x86/mm/cpu_entry_area.c
++++ b/arch/x86/mm/cpu_entry_area.c
+@@ -9,6 +9,7 @@
+ #include <asm/cpu_entry_area.h>
+ #include <asm/fixmap.h>
+ #include <asm/desc.h>
++#include <asm/kasan.h>
+
+ static DEFINE_PER_CPU_PAGE_ALIGNED(struct entry_stack_page, entry_stack_storage);
+
+@@ -90,8 +91,13 @@ void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags)
+ static void __init
+ cea_map_percpu_pages(void *cea_vaddr, void *ptr, int pages, pgprot_t prot)
+ {
++ phys_addr_t pa = per_cpu_ptr_to_phys(ptr);
++
++ kasan_populate_shadow_for_vaddr(cea_vaddr, pages * PAGE_SIZE,
++ early_pfn_to_nid(PFN_DOWN(pa)));
++
+ for ( ; pages; pages--, cea_vaddr+= PAGE_SIZE, ptr += PAGE_SIZE)
+- cea_set_pte(cea_vaddr, per_cpu_ptr_to_phys(ptr), prot);
++ cea_set_pte(cea_vaddr, pa, prot);
+ }
+
+ static void __init percpu_setup_debug_store(unsigned int cpu)
+diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
+index e7b9b464a82f..d1416926ad52 100644
+--- a/arch/x86/mm/kasan_init_64.c
++++ b/arch/x86/mm/kasan_init_64.c
+@@ -316,6 +316,18 @@ void __init kasan_early_init(void)
+ kasan_map_early_shadow(init_top_pgt);
+ }
+
++void __init kasan_populate_shadow_for_vaddr(void *va, size_t size, int nid)
++{
++ unsigned long shadow_start, shadow_end;
++
++ shadow_start = (unsigned long)kasan_mem_to_shadow(va);
++ shadow_start = round_down(shadow_start, PAGE_SIZE);
++ shadow_end = (unsigned long)kasan_mem_to_shadow(va + size);
++ shadow_end = round_up(shadow_end, PAGE_SIZE);
++
++ kasan_populate_shadow(shadow_start, shadow_end, nid);
++}
++
+ void __init kasan_init(void)
+ {
+ int i;
+@@ -393,9 +405,6 @@ void __init kasan_init(void)
+ kasan_mem_to_shadow((void *)VMALLOC_END + 1),
+ shadow_cpu_entry_begin);
+
+- kasan_populate_shadow((unsigned long)shadow_cpu_entry_begin,
+- (unsigned long)shadow_cpu_entry_end, 0);
+-
+ kasan_populate_early_shadow(shadow_cpu_entry_end,
+ kasan_mem_to_shadow((void *)__START_KERNEL_map));
+
+--
+2.25.1
+
diff --git a/0018-x86-mm-Recompute-physical-address-for-every-page-of-.patch b/0018-x86-mm-Recompute-physical-address-for-every-page-of-.patch
new file mode 100644
index 0000000..2696052
--- /dev/null
+++ b/0018-x86-mm-Recompute-physical-address-for-every-page-of-.patch
@@ -0,0 +1,50 @@
+From 12867b242d6e431f6f947e53abd1094cd0075b55 Mon Sep 17 00:00:00 2001
+From: Sean Christopherson <seanjc(a)google.com>
+Date: Fri, 17 Mar 2023 03:07:43 +0000
+Subject: [PATCH 18/24] x86/mm: Recompute physical address for every page of
+ per-CPU CEA mapping
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit 80d72a8f76e8f3f0b5a70b8c7022578e17bde8e7
+category: bugfix
+bugzilla: 188336
+CVE: CVE-2023-0597
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+Recompute the physical address for each per-CPU page in the CPU entry
+area, a recent commit inadvertantly modified cea_map_percpu_pages() such
+that every PTE is mapped to the physical address of the first page.
+
+Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
+Signed-off-by: Sean Christopherson <seanjc(a)google.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
+Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
+Link: https://lkml.kernel.org/r/20221110203504.1985010-2-seanjc@google.com
+Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
+Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
+Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ arch/x86/mm/cpu_entry_area.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
+index dff9001e5e12..d831aae94b41 100644
+--- a/arch/x86/mm/cpu_entry_area.c
++++ b/arch/x86/mm/cpu_entry_area.c
+@@ -97,7 +97,7 @@ cea_map_percpu_pages(void *cea_vaddr, void *ptr, int pages, pgprot_t prot)
+ early_pfn_to_nid(PFN_DOWN(pa)));
+
+ for ( ; pages; pages--, cea_vaddr+= PAGE_SIZE, ptr += PAGE_SIZE)
+- cea_set_pte(cea_vaddr, pa, prot);
++ cea_set_pte(cea_vaddr, per_cpu_ptr_to_phys(ptr), prot);
+ }
+
+ static void __init percpu_setup_debug_store(unsigned int cpu)
+--
+2.25.1
+
diff --git a/0019-x86-mm-Populate-KASAN-shadow-for-entire-per-CPU-rang.patch b/0019-x86-mm-Populate-KASAN-shadow-for-entire-per-CPU-rang.patch
new file mode 100644
index 0000000..8fa96dd
--- /dev/null
+++ b/0019-x86-mm-Populate-KASAN-shadow-for-entire-per-CPU-rang.patch
@@ -0,0 +1,121 @@
+From 91bb861cfc95653af4223af2e00b9e637c501d5a Mon Sep 17 00:00:00 2001
+From: Sean Christopherson <seanjc(a)google.com>
+Date: Fri, 17 Mar 2023 03:07:44 +0000
+Subject: [PATCH 19/24] x86/mm: Populate KASAN shadow for entire per-CPU range
+ of CPU entry area
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit 97650148a15e0b30099d6175ffe278b9f55ec66a
+category: bugfix
+bugzilla: 188336
+CVE: CVE-2023-0597
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+Populate a KASAN shadow for the entire possible per-CPU range of the CPU
+entry area instead of requiring that each individual chunk map a shadow.
+Mapping shadows individually is error prone, e.g. the per-CPU GDT mapping
+was left behind, which can lead to not-present page faults during KASAN
+validation if the kernel performs a software lookup into the GDT. The DS
+buffer is also likely affected.
+
+The motivation for mapping the per-CPU areas on-demand was to avoid
+mapping the entire 512GiB range that's reserved for the CPU entry area,
+shaving a few bytes by not creating shadows for potentially unused memory
+was not a goal.
+
+The bug is most easily reproduced by doing a sigreturn with a garbage
+CS in the sigcontext, e.g.
+
+ int main(void)
+ {
+ struct sigcontext regs;
+
+ syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
+ syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
+ syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
+
+ memset(®s, 0, sizeof(regs));
+ regs.cs = 0x1d0;
+ syscall(__NR_rt_sigreturn);
+ return 0;
+ }
+
+to coerce the kernel into doing a GDT lookup to compute CS.base when
+reading the instruction bytes on the subsequent #GP to determine whether
+or not the #GP is something the kernel should handle, e.g. to fixup UMIP
+violations or to emulate CLI/STI for IOPL=3 applications.
+
+ BUG: unable to handle page fault for address: fffffbc8379ace00
+ #PF: supervisor read access in kernel mode
+ #PF: error_code(0x0000) - not-present page
+ PGD 16c03a067 P4D 16c03a067 PUD 15b990067 PMD 15b98f067 PTE 0
+ Oops: 0000 [#1] PREEMPT SMP KASAN
+ CPU: 3 PID: 851 Comm: r2 Not tainted 6.1.0-rc3-next-20221103+ #432
+ Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
+ RIP: 0010:kasan_check_range+0xdf/0x190
+ Call Trace:
+ <TASK>
+ get_desc+0xb0/0x1d0
+ insn_get_seg_base+0x104/0x270
+ insn_fetch_from_user+0x66/0x80
+ fixup_umip_exception+0xb1/0x530
+ exc_general_protection+0x181/0x210
+ asm_exc_general_protection+0x22/0x30
+ RIP: 0003:0x0
+ Code: Unable to access opcode bytes at 0xffffffffffffffd6.
+ RSP: 0003:0000000000000000 EFLAGS: 00000202
+ RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000001d0
+ RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
+ RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
+ R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
+ R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
+ </TASK>
+
+Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
+Reported-by: syzbot+ffb4f000dc2872c93f62(a)syzkaller.appspotmail.com
+Suggested-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
+Signed-off-by: Sean Christopherson <seanjc(a)google.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
+Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
+Link: https://lkml.kernel.org/r/20221110203504.1985010-3-seanjc@google.com
+Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
+Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
+Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ arch/x86/mm/cpu_entry_area.c | 8 +++-----
+ 1 file changed, 3 insertions(+), 5 deletions(-)
+
+diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
+index d831aae94b41..7c855dffcdc2 100644
+--- a/arch/x86/mm/cpu_entry_area.c
++++ b/arch/x86/mm/cpu_entry_area.c
+@@ -91,11 +91,6 @@ void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags)
+ static void __init
+ cea_map_percpu_pages(void *cea_vaddr, void *ptr, int pages, pgprot_t prot)
+ {
+- phys_addr_t pa = per_cpu_ptr_to_phys(ptr);
+-
+- kasan_populate_shadow_for_vaddr(cea_vaddr, pages * PAGE_SIZE,
+- early_pfn_to_nid(PFN_DOWN(pa)));
+-
+ for ( ; pages; pages--, cea_vaddr+= PAGE_SIZE, ptr += PAGE_SIZE)
+ cea_set_pte(cea_vaddr, per_cpu_ptr_to_phys(ptr), prot);
+ }
+@@ -195,6 +190,9 @@ static void __init setup_cpu_entry_area(unsigned int cpu)
+ pgprot_t tss_prot = PAGE_KERNEL;
+ #endif
+
++ kasan_populate_shadow_for_vaddr(cea, CPU_ENTRY_AREA_SIZE,
++ early_cpu_to_node(cpu));
++
+ cea_set_pte(&cea->gdt, get_cpu_gdt_paddr(cpu), gdt_prot);
+
+ cea_map_percpu_pages(&cea->entry_stack_page,
+--
+2.25.1
+
diff --git a/0020-x86-kasan-Rename-local-CPU_ENTRY_AREA-variables-to-s.patch b/0020-x86-kasan-Rename-local-CPU_ENTRY_AREA-variables-to-s.patch
new file mode 100644
index 0000000..e4f9f63
--- /dev/null
+++ b/0020-x86-kasan-Rename-local-CPU_ENTRY_AREA-variables-to-s.patch
@@ -0,0 +1,88 @@
+From 0560fceb4d3c76133f1a89decbf1c3334afdbd00 Mon Sep 17 00:00:00 2001
+From: Sean Christopherson <seanjc(a)google.com>
+Date: Fri, 17 Mar 2023 03:07:45 +0000
+Subject: [PATCH 20/24] x86/kasan: Rename local CPU_ENTRY_AREA variables to
+ shorten names
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit 7077d2ccb94dafd00b29cc2d601c9f6891648f5b
+category: bugfix
+bugzilla: 188336
+CVE: CVE-2023-0597
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+Rename the CPU entry area variables in kasan_init() to shorten their
+names, a future fix will reference the beginning of the per-CPU portion
+of the CPU entry area, and shadow_cpu_entry_per_cpu_begin is a bit much.
+
+No functional change intended.
+
+Signed-off-by: Sean Christopherson <seanjc(a)google.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
+Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
+Link: https://lkml.kernel.org/r/20221110203504.1985010-4-seanjc@google.com
+Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
+Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
+Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ arch/x86/mm/kasan_init_64.c | 22 +++++++++++-----------
+ 1 file changed, 11 insertions(+), 11 deletions(-)
+
+diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
+index d1416926ad52..ad7872ae10ed 100644
+--- a/arch/x86/mm/kasan_init_64.c
++++ b/arch/x86/mm/kasan_init_64.c
+@@ -331,7 +331,7 @@ void __init kasan_populate_shadow_for_vaddr(void *va, size_t size, int nid)
+ void __init kasan_init(void)
+ {
+ int i;
+- void *shadow_cpu_entry_begin, *shadow_cpu_entry_end;
++ void *shadow_cea_begin, *shadow_cea_end;
+
+ memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
+
+@@ -372,16 +372,16 @@ void __init kasan_init(void)
+ map_range(&pfn_mapped[i]);
+ }
+
+- shadow_cpu_entry_begin = (void *)CPU_ENTRY_AREA_BASE;
+- shadow_cpu_entry_begin = kasan_mem_to_shadow(shadow_cpu_entry_begin);
+- shadow_cpu_entry_begin = (void *)round_down(
+- (unsigned long)shadow_cpu_entry_begin, PAGE_SIZE);
++ shadow_cea_begin = (void *)CPU_ENTRY_AREA_BASE;
++ shadow_cea_begin = kasan_mem_to_shadow(shadow_cea_begin);
++ shadow_cea_begin = (void *)round_down(
++ (unsigned long)shadow_cea_begin, PAGE_SIZE);
+
+- shadow_cpu_entry_end = (void *)(CPU_ENTRY_AREA_BASE +
++ shadow_cea_end = (void *)(CPU_ENTRY_AREA_BASE +
+ CPU_ENTRY_AREA_MAP_SIZE);
+- shadow_cpu_entry_end = kasan_mem_to_shadow(shadow_cpu_entry_end);
+- shadow_cpu_entry_end = (void *)round_up(
+- (unsigned long)shadow_cpu_entry_end, PAGE_SIZE);
++ shadow_cea_end = kasan_mem_to_shadow(shadow_cea_end);
++ shadow_cea_end = (void *)round_up(
++ (unsigned long)shadow_cea_end, PAGE_SIZE);
+
+ kasan_populate_early_shadow(
+ kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
+@@ -403,9 +403,9 @@ void __init kasan_init(void)
+
+ kasan_populate_early_shadow(
+ kasan_mem_to_shadow((void *)VMALLOC_END + 1),
+- shadow_cpu_entry_begin);
++ shadow_cea_begin);
+
+- kasan_populate_early_shadow(shadow_cpu_entry_end,
++ kasan_populate_early_shadow(shadow_cea_end,
+ kasan_mem_to_shadow((void *)__START_KERNEL_map));
+
+ kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext),
+--
+2.25.1
+
diff --git a/0021-x86-kasan-Add-helpers-to-align-shadow-addresses-up-a.patch b/0021-x86-kasan-Add-helpers-to-align-shadow-addresses-up-a.patch
new file mode 100644
index 0000000..5f3f362
--- /dev/null
+++ b/0021-x86-kasan-Add-helpers-to-align-shadow-addresses-up-a.patch
@@ -0,0 +1,113 @@
+From ec4ebad1a3ed5a1ff3301de4df9a12ebf81b09c1 Mon Sep 17 00:00:00 2001
+From: Sean Christopherson <seanjc(a)google.com>
+Date: Fri, 17 Mar 2023 03:07:46 +0000
+Subject: [PATCH 21/24] x86/kasan: Add helpers to align shadow addresses up and
+ down
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit bde258d97409f2a45243cb393a55ea9ecfc7aba5
+category: bugfix
+bugzilla: 188336
+CVE: CVE-2023-0597
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+Add helpers to dedup code for aligning shadow address up/down to page
+boundaries when translating an address to its shadow.
+
+No functional change intended.
+
+Signed-off-by: Sean Christopherson <seanjc(a)google.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
+Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
+Link: https://lkml.kernel.org/r/20221110203504.1985010-5-seanjc@google.com
+Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
+Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
+Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ arch/x86/mm/kasan_init_64.c | 40 ++++++++++++++++++++-----------------
+ 1 file changed, 22 insertions(+), 18 deletions(-)
+
+diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
+index ad7872ae10ed..afc5e129ca7b 100644
+--- a/arch/x86/mm/kasan_init_64.c
++++ b/arch/x86/mm/kasan_init_64.c
+@@ -316,22 +316,33 @@ void __init kasan_early_init(void)
+ kasan_map_early_shadow(init_top_pgt);
+ }
+
++static unsigned long kasan_mem_to_shadow_align_down(unsigned long va)
++{
++ unsigned long shadow = (unsigned long)kasan_mem_to_shadow((void *)va);
++
++ return round_down(shadow, PAGE_SIZE);
++}
++
++static unsigned long kasan_mem_to_shadow_align_up(unsigned long va)
++{
++ unsigned long shadow = (unsigned long)kasan_mem_to_shadow((void *)va);
++
++ return round_up(shadow, PAGE_SIZE);
++}
++
+ void __init kasan_populate_shadow_for_vaddr(void *va, size_t size, int nid)
+ {
+ unsigned long shadow_start, shadow_end;
+
+- shadow_start = (unsigned long)kasan_mem_to_shadow(va);
+- shadow_start = round_down(shadow_start, PAGE_SIZE);
+- shadow_end = (unsigned long)kasan_mem_to_shadow(va + size);
+- shadow_end = round_up(shadow_end, PAGE_SIZE);
+-
++ shadow_start = kasan_mem_to_shadow_align_down((unsigned long)va);
++ shadow_end = kasan_mem_to_shadow_align_up((unsigned long)va + size);
+ kasan_populate_shadow(shadow_start, shadow_end, nid);
+ }
+
+ void __init kasan_init(void)
+ {
++ unsigned long shadow_cea_begin, shadow_cea_end;
+ int i;
+- void *shadow_cea_begin, *shadow_cea_end;
+
+ memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
+
+@@ -372,16 +383,9 @@ void __init kasan_init(void)
+ map_range(&pfn_mapped[i]);
+ }
+
+- shadow_cea_begin = (void *)CPU_ENTRY_AREA_BASE;
+- shadow_cea_begin = kasan_mem_to_shadow(shadow_cea_begin);
+- shadow_cea_begin = (void *)round_down(
+- (unsigned long)shadow_cea_begin, PAGE_SIZE);
+-
+- shadow_cea_end = (void *)(CPU_ENTRY_AREA_BASE +
+- CPU_ENTRY_AREA_MAP_SIZE);
+- shadow_cea_end = kasan_mem_to_shadow(shadow_cea_end);
+- shadow_cea_end = (void *)round_up(
+- (unsigned long)shadow_cea_end, PAGE_SIZE);
++ shadow_cea_begin = kasan_mem_to_shadow_align_down(CPU_ENTRY_AREA_BASE);
++ shadow_cea_end = kasan_mem_to_shadow_align_up(CPU_ENTRY_AREA_BASE +
++ CPU_ENTRY_AREA_MAP_SIZE);
+
+ kasan_populate_early_shadow(
+ kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
+@@ -403,9 +407,9 @@ void __init kasan_init(void)
+
+ kasan_populate_early_shadow(
+ kasan_mem_to_shadow((void *)VMALLOC_END + 1),
+- shadow_cea_begin);
++ (void *)shadow_cea_begin);
+
+- kasan_populate_early_shadow(shadow_cea_end,
++ kasan_populate_early_shadow((void *)shadow_cea_end,
+ kasan_mem_to_shadow((void *)__START_KERNEL_map));
+
+ kasan_populate_shadow((unsigned long)kasan_mem_to_shadow(_stext),
+--
+2.25.1
+
diff --git a/0022-x86-kasan-Populate-shadow-for-shared-chunk-of-the-CP.patch b/0022-x86-kasan-Populate-shadow-for-shared-chunk-of-the-CP.patch
new file mode 100644
index 0000000..d6f9b77
--- /dev/null
+++ b/0022-x86-kasan-Populate-shadow-for-shared-chunk-of-the-CP.patch
@@ -0,0 +1,99 @@
+From 885cbab14224aca9bcf6df23a432a84e55b55dd5 Mon Sep 17 00:00:00 2001
+From: Sean Christopherson <seanjc(a)google.com>
+Date: Fri, 17 Mar 2023 03:07:47 +0000
+Subject: [PATCH 22/24] x86/kasan: Populate shadow for shared chunk of the CPU
+ entry area
+
+mainline inclusion
+from mainline-v6.2-rc1
+commit 1cfaac2400c73378e78182a706be0f3ac8b93cd7
+category: bugfix
+bugzilla: 188336
+CVE: CVE-2023-0597
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+Popuplate the shadow for the shared portion of the CPU entry area, i.e.
+the read-only IDT mapping, during KASAN initialization. A recent change
+modified KASAN to map the per-CPU areas on-demand, but forgot to keep a
+shadow for the common area that is shared amongst all CPUs.
+
+Map the common area in KASAN init instead of letting idt_map_in_cea() do
+the dirty work so that it Just Works in the unlikely event more shared
+data is shoved into the CPU entry area.
+
+The bug manifests as a not-present #PF when software attempts to lookup
+an IDT entry, e.g. when KVM is handling IRQs on Intel CPUs (KVM performs
+direct CALL to the IRQ handler to avoid the overhead of INTn):
+
+ BUG: unable to handle page fault for address: fffffbc0000001d8
+ #PF: supervisor read access in kernel mode
+ #PF: error_code(0x0000) - not-present page
+ PGD 16c03a067 P4D 16c03a067 PUD 0
+ Oops: 0000 [#1] PREEMPT SMP KASAN
+ CPU: 5 PID: 901 Comm: repro Tainted: G W 6.1.0-rc3+ #410
+ Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
+ RIP: 0010:kasan_check_range+0xdf/0x190
+ vmx_handle_exit_irqoff+0x152/0x290 [kvm_intel]
+ vcpu_run+0x1d89/0x2bd0 [kvm]
+ kvm_arch_vcpu_ioctl_run+0x3ce/0xa70 [kvm]
+ kvm_vcpu_ioctl+0x349/0x900 [kvm]
+ __x64_sys_ioctl+0xb8/0xf0
+ do_syscall_64+0x2b/0x50
+ entry_SYSCALL_64_after_hwframe+0x46/0xb0
+
+Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
+Reported-by: syzbot+8cdd16fd5a6c0565e227(a)syzkaller.appspotmail.com
+Signed-off-by: Sean Christopherson <seanjc(a)google.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
+Link: https://lkml.kernel.org/r/20221110203504.1985010-6-seanjc@google.com
+Signed-off-by: Tong Tiangen <tongtiangen(a)huawei.com>
+Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
+Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ arch/x86/mm/kasan_init_64.c | 12 +++++++++++-
+ 1 file changed, 11 insertions(+), 1 deletion(-)
+
+diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
+index afc5e129ca7b..0302491d799d 100644
+--- a/arch/x86/mm/kasan_init_64.c
++++ b/arch/x86/mm/kasan_init_64.c
+@@ -341,7 +341,7 @@ void __init kasan_populate_shadow_for_vaddr(void *va, size_t size, int nid)
+
+ void __init kasan_init(void)
+ {
+- unsigned long shadow_cea_begin, shadow_cea_end;
++ unsigned long shadow_cea_begin, shadow_cea_per_cpu_begin, shadow_cea_end;
+ int i;
+
+ memcpy(early_top_pgt, init_top_pgt, sizeof(early_top_pgt));
+@@ -384,6 +384,7 @@ void __init kasan_init(void)
+ }
+
+ shadow_cea_begin = kasan_mem_to_shadow_align_down(CPU_ENTRY_AREA_BASE);
++ shadow_cea_per_cpu_begin = kasan_mem_to_shadow_align_up(CPU_ENTRY_AREA_PER_CPU);
+ shadow_cea_end = kasan_mem_to_shadow_align_up(CPU_ENTRY_AREA_BASE +
+ CPU_ENTRY_AREA_MAP_SIZE);
+
+@@ -409,6 +410,15 @@ void __init kasan_init(void)
+ kasan_mem_to_shadow((void *)VMALLOC_END + 1),
+ (void *)shadow_cea_begin);
+
++ /*
++ * Populate the shadow for the shared portion of the CPU entry area.
++ * Shadows for the per-CPU areas are mapped on-demand, as each CPU's
++ * area is randomly placed somewhere in the 512GiB range and mapping
++ * the entire 512GiB range is prohibitively expensive.
++ */
++ kasan_populate_shadow(shadow_cea_begin,
++ shadow_cea_per_cpu_begin, 0);
++
+ kasan_populate_early_shadow((void *)shadow_cea_end,
+ kasan_mem_to_shadow((void *)__START_KERNEL_map));
+
+--
+2.25.1
+
diff --git a/0023-net-sched-act_mirred-better-wording-on-protection-ag.patch b/0023-net-sched-act_mirred-better-wording-on-protection-ag.patch
new file mode 100644
index 0000000..8065822
--- /dev/null
+++ b/0023-net-sched-act_mirred-better-wording-on-protection-ag.patch
@@ -0,0 +1,97 @@
+From 1420d4aeb4cecca648b494e6d875c222da1d9309 Mon Sep 17 00:00:00 2001
+From: Davide Caratti <dcaratti(a)redhat.com>
+Date: Sat, 18 Mar 2023 16:46:22 +0800
+Subject: [PATCH 23/24] net/sched: act_mirred: better wording on protection
+ against excessive stack growth
+
+mainline inclusion
+from mainline-v6.3-rc1
+commit 78dcdffe0418ac8f3f057f26fe71ccf4d8ed851f
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/kernel/issues/I64END
+CVE: CVE-2022-4269
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+with commit e2ca070f89ec ("net: sched: protect against stack overflow in
+TC act_mirred"), act_mirred protected itself against excessive stack growth
+using per_cpu counter of nested calls to tcf_mirred_act(), and capping it
+to MIRRED_RECURSION_LIMIT. However, such protection does not detect
+recursion/loops in case the packet is enqueued to the backlog (for example,
+when the mirred target device has RPS or skb timestamping enabled). Change
+the wording from "recursion" to "nesting" to make it more clear to readers.
+
+CC: Jamal Hadi Salim <jhs(a)mojatatu.com>
+Signed-off-by: Davide Caratti <dcaratti(a)redhat.com>
+Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
+Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com>
+Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
+Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com>
+Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ net/sched/act_mirred.c | 16 ++++++++--------
+ 1 file changed, 8 insertions(+), 8 deletions(-)
+
+diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
+index b8ad6ae282c0..ded6ee054be1 100644
+--- a/net/sched/act_mirred.c
++++ b/net/sched/act_mirred.c
+@@ -28,8 +28,8 @@
+ static LIST_HEAD(mirred_list);
+ static DEFINE_SPINLOCK(mirred_list_lock);
+
+-#define MIRRED_RECURSION_LIMIT 4
+-static DEFINE_PER_CPU(unsigned int, mirred_rec_level);
++#define MIRRED_NEST_LIMIT 4
++static DEFINE_PER_CPU(unsigned int, mirred_nest_level);
+
+ static bool tcf_mirred_is_act_redirect(int action)
+ {
+@@ -224,7 +224,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
+ struct sk_buff *skb2 = skb;
+ bool m_mac_header_xmit;
+ struct net_device *dev;
+- unsigned int rec_level;
++ unsigned int nest_level;
+ int retval, err = 0;
+ bool use_reinsert;
+ bool want_ingress;
+@@ -235,11 +235,11 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
+ int mac_len;
+ bool at_nh;
+
+- rec_level = __this_cpu_inc_return(mirred_rec_level);
+- if (unlikely(rec_level > MIRRED_RECURSION_LIMIT)) {
++ nest_level = __this_cpu_inc_return(mirred_nest_level);
++ if (unlikely(nest_level > MIRRED_NEST_LIMIT)) {
+ net_warn_ratelimited("Packet exceeded mirred recursion limit on dev %s\n",
+ netdev_name(skb->dev));
+- __this_cpu_dec(mirred_rec_level);
++ __this_cpu_dec(mirred_nest_level);
+ return TC_ACT_SHOT;
+ }
+
+@@ -308,7 +308,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
+ err = tcf_mirred_forward(want_ingress, skb);
+ if (err)
+ tcf_action_inc_overlimit_qstats(&m->common);
+- __this_cpu_dec(mirred_rec_level);
++ __this_cpu_dec(mirred_nest_level);
+ return TC_ACT_CONSUMED;
+ }
+ }
+@@ -320,7 +320,7 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
+ if (tcf_mirred_is_act_redirect(m_eaction))
+ retval = TC_ACT_SHOT;
+ }
+- __this_cpu_dec(mirred_rec_level);
++ __this_cpu_dec(mirred_nest_level);
+
+ return retval;
+ }
+--
+2.25.1
+
diff --git a/0024-act_mirred-use-the-backlog-for-nested-calls-to-mirre.patch b/0024-act_mirred-use-the-backlog-for-nested-calls-to-mirre.patch
new file mode 100644
index 0000000..edfc0ba
--- /dev/null
+++ b/0024-act_mirred-use-the-backlog-for-nested-calls-to-mirre.patch
@@ -0,0 +1,149 @@
+From a6bb3989ccb7d3493c20e709179904733c6db856 Mon Sep 17 00:00:00 2001
+From: Davide Caratti <dcaratti(a)redhat.com>
+Date: Sat, 18 Mar 2023 16:46:40 +0800
+Subject: [PATCH 24/24] act_mirred: use the backlog for nested calls to mirred
+ ingress
+
+mainline inclusion
+from mainline-v6.3-rc1
+commit ca22da2fbd693b54dc8e3b7b54ccc9f7e9ba3640
+category: bugfix
+bugzilla: https://gitee.com/src-openeuler/kernel/issues/I64END
+CVE: CVE-2022-4269
+
+Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
+
+--------------------------------
+
+William reports kernel soft-lockups on some OVS topologies when TC mirred
+egress->ingress action is hit by local TCP traffic [1].
+The same can also be reproduced with SCTP (thanks Xin for verifying), when
+client and server reach themselves through mirred egress to ingress, and
+one of the two peers sends a "heartbeat" packet (from within a timer).
+
+Enqueueing to backlog proved to fix this soft lockup; however, as Cong
+noticed [2], we should preserve - when possible - the current mirred
+behavior that counts as "overlimits" any eventual packet drop subsequent to
+the mirred forwarding action [3]. A compromise solution might use the
+backlog only when tcf_mirred_act() has a nest level greater than one:
+change tcf_mirred_forward() accordingly.
+
+Also, add a kselftest that can reproduce the lockup and verifies TC mirred
+ability to account for further packet drops after TC mirred egress->ingress
+(when the nest level is 1).
+
+ [1] https://lore.kernel.org/netdev/33dc43f587ec1388ba456b4915c75f02a8aae226.166…
+ [2] https://lore.kernel.org/netdev/Y0w%2FWWY60gqrtGLp@pop-os.localdomain/
+ [3] such behavior is not guaranteed: for example, if RPS or skb RX
+ timestamping is enabled on the mirred target device, the kernel
+ can defer receiving the skb and return NET_RX_SUCCESS inside
+ tcf_mirred_forward().
+
+Reported-by: William Zhao <wizhao(a)redhat.com>
+CC: Xin Long <lucien.xin(a)gmail.com>
+Signed-off-by: Davide Caratti <dcaratti(a)redhat.com>
+Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
+Acked-by: Jamal Hadi Salim <jhs(a)mojatatu.com>
+Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
+Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com>
+Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
+Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
+---
+ net/sched/act_mirred.c | 7 +++
+ .../selftests/net/forwarding/tc_actions.sh | 49 ++++++++++++++++++-
+ 2 files changed, 55 insertions(+), 1 deletion(-)
+
+diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
+index ded6ee054be1..baeae5e5c8f0 100644
+--- a/net/sched/act_mirred.c
++++ b/net/sched/act_mirred.c
+@@ -205,12 +205,19 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
+ return err;
+ }
+
++static bool is_mirred_nested(void)
++{
++ return unlikely(__this_cpu_read(mirred_nest_level) > 1);
++}
++
+ static int tcf_mirred_forward(bool want_ingress, struct sk_buff *skb)
+ {
+ int err;
+
+ if (!want_ingress)
+ err = tcf_dev_queue_xmit(skb, dev_queue_xmit);
++ else if (is_mirred_nested())
++ err = netif_rx(skb);
+ else
+ err = netif_receive_skb(skb);
+
+diff --git a/tools/testing/selftests/net/forwarding/tc_actions.sh b/tools/testing/selftests/net/forwarding/tc_actions.sh
+index 1e0a62f638fe..919c0dd9fe4b 100755
+--- a/tools/testing/selftests/net/forwarding/tc_actions.sh
++++ b/tools/testing/selftests/net/forwarding/tc_actions.sh
+@@ -3,7 +3,8 @@
+
+ ALL_TESTS="gact_drop_and_ok_test mirred_egress_redirect_test \
+ mirred_egress_mirror_test matchall_mirred_egress_mirror_test \
+- gact_trap_test mirred_egress_to_ingress_test"
++ gact_trap_test mirred_egress_to_ingress_test \
++ mirred_egress_to_ingress_tcp_test"
+ NUM_NETIFS=4
+ source tc_common.sh
+ source lib.sh
+@@ -198,6 +199,52 @@ mirred_egress_to_ingress_test()
+ log_test "mirred_egress_to_ingress ($tcflags)"
+ }
+
++mirred_egress_to_ingress_tcp_test()
++{
++ local tmpfile=$(mktemp) tmpfile1=$(mktemp)
++
++ RET=0
++ dd conv=sparse status=none if=/dev/zero bs=1M count=2 of=$tmpfile
++ tc filter add dev $h1 protocol ip pref 100 handle 100 egress flower \
++ $tcflags ip_proto tcp src_ip 192.0.2.1 dst_ip 192.0.2.2 \
++ action ct commit nat src addr 192.0.2.2 pipe \
++ action ct clear pipe \
++ action ct commit nat dst addr 192.0.2.1 pipe \
++ action ct clear pipe \
++ action skbedit ptype host pipe \
++ action mirred ingress redirect dev $h1
++ tc filter add dev $h1 protocol ip pref 101 handle 101 egress flower \
++ $tcflags ip_proto icmp \
++ action mirred ingress redirect dev $h1
++ tc filter add dev $h1 protocol ip pref 102 handle 102 ingress flower \
++ ip_proto icmp \
++ action drop
++
++ ip vrf exec v$h1 nc --recv-only -w10 -l -p 12345 -o $tmpfile1 &
++ local rpid=$!
++ ip vrf exec v$h1 nc -w1 --send-only 192.0.2.2 12345 <$tmpfile
++ wait -n $rpid
++ cmp -s $tmpfile $tmpfile1
++ check_err $? "server output check failed"
++
++ $MZ $h1 -c 10 -p 64 -a $h1mac -b $h1mac -A 192.0.2.1 -B 192.0.2.1 \
++ -t icmp "ping,id=42,seq=5" -q
++ tc_check_packets "dev $h1 egress" 101 10
++ check_err $? "didn't mirred redirect ICMP"
++ tc_check_packets "dev $h1 ingress" 102 10
++ check_err $? "didn't drop mirred ICMP"
++ local overlimits=$(tc_rule_stats_get ${h1} 101 egress .overlimits)
++ test ${overlimits} = 10
++ check_err $? "wrong overlimits, expected 10 got ${overlimits}"
++
++ tc filter del dev $h1 egress protocol ip pref 100 handle 100 flower
++ tc filter del dev $h1 egress protocol ip pref 101 handle 101 flower
++ tc filter del dev $h1 ingress protocol ip pref 102 handle 102 flower
++
++ rm -f $tmpfile $tmpfile1
++ log_test "mirred_egress_to_ingress_tcp ($tcflags)"
++}
++
+ setup_prepare()
+ {
+ h1=${NETIFS[p1]}
+--
+2.25.1
+
diff --git a/kernel.spec b/kernel.spec
index 0eb5b16..f4bc7a4 100644
--- a/kernel.spec
+++ b/kernel.spec
@@ -10,9 +10,9 @@
%global upstream_version 6.1
%global upstream_sublevel 19
-%global devel_release 6
+%global devel_release 7
%global maintenance_release .0.0
-%global pkg_release .16
+%global pkg_release .17
%define with_debuginfo 0
# Do not recompute the build-id of vmlinux in find-debuginfo.sh
@@ -84,6 +84,16 @@ Patch0011: 0011-bpf-Two-helper-functions-are-introduced-to-parse-use.patch
Patch0012: 0012-net-bpf-Add-a-writeable_tracepoint-to-inet_stream_co.patch
Patch0013: 0013-nfs-client-multipath.patch
Patch0014: 0014-nfs-client-multipath-config.patch
+Patch0015: 0015-mm-demotion-fix-NULL-vs-IS_ERR-checking-in-memory_ti.patch
+Patch0016: 0016-x86-mm-Randomize-per-cpu-entry-area.patch
+Patch0017: 0017-x86-kasan-Map-shadow-for-percpu-pages-on-demand.patch
+Patch0018: 0018-x86-mm-Recompute-physical-address-for-every-page-of-.patch
+Patch0019: 0019-x86-mm-Populate-KASAN-shadow-for-entire-per-CPU-rang.patch
+Patch0020: 0020-x86-kasan-Rename-local-CPU_ENTRY_AREA-variables-to-s.patch
+Patch0021: 0021-x86-kasan-Add-helpers-to-align-shadow-addresses-up-a.patch
+Patch0022: 0022-x86-kasan-Populate-shadow-for-shared-chunk-of-the-CP.patch
+Patch0023: 0023-net-sched-act_mirred-better-wording-on-protection-ag.patch
+Patch0024: 0024-act_mirred-use-the-backlog-for-nested-calls-to-mirre.patch
#BuildRequires:
@@ -323,6 +333,16 @@ Applypatches series.conf %{_builddir}/kernel-%{version}/linux-%{KernelVer}
%patch0012 -p1
%patch0013 -p1
%patch0014 -p1
+%patch0015 -p1
+%patch0016 -p1
+%patch0017 -p1
+%patch0018 -p1
+%patch0019 -p1
+%patch0020 -p1
+%patch0021 -p1
+%patch0022 -p1
+%patch0023 -p1
+%patch0024 -p1
touch .scmversion
find . \( -name "*.orig" -o -name "*~" \) -exec rm -f {} \; >/dev/null
@@ -905,6 +925,9 @@ fi
%endif
%changelog
+* Fri Mar 18 2023 Jialin Zhang <zhangjialin11(a)huawei.com> - 6.1.19-7.0.0.17
+- Fix CVE-2023-23005, CVE-2023-0597 and CVE-2022-4269
+
* Fri Mar 17 2023 Zheng Zengkai <zhengzengkai(a)huawei.com> - 6.1.19-6.0.0.16
- Fix kernel rpm build failure that libperf-jvmti.so is missing
--
2.25.1
1
0

[PATCH PR openEuler-22.03-LTS-SP1] mm/vmalloc: huge vmalloc backing pages should be split rather than compound
by Jialin Zhang 17 Mar '23
by Jialin Zhang 17 Mar '23
17 Mar '23
From: Nicholas Piggin <npiggin(a)gmail.com>
mainline inclusion
from mainline-v5.18-rc4
commit 3b8000ae185cb068adbda5f966a3835053c85fd4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6LD0S
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Huge vmalloc higher-order backing pages were allocated with __GFP_COMP
in order to allow the sub-pages to be refcounted by callers such as
"remap_vmalloc_page [sic]" (remap_vmalloc_range).
However a similar problem exists for other struct page fields callers
use, for example fb_deferred_io_fault() takes a vmalloc'ed page and
not only refcounts it but uses ->lru, ->mapping, ->index.
This is not compatible with compound sub-pages, and can cause bad page
state issues like
BUG: Bad page state in process swapper/0 pfn:00743
page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x743
flags: 0x7ffff000000000(node=0|zone=0|lastcpupid=0x7ffff)
raw: 007ffff000000000 c00c00000001d0c8 c00c00000001d0c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: corrupted mapping in tail page
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-00082-gfc6fff4a7ce1-dirty #2810
Call Trace:
dump_stack_lvl+0x74/0xa8 (unreliable)
bad_page+0x12c/0x170
free_tail_pages_check+0xe8/0x190
free_pcp_prepare+0x31c/0x4e0
free_unref_page+0x40/0x1b0
__vunmap+0x1d8/0x420
...
The correct approach is to use split high-order pages for the huge
vmalloc backing. These allow callers to treat them in exactly the same
way as individually-allocated order-0 pages.
Link: https://lore.kernel.org/all/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg…
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Paul Menzel <pmenzel(a)molgen.mpg.de>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
conflicts:
mm/vmalloc.c
Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
mm/vmalloc.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e27cd716ca95..2ca2c1bc0db9 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2641,14 +2641,17 @@ static void __vunmap(const void *addr, int deallocate_pages)
vm_remove_mappings(area, deallocate_pages);
if (deallocate_pages) {
- unsigned int page_order = vm_area_page_order(area);
int i;
- for (i = 0; i < area->nr_pages; i += 1U << page_order) {
+ for (i = 0; i < area->nr_pages; i++) {
struct page *page = area->pages[i];
BUG_ON(!page);
- __free_pages(page, page_order);
+ /*
+ * High-order allocs for huge vmallocs are split, so
+ * can be freed as an array of order-0 allocations
+ */
+ __free_pages(page, 0);
}
atomic_long_sub(area->nr_pages, &nr_vmalloc_pages);
@@ -2930,8 +2933,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
struct page *page;
int p;
- /* Compound pages required for remap_vmalloc_page */
- page = alloc_pages_node(node, gfp_mask | __GFP_COMP, page_order);
+ page = alloc_pages_node(node, gfp_mask, page_order);
if (unlikely(!page)) {
/* Successfully allocated i pages, free them in __vfree() */
area->nr_pages = i;
@@ -2943,6 +2945,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
goto fail;
}
+ /*
+ * Higher order allocations must be able to be treated as
+ * indepdenent small pages by callers (as they can with
+ * small-page vmallocs). Some drivers do their own refcounting
+ * on vmalloc_to_page() pages, some use page->mapping,
+ * page->lru, etc.
+ */
+ if (page_order)
+ split_page(page, page_order);
+
for (p = 0; p < (1U << page_order); p++)
area->pages[i + p] = page + p;
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 01/10] media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()
by Yongqiang Liu 17 Mar '23
by Yongqiang Liu 17 Mar '23
17 Mar '23
From: Baisong Zhong <zhongbaisong(a)huawei.com>
mainline inclusion
from mainline-v6.2-rc1
commit 0ed554fd769a19ea8464bb83e9ac201002ef74ad
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6NCQH
CVE: CVE-2023-28328
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Wei Chen reports a kernel bug as blew:
general protection fault, probably for non-canonical address
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
...
Call Trace:
<TASK>
__i2c_transfer+0x77e/0x1930 drivers/i2c/i2c-core-base.c:2109
i2c_transfer+0x1d5/0x3d0 drivers/i2c/i2c-core-base.c:2170
i2cdev_ioctl_rdwr+0x393/0x660 drivers/i2c/i2c-dev.c:297
i2cdev_ioctl+0x75d/0x9f0 drivers/i2c/i2c-dev.c:458
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl+0xfb/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fd834a8bded
In az6027_i2c_xfer(), if msg[i].addr is 0x99,
a null-ptr-deref will caused when accessing msg[i].buf.
For msg[i].len is 0 and msg[i].buf is null.
Fix this by checking msg[i].len in az6027_i2c_xfer().
Link: https://lore.kernel.org/lkml/CAO4mrfcPHB5aQJO=mpqV+p8mPLNg-Fok0gw8gZ=zemAfM…
Link: https://lore.kernel.org/linux-media/20221120065918.2160782-1-zhongbaisong@h…
Fixes: 76f9a820c867 ("V4L/DVB: AZ6027: Initial import of the driver")
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Signed-off-by: Baisong Zhong <zhongbaisong(a)huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/media/usb/dvb-usb/az6027.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/media/usb/dvb-usb/az6027.c b/drivers/media/usb/dvb-usb/az6027.c
index 6321b8e30261..555c8ac44881 100644
--- a/drivers/media/usb/dvb-usb/az6027.c
+++ b/drivers/media/usb/dvb-usb/az6027.c
@@ -977,6 +977,10 @@ static int az6027_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msg[], int n
if (msg[i].addr == 0x99) {
req = 0xBE;
index = 0;
+ if (msg[i].len < 1) {
+ i = -EOPNOTSUPP;
+ break;
+ }
value = msg[i].buf[0] & 0x00ff;
length = 1;
az6027_usb_out_op(d, req, value, index, data, length);
--
2.25.1
1
9

[PATCH OLK-5.10] Revert "scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure"
by Yihang Li 17 Mar '23
by Yihang Li 17 Mar '23
17 Mar '23
driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6NX2M
CVE: NA
----------------------------------------------------------------------
This fix commit c723ada86707c6afe524b51126c301d689c64d8e.
In that commit, if the softreset fails upon certain conditions, just
disable the PHY associated with the disk. The user needs to restore the
PHY.
SATA disks do not support simultaneous connection of multiple hosts.
Therefore, when multiple controllers are connected to a SATA disk at the
same time, the controller which is connected later failed to issue an ATA
softreset to the SATA disk. As a result, the PHY associated with the disk
is disabled and cannot be automatically recovered.
Now that, we will not focus on the execution result of softreset. No
matter whether the execution is successful or not, we will directly carry
out I_T_nexus_reset.
Signed-off-by: Yihang Li <liyihang9(a)huawei.com>
Signed-off-by: xiabing <xiabing12(a)h-partners.com>
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 29 +++++----------------------
1 file changed, 5 insertions(+), 24 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index b8249a055fbb..0f5578e52558 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1895,33 +1895,14 @@ static int hisi_sas_I_T_nexus_reset(struct domain_device *device)
}
hisi_sas_dereg_device(hisi_hba, device);
- rc = hisi_sas_debug_I_T_nexus_reset(device);
- if (rc == TMF_RESP_FUNC_COMPLETE && dev_is_sata(device)) {
- struct sas_phy *local_phy;
-
+ if (dev_is_sata(device)) {
rc = hisi_sas_softreset_ata_disk(device);
- switch (rc) {
- case -ECOMM:
- rc = -ENODEV;
- break;
- case TMF_RESP_FUNC_FAILED:
- case -EMSGSIZE:
- case -EIO:
- local_phy = sas_get_local_phy(device);
- rc = sas_phy_enable(local_phy, 0);
- if (!rc) {
- local_phy->enabled = 0;
- dev_err(dev, "Disabled local phy of ATA disk %016llx due to softreset fail (%d)\n",
- SAS_ADDR(device->sas_addr), rc);
- rc = -ENODEV;
- }
- sas_put_local_phy(local_phy);
- break;
- default:
- break;
- }
+ if (rc == TMF_RESP_FUNC_FAILED)
+ dev_err(dev, "ata disk %016llx reset (%d)\n",
+ SAS_ADDR(device->sas_addr), rc);
}
+ rc = hisi_sas_debug_I_T_nexus_reset(device);
if ((rc == TMF_RESP_FUNC_COMPLETE) || (rc == -ENODEV))
hisi_sas_release_task(hisi_hba, device);
--
2.30.0
1
0

[PATCH openEuler-1.0-LTS 1/8] fs/proc: task_mmu.c: don't read mapcount for migration entry
by Yongqiang Liu 17 Mar '23
by Yongqiang Liu 17 Mar '23
17 Mar '23
From: Yang Shi <shy828301(a)gmail.com>
stable inclusion
from stable-v5.10.102
commit db3f3636e4aed2cba3e4e7897a053323f7a62249
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6EVUJ
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 24d7275ce2791829953ed4e72f68277ceb2571c6 upstream.
The syzbot reported the below BUG:
kernel BUG at include/linux/page-flags.h:785!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
Call Trace:
page_mapcount include/linux/mm.h:837 [inline]
smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466
smaps_pte_entry fs/proc/task_mmu.c:538 [inline]
smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601
walk_pmd_range mm/pagewalk.c:128 [inline]
walk_pud_range mm/pagewalk.c:205 [inline]
walk_p4d_range mm/pagewalk.c:240 [inline]
walk_pgd_range mm/pagewalk.c:277 [inline]
__walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379
walk_page_vma+0x277/0x350 mm/pagewalk.c:530
smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768
smap_gather_stats fs/proc/task_mmu.c:741 [inline]
show_smap+0xc6/0x440 fs/proc/task_mmu.c:822
seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272
seq_read+0x3e0/0x5b0 fs/seq_file.c:162
vfs_read+0x1b5/0x600 fs/read_write.c:479
ksys_read+0x12d/0x250 fs/read_write.c:619
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
The reproducer was trying to read /proc/$PID/smaps when calling
MADV_FREE at the mean time. MADV_FREE may split THPs if it is called
for partial THP. It may trigger the below race:
CPU A CPU B
----- -----
smaps walk: MADV_FREE:
page_mapcount()
PageCompound()
split_huge_page()
page = compound_head(page)
PageDoubleMap(page)
When calling PageDoubleMap() this page is not a tail page of THP anymore
so the BUG is triggered.
This could be fixed by elevated refcount of the page before calling
mapcount, but that would prevent it from counting migration entries, and
it seems overkilling because the race just could happen when PMD is
split so all PTE entries of tail pages are actually migration entries,
and smaps_account() does treat migration entries as mapcount == 1 as
Kirill pointed out.
Add a new parameter for smaps_account() to tell this entry is migration
entry then skip calling page_mapcount(). Don't skip getting mapcount
for device private entries since they do track references with mapcount.
Pagemap also has the similar issue although it was not reported. Fixed
it as well.
[shy828301(a)gmail.com: v4]
Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com
[nathan(a)kernel.org: avoid unused variable warning in pagemap_pmd_range()]
Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org
Link: https://lkml.kernel.org/r/20220120202805.3369-1-shy828301@gmail.com
Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()")
Signed-off-by: Yang Shi <shy828301(a)gmail.com>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Reported-by: syzbot+1f52b3a18d5633fa7f82(a)syzkaller.appspotmail.com
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
conflicts:
fs/proc/task_mmu.c
Signed-off-by: Zhang Peng <zhangpeng362(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/proc/task_mmu.c | 37 +++++++++++++++++++++++++++----------
1 file changed, 27 insertions(+), 10 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 93815b2b8440..0175fd7b3598 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -429,7 +429,8 @@ struct mem_size_stats {
};
static void smaps_account(struct mem_size_stats *mss, struct page *page,
- bool compound, bool young, bool dirty, bool locked)
+ bool compound, bool young, bool dirty, bool locked,
+ bool migration)
{
int i, nr = compound ? 1 << compound_order(page) : 1;
unsigned long size = nr * PAGE_SIZE;
@@ -449,8 +450,15 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
* page_count(page) == 1 guarantees the page is mapped exactly once.
* If any subpage of the compound page mapped with PTE it would elevate
* page_count().
+ *
+ * The page_mapcount() is called to get a snapshot of the mapcount.
+ * Without holding the page lock this snapshot can be slightly wrong as
+ * we cannot always read the mapcount atomically. It is not safe to
+ * call page_mapcount() even with PTL held if the page is not mapped,
+ * especially for migration entries. Treat regular migration entries
+ * as mapcount == 1.
*/
- if (page_count(page) == 1) {
+ if ((page_count(page) == 1) || migration) {
if (dirty || PageDirty(page))
mss->private_dirty += size;
else
@@ -505,6 +513,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
struct vm_area_struct *vma = walk->vma;
bool locked = !!(vma->vm_flags & VM_LOCKED);
struct page *page = NULL;
+ bool migration = false;
if (pte_present(*pte)) {
page = vm_normal_page(vma, addr, *pte);
@@ -524,9 +533,10 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
} else {
mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT;
}
- } else if (is_migration_entry(swpent))
+ } else if (is_migration_entry(swpent)) {
+ migration = true;
page = migration_entry_to_page(swpent);
- else if (is_device_private_entry(swpent))
+ } else if (is_device_private_entry(swpent))
page = device_private_entry_to_page(swpent);
} else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap
&& pte_none(*pte))) {
@@ -546,7 +556,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
if (!page)
return;
- smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte), locked);
+ smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte),
+ locked, migration);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -570,7 +581,8 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
/* pass */;
else
VM_BUG_ON_PAGE(1, page);
- smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), locked);
+ smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd),
+ locked, false);
}
#else
static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
@@ -1285,6 +1297,7 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm,
{
u64 frame = 0, flags = 0;
struct page *page = NULL;
+ bool migration = false;
if (pte_present(pte)) {
if (pm->show_pfn)
@@ -1302,8 +1315,10 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm,
frame = swp_type(entry) |
(swp_offset(entry) << MAX_SWAPFILES_SHIFT);
flags |= PM_SWAP;
- if (is_migration_entry(entry))
+ if (is_migration_entry(entry)) {
+ migration = true;
page = migration_entry_to_page(entry);
+ }
if (is_device_private_entry(entry))
page = device_private_entry_to_page(entry);
@@ -1311,7 +1326,7 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm,
if (page && !PageAnon(page))
flags |= PM_FILE;
- if (page && page_mapcount(page) == 1)
+ if (page && !migration && page_mapcount(page) == 1)
flags |= PM_MMAP_EXCLUSIVE;
if (vma->vm_flags & VM_SOFTDIRTY)
flags |= PM_SOFT_DIRTY;
@@ -1327,8 +1342,9 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end,
spinlock_t *ptl;
pte_t *pte, *orig_pte;
int err = 0;
-
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ bool migration = false;
+
ptl = pmd_trans_huge_lock(pmdp, vma);
if (ptl) {
u64 flags = 0, frame = 0;
@@ -1363,11 +1379,12 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end,
if (pmd_swp_soft_dirty(pmd))
flags |= PM_SOFT_DIRTY;
VM_BUG_ON(!is_pmd_migration_entry(pmd));
+ migration = is_migration_entry(entry);
page = migration_entry_to_page(entry);
}
#endif
- if (page && page_mapcount(page) == 1)
+ if (page && !migration && page_mapcount(page) == 1)
flags |= PM_MMAP_EXCLUSIVE;
for (; addr != end; addr += PAGE_SIZE) {
--
2.25.1
2
9
From: xiabing <xiabing12(a)h-partners.com>
1. In the NCQ scenario, if multiple I/Os are delivered and one of the I/Os is faulty, a group of slow disks will always occur.
2. During I/O running, a null pointer exception occurs during the pressure test when a disk is removed.
3. When the length of the DMA Setup frame returned by the disk is abnormal, a group of slow disks are triggered.
John Garry (2):
scsi: libsas: Add sas_ata_device_link_abort()
scsi: libsas: Update SATA dev FIS in sas_ata_task_done()
Xingui Yang (6):
scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task()
scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw
scsi: hisi_sas: Use abort task set to reset SAS disks when discovered
scsi: libsas: Grab the ATA port lock in sas_ata_device_link_abort()
{topost} scsi: hisi_sas: Handle NCQ error when IPTT is valid
{topost} scsi: hisi_sas: Grab sas_dev lock when traversing the members
of sas_dev.list
drivers/scsi/hisi_sas/hisi_sas.h | 4 +-
drivers/scsi/hisi_sas/hisi_sas_main.c | 52 ++++++++++++++++-------
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 8 +++-
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 8 +++-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 58 ++++++++++++++++++++++++--
drivers/scsi/libsas/sas_ata.c | 22 +++++++++-
include/scsi/sas_ata.h | 6 +++
7 files changed, 132 insertions(+), 26 deletions(-)
--
2.30.0
1
8
subscribe
1
0
From: xiabing <xiabing12(a)h-partners.com>
1. In the NCQ scenario, if multiple I/Os are delivered and one of the I/Os is faulty, a group of slow disks will always occur.
2. During I/O running, a null pointer exception occurs during the pressure test when a disk is removed.
3. When the length of the DMA Setup frame returned by the disk is abnormal, a group of slow disks are triggered.
John Garry (2):
scsi: libsas: Add sas_ata_device_link_abort()
scsi: libsas: Update SATA dev FIS in sas_ata_task_done()
Xingui Yang (6):
scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task()
scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw
scsi: hisi_sas: Use abort task set to reset SAS disks when discovered
scsi: libsas: Grab the ATA port lock in sas_ata_device_link_abort()
{topost} scsi: hisi_sas: Handle NCQ error when IPTT is valid
{topost} scsi: hisi_sas: Grab sas_dev lock when traversing the members
of sas_dev.list
drivers/scsi/hisi_sas/hisi_sas.h | 4 +-
drivers/scsi/hisi_sas/hisi_sas_main.c | 52 ++++++++++++++++-------
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 8 +++-
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 8 +++-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 58 ++++++++++++++++++++++++--
drivers/scsi/libsas/sas_ata.c | 22 +++++++++-
include/scsi/sas_ata.h | 6 +++
7 files changed, 132 insertions(+), 26 deletions(-)
--
2.30.0
1
8

[PATCH PR openEuler-22.03-LTS-SP1] mm/vmalloc: huge vmalloc backing pages should be split rather than compound
by Jialin Zhang 16 Mar '23
by Jialin Zhang 16 Mar '23
16 Mar '23
From: Nicholas Piggin <npiggin(a)gmail.com>
mainline inclusion
from mainline-v5.18-rc4
commit 3b8000ae185cb068adbda5f966a3835053c85fd4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6LD0S
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Huge vmalloc higher-order backing pages were allocated with __GFP_COMP
in order to allow the sub-pages to be refcounted by callers such as
"remap_vmalloc_page [sic]" (remap_vmalloc_range).
However a similar problem exists for other struct page fields callers
use, for example fb_deferred_io_fault() takes a vmalloc'ed page and
not only refcounts it but uses ->lru, ->mapping, ->index.
This is not compatible with compound sub-pages, and can cause bad page
state issues like
BUG: Bad page state in process swapper/0 pfn:00743
page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x743
flags: 0x7ffff000000000(node=0|zone=0|lastcpupid=0x7ffff)
raw: 007ffff000000000 c00c00000001d0c8 c00c00000001d0c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: corrupted mapping in tail page
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-00082-gfc6fff4a7ce1-dirty #2810
Call Trace:
dump_stack_lvl+0x74/0xa8 (unreliable)
bad_page+0x12c/0x170
free_tail_pages_check+0xe8/0x190
free_pcp_prepare+0x31c/0x4e0
free_unref_page+0x40/0x1b0
__vunmap+0x1d8/0x420
...
The correct approach is to use split high-order pages for the huge
vmalloc backing. These allow callers to treat them in exactly the same
way as individually-allocated order-0 pages.
Link: https://lore.kernel.org/all/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg…
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Paul Menzel <pmenzel(a)molgen.mpg.de>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
conflicts:
mm/vmalloc.c
Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
mm/vmalloc.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e27cd716ca95..2ca2c1bc0db9 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2641,14 +2641,17 @@ static void __vunmap(const void *addr, int deallocate_pages)
vm_remove_mappings(area, deallocate_pages);
if (deallocate_pages) {
- unsigned int page_order = vm_area_page_order(area);
int i;
- for (i = 0; i < area->nr_pages; i += 1U << page_order) {
+ for (i = 0; i < area->nr_pages; i++) {
struct page *page = area->pages[i];
BUG_ON(!page);
- __free_pages(page, page_order);
+ /*
+ * High-order allocs for huge vmallocs are split, so
+ * can be freed as an array of order-0 allocations
+ */
+ __free_pages(page, 0);
}
atomic_long_sub(area->nr_pages, &nr_vmalloc_pages);
@@ -2930,8 +2933,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
struct page *page;
int p;
- /* Compound pages required for remap_vmalloc_page */
- page = alloc_pages_node(node, gfp_mask | __GFP_COMP, page_order);
+ page = alloc_pages_node(node, gfp_mask, page_order);
if (unlikely(!page)) {
/* Successfully allocated i pages, free them in __vfree() */
area->nr_pages = i;
@@ -2943,6 +2945,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
goto fail;
}
+ /*
+ * Higher order allocations must be able to be treated as
+ * indepdenent small pages by callers (as they can with
+ * small-page vmallocs). Some drivers do their own refcounting
+ * on vmalloc_to_page() pages, some use page->mapping,
+ * page->lru, etc.
+ */
+ if (page_order)
+ split_page(page, page_order);
+
for (p = 0; p < (1U << page_order); p++)
area->pages[i + p] = page + p;
--
2.25.1
1
0

[PATCH PR OLK-5.10] mm/vmalloc: huge vmalloc backing pages should be split rather than compound
by Jialin Zhang 16 Mar '23
by Jialin Zhang 16 Mar '23
16 Mar '23
From: Nicholas Piggin <npiggin(a)gmail.com>
mainline inclusion
from mainline-v5.18-rc4
commit 3b8000ae185cb068adbda5f966a3835053c85fd4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6LD0S
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Huge vmalloc higher-order backing pages were allocated with __GFP_COMP
in order to allow the sub-pages to be refcounted by callers such as
"remap_vmalloc_page [sic]" (remap_vmalloc_range).
However a similar problem exists for other struct page fields callers
use, for example fb_deferred_io_fault() takes a vmalloc'ed page and
not only refcounts it but uses ->lru, ->mapping, ->index.
This is not compatible with compound sub-pages, and can cause bad page
state issues like
BUG: Bad page state in process swapper/0 pfn:00743
page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x743
flags: 0x7ffff000000000(node=0|zone=0|lastcpupid=0x7ffff)
raw: 007ffff000000000 c00c00000001d0c8 c00c00000001d0c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: corrupted mapping in tail page
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-00082-gfc6fff4a7ce1-dirty #2810
Call Trace:
dump_stack_lvl+0x74/0xa8 (unreliable)
bad_page+0x12c/0x170
free_tail_pages_check+0xe8/0x190
free_pcp_prepare+0x31c/0x4e0
free_unref_page+0x40/0x1b0
__vunmap+0x1d8/0x420
...
The correct approach is to use split high-order pages for the huge
vmalloc backing. These allow callers to treat them in exactly the same
way as individually-allocated order-0 pages.
Link: https://lore.kernel.org/all/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg…
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Paul Menzel <pmenzel(a)molgen.mpg.de>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
conflicts:
mm/vmalloc.c
Signed-off-by: ZhangPeng <zhangpeng362(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
mm/vmalloc.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e27cd716ca95..2ca2c1bc0db9 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2641,14 +2641,17 @@ static void __vunmap(const void *addr, int deallocate_pages)
vm_remove_mappings(area, deallocate_pages);
if (deallocate_pages) {
- unsigned int page_order = vm_area_page_order(area);
int i;
- for (i = 0; i < area->nr_pages; i += 1U << page_order) {
+ for (i = 0; i < area->nr_pages; i++) {
struct page *page = area->pages[i];
BUG_ON(!page);
- __free_pages(page, page_order);
+ /*
+ * High-order allocs for huge vmallocs are split, so
+ * can be freed as an array of order-0 allocations
+ */
+ __free_pages(page, 0);
}
atomic_long_sub(area->nr_pages, &nr_vmalloc_pages);
@@ -2930,8 +2933,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
struct page *page;
int p;
- /* Compound pages required for remap_vmalloc_page */
- page = alloc_pages_node(node, gfp_mask | __GFP_COMP, page_order);
+ page = alloc_pages_node(node, gfp_mask, page_order);
if (unlikely(!page)) {
/* Successfully allocated i pages, free them in __vfree() */
area->nr_pages = i;
@@ -2943,6 +2945,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
goto fail;
}
+ /*
+ * Higher order allocations must be able to be treated as
+ * indepdenent small pages by callers (as they can with
+ * small-page vmallocs). Some drivers do their own refcounting
+ * on vmalloc_to_page() pages, some use page->mapping,
+ * page->lru, etc.
+ */
+ if (page_order)
+ split_page(page, page_order);
+
for (p = 0; p < (1U << page_order); p++)
area->pages[i + p] = page + p;
--
2.25.1
1
0

[PATCH openEuler-5.10-LTS 01/13] ext4: fix incorrect options show of original mount_opt and extend mount_opt2
by Jialin Zhang 15 Mar '23
by Jialin Zhang 15 Mar '23
15 Mar '23
From: Zhang Yi <yi.zhang(a)huawei.com>
maillist inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6D5XF
Reference: https://lore.kernel.org/linux-ext4/20230130111138.76tp6pij3yhh4brh@quack3/T…
--------------------------------
Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed
extend sbi->s_mount_opt2 options with sbi->s_mount_opt, it could lead to
show incorrect options, e.g. show fc_debug_force if we mount with
errors=continue mode and miss it if we set.
$ mkfs.ext4 /dev/pmem0
$ mount -o errors=remount-ro /dev/pmem0 /mnt
$ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
#empty
$ mount -o remount,errors=continue /mnt
$ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
fc_debug_force
$ mount -o remount,errors=remount-ro,fc_debug_force /mnt
$ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
#empty
Fixes: 995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Conflict:
fs/ext4/super.c
Reviewed-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 28 +++++++++++++++++++++-------
2 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index c470a5fb2f20..7e5abaa31fea 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1459,6 +1459,7 @@ struct ext4_sb_info {
unsigned int s_mount_opt2;
unsigned long s_mount_flags;
unsigned int s_def_mount_opt;
+ unsigned int s_def_mount_opt2;
ext4_fsblk_t s_sb_block;
atomic64_t s_resv_clusters;
kuid_t s_resuid;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index dd3b72ba67e8..da8bd8031119 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2585,7 +2585,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
struct ext4_super_block *es = sbi->s_es;
- int def_errors, def_mount_opt = sbi->s_def_mount_opt;
+ int def_errors;
const struct mount_opts *m;
char sep = nodefs ? '\n' : ',';
@@ -2597,15 +2597,28 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
for (m = ext4_mount_opts; m->token != Opt_err; m++) {
int want_set = m->flags & MOPT_SET;
+ int opt_2 = m->flags & MOPT_2;
+ unsigned int mount_opt, def_mount_opt;
+
if (((m->flags & (MOPT_SET|MOPT_CLEAR)) == 0) ||
(m->flags & MOPT_CLEAR_ERR) || m->flags & MOPT_SKIP)
continue;
- if (!nodefs && !(m->mount_opt & (sbi->s_mount_opt ^ def_mount_opt)))
- continue; /* skip if same as the default */
+
+ if (opt_2) {
+ mount_opt = sbi->s_mount_opt2;
+ def_mount_opt = sbi->s_def_mount_opt2;
+ } else {
+ mount_opt = sbi->s_mount_opt;
+ def_mount_opt = sbi->s_def_mount_opt;
+ }
+ /* skip if same as the default */
+ if (!nodefs && !(m->mount_opt & (mount_opt ^ def_mount_opt)))
+ continue;
+ /* select Opt_noFoo vs Opt_Foo */
if ((want_set &&
- (sbi->s_mount_opt & m->mount_opt) != m->mount_opt) ||
- (!want_set && (sbi->s_mount_opt & m->mount_opt)))
- continue; /* select Opt_noFoo vs Opt_Foo */
+ (mount_opt & m->mount_opt) != m->mount_opt) ||
+ (!want_set && (mount_opt & m->mount_opt)))
+ continue;
SEQ_OPTS_PRINT("%s", token2str(m->token));
}
@@ -2635,7 +2648,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
if (nodefs || sbi->s_stripe)
SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe);
if (nodefs || EXT4_MOUNT_DATA_FLAGS &
- (sbi->s_mount_opt ^ def_mount_opt)) {
+ (sbi->s_mount_opt ^ sbi->s_def_mount_opt)) {
if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA)
SEQ_OPTS_PUTS("data=journal");
else if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA)
@@ -4340,6 +4353,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
kfree(s_mount_opts);
}
sbi->s_def_mount_opt = sbi->s_mount_opt;
+ sbi->s_def_mount_opt2 = sbi->s_mount_opt2;
if (!parse_options((char *) data, sb, &journal_devnum,
&journal_ioprio, 0))
goto failed_mount;
--
2.25.1
1
12

[PATCH openEuler-5.10-LTS-SP1 01/13] ext4: fix incorrect options show of original mount_opt and extend mount_opt2
by Jialin Zhang 15 Mar '23
by Jialin Zhang 15 Mar '23
15 Mar '23
From: Zhang Yi <yi.zhang(a)huawei.com>
maillist inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6D5XF
Reference: https://lore.kernel.org/linux-ext4/20230130111138.76tp6pij3yhh4brh@quack3/T…
--------------------------------
Current _ext4_show_options() do not distinguish MOPT_2 flag, so it mixed
extend sbi->s_mount_opt2 options with sbi->s_mount_opt, it could lead to
show incorrect options, e.g. show fc_debug_force if we mount with
errors=continue mode and miss it if we set.
$ mkfs.ext4 /dev/pmem0
$ mount -o errors=remount-ro /dev/pmem0 /mnt
$ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
#empty
$ mount -o remount,errors=continue /mnt
$ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
fc_debug_force
$ mount -o remount,errors=remount-ro,fc_debug_force /mnt
$ cat /proc/fs/ext4/pmem0/options | grep fc_debug_force
#empty
Fixes: 995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options")
Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com>
Conflict:
fs/ext4/super.c
Reviewed-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 28 +++++++++++++++++++++-------
2 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 8245b94d8fc6..f1d36671bc2b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1462,6 +1462,7 @@ struct ext4_sb_info {
unsigned int s_mount_opt2;
unsigned long s_mount_flags;
unsigned int s_def_mount_opt;
+ unsigned int s_def_mount_opt2;
ext4_fsblk_t s_sb_block;
atomic64_t s_resv_clusters;
kuid_t s_resuid;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index c94ea845ea57..4fd680507948 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2586,7 +2586,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
struct ext4_super_block *es = sbi->s_es;
- int def_errors, def_mount_opt = sbi->s_def_mount_opt;
+ int def_errors;
const struct mount_opts *m;
char sep = nodefs ? '\n' : ',';
@@ -2598,15 +2598,28 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
for (m = ext4_mount_opts; m->token != Opt_err; m++) {
int want_set = m->flags & MOPT_SET;
+ int opt_2 = m->flags & MOPT_2;
+ unsigned int mount_opt, def_mount_opt;
+
if (((m->flags & (MOPT_SET|MOPT_CLEAR)) == 0) ||
(m->flags & MOPT_CLEAR_ERR) || m->flags & MOPT_SKIP)
continue;
- if (!nodefs && !(m->mount_opt & (sbi->s_mount_opt ^ def_mount_opt)))
- continue; /* skip if same as the default */
+
+ if (opt_2) {
+ mount_opt = sbi->s_mount_opt2;
+ def_mount_opt = sbi->s_def_mount_opt2;
+ } else {
+ mount_opt = sbi->s_mount_opt;
+ def_mount_opt = sbi->s_def_mount_opt;
+ }
+ /* skip if same as the default */
+ if (!nodefs && !(m->mount_opt & (mount_opt ^ def_mount_opt)))
+ continue;
+ /* select Opt_noFoo vs Opt_Foo */
if ((want_set &&
- (sbi->s_mount_opt & m->mount_opt) != m->mount_opt) ||
- (!want_set && (sbi->s_mount_opt & m->mount_opt)))
- continue; /* select Opt_noFoo vs Opt_Foo */
+ (mount_opt & m->mount_opt) != m->mount_opt) ||
+ (!want_set && (mount_opt & m->mount_opt)))
+ continue;
SEQ_OPTS_PRINT("%s", token2str(m->token));
}
@@ -2636,7 +2649,7 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb,
if (nodefs || sbi->s_stripe)
SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe);
if (nodefs || EXT4_MOUNT_DATA_FLAGS &
- (sbi->s_mount_opt ^ def_mount_opt)) {
+ (sbi->s_mount_opt ^ sbi->s_def_mount_opt)) {
if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA)
SEQ_OPTS_PUTS("data=journal");
else if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_ORDERED_DATA)
@@ -4341,6 +4354,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
kfree(s_mount_opts);
}
sbi->s_def_mount_opt = sbi->s_mount_opt;
+ sbi->s_def_mount_opt2 = sbi->s_mount_opt2;
if (!parse_options((char *) data, sb, &journal_devnum,
&journal_ioprio, 0))
goto failed_mount;
--
2.25.1
1
12

15 Mar '23
From: tanghui <tanghui20(a)huawei.com>
optimise the way to get util.
Signed-off-by: tanghui <tanghui20(a)huawei.com>
Signed-off-by: Wang ShaoBo <bobo.shaobowang(a)huawei.com>
v2:
refresh low_pct change way through sysctl.
---
fs/proc/stat.c | 4 ++
include/linux/sched/cputime.h | 3 ++
include/linux/sched/sysctl.h | 2 +
kernel/sched/fair.c | 83 +++++++++++++++++++++++++++++++++++
kernel/sched/sched.h | 1 +
kernel/sysctl.c | 9 ++++
6 files changed, 102 insertions(+)
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 7e832b24847dd..3fe60a77b0b4d 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -63,7 +63,11 @@ u64 get_idle_time(int cpu)
return idle;
}
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+u64 get_iowait_time(int cpu)
+#else
static u64 get_iowait_time(int cpu)
+#endif
{
u64 iowait, iowait_usecs = -1ULL;
diff --git a/include/linux/sched/cputime.h b/include/linux/sched/cputime.h
index 6b1793606fc95..4a092e006f5b2 100644
--- a/include/linux/sched/cputime.h
+++ b/include/linux/sched/cputime.h
@@ -189,6 +189,9 @@ task_sched_runtime(struct task_struct *task);
extern int use_sched_idle_time;
extern int sched_idle_time_adjust(int cpu, u64 *utime, u64 *stime);
extern unsigned long long sched_get_idle_time(int cpu);
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+extern u64 get_iowait_time(int cpu);
+#endif
#ifdef CONFIG_PROC_FS
extern u64 get_idle_time(int cpu);
diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index 04eb5b127867b..8223a1fce176c 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -31,6 +31,8 @@ extern unsigned int sysctl_sched_latency;
extern unsigned int sysctl_sched_min_granularity;
extern unsigned int sysctl_sched_wakeup_granularity;
extern unsigned int sysctl_sched_child_runs_first;
+extern int sysctl_sched_util_update_interval;
+extern unsigned long sysctl_sched_util_update_interval_max;
#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
extern int sysctl_sched_util_low_pct;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ad6a7923c9edb..af55a26d11fcb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6682,6 +6682,73 @@ static int wake_cap(struct task_struct *p, int cpu, int prev_cpu)
*/
int sysctl_sched_util_low_pct = 85;
+struct cpu_timeinfo {
+ u64 systime;
+ u64 idletime;
+ unsigned long next_update;
+ int vutil;
+};
+
+/*
+ * The time interval to update CPU utilization
+ * (default 1ms, max 10min)
+ */
+int sysctl_sched_util_update_interval = 1;
+unsigned long sysctl_sched_util_update_interval_max = 600000;
+
+static DEFINE_PER_CPU(struct cpu_timeinfo, qos_cputime);
+
+static inline u64 cpu_systime(int cpu)
+{
+ u64 user, nice, system, idle, iowait, irq, softirq, steal;
+
+ user = kcpustat_cpu(cpu).cpustat[CPUTIME_USER];
+ system = kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];
+ iowait = get_iowait_time(cpu);
+ irq = kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];
+ softirq = kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];
+ nice = kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
+ steal = kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];
+ idle = get_idle_time(cpu);
+
+ return user + system + iowait + irq + softirq + nice + idle + steal;
+}
+
+static inline u64 cpu_idletime(int cpu)
+{
+ return get_idle_time(cpu) + get_iowait_time(cpu);
+}
+
+static inline void update_cpu_vutil(void)
+{
+ struct cpu_timeinfo *cputime = per_cpu_ptr(&qos_cputime, smp_processor_id());
+ u64 delta_systime, delta_idle, systime, idletime;
+ int cpu = smp_processor_id();
+ unsigned long interval;
+
+ if (time_after(jiffies, cputime->next_update)) {
+ interval = msecs_to_jiffies(sysctl_sched_util_update_interval);
+ cputime->next_update = jiffies + interval;
+ systime = cpu_systime(cpu);
+ idletime = cpu_idletime(cpu);
+ delta_systime = systime - cputime->systime;
+ delta_idle = idletime - cputime->idletime;
+ if (!delta_systime)
+ return;
+
+ cputime->systime = systime;
+ cputime->idletime = idletime;
+ cputime->vutil = (delta_systime - delta_idle) * 100 / delta_systime;
+ }
+}
+
+static inline int cpu_vutil_of(int cpu)
+{
+ struct cpu_timeinfo *cputime = per_cpu_ptr(&qos_cputime, cpu);
+
+ return cputime->vutil;
+}
+
static inline bool prefer_cpus_valid(struct task_struct *p)
{
return p->prefer_cpus &&
@@ -6741,17 +6808,29 @@ static void set_task_select_cpus(struct task_struct *p, int *idlest_cpu,
return;
}
+#if 0
util_avg_sum += tg->se[cpu]->avg.util_avg;
tg_capacity += capacity_of(cpu);
+#endif
+ util_avg_sum += cpu_vutil_of(cpu);
}
rcu_read_unlock();
+#if 0
if (tg_capacity > cpumask_weight(p->prefer_cpus) &&
util_avg_sum * 100 <= tg_capacity * sysctl_sched_util_low_pct) {
p->select_cpus = p->prefer_cpus;
if (sd_flag & SD_BALANCE_WAKE)
schedstat_inc(p->se.dyn_affi_stats->nr_wakeups_preferred_cpus);
}
+#endif
+
+ if (util_avg_sum < sysctl_sched_util_low_pct *
+ cpumask_weight(p->prefer_cpus)) {
+ p->select_cpus = p->prefer_cpus;
+ if (sd_flag & SD_BALANCE_WAKE)
+ schedstat_inc(p->se.dyn_affi_stats->nr_wakeups_preferred_cpus);
+ }
}
#endif
@@ -10610,6 +10689,10 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
if (static_branch_unlikely(&sched_numa_balancing))
task_tick_numa(rq, curr);
+
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+ update_cpu_vutil();
+#endif
}
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index ae30681530938..045fbb3871bbe 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2354,3 +2354,4 @@ static inline void membarrier_switch_mm(struct rq *rq,
{
}
#endif
+
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ad62ea156afd9..685f9881b8e23 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -352,6 +352,15 @@ static struct ctl_table kern_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "sched_util_update_interval_ms",
+ .data = &sysctl_sched_util_update_interval,
+ .maxlen = sizeof(sysctl_sched_util_update_interval),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
+ .extra2 = &sysctl_sched_util_update_interval_max,
+ },
#ifdef CONFIG_SCHED_DEBUG
{
.procname = "sched_min_granularity_ns",
--
2.25.1
1
0
From: tanghui <tanghui20(a)huawei.com>
sched: optimise the way to get util, which only canbe used when
CONFIG_QOS_SCHED_DYNAMIC_AFFINITY is set.
Signed-off-by: tanghui <tanghui20(a)huawei.com>
---
fs/proc/stat.c | 4 ++
include/linux/sched/cputime.h | 3 ++
include/linux/sched/sysctl.h | 2 +
kernel/sched/fair.c | 97 ++++++++++++++++++++++++++++++-----
kernel/sched/sched.h | 1 +
kernel/sysctl.c | 9 ++++
6 files changed, 104 insertions(+), 12 deletions(-)
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index c83a10e895f4..2eaba2b78f47 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -63,7 +63,11 @@ u64 get_idle_time(int cpu)
return idle;
}
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+u64 get_iowait_time(int cpu)
+#else
static u64 get_iowait_time(int cpu)
+#endif
{
u64 iowait, iowait_usecs = -1ULL;
diff --git a/include/linux/sched/cputime.h b/include/linux/sched/cputime.h
index 1ebbeec02051..f6244d48f357 100644
--- a/include/linux/sched/cputime.h
+++ b/include/linux/sched/cputime.h
@@ -190,5 +190,8 @@ extern int use_sched_idle_time;
extern int sched_idle_time_adjust(int cpu, u64 *utime, u64 *stime);
extern unsigned long long sched_get_idle_time(int cpu);
extern u64 get_idle_time(int cpu);
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+extern u64 get_iowait_time(int cpu);
+#endif
#endif /* _LINUX_SCHED_CPUTIME_H */
diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index f5031a607df8..386ef53017ca 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -28,6 +28,8 @@ extern unsigned int sysctl_sched_child_runs_first;
extern int sysctl_sched_util_low_pct;
extern int sysctl_sched_util_higher_pct;
extern int sysctl_sched_load_higher_pct;
+extern int sysctl_sched_util_update_interval;
+extern unsigned long sysctl_sched_util_update_interval_max;
#endif
enum sched_tunable_scaling {
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5344a68a463e..3217e3998fdf 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6458,6 +6458,15 @@ static int wake_cap(struct task_struct *p, int cpu, int prev_cpu)
}
#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+#define UTIL_PCT_HIGH 85
+
+struct cpu_timeinfo {
+ u64 systime;
+ u64 idletime;
+ unsigned long next_update;
+ int vutil;
+};
+
/*
* Light load threshold for CPU: just use cpu Utilization to measure
*
@@ -6484,15 +6493,75 @@ int sysctl_sched_util_higher_pct = 100;
*/
int sysctl_sched_load_higher_pct = 10;
+/*
+ * The time interval to update CPU utilization
+ * (default 1ms, max 10min)
+ */
+int sysctl_sched_util_update_interval = 1;
+unsigned long sysctl_sched_util_update_interval_max = 600000;
+
+static DEFINE_PER_CPU(struct cpu_timeinfo, qos_cputime);
+
+static inline u64 cpu_systime(int cpu)
+{
+ u64 user, nice, system, idle, iowait, irq, softirq, steal;
+
+ user = kcpustat_cpu(cpu).cpustat[CPUTIME_USER];
+ system = kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];
+ iowait = get_iowait_time(cpu);
+ irq = kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];
+ softirq = kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];
+ nice = kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
+ steal = kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];
+ idle = get_idle_time(cpu);
+
+ return user + system + iowait + irq + softirq + nice + idle + steal;
+}
+
+static inline u64 cpu_idletime(int cpu)
+{
+ return get_idle_time(cpu) + get_iowait_time(cpu);
+}
+
+static inline void update_cpu_vutil(void)
+{
+ struct cpu_timeinfo *cputime = per_cpu_ptr(&qos_cputime, smp_processor_id());
+ u64 delta_systime, delta_idle, systime, idletime;
+ int cpu = smp_processor_id();
+ unsigned long interval;
+
+ if (time_after(jiffies, cputime->next_update)) {
+ interval = msecs_to_jiffies(sysctl_sched_util_update_interval);
+ cputime->next_update = jiffies + interval;
+ systime = cpu_systime(cpu);
+ idletime = cpu_idletime(cpu);
+ delta_systime = systime - cputime->systime;
+ delta_idle = idletime - cputime->idletime;
+ if (!delta_systime)
+ return;
+
+ cputime->systime = systime;
+ cputime->idletime = idletime;
+ cputime->vutil = (delta_systime - delta_idle) * 100 / delta_systime;
+ }
+}
+
+static inline int cpu_vutil_of(int cpu)
+{
+ struct cpu_timeinfo *cputime = per_cpu_ptr(&qos_cputime, cpu);
+
+ return cputime->vutil;
+}
+
static inline bool prefer_cpu_util_low(int cpu)
{
unsigned long capacity = capacity_of(cpu);
- unsigned long util = cpu_util(cpu);
+ unsigned long util_pct = cpu_vutil_of(cpu);
- if (util >= capacity || capacity <= 1)
+ if (util_pct >= 100 || capacity <= 1)
return sysctl_sched_util_low_pct == 100;
- return util * 100 <= capacity * sysctl_sched_util_low_pct;
+ return util_pct <= sysctl_sched_util_low_pct;
}
/*
@@ -6504,8 +6573,8 @@ static inline int compare_cpu_util(int preferred_cpu, int external_cpu)
{
unsigned long capacity_cpux = capacity_of(preferred_cpu);
unsigned long capacity_cpuy = capacity_of(external_cpu);
- unsigned long cpu_util_x = cpu_util(preferred_cpu);
- unsigned long cpu_util_y = cpu_util(external_cpu);
+ unsigned long cpu_util_x = cpu_vutil_of(preferred_cpu);
+ unsigned long cpu_util_y = cpu_vutil_of(external_cpu);
int ratio;
/*
@@ -6519,7 +6588,7 @@ static inline int compare_cpu_util(int preferred_cpu, int external_cpu)
if (capacity_cpux <= 1)
return 1;
- if (cpu_util_x >= capacity_cpux && available_idle_cpu(external_cpu))
+ if (cpu_util_x >= UTIL_PCT_HIGH && available_idle_cpu(external_cpu))
return 1;
if (!cpu_util_x)
@@ -6529,12 +6598,12 @@ static inline int compare_cpu_util(int preferred_cpu, int external_cpu)
* The lower the CPU utilization, the larger the ratio of
* CPU utilization gap.
*/
- ratio = cpu_util_x >= capacity_cpux ? 1 : capacity_cpux / cpu_util_x;
+ ratio = cpu_util_x >= 100 ? 0 : 100 / cpu_util_x;
if (ratio > 10)
ratio = 10;
- return (sysctl_sched_util_higher_pct * ratio + 100) * cpu_util_y *
- capacity_cpux < 100 * cpu_util_x * capacity_cpuy;
+ return (sysctl_sched_util_higher_pct * ratio + 100) *
+ cpu_util_y * capacity_cpux < 100 * cpu_util_x * capacity_cpuy;
}
static inline bool prefer_cpus_valid(struct task_struct *p)
@@ -6549,7 +6618,7 @@ static inline bool prefer_cpus_valid(struct task_struct *p)
* @p: the task whose available cpu range will to set
* @idlest_cpu: the cpu which is the idlest in prefer cpus
*
- * x: the cpu of min util_avg from preferred set
+ * x: the cpu of min util from preferred set
* y: the cpu from allowed set but exclude preferred set
*
* If x's utilization is low, select preferred cpu range for task
@@ -6578,7 +6647,7 @@ static void set_task_select_cpus(struct task_struct *p, int *idlest_cpu,
goto cpus_allowed;
cpu = cpumask_first(p->se.prefer_cpus);
- min_util = cpu_util(cpu);
+ min_util = cpu_vutil_of(cpu);
for_each_cpu(i, p->se.prefer_cpus) {
if (prefer_cpu_util_low(i) || available_idle_cpu(i)) {
cpu = i;
@@ -6589,7 +6658,7 @@ static void set_task_select_cpus(struct task_struct *p, int *idlest_cpu,
if (capacity_of(i) <= 1)
continue;
- c_util = cpu_util(i);
+ c_util = cpu_vutil_of(i);
if (min_util * capacity_of(i) > c_util * capacity_of(cpu)) {
min_util = c_util;
cpu = i;
@@ -10278,6 +10347,10 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
if (static_branch_unlikely(&sched_numa_balancing))
task_tick_numa(rq, curr);
+
+#ifdef CONFIG_QOS_SCHED_DYNAMIC_AFFINITY
+ update_cpu_vutil();
+#endif
}
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 55216a03d327..4de9c966da01 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2348,3 +2348,4 @@ static inline void membarrier_switch_mm(struct rq *rq,
{
}
#endif
+
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b309aa206697..d52e003f3f56 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1274,6 +1274,15 @@ static struct ctl_table kern_table[] = {
.extra1 = &zero,
.extra2 = &one_hundred,
},
+ {
+ .procname = "sched_util_update_interval_ms",
+ .data = &sysctl_sched_util_update_interval,
+ .maxlen = sizeof(sysctl_sched_util_update_interval),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
+ .extra2 = &sysctl_sched_util_update_interval_max,
+ },
#endif
#endif
{ }
--
2.23.0
1
0

[PATCH openEuler-1.0-LTS 1/2] scsi: fix use-after-free problem in scsi_remove_target
by Yongqiang Liu 13 Mar '23
by Yongqiang Liu 13 Mar '23
13 Mar '23
From: Zhong Jinghua <zhongjinghua(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 188355, https://gitee.com/openeuler/kernel/issues/I6E4JF
CVE: NA
----------------------------------------
A use-after-free problem like below:
BUG: KASAN: use-after-free in scsi_target_reap+0x6c/0x70
Workqueue: scsi_wq_1 __iscsi_unbind_session [scsi_transport_iscsi]
Call trace:
dump_backtrace+0x0/0x320
show_stack+0x24/0x30
dump_stack+0xdc/0x128
print_address_description+0x68/0x278
kasan_report+0x1e4/0x308
__asan_report_load4_noabort+0x30/0x40
scsi_target_reap+0x6c/0x70
scsi_remove_target+0x430/0x640
__iscsi_unbind_session+0x164/0x268 [scsi_transport_iscsi]
process_one_work+0x67c/0x1350
worker_thread+0x370/0xf90
kthread+0x2a4/0x320
ret_from_fork+0x10/0x18
The problem is caused by a concurrency scenario:
T0: delete target
// echo 1 > /sys/devices/platform/host1/session1/target1:0:0/1:0:0:1/delete
T1: logout
// iscsiadm -m node --logout
T0 T1
sdev_store_delete
scsi_remove_device
device_remove_file
__scsi_remove_device
__iscsi_unbind_session
scsi_remove_target
spin_lock_irqsave
list_for_each_entry
scsi_target_reap
// starget->reap_ref 1 -> 0
kref_get(&starget->reap_ref);
// warn use-after-free.
spin_unlock_irqrestore
scsi_target_reap_ref_release
scsi_target_destroy
... // delete starget
scsi_target_reap
// UAF
When T0 reduces the reference count to 0, but has not been released,
T1 can still enter list_for_each_entry, and then kref_get reports UAF.
Fix it by using kref_get_unless_zero() to check for a reference count of
0.
Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/scsi/scsi_sysfs.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 4030e1fa57e5..453f5a6fb96b 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1500,7 +1500,16 @@ void scsi_remove_target(struct device *dev)
starget->state == STARGET_CREATED_REMOVE)
continue;
if (starget->dev.parent == dev || &starget->dev == dev) {
- kref_get(&starget->reap_ref);
+ /*
+ * If the reference count is already zero, skip
+ * this target. Calling kref_get_unless_zero() if
+ * the reference count is zero is safe because
+ * scsi_target_destroy() will wait until the host
+ * lock has been released before freeing starget.
+ */
+ if (!kref_get_unless_zero(&starget->reap_ref))
+ continue;
+
if (starget->state == STARGET_CREATED)
starget->state = STARGET_CREATED_REMOVE;
else
--
2.25.1
1
1

[OLK-5.10 v1] RDMA/hns: Support congestion control algorithm parameter configuration
by Chengchang Tang 13 Mar '23
by Chengchang Tang 13 Mar '23
13 Mar '23
driver inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I6J5O7
---------------------------------------------------------------
hns roce support 4 congestion control algorithms. Each algorihm
involves multiple parameters. This patch add port sysfs directory
for each algorithm, which allows users to modify the parameters
of these algorithms.
Signed-off-by: Chengchang Tang <tangchengchang(a)huawei.com>
Reviewed-by: Yangyang Li <liyangyang20(a)huawei.com>
---
drivers/infiniband/hw/hns/Makefile | 2 +-
drivers/infiniband/hw/hns/hns_roce_device.h | 42 +-
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 77 ++++
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 132 ++++++
drivers/infiniband/hw/hns/hns_roce_main.c | 2 +
drivers/infiniband/hw/hns/hns_roce_sysfs.c | 445 ++++++++++++++++++++
6 files changed, 695 insertions(+), 5 deletions(-)
create mode 100644 drivers/infiniband/hw/hns/hns_roce_sysfs.c
diff --git a/drivers/infiniband/hw/hns/Makefile b/drivers/infiniband/hw/hns/Makefile
index 77d8e41ffe7b..09f95fedd15d 100644
--- a/drivers/infiniband/hw/hns/Makefile
+++ b/drivers/infiniband/hw/hns/Makefile
@@ -10,7 +10,7 @@ ccflags-y += -I $(srctree)/drivers/net/ethernet/hisilicon/hns3/hns3_common
hns-roce-objs := hns_roce_main.o hns_roce_cmd.o hns_roce_pd.o \
hns_roce_ah.o hns_roce_hem.o hns_roce_mr.o hns_roce_qp.o \
hns_roce_cq.o hns_roce_alloc.o hns_roce_db.o hns_roce_srq.o hns_roce_restrack.o \
- hns_roce_bond.o hns_roce_dca.o hns_roce_debugfs.o
+ hns_roce_bond.o hns_roce_dca.o hns_roce_debugfs.o hns_roce_sysfs.o
ifdef CONFIG_INFINIBAND_HNS_HIP08
hns-roce-hw-v2-objs := hns_roce_hw_v2.o $(hns-roce-objs)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 869d952b32b2..9d5580040e6b 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -767,11 +767,19 @@ struct hns_roce_eq_table {
struct hns_roce_eq *eq;
};
+enum hns_roce_scc_algo {
+ HNS_ROCE_SCC_ALGO_DCQCN = 0,
+ HNS_ROCE_SCC_ALGO_LDCP,
+ HNS_ROCE_SCC_ALGO_HC3,
+ HNS_ROCE_SCC_ALGO_DIP,
+ HNS_ROCE_SCC_ALGO_TOTAL,
+};
+
enum cong_type {
- CONG_TYPE_DCQCN,
- CONG_TYPE_LDCP,
- CONG_TYPE_HC3,
- CONG_TYPE_DIP,
+ CONG_TYPE_DCQCN = 1 << HNS_ROCE_SCC_ALGO_DCQCN,
+ CONG_TYPE_LDCP = 1 << HNS_ROCE_SCC_ALGO_LDCP,
+ CONG_TYPE_HC3 = 1 << HNS_ROCE_SCC_ALGO_HC3,
+ CONG_TYPE_DIP = 1 << HNS_ROCE_SCC_ALGO_DIP,
};
struct hns_roce_caps {
@@ -1016,6 +1024,28 @@ struct hns_roce_hw {
int (*bond_init)(struct hns_roce_dev *hr_dev);
bool (*bond_is_active)(struct hns_roce_dev *hr_dev);
struct net_device *(*get_bond_netdev)(struct hns_roce_dev *hr_dev);
+ int (*config_scc_param)(struct hns_roce_dev *hr_dev, u8 port_num,
+ enum hns_roce_scc_algo algo);
+ int (*query_scc_param)(struct hns_roce_dev *hr_dev, u8 port_num,
+ enum hns_roce_scc_algo alog);
+};
+
+#define HNS_ROCE_SCC_PARAM_SIZE 4
+struct hns_roce_scc_param {
+ __le32 param[HNS_ROCE_SCC_PARAM_SIZE];
+ u32 lifespan;
+ unsigned long timestamp;
+ enum hns_roce_scc_algo algo_type;
+ struct delayed_work scc_cfg_dwork;
+ struct hns_roce_dev *hr_dev;
+ u8 port_num;
+};
+
+struct hns_roce_port {
+ struct hns_roce_dev *hr_dev;
+ u8 port_num;
+ struct kobject kobj;
+ struct hns_roce_scc_param *scc_param;
};
struct hns_roce_dev {
@@ -1094,6 +1124,7 @@ struct hns_roce_dev {
struct delayed_work bond_work;
struct hns_roce_bond_group *bond_grp;
struct netdev_lag_lower_state_info slave_state;
+ struct hns_roce_port port_data[HNS_ROCE_MAX_PORTS];
atomic64_t *dfx_cnt;
};
@@ -1379,4 +1410,7 @@ struct hns_user_mmap_entry *
hns_roce_user_mmap_entry_insert(struct ib_ucontext *ucontext, u64 address,
size_t length,
enum hns_roce_mmap_type mmap_type);
+int hns_roce_create_port_files(struct ib_device *ibdev, u8 port_num,
+ struct kobject *kobj);
+void hns_roce_unregister_sysfs(struct hns_roce_dev *hr_dev);
#endif /* _HNS_ROCE_DEVICE_H */
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index a76a57c7e36a..c9826a010f38 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -7146,6 +7146,81 @@ static void hns_roce_v2_cleanup_eq_table(struct hns_roce_dev *hr_dev)
kfree(eq_table->eq);
}
+static enum hns_roce_opcode_type scc_opcode[] = {
+ HNS_ROCE_OPC_CFG_DCQCN_PARAM,
+ HNS_ROCE_OPC_CFG_LDCP_PARAM,
+ HNS_ROCE_OPC_CFG_HC3_PARAM,
+ HNS_ROCE_OPC_CFG_DIP_PARAM,
+};
+
+static int hns_roce_v2_config_scc_param(struct hns_roce_dev *hr_dev,
+ u8 port_num,
+ enum hns_roce_scc_algo algo)
+{
+ struct hns_roce_scc_param *scc_param;
+ struct hns_roce_cmq_desc desc;
+ struct hns_roce_port *pdata;
+ int ret;
+
+ if (port_num > hr_dev->caps.num_ports) {
+ ibdev_err_ratelimited(&hr_dev->ib_dev,
+ "invalid port num %u.\n", port_num);
+ return -ENODEV;
+ }
+
+ if (algo >= HNS_ROCE_SCC_ALGO_TOTAL) {
+ ibdev_err_ratelimited(&hr_dev->ib_dev, "invalid SCC algo.\n");
+ return -EINVAL;
+ }
+
+ hns_roce_cmq_setup_basic_desc(&desc, scc_opcode[algo], false);
+ pdata = &hr_dev->port_data[port_num - 1];
+ scc_param = &pdata->scc_param[algo];
+ memcpy(&desc.data, scc_param, sizeof(scc_param->param));
+
+ ret = hns_roce_cmq_send(hr_dev, &desc, 1);
+ if (ret)
+ ibdev_err_ratelimited(&hr_dev->ib_dev,
+ "failed to configure scc param, opcode: 0x%x, ret = %d.\n",
+ le16_to_cpu(desc.opcode), ret);
+ return ret;
+}
+
+static int hns_roce_v2_query_scc_param(struct hns_roce_dev *hr_dev,
+ u8 port_num, enum hns_roce_scc_algo algo)
+{
+ struct hns_roce_scc_param *scc_param;
+ struct hns_roce_cmq_desc desc;
+ struct hns_roce_port *pdata;
+ int ret;
+
+ if (port_num > hr_dev->caps.num_ports) {
+ ibdev_err_ratelimited(&hr_dev->ib_dev,
+ "invalid port num %u.\n", port_num);
+ return -ENODEV;
+ }
+
+ if (algo >= HNS_ROCE_SCC_ALGO_TOTAL) {
+ ibdev_err_ratelimited(&hr_dev->ib_dev, "invalid SCC algo.\n");
+ return -EINVAL;
+ }
+
+ hns_roce_cmq_setup_basic_desc(&desc, scc_opcode[algo], true);
+ ret = hns_roce_cmq_send(hr_dev, &desc, 1);
+ if (ret) {
+ ibdev_err_ratelimited(&hr_dev->ib_dev,
+ "failed to query scc param, opcode: 0x%x, ret = %d.\n",
+ le16_to_cpu(desc.opcode), ret);
+ return ret;
+ }
+
+ pdata = &hr_dev->port_data[port_num - 1];
+ scc_param = &pdata->scc_param[algo];
+ memcpy(scc_param, &desc.data, sizeof(scc_param->param));
+
+ return 0;
+}
+
static const struct ib_device_ops hns_roce_v2_dev_ops = {
.destroy_qp = hns_roce_v2_destroy_qp,
.modify_cq = hns_roce_v2_modify_cq,
@@ -7198,6 +7273,8 @@ static const struct hns_roce_hw hns_roce_hw_v2 = {
.bond_is_active = hns_roce_bond_is_active,
.get_bond_netdev = hns_roce_get_bond_netdev,
.query_hw_counter = hns_roce_hw_v2_query_counter,
+ .config_scc_param = hns_roce_v2_config_scc_param,
+ .query_scc_param = hns_roce_v2_query_scc_param,
};
static const struct pci_device_id hns_roce_hw_v2_pci_tbl[] = {
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index 7d5c304a8342..90401577865e 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -227,6 +227,10 @@ enum {
enum hns_roce_opcode_type {
HNS_QUERY_FW_VER = 0x0001,
HNS_QUERY_MAC_TYPE = 0x0389,
+ HNS_ROCE_OPC_CFG_DCQCN_PARAM = 0x1A80,
+ HNS_ROCE_OPC_CFG_LDCP_PARAM = 0x1A81,
+ HNS_ROCE_OPC_CFG_HC3_PARAM = 0x1A82,
+ HNS_ROCE_OPC_CFG_DIP_PARAM = 0x1A83,
HNS_ROCE_OPC_QUERY_HW_VER = 0x8000,
HNS_ROCE_OPC_CFG_GLOBAL_PARAM = 0x8001,
HNS_ROCE_OPC_ALLOC_PF_RES = 0x8004,
@@ -1468,6 +1472,134 @@ struct hns_roce_wqe_atomic_seg {
__le64 cmp_data;
};
+#define HNS_ROCE_DCQCN_AI_OFS 0
+#define HNS_ROCE_DCQCN_AI_SZ sizeof(u16)
+#define HNS_ROCE_DCQCN_AI_MAX ((u16)(~0U))
+#define HNS_ROCE_DCQCN_F_OFS (HNS_ROCE_DCQCN_AI_OFS + HNS_ROCE_DCQCN_AI_SZ)
+#define HNS_ROCE_DCQCN_F_SZ sizeof(u8)
+#define HNS_ROCE_DCQCN_F_MAX ((u8)(~0U))
+#define HNS_ROCE_DCQCN_TKP_OFS (HNS_ROCE_DCQCN_F_OFS + HNS_ROCE_DCQCN_F_SZ)
+#define HNS_ROCE_DCQCN_TKP_SZ sizeof(u8)
+#define HNS_ROCE_DCQCN_TKP_MAX 15
+#define HNS_ROCE_DCQCN_TMP_OFS (HNS_ROCE_DCQCN_TKP_OFS + HNS_ROCE_DCQCN_TKP_SZ)
+#define HNS_ROCE_DCQCN_TMP_SZ sizeof(u16)
+#define HNS_ROCE_DCQCN_TMP_MAX 15
+#define HNS_ROCE_DCQCN_ALP_OFS (HNS_ROCE_DCQCN_TMP_OFS + HNS_ROCE_DCQCN_TMP_SZ)
+#define HNS_ROCE_DCQCN_ALP_SZ sizeof(u16)
+#define HNS_ROCE_DCQCN_ALP_MAX ((u16)(~0U))
+#define HNS_ROCE_DCQCN_MAX_SPEED_OFS (HNS_ROCE_DCQCN_ALP_OFS + \
+ HNS_ROCE_DCQCN_ALP_SZ)
+#define HNS_ROCE_DCQCN_MAX_SPEED_SZ sizeof(u32)
+#define HNS_ROCE_DCQCN_MAX_SPEED_MAX ((u32)(~0U))
+#define HNS_ROCE_DCQCN_G_OFS (HNS_ROCE_DCQCN_MAX_SPEED_OFS + \
+ HNS_ROCE_DCQCN_MAX_SPEED_SZ)
+#define HNS_ROCE_DCQCN_G_SZ sizeof(u8)
+#define HNS_ROCE_DCQCN_G_MAX 15
+#define HNS_ROCE_DCQCN_AL_OFS (HNS_ROCE_DCQCN_G_OFS + HNS_ROCE_DCQCN_G_SZ)
+#define HNS_ROCE_DCQCN_AL_SZ sizeof(u8)
+#define HNS_ROCE_DCQCN_AL_MAX ((u8)(~0U))
+#define HNS_ROCE_DCQCN_CNP_TIME_OFS (HNS_ROCE_DCQCN_AL_OFS + \
+ HNS_ROCE_DCQCN_AL_SZ)
+#define HNS_ROCE_DCQCN_CNP_TIME_SZ sizeof(u8)
+#define HNS_ROCE_DCQCN_CNP_TIME_MAX ((u8)(~0U))
+#define HNS_ROCE_DCQCN_ASHIFT_OFS (HNS_ROCE_DCQCN_CNP_TIME_OFS + \
+ HNS_ROCE_DCQCN_CNP_TIME_SZ)
+#define HNS_ROCE_DCQCN_ASHIFT_SZ sizeof(u8)
+#define HNS_ROCE_DCQCN_ASHIFT_MAX 15
+#define HNS_ROCE_DCQCN_LIFESPAN_OFS (HNS_ROCE_DCQCN_ASHIFT_OFS + \
+ HNS_ROCE_DCQCN_ASHIFT_SZ)
+#define HNS_ROCE_DCQCN_LIFESPAN_SZ sizeof(u32)
+#define HNS_ROCE_DCQCN_LIFESPAN_MAX 1000
+
+#define HNS_ROCE_LDCP_CWD0_OFS 0
+#define HNS_ROCE_LDCP_CWD0_SZ sizeof(u32)
+#define HNS_ROCE_LDCP_CWD0_MAX ((u32)(~0U))
+#define HNS_ROCE_LDCP_ALPHA_OFS (HNS_ROCE_LDCP_CWD0_OFS + HNS_ROCE_LDCP_CWD0_SZ)
+#define HNS_ROCE_LDCP_ALPHA_SZ sizeof(u8)
+#define HNS_ROCE_LDCP_ALPHA_MAX ((u8)(~0U))
+#define HNS_ROCE_LDCP_GAMMA_OFS (HNS_ROCE_LDCP_ALPHA_OFS + \
+ HNS_ROCE_LDCP_ALPHA_SZ)
+#define HNS_ROCE_LDCP_GAMMA_SZ sizeof(u8)
+#define HNS_ROCE_LDCP_GAMMA_MAX ((u8)(~0U))
+#define HNS_ROCE_LDCP_BETA_OFS (HNS_ROCE_LDCP_GAMMA_OFS + \
+ HNS_ROCE_LDCP_GAMMA_SZ)
+#define HNS_ROCE_LDCP_BETA_SZ sizeof(u8)
+#define HNS_ROCE_LDCP_BETA_MAX ((u8)(~0U))
+#define HNS_ROCE_LDCP_ETA_OFS (HNS_ROCE_LDCP_BETA_OFS + HNS_ROCE_LDCP_BETA_SZ)
+#define HNS_ROCE_LDCP_ETA_SZ sizeof(u8)
+#define HNS_ROCE_LDCP_ETA_MAX ((u8)(~0U))
+#define HNS_ROCE_LDCP_LIFESPAN_OFS (4 * sizeof(u32))
+#define HNS_ROCE_LDCP_LIFESPAN_SZ sizeof(u32)
+#define HNS_ROCE_LDCP_LIFESPAN_MAX 1000
+
+#define HNS_ROCE_HC3_INITIAL_WINDOW_OFS 0
+#define HNS_ROCE_HC3_INITIAL_WINDOW_SZ sizeof(u32)
+#define HNS_ROCE_HC3_INITIAL_WINDOW_MAX ((u32)(~0U))
+#define HNS_ROCE_HC3_BANDWIDTH_OFS (HNS_ROCE_HC3_INITIAL_WINDOW_OFS + \
+ HNS_ROCE_HC3_INITIAL_WINDOW_SZ)
+#define HNS_ROCE_HC3_BANDWIDTH_SZ sizeof(u32)
+#define HNS_ROCE_HC3_BANDWIDTH_MAX ((u32)(~0U))
+#define HNS_ROCE_HC3_QLEN_SHIFT_OFS (HNS_ROCE_HC3_BANDWIDTH_OFS + \
+ HNS_ROCE_HC3_BANDWIDTH_SZ)
+#define HNS_ROCE_HC3_QLEN_SHIFT_SZ sizeof(u8)
+#define HNS_ROCE_HC3_QLEN_SHIFT_MAX ((u8)(~0U))
+#define HNS_ROCE_HC3_PORT_USAGE_SHIFT_OFS (HNS_ROCE_HC3_QLEN_SHIFT_OFS + \
+ HNS_ROCE_HC3_QLEN_SHIFT_SZ)
+#define HNS_ROCE_HC3_PORT_USAGE_SHIFT_SZ sizeof(u8)
+#define HNS_ROCE_HC3_PORT_USAGE_SHIFT_MAX ((u8)(~0U))
+#define HNS_ROCE_HC3_OVER_PERIOD_OFS (HNS_ROCE_HC3_PORT_USAGE_SHIFT_OFS + \
+ HNS_ROCE_HC3_PORT_USAGE_SHIFT_SZ)
+#define HNS_ROCE_HC3_OVER_PERIOD_SZ sizeof(u8)
+#define HNS_ROCE_HC3_OVER_PERIOD_MAX ((u8)(~0U))
+#define HNS_ROCE_HC3_MAX_STAGE_OFS (HNS_ROCE_HC3_OVER_PERIOD_OFS + \
+ HNS_ROCE_HC3_OVER_PERIOD_SZ)
+#define HNS_ROCE_HC3_MAX_STAGE_SZ sizeof(u8)
+#define HNS_ROCE_HC3_MAX_STAGE_MAX ((u8)(~0U))
+#define HNS_ROCE_HC3_GAMMA_SHIFT_OFS (HNS_ROCE_HC3_MAX_STAGE_OFS + \
+ HNS_ROCE_HC3_MAX_STAGE_SZ)
+#define HNS_ROCE_HC3_GAMMA_SHIFT_SZ sizeof(u8)
+#define HNS_ROCE_HC3_GAMMA_SHIFT_MAX 15
+#define HNS_ROCE_HC3_LIFESPAN_OFS (4 * sizeof(u32))
+#define HNS_ROCE_HC3_LIFESPAN_SZ sizeof(u32)
+#define HNS_ROCE_HC3_LIFESPAN_MAX 1000
+
+#define HNS_ROCE_DIP_AI_OFS 0
+#define HNS_ROCE_DIP_AI_SZ sizeof(u16)
+#define HNS_ROCE_DIP_AI_MAX ((u16)(~0U))
+#define HNS_ROCE_DIP_F_OFS (HNS_ROCE_DIP_AI_OFS + HNS_ROCE_DIP_AI_SZ)
+#define HNS_ROCE_DIP_F_SZ sizeof(u8)
+#define HNS_ROCE_DIP_F_MAX ((u8)(~0U))
+#define HNS_ROCE_DIP_TKP_OFS (HNS_ROCE_DIP_F_OFS + HNS_ROCE_DIP_F_SZ)
+#define HNS_ROCE_DIP_TKP_SZ sizeof(u8)
+#define HNS_ROCE_DIP_TKP_MAX 15
+#define HNS_ROCE_DIP_TMP_OFS (HNS_ROCE_DIP_TKP_OFS + HNS_ROCE_DIP_TKP_SZ)
+#define HNS_ROCE_DIP_TMP_SZ sizeof(u16)
+#define HNS_ROCE_DIP_TMP_MAX 15
+#define HNS_ROCE_DIP_ALP_OFS (HNS_ROCE_DIP_TMP_OFS + HNS_ROCE_DIP_TMP_SZ)
+#define HNS_ROCE_DIP_ALP_SZ sizeof(u16)
+#define HNS_ROCE_DIP_ALP_MAX ((u16)(~0U))
+#define HNS_ROCE_DIP_MAX_SPEED_OFS (HNS_ROCE_DIP_ALP_OFS + HNS_ROCE_DIP_ALP_SZ)
+#define HNS_ROCE_DIP_MAX_SPEED_SZ sizeof(u32)
+#define HNS_ROCE_DIP_MAX_SPEED_MAX ((u32)(~0U))
+#define HNS_ROCE_DIP_G_OFS (HNS_ROCE_DIP_MAX_SPEED_OFS + \
+ HNS_ROCE_DIP_MAX_SPEED_SZ)
+#define HNS_ROCE_DIP_G_SZ sizeof(u8)
+#define HNS_ROCE_DIP_G_MAX 15
+#define HNS_ROCE_DIP_AL_OFS (HNS_ROCE_DIP_G_OFS + HNS_ROCE_DIP_G_SZ)
+#define HNS_ROCE_DIP_AL_SZ sizeof(u8)
+#define HNS_ROCE_DIP_AL_MAX ((u8)(~0U))
+#define HNS_ROCE_DIP_CNP_TIME_OFS (HNS_ROCE_DIP_AL_OFS + HNS_ROCE_DIP_AL_SZ)
+#define HNS_ROCE_DIP_CNP_TIME_SZ sizeof(u8)
+#define HNS_ROCE_DIP_CNP_TIME_MAX ((u8)(~0U))
+#define HNS_ROCE_DIP_ASHIFT_OFS (HNS_ROCE_DIP_CNP_TIME_OFS + \
+ HNS_ROCE_DIP_CNP_TIME_SZ)
+#define HNS_ROCE_DIP_ASHIFT_SZ sizeof(u8)
+#define HNS_ROCE_DIP_ASHIFT_MAX 15
+#define HNS_ROCE_DIP_LIFESPAN_OFS (HNS_ROCE_DIP_ASHIFT_OFS + \
+ HNS_ROCE_DIP_ASHIFT_SZ)
+#define HNS_ROCE_DIP_LIFESPAN_SZ sizeof(u32)
+#define HNS_ROCE_DIP_LIFESPAN_MAX 1000
+
struct hns_roce_sccc_clr {
__le32 qpn;
__le32 rsv[5];
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index d98825154a82..4d1d2629d8f0 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -870,6 +870,7 @@ static const struct ib_device_ops hns_roce_dev_ops = {
.reg_user_mr = hns_roce_reg_user_mr,
.alloc_hw_stats = hns_roce_alloc_hw_port_stats,
.get_hw_stats = hns_roce_get_hw_stats,
+ .init_port = hns_roce_create_port_files,
INIT_RDMA_OBJ_SIZE(ib_ah, hns_roce_ah, ibah),
INIT_RDMA_OBJ_SIZE(ib_cq, hns_roce_cq, ib_cq),
@@ -1441,6 +1442,7 @@ int hns_roce_init(struct hns_roce_dev *hr_dev)
void hns_roce_exit(struct hns_roce_dev *hr_dev)
{
+ hns_roce_unregister_sysfs(hr_dev);
hns_roce_unregister_device(hr_dev);
hns_roce_unregister_debugfs(hr_dev);
diff --git a/drivers/infiniband/hw/hns/hns_roce_sysfs.c b/drivers/infiniband/hw/hns/hns_roce_sysfs.c
new file mode 100644
index 000000000000..507cc7147a2e
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_sysfs.c
@@ -0,0 +1,445 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright (c) 2023 Hisilicon Limited.
+ */
+
+#include "hnae3.h"
+#include "hns_roce_device.h"
+#include "hns_roce_hw_v2.h"
+
+struct hns_port_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct hns_roce_port *pdata,
+ struct hns_port_attribute *attr, char *buf);
+ ssize_t (*store)(struct hns_roce_port *pdata,
+ struct hns_port_attribute *attr, const char *buf,
+ size_t count);
+};
+
+static void scc_param_config_work(struct work_struct *work)
+{
+ struct hns_roce_scc_param *scc_param = container_of(work,
+ struct hns_roce_scc_param, scc_cfg_dwork.work);
+ struct hns_roce_dev *hr_dev = scc_param->hr_dev;
+
+ hr_dev->hw->config_scc_param(hr_dev, scc_param->port_num,
+ scc_param->algo_type);
+}
+
+static int alloc_scc_param(struct hns_roce_dev *hr_dev,
+ struct hns_roce_port *pdata)
+{
+ struct hns_roce_scc_param *scc_param;
+ int i;
+
+ scc_param = kcalloc(HNS_ROCE_SCC_ALGO_TOTAL, sizeof(*scc_param),
+ GFP_KERNEL);
+ if (!scc_param)
+ return -ENOMEM;
+
+ for (i = 0; i < HNS_ROCE_SCC_ALGO_TOTAL; i++) {
+ scc_param[i].algo_type = i;
+ scc_param[i].timestamp = jiffies;
+ scc_param[i].hr_dev = hr_dev;
+ scc_param[i].port_num = pdata->port_num;
+ INIT_DELAYED_WORK(&scc_param[i].scc_cfg_dwork,
+ scc_param_config_work);
+ }
+
+ pdata->scc_param = scc_param;
+ return 0;
+}
+
+struct hns_port_cc_attr {
+ struct hns_port_attribute port_attr;
+ enum hns_roce_scc_algo algo_type;
+ u32 offset;
+ u32 size;
+ u32 max;
+ u32 min;
+};
+
+static int scc_attr_check(struct hns_port_cc_attr *scc_attr)
+{
+ if (WARN_ON(scc_attr->size > sizeof(u32)))
+ return -EINVAL;
+
+ if (WARN_ON(scc_attr->algo_type >= HNS_ROCE_SCC_ALGO_TOTAL))
+ return -EINVAL;
+ return 0;
+}
+
+static ssize_t scc_attr_show(struct hns_roce_port *pdata,
+ struct hns_port_attribute *attr, char *buf)
+{
+ struct hns_port_cc_attr *scc_attr =
+ container_of(attr, struct hns_port_cc_attr, port_attr);
+ struct hns_roce_scc_param *scc_param;
+ unsigned long exp_time;
+ __le32 val = 0;
+ int ret;
+
+ ret = scc_attr_check(scc_attr);
+ if (ret)
+ return ret;
+
+ scc_param = &pdata->scc_param[scc_attr->algo_type];
+
+ /* Only HW param need be queried */
+ if (scc_attr->offset < offsetof(typeof(*scc_param), lifespan)) {
+ exp_time = scc_param->timestamp +
+ msecs_to_jiffies(scc_param->lifespan);
+
+ if (time_is_before_eq_jiffies(exp_time)) {
+ scc_param->timestamp = jiffies;
+ pdata->hr_dev->hw->query_scc_param(pdata->hr_dev,
+ pdata->port_num, scc_attr->algo_type);
+ }
+ }
+
+ memcpy(&val, (void *)scc_param + scc_attr->offset, scc_attr->size);
+
+ return sysfs_emit(buf, "%u\n", le32_to_cpu(val));
+}
+
+static ssize_t scc_attr_store(struct hns_roce_port *pdata,
+ struct hns_port_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct hns_port_cc_attr *scc_attr =
+ container_of(attr, struct hns_port_cc_attr, port_attr);
+ struct hns_roce_scc_param *scc_param;
+ unsigned long lifespan_jiffies;
+ unsigned long exp_time;
+ __le32 attr_val;
+ u32 val;
+ int ret;
+
+ ret = scc_attr_check(scc_attr);
+ if (ret)
+ return ret;
+
+ if (kstrtou32(buf, 0, &val))
+ return -EINVAL;
+
+ if (val > scc_attr->max || val < scc_attr->min)
+ return -EINVAL;
+
+ attr_val = cpu_to_le32(val);
+ scc_param = &pdata->scc_param[scc_attr->algo_type];
+ memcpy((void *)scc_param + scc_attr->offset, &attr_val,
+ scc_attr->size);
+
+ /* lifespan is only used for driver */
+ if (scc_attr->offset >= offsetof(typeof(*scc_param), lifespan))
+ return count;
+
+ lifespan_jiffies = msecs_to_jiffies(scc_param->lifespan);
+ exp_time = scc_param->timestamp + lifespan_jiffies;
+
+ if (time_is_before_eq_jiffies(exp_time)) {
+ scc_param->timestamp = jiffies;
+ queue_delayed_work(pdata->hr_dev->irq_workq,
+ &scc_param->scc_cfg_dwork, lifespan_jiffies);
+ }
+
+ return count;
+}
+
+static umode_t scc_attr_is_visible(struct kobject *kobj,
+ struct attribute *attr, int i)
+{
+ struct hns_port_attribute *port_attr =
+ container_of(attr, struct hns_port_attribute, attr);
+ struct hns_port_cc_attr *scc_attr =
+ container_of(port_attr, struct hns_port_cc_attr, port_attr);
+ struct hns_roce_port *pdata =
+ container_of(kobj, struct hns_roce_port, kobj);
+ struct hns_roce_dev *hr_dev = pdata->hr_dev;
+
+ if (hr_dev->is_vf ||
+ !(hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_QP_FLOW_CTRL))
+ return 0;
+
+ if (!(hr_dev->caps.cong_type & (1 << scc_attr->algo_type)))
+ return 0;
+
+ return 0644;
+}
+
+#define __HNS_SCC_ATTR(_name, _type, _offset, _size, _min, _max) { \
+ .port_attr = __ATTR(_name, 0644, scc_attr_show, scc_attr_store), \
+ .algo_type = _type, \
+ .offset = _offset, \
+ .size = _size, \
+ .min = _min, \
+ .max = _max, \
+}
+
+#define HNS_PORT_DCQCN_CC_ATTR_RW(_name, NAME) \
+ struct hns_port_cc_attr hns_roce_port_attr_dcqcn_##_name = \
+ __HNS_SCC_ATTR(_name, HNS_ROCE_SCC_ALGO_DCQCN, \
+ HNS_ROCE_DCQCN_##NAME##_OFS, \
+ HNS_ROCE_DCQCN_##NAME##_SZ, \
+ 0, HNS_ROCE_DCQCN_##NAME##_MAX)
+
+HNS_PORT_DCQCN_CC_ATTR_RW(ai, AI);
+HNS_PORT_DCQCN_CC_ATTR_RW(f, F);
+HNS_PORT_DCQCN_CC_ATTR_RW(tkp, TKP);
+HNS_PORT_DCQCN_CC_ATTR_RW(tmp, TMP);
+HNS_PORT_DCQCN_CC_ATTR_RW(alp, ALP);
+HNS_PORT_DCQCN_CC_ATTR_RW(max_speed, MAX_SPEED);
+HNS_PORT_DCQCN_CC_ATTR_RW(g, G);
+HNS_PORT_DCQCN_CC_ATTR_RW(al, AL);
+HNS_PORT_DCQCN_CC_ATTR_RW(cnp_time, CNP_TIME);
+HNS_PORT_DCQCN_CC_ATTR_RW(ashift, ASHIFT);
+HNS_PORT_DCQCN_CC_ATTR_RW(lifespan, LIFESPAN);
+
+static struct attribute *dcqcn_param_attrs[] = {
+ &hns_roce_port_attr_dcqcn_ai.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_f.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_tkp.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_tmp.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_alp.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_max_speed.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_g.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_al.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_cnp_time.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_ashift.port_attr.attr,
+ &hns_roce_port_attr_dcqcn_lifespan.port_attr.attr,
+ NULL,
+};
+
+static const struct attribute_group dcqcn_cc_param_group = {
+ .name = "dcqcn_cc_param",
+ .attrs = dcqcn_param_attrs,
+ .is_visible = scc_attr_is_visible,
+};
+
+#define HNS_PORT_LDCP_CC_ATTR_RW(_name, NAME) \
+ struct hns_port_cc_attr hns_roce_port_attr_ldcp_##_name = \
+ __HNS_SCC_ATTR(_name, HNS_ROCE_SCC_ALGO_LDCP, \
+ HNS_ROCE_LDCP_##NAME##_OFS, \
+ HNS_ROCE_LDCP_##NAME##_SZ, \
+ 0, HNS_ROCE_LDCP_##NAME##_MAX)
+
+HNS_PORT_LDCP_CC_ATTR_RW(cwd0, CWD0);
+HNS_PORT_LDCP_CC_ATTR_RW(alpha, ALPHA);
+HNS_PORT_LDCP_CC_ATTR_RW(gamma, GAMMA);
+HNS_PORT_LDCP_CC_ATTR_RW(beta, BETA);
+HNS_PORT_LDCP_CC_ATTR_RW(eta, ETA);
+HNS_PORT_LDCP_CC_ATTR_RW(lifespan, LIFESPAN);
+
+static struct attribute *ldcp_param_attrs[] = {
+ &hns_roce_port_attr_ldcp_cwd0.port_attr.attr,
+ &hns_roce_port_attr_ldcp_alpha.port_attr.attr,
+ &hns_roce_port_attr_ldcp_gamma.port_attr.attr,
+ &hns_roce_port_attr_ldcp_beta.port_attr.attr,
+ &hns_roce_port_attr_ldcp_eta.port_attr.attr,
+ &hns_roce_port_attr_ldcp_lifespan.port_attr.attr,
+ NULL,
+};
+
+static const struct attribute_group ldcp_cc_param_group = {
+ .name = "ldcp_cc_param",
+ .attrs = ldcp_param_attrs,
+ .is_visible = scc_attr_is_visible,
+};
+
+#define HNS_PORT_HC3_CC_ATTR_RW(_name, NAME) \
+ struct hns_port_cc_attr hns_roce_port_attr_hc3_##_name = \
+ __HNS_SCC_ATTR(_name, HNS_ROCE_SCC_ALGO_HC3, \
+ HNS_ROCE_HC3_##NAME##_OFS, \
+ HNS_ROCE_HC3_##NAME##_SZ, \
+ 0, HNS_ROCE_HC3_##NAME##_MAX)
+
+HNS_PORT_HC3_CC_ATTR_RW(initial_window, INITIAL_WINDOW);
+HNS_PORT_HC3_CC_ATTR_RW(bandwidth, BANDWIDTH);
+HNS_PORT_HC3_CC_ATTR_RW(qlen_shift, QLEN_SHIFT);
+HNS_PORT_HC3_CC_ATTR_RW(port_usage_shift, PORT_USAGE_SHIFT);
+HNS_PORT_HC3_CC_ATTR_RW(over_period, OVER_PERIOD);
+HNS_PORT_HC3_CC_ATTR_RW(max_stage, MAX_STAGE);
+HNS_PORT_HC3_CC_ATTR_RW(gamma_shift, GAMMA_SHIFT);
+HNS_PORT_HC3_CC_ATTR_RW(lifespan, LIFESPAN);
+
+static struct attribute *hc3_param_attrs[] = {
+ &hns_roce_port_attr_hc3_initial_window.port_attr.attr,
+ &hns_roce_port_attr_hc3_bandwidth.port_attr.attr,
+ &hns_roce_port_attr_hc3_qlen_shift.port_attr.attr,
+ &hns_roce_port_attr_hc3_port_usage_shift.port_attr.attr,
+ &hns_roce_port_attr_hc3_over_period.port_attr.attr,
+ &hns_roce_port_attr_hc3_max_stage.port_attr.attr,
+ &hns_roce_port_attr_hc3_gamma_shift.port_attr.attr,
+ &hns_roce_port_attr_hc3_lifespan.port_attr.attr,
+ NULL,
+};
+
+static const struct attribute_group hc3_cc_param_group = {
+ .name = "hc3_cc_param",
+ .attrs = hc3_param_attrs,
+ .is_visible = scc_attr_is_visible,
+};
+
+#define HNS_PORT_DIP_CC_ATTR_RW(_name, NAME) \
+ struct hns_port_cc_attr hns_roce_port_attr_dip_##_name = \
+ __HNS_SCC_ATTR(_name, HNS_ROCE_SCC_ALGO_DIP, \
+ HNS_ROCE_DIP_##NAME##_OFS, \
+ HNS_ROCE_DIP_##NAME##_SZ, \
+ 0, HNS_ROCE_DIP_##NAME##_MAX)
+
+HNS_PORT_DIP_CC_ATTR_RW(ai, AI);
+HNS_PORT_DIP_CC_ATTR_RW(f, F);
+HNS_PORT_DIP_CC_ATTR_RW(tkp, TKP);
+HNS_PORT_DIP_CC_ATTR_RW(tmp, TMP);
+HNS_PORT_DIP_CC_ATTR_RW(alp, ALP);
+HNS_PORT_DIP_CC_ATTR_RW(max_speed, MAX_SPEED);
+HNS_PORT_DIP_CC_ATTR_RW(g, G);
+HNS_PORT_DIP_CC_ATTR_RW(al, AL);
+HNS_PORT_DIP_CC_ATTR_RW(cnp_time, CNP_TIME);
+HNS_PORT_DIP_CC_ATTR_RW(ashift, ASHIFT);
+HNS_PORT_DIP_CC_ATTR_RW(lifespan, LIFESPAN);
+
+static struct attribute *dip_param_attrs[] = {
+ &hns_roce_port_attr_dip_ai.port_attr.attr,
+ &hns_roce_port_attr_dip_f.port_attr.attr,
+ &hns_roce_port_attr_dip_tkp.port_attr.attr,
+ &hns_roce_port_attr_dip_tmp.port_attr.attr,
+ &hns_roce_port_attr_dip_alp.port_attr.attr,
+ &hns_roce_port_attr_dip_max_speed.port_attr.attr,
+ &hns_roce_port_attr_dip_g.port_attr.attr,
+ &hns_roce_port_attr_dip_al.port_attr.attr,
+ &hns_roce_port_attr_dip_cnp_time.port_attr.attr,
+ &hns_roce_port_attr_dip_ashift.port_attr.attr,
+ &hns_roce_port_attr_dip_lifespan.port_attr.attr,
+ NULL,
+};
+
+static const struct attribute_group dip_cc_param_group = {
+ .name = "dip_cc_param",
+ .attrs = dip_param_attrs,
+ .is_visible = scc_attr_is_visible,
+};
+
+const struct attribute_group *hns_attr_port_groups[] = {
+ &dcqcn_cc_param_group,
+ &ldcp_cc_param_group,
+ &hc3_cc_param_group,
+ &dip_cc_param_group,
+ NULL,
+};
+
+static ssize_t hns_roce_port_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ struct hns_port_attribute *port_attr =
+ container_of(attr, struct hns_port_attribute, attr);
+ struct hns_roce_port *p =
+ container_of(kobj, struct hns_roce_port, kobj);
+
+ if (!port_attr->show)
+ return -EIO;
+
+ return port_attr->show(p, port_attr, buf);
+}
+
+static ssize_t hns_roce_port_attr_store(struct kobject *kobj,
+ struct attribute *attr,
+ const char *buf, size_t count)
+{
+ struct hns_port_attribute *port_attr =
+ container_of(attr, struct hns_port_attribute, attr);
+ struct hns_roce_port *p =
+ container_of(kobj, struct hns_roce_port, kobj);
+
+ if (!port_attr->store)
+ return -EIO;
+
+ return port_attr->store(p, port_attr, buf, count);
+}
+
+static void hns_roce_port_release(struct kobject *kobj)
+{
+ struct hns_roce_port *pdata =
+ container_of(kobj, struct hns_roce_port, kobj);
+
+ kfree(pdata->scc_param);
+}
+
+static const struct sysfs_ops hns_roce_port_ops = {
+ .show = hns_roce_port_attr_show,
+ .store = hns_roce_port_attr_store,
+};
+
+static struct kobj_type hns_roce_port_ktype = {
+ .release = hns_roce_port_release,
+ .sysfs_ops = &hns_roce_port_ops,
+};
+
+int hns_roce_create_port_files(struct ib_device *ibdev, u8 port_num,
+ struct kobject *kobj)
+{
+ struct hns_roce_dev *hr_dev = to_hr_dev(ibdev);
+ struct hns_roce_port *pdata;
+ int ret;
+
+ if (!port_num || port_num > hr_dev->caps.num_ports) {
+ ibdev_err(ibdev, "fail to create port sysfs for invalid port %u.\n",
+ port_num);
+ return -ENODEV;
+ }
+
+ pdata = &hr_dev->port_data[port_num - 1];
+ pdata->hr_dev = hr_dev;
+ pdata->port_num = port_num;
+ ret = kobject_init_and_add(&pdata->kobj, &hns_roce_port_ktype,
+ kobj, "cc_param");
+ if (ret) {
+ ibdev_err(ibdev, "fail to create port(%u) sysfs, ret = %d.\n",
+ port_num, ret);
+ goto fail_kobj;
+ }
+ kobject_uevent(&pdata->kobj, KOBJ_ADD);
+
+ ret = sysfs_create_groups(&pdata->kobj, hns_attr_port_groups);
+ if (ret) {
+ ibdev_err(ibdev,
+ "fail to create port(%u) cc param sysfs, ret = %d.\n",
+ port_num, ret);
+ goto fail_kobj;
+ }
+
+ ret = alloc_scc_param(hr_dev, pdata);
+ if (ret) {
+ dev_err(hr_dev->dev, "alloc scc param failed, ret = %d!\n",
+ ret);
+ goto fail_group;
+ }
+
+ return ret;
+
+fail_group:
+ sysfs_remove_groups(&pdata->kobj, hns_attr_port_groups);
+
+fail_kobj:
+ kobject_put(&pdata->kobj);
+
+ return ret;
+}
+
+static void hns_roce_unregister_port_sysfs(struct hns_roce_dev *hr_dev,
+ u8 port_num)
+{
+ struct hns_roce_port *pdata;
+
+ pdata = &hr_dev->port_data[port_num];
+ sysfs_remove_groups(&pdata->kobj, hns_attr_port_groups);
+ kobject_put(&pdata->kobj);
+}
+
+void hns_roce_unregister_sysfs(struct hns_roce_dev *hr_dev)
+{
+ int i;
+
+ for (i = 0; i < hr_dev->caps.num_ports; i++)
+ hns_roce_unregister_port_sysfs(hr_dev, i);
+}
--
2.30.0
1
0

10 Mar '23
当前例会议题:
议题一:进展update(10min) --- 张伽琳 & 郑增凯
议题二:内核热升级使用及技术探讨和规划 --- 桑琰
欢迎大家继续申报~
上期遗留问题:
1、openEuler缺陷issue处理流程介绍
2、内核热升级下一步规划,需求讨论
-----原始约会-----
发件人: openEuler conference <public(a)openeuler.org>
发送时间: 2023年3月7日 15:49
收件人: dev@openeuler.org,kernel-discuss@openeuler.org,kernel@openeuler.org
主题: openEuler Kernel SIG双周例会
时间: 2023年3月10日星期五 14:00-15:30(UTC+08:00) 北京,重庆,香港特别行政区,乌鲁木齐。
地点:
您好!
Kernel SIG 邀请您参加 2023-03-10 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG双周例会
会议内容:
欢迎您参加 Kernel SIG 双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/87191608489?pwd=eG96T0p2Y0NDRUdHOW9SYys5SElTQT09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-03-10 14:00,
The subject of the conference is openEuler Kernel SIG双周例会,
Summary:
欢迎您参加 Kernel SIG 双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/87191608489?pwd=eG96T0p2Y0NDRUdHOW9SYys5SElTQT09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[PATCH openEuler-5.10-LTS-SP1] Revert "scsi: fix iscsi rescan fails to create block"
by Jialin Zhang 09 Mar '23
by Jialin Zhang 09 Mar '23
09 Mar '23
From: Zhong Jinghua <zhongjinghua(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 188150, https://gitee.com/openeuler/kernel/issues/I643OL
----------------------------------------
This reverts commit 7f10ea522db56188ae46c5bbee7052a2b2797515.
This commit has a soft lock problem:
watchdog: BUG: soft lockup - CPU#22 stuck for 67s! [iscsid:16369]
Call Trace:
scsi_remove_target+0x548/0x7b0
? sdev_store_delete+0x90/0x90
? __mutex_lock_slowpath+0x10/0x10
? device_remove_class_symlinks+0x1b0/0x1b0
__iscsi_unbind_session+0x16b/0x250 [scsi_transport_iscsi]
iscsi_remove_session+0x1d3/0x2f0 [scsi_transport_iscsi]
iscsi_session_remove+0x5c/0x80 [libiscsi]
iscsi_sw_tcp_session_destroy+0xd3/0x160 [iscsi_tcp]
iscsi_if_rx+0x2369/0x5060 [scsi_transport_iscsi]
The reason is that if other threads hold the reference count of the
kobject while waiting for the device to be released, it will keep
waiting in a loop.
Fixes: 7f10ea522db5 ("scsi: fix iscsi rescan fails to create block")
Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/scsi/scsi_sysfs.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index e7893835b99a..42db9c52208e 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1503,13 +1503,6 @@ void scsi_remove_device(struct scsi_device *sdev)
}
EXPORT_SYMBOL(scsi_remove_device);
-static int scsi_device_try_get(struct scsi_device *sdev)
-{
- if (!kobject_get_unless_zero(&sdev->sdev_gendev.kobj))
- return -ENXIO;
- return 0;
-}
-
static void __scsi_remove_target(struct scsi_target *starget)
{
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
@@ -1528,7 +1521,9 @@ static void __scsi_remove_target(struct scsi_target *starget)
if (sdev->channel != starget->channel ||
sdev->id != starget->id)
continue;
- if (scsi_device_try_get(sdev))
+ if (sdev->sdev_state == SDEV_DEL ||
+ sdev->sdev_state == SDEV_CANCEL ||
+ !get_device(&sdev->sdev_gendev))
continue;
spin_unlock_irqrestore(shost->host_lock, flags);
scsi_remove_device(sdev);
--
2.25.1
1
0

[PATCH openEuler-1.0-LTS 1/3] HID: asus: Remove check for same LED brightness on set
by Yongqiang Liu 09 Mar '23
by Yongqiang Liu 09 Mar '23
09 Mar '23
From: "Luke D. Jones" <luke(a)ljones.dev>
mainline inclusion
from mainline-v5.14-rc4
commit 3fdcf7cdfc229346d028242e73562704ad644dd0
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6I7U9
CVE: CVE-2023-1079
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Remove the early return on LED brightness set so that any controller
application, daemon, or desktop may set the same brightness at any stage.
This is required because many ASUS ROG keyboards will default to max
brightness on laptop resume if the LEDs were set to off before sleep.
Signed-off-by: Luke D Jones <luke(a)ljones.dev>
Signed-off-by: Jiri Kosina <jkosina(a)suse.cz>
Signed-off-by: Yuyao Lin <linyuyao1(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/hid/hid-asus.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/hid/hid-asus.c b/drivers/hid/hid-asus.c
index 800b2364e29e..9ae8e3d5edf1 100644
--- a/drivers/hid/hid-asus.c
+++ b/drivers/hid/hid-asus.c
@@ -318,9 +318,6 @@ static void asus_kbd_backlight_set(struct led_classdev *led_cdev,
{
struct asus_kbd_leds *led = container_of(led_cdev, struct asus_kbd_leds,
cdev);
- if (led->brightness == brightness)
- return;
-
led->brightness = brightness;
schedule_work(&led->work);
}
--
2.25.1
1
2

[PATCH openEuler-1.0-LTS 01/16] netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state
by Yongqiang Liu 08 Mar '23
by Yongqiang Liu 08 Mar '23
08 Mar '23
From: Florian Westphal <fw(a)strlen.de>
stable inclusion
from stable-v4.19.272
commit 01687e35df44dd09cc6943306db35d9efc507907
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6KOHU
CVE: NA
--------------------------------
[ Upstream commit e15d4cdf27cb0c1e977270270b2cea12e0955edd ]
Consider:
client -----> conntrack ---> Host
client sends a SYN, but $Host is unreachable/silent.
Client eventually gives up and the conntrack entry will time out.
However, if the client is restarted with same addr/port pair, it
may prevent the conntrack entry from timing out.
This is noticeable when the existing conntrack entry has no NAT
transformation or an outdated one and port reuse happens either
on client or due to a NAT middlebox.
This change prevents refresh of the timeout for SYN retransmits,
so entry is going away after nf_conntrack_tcp_timeout_syn_sent
seconds (default: 60).
Entry will be re-created on next connection attempt, but then
nat rules will be evaluated again.
Signed-off-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/netfilter/nf_conntrack_proto_tcp.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index aab532b8c8c6..1600f35bfd49 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -1089,6 +1089,16 @@ static int tcp_packet(struct nf_conn *ct,
nf_ct_kill_acct(ct, ctinfo, skb);
return NF_ACCEPT;
}
+
+ if (index == TCP_SYN_SET && old_state == TCP_CONNTRACK_SYN_SENT) {
+ /* do not renew timeout on SYN retransmit.
+ *
+ * Else port reuse by client or NAT middlebox can keep
+ * entry alive indefinitely (including nat info).
+ */
+ return NF_ACCEPT;
+ }
+
/* ESTABLISHED without SEEN_REPLY, i.e. mid-connection
* pickup with loose=1. Avoid large ESTABLISHED timeout.
*/
--
2.25.1
1
15

[PATCH openEuler-1.0-LTS 01/31] binder: use standard functions to allocate fds
by Yongqiang Liu 08 Mar '23
by Yongqiang Liu 08 Mar '23
08 Mar '23
From: Todd Kjos <tkjos(a)android.com>
mainline inclusion
from mainline-v4.20-rc1
commit 44d8047f1d87adc2fd7eccc88533794f6d88c15e
category: bugfix
bugzilla: 188431, https://gitee.com/src-openeuler/kernel/issues/I6DKVG
CVE: CVE-2023-20938
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Binder uses internal fs interfaces to allocate and install fds:
__alloc_fd
__fd_install
__close_fd
get_files_struct
put_files_struct
These were used to support the passing of fds between processes
as part of a transaction. The actual allocation and installation
of the fds in the target process was handled by the sending
process so the standard functions, alloc_fd() and fd_install()
which assume task==current couldn't be used.
This patch refactors this mechanism so that the fds are
allocated and installed by the target process allowing the
standard functions to be used.
The sender now creates a list of fd fixups that contains the
struct *file and the address to fixup with the new fd once
it is allocated. This list is processed by the target process
when the transaction is dequeued.
A new error case is introduced by this change. If an async
transaction with file descriptors cannot allocate new
fds in the target (probably due to out of file descriptors),
the transaction is discarded with a log message. In the old
implementation this would have been detected in the sender
context and failed prior to sending.
Signed-off-by: Todd Kjos <tkjos(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Conflicts:
drivers/android/binder.c
Signed-off-by: Li Huafei <lihuafei1(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/android/Kconfig | 2 +-
drivers/android/binder.c | 391 ++++++++++++++++++++-------------
drivers/android/binder_trace.h | 36 ++-
3 files changed, 262 insertions(+), 167 deletions(-)
diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
index 432e9ad77070..51e8250d113f 100644
--- a/drivers/android/Kconfig
+++ b/drivers/android/Kconfig
@@ -10,7 +10,7 @@ if ANDROID
config ANDROID_BINDER_IPC
bool "Android Binder IPC Driver"
- depends on MMU
+ depends on MMU && !CPU_CACHE_VIVT
default n
---help---
Binder is used in Android for both communication between processes,
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index c5a39b8ae886..8a72aacd7a39 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -71,6 +71,7 @@
#include <linux/security.h>
#include <linux/spinlock.h>
#include <linux/ratelimit.h>
+#include <linux/syscalls.h>
#include <uapi/linux/android/binder.h>
@@ -457,9 +458,8 @@ struct binder_ref {
};
enum binder_deferred_state {
- BINDER_DEFERRED_PUT_FILES = 0x01,
- BINDER_DEFERRED_FLUSH = 0x02,
- BINDER_DEFERRED_RELEASE = 0x04,
+ BINDER_DEFERRED_FLUSH = 0x01,
+ BINDER_DEFERRED_RELEASE = 0x02,
};
/**
@@ -480,9 +480,6 @@ enum binder_deferred_state {
* (invariant after initialized)
* @tsk task_struct for group_leader of process
* (invariant after initialized)
- * @files files_struct for process
- * (protected by @files_lock)
- * @files_lock mutex to protect @files
* @cred struct cred associated with the `struct file`
* in binder_open()
* (invariant after initialized)
@@ -530,8 +527,6 @@ struct binder_proc {
struct list_head waiting_threads;
int pid;
struct task_struct *tsk;
- struct files_struct *files;
- struct mutex files_lock;
const struct cred *cred;
struct hlist_node deferred_work_node;
int deferred_work;
@@ -615,6 +610,23 @@ struct binder_thread {
bool is_dead;
};
+/**
+ * struct binder_txn_fd_fixup - transaction fd fixup list element
+ * @fixup_entry: list entry
+ * @file: struct file to be associated with new fd
+ * @offset: offset in buffer data to this fixup
+ *
+ * List element for fd fixups in a transaction. Since file
+ * descriptors need to be allocated in the context of the
+ * target process, we pass each fd to be processed in this
+ * struct.
+ */
+struct binder_txn_fd_fixup {
+ struct list_head fixup_entry;
+ struct file *file;
+ size_t offset;
+};
+
struct binder_transaction {
int debug_id;
struct binder_work work;
@@ -632,6 +644,7 @@ struct binder_transaction {
long priority;
long saved_priority;
kuid_t sender_euid;
+ struct list_head fd_fixups;
/**
* @lock: protects @from, @to_proc, and @to_thread
*
@@ -905,66 +918,6 @@ static void binder_free_thread(struct binder_thread *thread);
static void binder_free_proc(struct binder_proc *proc);
static void binder_inc_node_tmpref_ilocked(struct binder_node *node);
-static int task_get_unused_fd_flags(struct binder_proc *proc, int flags)
-{
- unsigned long rlim_cur;
- unsigned long irqs;
- int ret;
-
- mutex_lock(&proc->files_lock);
- if (proc->files == NULL) {
- ret = -ESRCH;
- goto err;
- }
- if (!lock_task_sighand(proc->tsk, &irqs)) {
- ret = -EMFILE;
- goto err;
- }
- rlim_cur = task_rlimit(proc->tsk, RLIMIT_NOFILE);
- unlock_task_sighand(proc->tsk, &irqs);
-
- ret = __alloc_fd(proc->files, 0, rlim_cur, flags);
-err:
- mutex_unlock(&proc->files_lock);
- return ret;
-}
-
-/*
- * copied from fd_install
- */
-static void task_fd_install(
- struct binder_proc *proc, unsigned int fd, struct file *file)
-{
- mutex_lock(&proc->files_lock);
- if (proc->files)
- __fd_install(proc->files, fd, file);
- mutex_unlock(&proc->files_lock);
-}
-
-/*
- * copied from sys_close
- */
-static long task_close_fd(struct binder_proc *proc, unsigned int fd)
-{
- int retval;
-
- mutex_lock(&proc->files_lock);
- if (proc->files == NULL) {
- retval = -ESRCH;
- goto err;
- }
- retval = __close_fd(proc->files, fd);
- /* can't restart close syscall because file table entry was cleared */
- if (unlikely(retval == -ERESTARTSYS ||
- retval == -ERESTARTNOINTR ||
- retval == -ERESTARTNOHAND ||
- retval == -ERESTART_RESTARTBLOCK))
- retval = -EINTR;
-err:
- mutex_unlock(&proc->files_lock);
- return retval;
-}
-
static bool binder_has_work_ilocked(struct binder_thread *thread,
bool do_proc_work)
{
@@ -1948,6 +1901,27 @@ static struct binder_thread *binder_get_txn_from_and_acq_inner(
return NULL;
}
+/**
+ * binder_free_txn_fixups() - free unprocessed fd fixups
+ * @t: binder transaction for t->from
+ *
+ * If the transaction is being torn down prior to being
+ * processed by the target process, free all of the
+ * fd fixups and fput the file structs. It is safe to
+ * call this function after the fixups have been
+ * processed -- in that case, the list will be empty.
+ */
+static void binder_free_txn_fixups(struct binder_transaction *t)
+{
+ struct binder_txn_fd_fixup *fixup, *tmp;
+
+ list_for_each_entry_safe(fixup, tmp, &t->fd_fixups, fixup_entry) {
+ fput(fixup->file);
+ list_del(&fixup->fixup_entry);
+ kfree(fixup);
+ }
+}
+
static void binder_free_transaction(struct binder_transaction *t)
{
struct binder_proc *target_proc = t->to_proc;
@@ -1962,6 +1936,7 @@ static void binder_free_transaction(struct binder_transaction *t)
* If the transaction has no target_proc, then
* t->buffer->transaction has already been cleared.
*/
+ binder_free_txn_fixups(t);
kfree(t);
binder_stats_deleted(BINDER_STAT_TRANSACTION);
}
@@ -2262,12 +2237,17 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
} break;
case BINDER_TYPE_FD: {
- struct binder_fd_object *fp = to_binder_fd_object(hdr);
-
- binder_debug(BINDER_DEBUG_TRANSACTION,
- " fd %d\n", fp->fd);
- if (failed_at)
- task_close_fd(proc, fp->fd);
+ /*
+ * No need to close the file here since user-space
+ * closes it for for successfully delivered
+ * transactions. For transactions that weren't
+ * delivered, the new fd was never allocated so
+ * there is no need to close and the fput on the
+ * file is done when the transaction is torn
+ * down.
+ */
+ WARN_ON(failed_at &&
+ proc->tsk == current->group_leader);
} break;
case BINDER_TYPE_PTR:
/*
@@ -2283,6 +2263,15 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
size_t fd_index;
binder_size_t fd_buf_size;
+ if (proc->tsk != current->group_leader) {
+ /*
+ * Nothing to do if running in sender context
+ * The fd fixups have not been applied so no
+ * fds need to be closed.
+ */
+ continue;
+ }
+
fda = to_binder_fd_array_object(hdr);
parent = binder_validate_ptr(buffer, fda->parent,
off_start,
@@ -2315,7 +2304,7 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
}
fd_array = (u32 *)(parent_buffer + (uintptr_t)fda->parent_offset);
for (fd_index = 0; fd_index < fda->num_fds; fd_index++)
- task_close_fd(proc, fd_array[fd_index]);
+ ksys_close(fd_array[fd_index]);
} break;
default:
pr_err("transaction release %d bad object type %x\n",
@@ -2447,17 +2436,18 @@ static int binder_translate_handle(struct flat_binder_object *fp,
return ret;
}
-static int binder_translate_fd(int fd,
+static int binder_translate_fd(u32 *fdp,
struct binder_transaction *t,
struct binder_thread *thread,
struct binder_transaction *in_reply_to)
{
struct binder_proc *proc = thread->proc;
struct binder_proc *target_proc = t->to_proc;
- int target_fd;
+ struct binder_txn_fd_fixup *fixup;
struct file *file;
- int ret;
+ int ret = 0;
bool target_allows_fd;
+ int fd = *fdp;
if (in_reply_to)
target_allows_fd = !!(in_reply_to->flags & TF_ACCEPT_FDS);
@@ -2485,19 +2475,24 @@ static int binder_translate_fd(int fd,
goto err_security;
}
- target_fd = task_get_unused_fd_flags(target_proc, O_CLOEXEC);
- if (target_fd < 0) {
+ /*
+ * Add fixup record for this transaction. The allocation
+ * of the fd in the target needs to be done from a
+ * target thread.
+ */
+ fixup = kzalloc(sizeof(*fixup), GFP_KERNEL);
+ if (!fixup) {
ret = -ENOMEM;
- goto err_get_unused_fd;
+ goto err_alloc;
}
- task_fd_install(target_proc, target_fd, file);
- trace_binder_transaction_fd(t, fd, target_fd);
- binder_debug(BINDER_DEBUG_TRANSACTION, " fd %d -> %d\n",
- fd, target_fd);
+ fixup->file = file;
+ fixup->offset = (uintptr_t)fdp - (uintptr_t)t->buffer->data;
+ trace_binder_transaction_fd_send(t, fd, fixup->offset);
+ list_add_tail(&fixup->fixup_entry, &t->fd_fixups);
- return target_fd;
+ return ret;
-err_get_unused_fd:
+err_alloc:
err_security:
fput(file);
err_fget:
@@ -2511,8 +2506,7 @@ static int binder_translate_fd_array(struct binder_fd_array_object *fda,
struct binder_thread *thread,
struct binder_transaction *in_reply_to)
{
- binder_size_t fdi, fd_buf_size, num_installed_fds;
- int target_fd;
+ binder_size_t fdi, fd_buf_size;
uintptr_t parent_buffer;
u32 *fd_array;
struct binder_proc *proc = thread->proc;
@@ -2544,23 +2538,12 @@ static int binder_translate_fd_array(struct binder_fd_array_object *fda,
return -EINVAL;
}
for (fdi = 0; fdi < fda->num_fds; fdi++) {
- target_fd = binder_translate_fd(fd_array[fdi], t, thread,
+ int ret = binder_translate_fd(&fd_array[fdi], t, thread,
in_reply_to);
- if (target_fd < 0)
- goto err_translate_fd_failed;
- fd_array[fdi] = target_fd;
+ if (ret < 0)
+ return ret;
}
return 0;
-
-err_translate_fd_failed:
- /*
- * Failed to allocate fd or security error, free fds
- * installed so far.
- */
- num_installed_fds = fdi;
- for (fdi = 0; fdi < num_installed_fds; fdi++)
- task_close_fd(target_proc, fd_array[fdi]);
- return target_fd;
}
static int binder_fixup_parent(struct binder_transaction *t,
@@ -2935,6 +2918,7 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_alloc_t_failed;
}
+ INIT_LIST_HEAD(&t->fd_fixups);
binder_stats_created(BINDER_STAT_TRANSACTION);
spin_lock_init(&t->lock);
@@ -3089,17 +3073,16 @@ static void binder_transaction(struct binder_proc *proc,
case BINDER_TYPE_FD: {
struct binder_fd_object *fp = to_binder_fd_object(hdr);
- int target_fd = binder_translate_fd(fp->fd, t, thread,
- in_reply_to);
+ int ret = binder_translate_fd(&fp->fd, t, thread,
+ in_reply_to);
- if (target_fd < 0) {
+ if (ret < 0) {
return_error = BR_FAILED_REPLY;
- return_error_param = target_fd;
+ return_error_param = ret;
return_error_line = __LINE__;
goto err_translate_failed;
}
fp->pad_binder = 0;
- fp->fd = target_fd;
} break;
case BINDER_TYPE_FDA: {
struct binder_fd_array_object *fda =
@@ -3256,6 +3239,7 @@ static void binder_transaction(struct binder_proc *proc,
err_bad_offset:
err_bad_parent:
err_copy_data_failed:
+ binder_free_txn_fixups(t);
trace_binder_transaction_failed_buffer_release(t->buffer);
binder_transaction_buffer_release(target_proc, t->buffer, offp);
if (target_node)
@@ -3318,6 +3302,49 @@ static void binder_transaction(struct binder_proc *proc,
}
}
+/**
+ * binder_free_buf() - free the specified buffer
+ * @proc: binder proc that owns buffer
+ * @buffer: buffer to be freed
+ *
+ * If buffer for an async transaction, enqueue the next async
+ * transaction from the node.
+ *
+ * Cleanup buffer and free it.
+ */
+void
+binder_free_buf(struct binder_proc *proc, struct binder_buffer *buffer)
+{
+ binder_inner_proc_lock(proc);
+ if (buffer->transaction) {
+ buffer->transaction->buffer = NULL;
+ buffer->transaction = NULL;
+ }
+ binder_inner_proc_unlock(proc);
+ if (buffer->async_transaction && buffer->target_node) {
+ struct binder_node *buf_node;
+ struct binder_work *w;
+
+ buf_node = buffer->target_node;
+ binder_node_inner_lock(buf_node);
+ BUG_ON(!buf_node->has_async_transaction);
+ BUG_ON(buf_node->proc != proc);
+ w = binder_dequeue_work_head_ilocked(
+ &buf_node->async_todo);
+ if (!w) {
+ buf_node->has_async_transaction = false;
+ } else {
+ binder_enqueue_work_ilocked(
+ w, &proc->todo);
+ binder_wakeup_proc_ilocked(proc);
+ }
+ binder_node_inner_unlock(buf_node);
+ }
+ trace_binder_transaction_buffer_release(buffer);
+ binder_transaction_buffer_release(proc, buffer, NULL);
+ binder_alloc_free_buf(&proc->alloc, buffer);
+}
+
static int binder_thread_write(struct binder_proc *proc,
struct binder_thread *thread,
binder_uintptr_t binder_buffer, size_t size,
@@ -3508,35 +3535,7 @@ static int binder_thread_write(struct binder_proc *proc,
proc->pid, thread->pid, (u64)data_ptr,
buffer->debug_id,
buffer->transaction ? "active" : "finished");
-
- binder_inner_proc_lock(proc);
- if (buffer->transaction) {
- buffer->transaction->buffer = NULL;
- buffer->transaction = NULL;
- }
- binder_inner_proc_unlock(proc);
- if (buffer->async_transaction && buffer->target_node) {
- struct binder_node *buf_node;
- struct binder_work *w;
-
- buf_node = buffer->target_node;
- binder_node_inner_lock(buf_node);
- BUG_ON(!buf_node->has_async_transaction);
- BUG_ON(buf_node->proc != proc);
- w = binder_dequeue_work_head_ilocked(
- &buf_node->async_todo);
- if (!w) {
- buf_node->has_async_transaction = false;
- } else {
- binder_enqueue_work_ilocked(
- w, &proc->todo);
- binder_wakeup_proc_ilocked(proc);
- }
- binder_node_inner_unlock(buf_node);
- }
- trace_binder_transaction_buffer_release(buffer);
- binder_transaction_buffer_release(proc, buffer, NULL);
- binder_alloc_free_buf(&proc->alloc, buffer);
+ binder_free_buf(proc, buffer);
break;
}
@@ -3859,6 +3858,76 @@ static int binder_wait_for_work(struct binder_thread *thread,
return ret;
}
+/**
+ * binder_apply_fd_fixups() - finish fd translation
+ * @t: binder transaction with list of fd fixups
+ *
+ * Now that we are in the context of the transaction target
+ * process, we can allocate and install fds. Process the
+ * list of fds to translate and fixup the buffer with the
+ * new fds.
+ *
+ * If we fail to allocate an fd, then free the resources by
+ * fput'ing files that have not been processed and ksys_close'ing
+ * any fds that have already been allocated.
+ */
+static int binder_apply_fd_fixups(struct binder_transaction *t)
+{
+ struct binder_txn_fd_fixup *fixup, *tmp;
+ int ret = 0;
+
+ list_for_each_entry(fixup, &t->fd_fixups, fixup_entry) {
+ int fd = get_unused_fd_flags(O_CLOEXEC);
+ u32 *fdp;
+
+ if (fd < 0) {
+ binder_debug(BINDER_DEBUG_TRANSACTION,
+ "failed fd fixup txn %d fd %d\n",
+ t->debug_id, fd);
+ ret = -ENOMEM;
+ break;
+ }
+ binder_debug(BINDER_DEBUG_TRANSACTION,
+ "fd fixup txn %d fd %d\n",
+ t->debug_id, fd);
+ trace_binder_transaction_fd_recv(t, fd, fixup->offset);
+ fd_install(fd, fixup->file);
+ fixup->file = NULL;
+ fdp = (u32 *)(t->buffer->data + fixup->offset);
+ /*
+ * This store can cause problems for CPUs with a
+ * VIVT cache (eg ARMv5) since the cache cannot
+ * detect virtual aliases to the same physical cacheline.
+ * To support VIVT, this address and the user-space VA
+ * would both need to be flushed. Since this kernel
+ * VA is not constructed via page_to_virt(), we can't
+ * use flush_dcache_page() on it, so we'd have to use
+ * an internal function. If devices with VIVT ever
+ * need to run Android, we'll either need to go back
+ * to patching the translated fd from the sender side
+ * (using the non-standard kernel functions), or rework
+ * how the kernel uses the buffer to use page_to_virt()
+ * addresses instead of allocating in our own vm area.
+ *
+ * For now, we disable compilation if CONFIG_CPU_CACHE_VIVT.
+ */
+ *fdp = fd;
+ }
+ list_for_each_entry_safe(fixup, tmp, &t->fd_fixups, fixup_entry) {
+ if (fixup->file) {
+ fput(fixup->file);
+ } else if (ret) {
+ u32 *fdp = (u32 *)(t->buffer->data + fixup->offset);
+
+ ksys_close(*fdp);
+ }
+ list_del(&fixup->fixup_entry);
+ kfree(fixup);
+ }
+
+ return ret;
+}
+
static int binder_thread_read(struct binder_proc *proc,
struct binder_thread *thread,
binder_uintptr_t binder_buffer, size_t size,
@@ -4140,6 +4209,34 @@ static int binder_thread_read(struct binder_proc *proc,
tr.sender_pid = 0;
}
+ ret = binder_apply_fd_fixups(t);
+ if (ret) {
+ struct binder_buffer *buffer = t->buffer;
+ bool oneway = !!(t->flags & TF_ONE_WAY);
+ int tid = t->debug_id;
+
+ if (t_from)
+ binder_thread_dec_tmpref(t_from);
+ buffer->transaction = NULL;
+ binder_cleanup_transaction(t, "fd fixups failed",
+ BR_FAILED_REPLY);
+ binder_free_buf(proc, buffer);
+ binder_debug(BINDER_DEBUG_FAILED_TRANSACTION,
+ "%d:%d %stransaction %d fd fixups failed %d/%d, line %d\n",
+ proc->pid, thread->pid,
+ oneway ? "async " :
+ (cmd == BR_REPLY ? "reply " : ""),
+ tid, BR_FAILED_REPLY, ret, __LINE__);
+ if (cmd == BR_REPLY) {
+ cmd = BR_FAILED_REPLY;
+ if (put_user(cmd, (uint32_t __user *)ptr))
+ return -EFAULT;
+ ptr += sizeof(uint32_t);
+ binder_stat_br(proc, thread, cmd);
+ break;
+ }
+ continue;
+ }
tr.data_size = t->buffer->data_size;
tr.offsets_size = t->buffer->offsets_size;
tr.data.ptr.buffer = (binder_uintptr_t)
@@ -4730,7 +4827,6 @@ static void binder_vma_close(struct vm_area_struct *vma)
(vma->vm_end - vma->vm_start) / SZ_1K, vma->vm_flags,
(unsigned long)pgprot_val(vma->vm_page_prot));
binder_alloc_vma_close(&proc->alloc);
- binder_defer_work(proc, BINDER_DEFERRED_PUT_FILES);
}
static vm_fault_t binder_vm_fault(struct vm_fault *vmf)
@@ -4776,9 +4872,6 @@ static int binder_mmap(struct file *filp, struct vm_area_struct *vma)
ret = binder_alloc_mmap_handler(&proc->alloc, vma);
if (ret)
return ret;
- mutex_lock(&proc->files_lock);
- proc->files = get_files_struct(current);
- mutex_unlock(&proc->files_lock);
return 0;
err_bad_arg:
@@ -4802,7 +4895,6 @@ static int binder_open(struct inode *nodp, struct file *filp)
spin_lock_init(&proc->outer_lock);
get_task_struct(current->group_leader);
proc->tsk = current->group_leader;
- mutex_init(&proc->files_lock);
proc->cred = get_cred(filp->f_cred);
INIT_LIST_HEAD(&proc->todo);
proc->default_priority = task_nice(current);
@@ -4953,8 +5045,6 @@ static void binder_deferred_release(struct binder_proc *proc)
struct rb_node *n;
int threads, nodes, incoming_refs, outgoing_refs, active_transactions;
- BUG_ON(proc->files);
-
mutex_lock(&binder_procs_lock);
hlist_del(&proc->proc_node);
mutex_unlock(&binder_procs_lock);
@@ -5036,7 +5126,6 @@ static void binder_deferred_release(struct binder_proc *proc)
static void binder_deferred_func(struct work_struct *work)
{
struct binder_proc *proc;
- struct files_struct *files;
int defer;
@@ -5054,23 +5143,11 @@ static void binder_deferred_func(struct work_struct *work)
}
mutex_unlock(&binder_deferred_lock);
- files = NULL;
- if (defer & BINDER_DEFERRED_PUT_FILES) {
- mutex_lock(&proc->files_lock);
- files = proc->files;
- if (files)
- proc->files = NULL;
- mutex_unlock(&proc->files_lock);
- }
-
if (defer & BINDER_DEFERRED_FLUSH)
binder_deferred_flush(proc);
if (defer & BINDER_DEFERRED_RELEASE)
binder_deferred_release(proc); /* frees proc */
-
- if (files)
- put_files_struct(files);
} while (proc);
}
static DECLARE_WORK(binder_deferred_work, binder_deferred_func);
diff --git a/drivers/android/binder_trace.h b/drivers/android/binder_trace.h
index 588eb3ec3507..14de7ac57a34 100644
--- a/drivers/android/binder_trace.h
+++ b/drivers/android/binder_trace.h
@@ -223,22 +223,40 @@ TRACE_EVENT(binder_transaction_ref_to_ref,
__entry->dest_ref_debug_id, __entry->dest_ref_desc)
);
-TRACE_EVENT(binder_transaction_fd,
- TP_PROTO(struct binder_transaction *t, int src_fd, int dest_fd),
- TP_ARGS(t, src_fd, dest_fd),
+TRACE_EVENT(binder_transaction_fd_send,
+ TP_PROTO(struct binder_transaction *t, int fd, size_t offset),
+ TP_ARGS(t, fd, offset),
TP_STRUCT__entry(
__field(int, debug_id)
- __field(int, src_fd)
- __field(int, dest_fd)
+ __field(int, fd)
+ __field(size_t, offset)
+ ),
+ TP_fast_assign(
+ __entry->debug_id = t->debug_id;
+ __entry->fd = fd;
+ __entry->offset = offset;
+ ),
+ TP_printk("transaction=%d src_fd=%d offset=%zu",
+ __entry->debug_id, __entry->fd, __entry->offset)
+);
+
+TRACE_EVENT(binder_transaction_fd_recv,
+ TP_PROTO(struct binder_transaction *t, int fd, size_t offset),
+ TP_ARGS(t, fd, offset),
+
+ TP_STRUCT__entry(
+ __field(int, debug_id)
+ __field(int, fd)
+ __field(size_t, offset)
),
TP_fast_assign(
__entry->debug_id = t->debug_id;
- __entry->src_fd = src_fd;
- __entry->dest_fd = dest_fd;
+ __entry->fd = fd;
+ __entry->offset = offset;
),
- TP_printk("transaction=%d src_fd=%d ==> dest_fd=%d",
- __entry->debug_id, __entry->src_fd, __entry->dest_fd)
+ TP_printk("transaction=%d dest_fd=%d offset=%zu",
+ __entry->debug_id, __entry->fd, __entry->offset)
);
DECLARE_EVENT_CLASS(binder_buffer_class,
--
2.25.1
1
30

[PATCH openEuler-1.0-LTS 1/7] sctp: fail if no bound addresses can be used for a given scope
by Yongqiang Liu 08 Mar '23
by Yongqiang Liu 08 Mar '23
08 Mar '23
From: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
stable inclusion
from stable-v4.19.271
commit 26436553aabfd9b40e1daa537a099bf5bb13fb55
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6I7U3
CVE: CVE-2023-1074
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 458e279f861d3f61796894cd158b780765a1569f ]
Currently, if you bind the socket to something like:
servaddr.sin6_family = AF_INET6;
servaddr.sin6_port = htons(0);
servaddr.sin6_scope_id = 0;
inet_pton(AF_INET6, "::1", &servaddr.sin6_addr);
And then request a connect to:
connaddr.sin6_family = AF_INET6;
connaddr.sin6_port = htons(20000);
connaddr.sin6_scope_id = if_nametoindex("lo");
inet_pton(AF_INET6, "fe88::1", &connaddr.sin6_addr);
What the stack does is:
- bind the socket
- create a new asoc
- to handle the connect
- copy the addresses that can be used for the given scope
- try to connect
But the copy returns 0 addresses, and the effect is that it ends up
trying to connect as if the socket wasn't bound, which is not the
desired behavior. This unexpected behavior also allows KASLR leaks
through SCTP diag interface.
The fix here then is, if when trying to copy the addresses that can
be used for the scope used in connect() it returns 0 addresses, bail
out. This is what TCP does with a similar reproducer.
Reported-by: Pietro Borrello <borrello(a)diag.uniroma1.it>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Link: https://lore.kernel.org/r/9fcd182f1099f86c6661f3717f63712ddd1c676c.16744967…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/sctp/bind_addr.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index f8a283245672..d723942e5e65 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -88,6 +88,12 @@ int sctp_bind_addr_copy(struct net *net, struct sctp_bind_addr *dest,
}
}
+ /* If somehow no addresses were found that can be used with this
+ * scope, it's an error.
+ */
+ if (list_empty(&dest->address_list))
+ error = -ENETUNREACH;
+
out:
if (error)
sctp_bind_addr_clean(dest);
--
2.25.1
1
6

[PATCH openEuler-1.0-LTS] sctp: fail if no bound addresses can be used for a given scope
by Yongqiang Liu 08 Mar '23
by Yongqiang Liu 08 Mar '23
08 Mar '23
From: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
stable inclusion
from stable-v4.19.271
commit 26436553aabfd9b40e1daa537a099bf5bb13fb55
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6I7U3
CVE: CVE-2023-1074
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 458e279f861d3f61796894cd158b780765a1569f ]
Currently, if you bind the socket to something like:
servaddr.sin6_family = AF_INET6;
servaddr.sin6_port = htons(0);
servaddr.sin6_scope_id = 0;
inet_pton(AF_INET6, "::1", &servaddr.sin6_addr);
And then request a connect to:
connaddr.sin6_family = AF_INET6;
connaddr.sin6_port = htons(20000);
connaddr.sin6_scope_id = if_nametoindex("lo");
inet_pton(AF_INET6, "fe88::1", &connaddr.sin6_addr);
What the stack does is:
- bind the socket
- create a new asoc
- to handle the connect
- copy the addresses that can be used for the given scope
- try to connect
But the copy returns 0 addresses, and the effect is that it ends up
trying to connect as if the socket wasn't bound, which is not the
desired behavior. This unexpected behavior also allows KASLR leaks
through SCTP diag interface.
The fix here then is, if when trying to copy the addresses that can
be used for the scope used in connect() it returns 0 addresses, bail
out. This is what TCP does with a similar reproducer.
Reported-by: Pietro Borrello <borrello(a)diag.uniroma1.it>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Link: https://lore.kernel.org/r/9fcd182f1099f86c6661f3717f63712ddd1c676c.16744967…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Dong Chenchen <dongchenchen2(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/sctp/bind_addr.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index f8a283245672..d723942e5e65 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -88,6 +88,12 @@ int sctp_bind_addr_copy(struct net *net, struct sctp_bind_addr *dest,
}
}
+ /* If somehow no addresses were found that can be used with this
+ * scope, it's an error.
+ */
+ if (list_empty(&dest->address_list))
+ error = -ENETUNREACH;
+
out:
if (error)
sctp_bind_addr_clean(dest);
--
2.25.1
1
0

[PATCH openEuler-5.10-LTS-SP1 01/30] ovl: fix use inode directly in rcu-walk mode
by Zheng Zengkai 08 Mar '23
by Zheng Zengkai 08 Mar '23
08 Mar '23
From: Chen Zhongjin <chenzhongjin(a)huawei.com>
stable inclusion
from stable-v5.10.163
commit 740c537f52c1f54aff9094744483d1515c7c8b7b
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6GCCV
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 672e4268b2863d7e4978dfed29552b31c2f9bd4e upstream.
ovl_dentry_revalidate_common() can be called in rcu-walk mode. As document
said, "in rcu-walk mode, d_parent and d_inode should not be used without
care".
Check inode here to protect access under rcu-walk mode.
Fixes: bccece1ead36 ("ovl: allow remote upper")
Reported-and-tested-by: syzbot+a4055c78774bbf3498bb(a)syzkaller.appspotmail.com
Signed-off-by: Chen Zhongjin <chenzhongjin(a)huawei.com>
Cc: <stable(a)vger.kernel.org> # v5.7
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/overlayfs/super.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 45c596dfe3a3..e3cd5a00f880 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -138,11 +138,16 @@ static int ovl_dentry_revalidate_common(struct dentry *dentry,
unsigned int flags, bool weak)
{
struct ovl_entry *oe = dentry->d_fsdata;
+ struct inode *inode = d_inode_rcu(dentry);
struct dentry *upper;
unsigned int i;
int ret = 1;
- upper = ovl_dentry_upper(dentry);
+ /* Careful in RCU mode */
+ if (!inode)
+ return -ECHILD;
+
+ upper = ovl_i_dentry_upper(inode);
if (upper)
ret = ovl_revalidate_real(upper, flags, weak);
--
2.25.1
1
29

[PATCH openEuler-5.10-LTS 01/30] ovl: fix use inode directly in rcu-walk mode
by Zheng Zengkai 08 Mar '23
by Zheng Zengkai 08 Mar '23
08 Mar '23
From: Chen Zhongjin <chenzhongjin(a)huawei.com>
stable inclusion
from stable-v5.10.163
commit 740c537f52c1f54aff9094744483d1515c7c8b7b
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6GCCV
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit 672e4268b2863d7e4978dfed29552b31c2f9bd4e upstream.
ovl_dentry_revalidate_common() can be called in rcu-walk mode. As document
said, "in rcu-walk mode, d_parent and d_inode should not be used without
care".
Check inode here to protect access under rcu-walk mode.
Fixes: bccece1ead36 ("ovl: allow remote upper")
Reported-and-tested-by: syzbot+a4055c78774bbf3498bb(a)syzkaller.appspotmail.com
Signed-off-by: Chen Zhongjin <chenzhongjin(a)huawei.com>
Cc: <stable(a)vger.kernel.org> # v5.7
Signed-off-by: Miklos Szeredi <mszeredi(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Zhihao Cheng <chengzhihao1(a)huawei.com>
Reviewed-by: Yang Erkun <yangerkun(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/overlayfs/super.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 45c596dfe3a3..e3cd5a00f880 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -138,11 +138,16 @@ static int ovl_dentry_revalidate_common(struct dentry *dentry,
unsigned int flags, bool weak)
{
struct ovl_entry *oe = dentry->d_fsdata;
+ struct inode *inode = d_inode_rcu(dentry);
struct dentry *upper;
unsigned int i;
int ret = 1;
- upper = ovl_dentry_upper(dentry);
+ /* Careful in RCU mode */
+ if (!inode)
+ return -ECHILD;
+
+ upper = ovl_i_dentry_upper(inode);
if (upper)
ret = ovl_revalidate_real(upper, flags, weak);
--
2.25.1
1
28

[PATCH openEuler-1.0-LTS 1/2] dhugetlb: use mutex lock in update_reserve_pages()
by Yongqiang Liu 07 Mar '23
by Yongqiang Liu 07 Mar '23
07 Mar '23
From: Liu Shixin <liushixin2(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I6KOXL
CVE: NA
--------------------------------
When memory is fragmented, update_reserve_pages() may call migrate_pages()
to collect continuous memory. This function can sleep, so we should use
mutex lock instead of spin lock.
Fixes: 3eb69101b5e6 ("mm: Add two interface for dhugetlb")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Reviewed-by: Nanyong Sun <sunnanyong(a)huawei.com>
Reviewed-by: tong tiangen <tongtiangen(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
include/linux/hugetlb.h | 2 +-
mm/hugetlb.c | 2 +-
mm/memcontrol.c | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index d44eea25c0d6..dfd9a8c945e1 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -696,7 +696,7 @@ struct small_page_pool {
struct dhugetlb_pool {
int nid;
spinlock_t lock;
- spinlock_t reserved_lock;
+ struct mutex reserved_lock;
atomic_t refcnt;
struct mem_cgroup *attach_memcg;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9f974270c84d..a468d94bc16a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3327,7 +3327,7 @@ struct dhugetlb_pool *hpool_alloc(unsigned long nid)
return NULL;
spin_lock_init(&hpool->lock);
- spin_lock_init(&hpool->reserved_lock);
+ mutex_init(&hpool->reserved_lock);
hpool->nid = nid;
atomic_set(&hpool->refcnt, 1);
INIT_LIST_HEAD(&hpool->dhugetlb_1G_freelists);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fd40fef49e45..886d6b0a4fce 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4950,9 +4950,9 @@ static int update_reserve_pages(struct kernfs_open_file *of,
hpool = get_dhugetlb_pool_from_memcg(memcg);
if (!hpool)
return -EINVAL;
- spin_lock(&hpool->reserved_lock);
+ mutex_lock(&hpool->reserved_lock);
dhugetlb_reserve_hugepages(hpool, size, gigantic);
- spin_unlock(&hpool->reserved_lock);
+ mutex_unlock(&hpool->reserved_lock);
dhugetlb_pool_put(hpool);
return 0;
}
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS 1/2] ntfs: fix use-after-free in ntfs_ucsncmp()
by Yongqiang Liu 07 Mar '23
by Yongqiang Liu 07 Mar '23
07 Mar '23
From: ChenXiaoSong <chenxiaosong2(a)huawei.com>
stable inclusion
from stable-v4.19.254
commit 6c0355ca7ac434d84d8b93336462b698573ca3b3
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6HWOS
CVE: CVE-2023-26607
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
--------------------------------
commit 38c9c22a85aeed28d0831f230136e9cf6fa2ed44 upstream.
Syzkaller reported use-after-free bug as follows:
==================================================================
BUG: KASAN: use-after-free in ntfs_ucsncmp+0x123/0x130
Read of size 2 at addr ffff8880751acee8 by task a.out/879
CPU: 7 PID: 879 Comm: a.out Not tainted 5.19.0-rc4-next-20220630-00001-gcc5218c8bd2c-dirty #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x1c0/0x2b0
print_address_description.constprop.0.cold+0xd4/0x484
print_report.cold+0x55/0x232
kasan_report+0xbf/0xf0
ntfs_ucsncmp+0x123/0x130
ntfs_are_names_equal.cold+0x2b/0x41
ntfs_attr_find+0x43b/0xb90
ntfs_attr_lookup+0x16d/0x1e0
ntfs_read_locked_attr_inode+0x4aa/0x2360
ntfs_attr_iget+0x1af/0x220
ntfs_read_locked_inode+0x246c/0x5120
ntfs_iget+0x132/0x180
load_system_files+0x1cc6/0x3480
ntfs_fill_super+0xa66/0x1cf0
mount_bdev+0x38d/0x460
legacy_get_tree+0x10d/0x220
vfs_get_tree+0x93/0x300
do_new_mount+0x2da/0x6d0
path_mount+0x496/0x19d0
__x64_sys_mount+0x284/0x300
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f3f2118d9ea
Code: 48 8b 0d a9 f4 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 76 f4 0b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc269deac8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3f2118d9ea
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffc269dec00
RBP: 00007ffc269dec80 R08: 00007ffc269deb00 R09: 00007ffc269dec44
R10: 0000000000000000 R11: 0000000000000202 R12: 000055f81ab1d220
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
The buggy address belongs to the physical page:
page:0000000085430378 refcount:1 mapcount:1 mapping:0000000000000000 index:0x555c6a81d pfn:0x751ac
memcg:ffff888101f7e180
anon flags: 0xfffffc00a0014(uptodate|lru|mappedtodisk|swapbacked|node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc00a0014 ffffea0001bf2988 ffffea0001de2448 ffff88801712e201
raw: 0000000555c6a81d 0000000000000000 0000000100000000 ffff888101f7e180
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880751acd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff8880751ace00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff8880751ace80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
ffff8880751acf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff8880751acf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
The reason is that struct ATTR_RECORD->name_offset is 6485, end address of
name string is out of bounds.
Fix this by adding sanity check on end address of attribute name string.
[akpm(a)linux-foundation.org: coding-style cleanups]
[chenxiaosong2(a)huawei.com: cleanup suggested by Hawkins Jiawei]
Link: https://lkml.kernel.org/r/20220709064511.3304299-1-chenxiaosong2@huawei.com
Link: https://lkml.kernel.org/r/20220707105329.4020708-1-chenxiaosong2@huawei.com
Signed-off-by: ChenXiaoSong <chenxiaosong2(a)huawei.com>
Signed-off-by: Hawkins Jiawei <yin31149(a)gmail.com>
Cc: Anton Altaparmakov <anton(a)tuxera.com>
Cc: ChenXiaoSong <chenxiaosong2(a)huawei.com>
Cc: Yongqiang Liu <liuyongqiang13(a)huawei.com>
Cc: Zhang Yi <yi.zhang(a)huawei.com>
Cc: Zhang Xiaoxu <zhangxiaoxu5(a)huawei.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Long Li <leo.lilong(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
fs/ntfs/attrib.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/ntfs/attrib.c b/fs/ntfs/attrib.c
index 44a39a099b54..62b49197e5f6 100644
--- a/fs/ntfs/attrib.c
+++ b/fs/ntfs/attrib.c
@@ -606,8 +606,12 @@ static int ntfs_attr_find(const ATTR_TYPE type, const ntfschar *name,
a = (ATTR_RECORD*)((u8*)ctx->attr +
le32_to_cpu(ctx->attr->length));
for (;; a = (ATTR_RECORD*)((u8*)a + le32_to_cpu(a->length))) {
- if ((u8*)a < (u8*)ctx->mrec || (u8*)a > (u8*)ctx->mrec +
- le32_to_cpu(ctx->mrec->bytes_allocated))
+ u8 *mrec_end = (u8 *)ctx->mrec +
+ le32_to_cpu(ctx->mrec->bytes_allocated);
+ u8 *name_end = (u8 *)a + le16_to_cpu(a->name_offset) +
+ a->name_length * sizeof(ntfschar);
+ if ((u8*)a < (u8*)ctx->mrec || (u8*)a > mrec_end ||
+ name_end > mrec_end)
break;
ctx->attr = a;
if (unlikely(le32_to_cpu(a->type) > le32_to_cpu(type) ||
--
2.25.1
1
1

07 Mar '23
From: Kuniyuki Iwashima <kuniyu(a)amazon.com>
stable inclusion
from stable-v5.15.95
commit fdaf88531cfd17b2a710cceb3141ef6f9085ff40
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6H3MB
CVE: CVE-2023-0461
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
---------------------------
When we backport dadd0dcaa67d ("net/ulp: prevent ULP without clone op from
entering the LISTEN status"), we have accidentally backported a part of
7a7160edf1bf ("net: Return errno in sk->sk_prot->get_port().") and removed
err = -EADDRINUSE in inet_csk_listen_start().
Thus, listen() no longer returns -EADDRINUSE even if ->get_port() failed
as reported in [0].
We set -EADDRINUSE to err just before ->get_port() to fix the regression.
[0]: https://lore.kernel.org/stable/EF8A45D0-768A-4CD5-9A8A-0FA6E610ABF7@winter.…
Reported-by: Winter <winter(a)winter.cafe>
Signed-off-by: Kuniyuki Iwashima <kuniyu(a)amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/ipv4/inet_connection_sock.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index f1f3dc6a7d63..0d85871b5cda 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -941,6 +941,7 @@ int inet_csk_listen_start(struct sock *sk, int backlog)
* It is OK, because this socket enters to hash table only
* after validation is complete.
*/
+ err = -EADDRINUSE;
inet_sk_state_store(sk, TCP_LISTEN);
if (!sk->sk_prot->get_port(sk, inet->inet_num)) {
inet->inet_sport = htons(inet->inet_num);
--
2.25.1
1
3
您好!
Kernel SIG 邀请您参加 2023-03-10 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG双周例会
会议内容:
欢迎您参加 Kernel SIG 双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/87191608489?pwd=eG96T0p2Y0NDRUdHOW9SYys5SElTQT09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-03-10 14:00,
The subject of the conference is openEuler Kernel SIG双周例会,
Summary:
欢迎您参加 Kernel SIG 双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/87191608489?pwd=eG96T0p2Y0NDRUdHOW9SYys5SElTQT09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[OLK-5.10] phy: tegra: xusb: Fix return value of tegra_xusb_find_port_node function
by Wang Yufen 06 Mar '23
by Wang Yufen 06 Mar '23
06 Mar '23
From: Miaoqian Lin <linmq006(a)gmail.com>
mainline inclusion
from mainline-v5.17-rc1
commit 045a31b95509c8f25f5f04ec5e0dec5cd09f2c5f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6IXQP
CVE: CVE-2023-23000
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
callers of tegra_xusb_find_port_node() function only do NULL checking for
the return value. return NULL instead of ERR_PTR(-ENOMEM) to keep
consistent.
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
Acked-by: Thierry Reding <treding(a)nvidia.com>
Link: https://lore.kernel.org/r/20211213020507.1458-1-linmq006@gmail.com
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
Signed-off-by: Wang Yufen <wangyufen(a)huawei.com>
---
drivers/phy/tegra/xusb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/tegra/xusb.c b/drivers/phy/tegra/xusb.c
index 181a1be..02da8c0 100644
--- a/drivers/phy/tegra/xusb.c
+++ b/drivers/phy/tegra/xusb.c
@@ -449,7 +449,7 @@ struct tegra_xusb_lane *
name = kasprintf(GFP_KERNEL, "%s-%u", type, index);
if (!name) {
of_node_put(ports);
- return ERR_PTR(-ENOMEM);
+ return NULL;
}
np = of_get_child_by_name(ports, name);
kfree(name);
--
1.8.3.1
1
0

[openEuler-22.03-LTS] phy: tegra: xusb: Fix return value of tegra_xusb_find_port_node function
by Wang Yufen 06 Mar '23
by Wang Yufen 06 Mar '23
06 Mar '23
From: Miaoqian Lin <linmq006(a)gmail.com>
mainline inclusion
from mainline-v5.17-rc1
commit 045a31b95509c8f25f5f04ec5e0dec5cd09f2c5f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6IXQP
CVE: CVE-2023-23000
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
callers of tegra_xusb_find_port_node() function only do NULL checking for
the return value. return NULL instead of ERR_PTR(-ENOMEM) to keep
consistent.
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
Acked-by: Thierry Reding <treding(a)nvidia.com>
Link: https://lore.kernel.org/r/20211213020507.1458-1-linmq006@gmail.com
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
Signed-off-by: Wang Yufen <wangyufen(a)huawei.com>
---
drivers/phy/tegra/xusb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/tegra/xusb.c b/drivers/phy/tegra/xusb.c
index 181a1be..02da8c0 100644
--- a/drivers/phy/tegra/xusb.c
+++ b/drivers/phy/tegra/xusb.c
@@ -449,7 +449,7 @@ struct tegra_xusb_lane *
name = kasprintf(GFP_KERNEL, "%s-%u", type, index);
if (!name) {
of_node_put(ports);
- return ERR_PTR(-ENOMEM);
+ return NULL;
}
np = of_get_child_by_name(ports, name);
kfree(name);
--
1.8.3.1
1
0

[openEuler-1.0-LTS] phy: tegra: xusb: Fix return value of tegra_xusb_find_port_node function
by Wang Yufen 06 Mar '23
by Wang Yufen 06 Mar '23
06 Mar '23
From: Miaoqian Lin <linmq006(a)gmail.com>
mainline inclusion
from mainline-v5.17-rc1
commit 045a31b95509c8f25f5f04ec5e0dec5cd09f2c5f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6IXQP
CVE: CVE-2023-23000
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
callers of tegra_xusb_find_port_node() function only do NULL checking for
the return value. return NULL instead of ERR_PTR(-ENOMEM) to keep
consistent.
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
Acked-by: Thierry Reding <treding(a)nvidia.com>
Link: https://lore.kernel.org/r/20211213020507.1458-1-linmq006@gmail.com
Signed-off-by: Vinod Koul <vkoul(a)kernel.org>
Signed-off-by: Wang Yufen <wangyufen(a)huawei.com>
---
drivers/phy/tegra/xusb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/tegra/xusb.c b/drivers/phy/tegra/xusb.c
index de1b4eb..9e8fa18 100644
--- a/drivers/phy/tegra/xusb.c
+++ b/drivers/phy/tegra/xusb.c
@@ -441,7 +441,7 @@ struct tegra_xusb_lane *
name = kasprintf(GFP_KERNEL, "%s-%u", type, index);
if (!name) {
of_node_put(ports);
- return ERR_PTR(-ENOMEM);
+ return NULL;
}
np = of_get_child_by_name(ports, name);
kfree(name);
--
1.8.3.1
1
0

[PATCH openEuler-1.0-LTS 1/4] rds: rds_rm_zerocopy_callback() use list_first_entry()
by Yongqiang Liu 06 Mar '23
by Yongqiang Liu 06 Mar '23
06 Mar '23
From: Pietro Borrello <borrello(a)diag.uniroma1.it>
stable inclusion
from stable-v4.19.273
commit 909d5eef5ce792bb76d7b5a9b7a6852b813d8cac
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6I7UF
CVE: CVE-2023-1078
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
-------------------------------------------------
[ Upstream commit f753a68980cf4b59a80fe677619da2b1804f526d ]
rds_rm_zerocopy_callback() uses list_entry() on the head of a list
causing a type confusion.
Use list_first_entry() to actually access the first element of the
rs_zcookie_queue list.
Fixes: 9426bbc6de99 ("rds: use list structure to track information for zerocopy completion notification")
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Signed-off-by: Pietro Borrello <borrello(a)diag.uniroma1.it>
Link: https://lore.kernel.org/r/20230202-rds-zerocopy-v3-1-83b0df974f9a@diag.unir…
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Lu Wei <luwei32(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/rds/message.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/rds/message.c b/net/rds/message.c
index 4b00b1152a5f..309b54cc62ae 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -104,9 +104,9 @@ static void rds_rm_zerocopy_callback(struct rds_sock *rs,
spin_lock_irqsave(&q->lock, flags);
head = &q->zcookie_head;
if (!list_empty(head)) {
- info = list_entry(head, struct rds_msg_zcopy_info,
- rs_zcookie_next);
- if (info && rds_zcookie_add(info, cookie)) {
+ info = list_first_entry(head, struct rds_msg_zcopy_info,
+ rs_zcookie_next);
+ if (rds_zcookie_add(info, cookie)) {
spin_unlock_irqrestore(&q->lock, flags);
kfree(rds_info_from_znotifier(znotif));
/* caller invokes rds_wake_sk_sleep() */
--
2.25.1
1
3

04 Mar '23
v7:
- change there is a container(PNP0A06) has a memory device
(PNP0C80) before create device attributes under container
subsys
v6:
- get container device form container_subsys
v5:
- check the container have PNP0C80 before create
attribute files
v4:
- prettify the code
v3:
- prettify the code
- add a hisi_internal.h to hold common code
v2:
- remove the !adev judge in patch 9, as it will always be
true.
patch 1-3: Add support for iterate through the
child devices in the acpi device.
patch 4-9: Add support for hbm memory device and
hbm cache support
Rafael J. Wysocki (3):
ACPI: bus: Introduce acpi_dev_for_each_child()
ACPI: bus: Avoid non-ACPI device objects in walks over children
ACPI: bus: Export acpi_dev_for_each_child() to modules
Zhang Zekun (5):
ACPI: OSL: Export the symbol of acpi_hotplug_schedule
soc: hisilicon: hisi_hbmdev: Add power domain control methods
ACPI: memhotplug: export the state of each hotplug device
soc: hisilicon: hisi_hbmdev: Provide extra memory topology information
soc: hbmcache: Add support for online and offline the hbm cache
drivers/acpi/acpi_memhotplug.c | 6 +
drivers/acpi/bus.c | 27 +++
drivers/acpi/internal.h | 1 -
drivers/acpi/osl.c | 1 +
drivers/base/container.c | 1 +
drivers/soc/Kconfig | 1 +
drivers/soc/Makefile | 1 +
drivers/soc/hisilicon/Kconfig | 33 +++
drivers/soc/hisilicon/Makefile | 4 +
drivers/soc/hisilicon/hisi_hbmcache.c | 147 +++++++++++++
drivers/soc/hisilicon/hisi_hbmdev.c | 301 ++++++++++++++++++++++++++
drivers/soc/hisilicon/hisi_internal.h | 31 +++
include/acpi/acpi_bus.h | 2 +
include/linux/acpi.h | 1 +
include/linux/memory_hotplug.h | 2 +
15 files changed, 558 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/hisilicon/Kconfig
create mode 100644 drivers/soc/hisilicon/Makefile
create mode 100644 drivers/soc/hisilicon/hisi_hbmcache.c
create mode 100644 drivers/soc/hisilicon/hisi_hbmdev.c
create mode 100644 drivers/soc/hisilicon/hisi_internal.h
--
2.17.1
1
8
v2:
- remove patch "mm/sharepool: Add mg_sp_alloc_nodemask"
Chen Jun (3):
mm/sharepool: Fix a double free problem caused by init_local_group
mm/sharepool: extract group_add_task
mm/sharepool: use delete_spg_node to replace some repetitive code
Guo Mengqi (1):
mm: sharepool: add static modifier to find_spg_node_by_spg()
Wang Wensheng (8):
mm/sharepool: Fix NULL pointer dereference in mg_sp_group_del_task
mm/sharepool: Reorganize create_spg()
mm/sharepool: Simplify sp_make_share_k2u()
mm/sharepool: Rename sp_group operations
mm/sharepool: Simplify sp_unshare_uva()
mm/sharepool: Fix null-pointer-deference in sp_free_area
mm/sharepool: Modify error message in mg_sp_group_del_task
mm/sharepool: Fix double delete list in sp_group_exit
Xu Qiang (17):
mm/sharepool: Refactoring proc file interface similar code
mm/sharepool: Add helper for master_list
mm/sharepool: Delete unused spg_id and hugepage_failures.
mm/sharepool: Delete unused mm in sp_proc_stat.
mm/sharepool: Move spa_num field to sp_group.
mm/sharepool: Rename sp_spg_stat to sp_meminfo.
mm/sharepool: Split meminfo_update into meminfo_inc_usage and
meminfo_dec_usage.
mm/sharepool: split meminfo_update_k2u into meminfo_inc_k2u and
meminfo_dec_k2u.
mm/sharepool: Delete redundant tgid in sp_proc_stat.
mm/sharepool: Move comm from sp_proc_stat to sp_group_master.
mm/sharepool: replace sp_proc_stat with sp_meminfo.
mm/sharepool: Delete unused tgid and spg_id in spg_proc_stat.
mm/sharepool: Replace spg_proc_stat with sp_meminfo.
mm/sharepool: Add meminfo_alloc_sum_byKB and meminfo_alloc_sum.
mm/sharepool: Add meminfo_k2u_size.
mm/sharepool: Delete unused kthread_stat.
mm/sharepool: Delete redundant size and alloc_size in sp_meminfo.
Zhang Zekun (6):
perf: hisi: Add configs for PMU isolation
driver: Add CONFIG_ACPI_APEI_GHES_TS_CORE for code isolation
ACPI / APEI: Add config to isolate Notify all ras err
vmalloc: Add config for Extend for hugepages mapping
iommu/arm-smmu-v3: Add config to Add support for suspend and resume
hugetlbfs: Add config to isolate the code of share_pool
Zhou Guanghui (2):
mm/sharepool: Don't display sharepool statistics in the container
mm: sharepool: Charge Buddy hugepage to memcg
arch/arm64/Kconfig | 2 +-
arch/arm64/configs/openeuler_defconfig | 6 +
drivers/acpi/apei/Kconfig | 14 +
drivers/acpi/apei/ghes.c | 10 +-
drivers/iommu/Kconfig | 7 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 +-
drivers/perf/hisilicon/Kconfig | 19 +
drivers/perf/hisilicon/Makefile | 6 +-
fs/hugetlbfs/inode.c | 21 +-
include/acpi/ghes.h | 2 +
include/linux/cper.h | 3 +
include/linux/mm.h | 2 +
include/linux/vmalloc.h | 7 +
mm/Kconfig | 6 +
mm/hugetlb.c | 2 +
mm/share_pool.c | 1027 ++++++++-----------
mm/share_pool_internal.h | 22 +
mm/vmalloc.c | 16 +
18 files changed, 569 insertions(+), 612 deletions(-)
create mode 100644 mm/share_pool_internal.h
--
2.17.1
1
37

Re: [PATCH v5 OLK-5.10 5/8] soc: hisilicon: hisi_hbmdev: Add power domain control methods
by zhangzekun (A) 03 Mar '23
by zhangzekun (A) 03 Mar '23
03 Mar '23
在 2023/3/3 14:19, Kefeng Wang 写道:
>
>
> On 2023/3/3 11:49, Zhang Zekun wrote:
>> Offering: HULK
>> hulk inclusion
>> category: feature
>> bugzilla: https://gitee.com/openeuler/kernel/issues/I67QNJ
>> CVE: NA
>>
>> ------------------------------------------------------------------
>>
>> Platform devices which supports power control are often required to be
>> power off/on together with the devices in the same power domain.
>> However,
>> there isn't a generic driver that support the power control logic of
>> these devices.
>>
>> ACPI container seems to be a good place to hold these control logic. Add
>> platform devices in the same power domain in a ACPI container, we can
>> easily get the locality information about these devices and can moniter
>> the power of these devices in the same power domain together.
>>
>> This patch provide three userspace control interface to control the
>> power
>> of devices together in the container:
>> - state: Echo online to state to power up the devices in the
>> container and
>> then online these devices which will be triggered by BIOS. Echo
>> offline
>> to the state to offline and eject the child devices in the container
>> which are ejectable.
>> - pxms: show the pxms of devices which are present in the container.
>>
>> In our scenario, we need to control the power of HBM memory devices
>> which
>> can be power consuming and will only be used in some specialized
>> scenarios,
>> such as HPC. HBM memory devices in a socket are in the same power
>> domain,
>> and should be power off/on together. We have come up with an idea
>> that put
>> these power control logic in a specialized driver, but ACPI container
>> seems
>> to be a more generic place to hold these control logic.
>>
>> Signed-off-by: Zhang Zekun <zhangzekun11(a)huawei.com>
>> ---
>> v6:
>> - get container dev form container_subsys
>>
>> v5:
>> - check containers have PNP0C80 before create attibute
>> files
>>
>> v4:
>> - prettify the code
>>
>> v3:
>> - move the common code to hisi_internal.h
>>
>> drivers/base/container.c | 1 +
>> drivers/soc/Kconfig | 1 +
>> drivers/soc/Makefile | 1 +
>> drivers/soc/hisilicon/Kconfig | 19 +++
>> drivers/soc/hisilicon/Makefile | 3 +
>> drivers/soc/hisilicon/hisi_hbmdev.c | 196 ++++++++++++++++++++++++++
>> drivers/soc/hisilicon/hisi_internal.h | 31 ++++
>> 7 files changed, 252 insertions(+)
>> create mode 100644 drivers/soc/hisilicon/Kconfig
>> create mode 100644 drivers/soc/hisilicon/Makefile
>> create mode 100644 drivers/soc/hisilicon/hisi_hbmdev.c
>> create mode 100644 drivers/soc/hisilicon/hisi_internal.h
>>
>> diff --git a/drivers/base/container.c b/drivers/base/container.c
>> index 1ba42d2d3532..12e572d0c69b 100644
>> --- a/drivers/base/container.c
>> +++ b/drivers/base/container.c
>> @@ -30,6 +30,7 @@ struct bus_type container_subsys = {
>> .online = trivial_online,
>> .offline = container_offline,
>> };
>> +EXPORT_SYMBOL_GPL(container_subsys);
>> void __init container_dev_init(void)
>> {
>> diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig
>> index 425ab6f7e375..f7c59b063321 100644
>> --- a/drivers/soc/Kconfig
>> +++ b/drivers/soc/Kconfig
>> @@ -23,5 +23,6 @@ source "drivers/soc/versatile/Kconfig"
>> source "drivers/soc/xilinx/Kconfig"
>> source "drivers/soc/zte/Kconfig"
>> source "drivers/soc/kendryte/Kconfig"
>> +source "drivers/soc/hisilicon/Kconfig"
>> endmenu
>> diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile
>> index 36452bed86ef..68f186e00e44 100644
>> --- a/drivers/soc/Makefile
>> +++ b/drivers/soc/Makefile
>> @@ -29,3 +29,4 @@ obj-$(CONFIG_PLAT_VERSATILE) += versatile/
>> obj-y += xilinx/
>> obj-$(CONFIG_ARCH_ZX) += zte/
>> obj-$(CONFIG_SOC_KENDRYTE) += kendryte/
>> +obj-y += hisilicon/
>> diff --git a/drivers/soc/hisilicon/Kconfig
>> b/drivers/soc/hisilicon/Kconfig
>> new file mode 100644
>> index 000000000000..497787af004e
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/Kconfig
>> @@ -0,0 +1,19 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# Hisilicon SoC drivers
>> +#
>> +menu "Hisilicon SoC driver support"
>> +
>> +config HISI_HBMDEV
>> + tristate "add extra support for hbm memory device"
>> + depends on ACPI_HOTPLUG_MEMORY
>> + select ACPI_CONTAINER
>> + help
>> + This driver add extra supports for memory devices. The driver
>> + provides methods for userpace to control the power of memory
>> + devices in a container.
>> +
>> + To compile this driver as a module, choose M here:
>> + the module will be called hisi_hbmdev.
>> +
>> +endmenu
>> diff --git a/drivers/soc/hisilicon/Makefile
>> b/drivers/soc/hisilicon/Makefile
>> new file mode 100644
>> index 000000000000..22e87acb1ab3
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/Makefile
>> @@ -0,0 +1,3 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +obj-$(CONFIG_HISI_HBMDEV) += hisi_hbmdev.o
>> diff --git a/drivers/soc/hisilicon/hisi_hbmdev.c
>> b/drivers/soc/hisilicon/hisi_hbmdev.c
>> new file mode 100644
>> index 000000000000..8485efc00684
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/hisi_hbmdev.c
>> @@ -0,0 +1,196 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) Huawei Technologies Co., Ltd. 2023. All rights
>> reserved.
>> + */
>> +
>> +#include <linux/kobject.h>
>> +#include <linux/module.h>
>> +#include <linux/nodemask.h>
>> +#include <linux/acpi.h>
>> +#include <linux/container.h>
>> +
>> +#include "hisi_internal.h"
>> +
>> +#define ACPI_MEMORY_DEVICE_HID "PNP0C80"
>> +#define ACPI_GENERIC_CONTAINER_DEVICE_HID "PNP0A06"
>> +
>> +struct memory_dev {
>> + struct kobject *memdev_kobj;
>> +};
>> +
>> +static struct memory_dev *mdev;
>> +
>> +static int hbmdev_find(struct acpi_device *adev, void *arg)
>> +{
>> + const char *hid = acpi_device_hid(adev);
>> + bool *found = arg;
>> +
>> + if (!strcmp(hid, ACPI_MEMORY_DEVICE_HID)) {
>> + *found = true;
>> + return -1;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static bool has_hbmdev(struct device *dev)
>> +{
>> + struct acpi_device *adev = ACPI_COMPANION(dev);
>> + const char *hid = acpi_device_hid(adev);
>> + bool found = false;
>> +
>> + if (strcmp(hid, ACPI_GENERIC_CONTAINER_DEVICE_HID))
>> + return found;
>> +
>> + acpi_dev_for_each_child(adev, hbmdev_find, &found);
>> +
>> + return found;
>> +}
>> +
>> +static int get_pxm(struct acpi_device *acpi_device, void *arg)
>> +{
>> + acpi_handle handle = acpi_device->handle;
>> + nodemask_t *mask = arg;
>> + unsigned long long sta;
>> + acpi_status status;
>> + int nid;
>> +
>> + status = acpi_evaluate_integer(handle, "_STA", NULL, &sta);
>> + if (ACPI_SUCCESS(status) && (sta & ACPI_STA_DEVICE_ENABLED)) {
>> + nid = acpi_get_node(handle);
>> + if (nid >= 0)
>> + node_set(nid, *mask);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static ssize_t pxms_show(struct device *dev,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct acpi_device *adev = ACPI_COMPANION(dev);
>> + nodemask_t mask;
>> +
>> + nodes_clear(mask);
>> + acpi_dev_for_each_child(adev, get_pxm, &mask);
>> +
>> + return sysfs_emit(buf, "%*pbl\n",
>> + nodemask_pr_args(&mask));
>> +}
>> +static DEVICE_ATTR_RO(pxms);
>> +
>> +static int memdev_power_on(struct acpi_device *adev)
>> +{
>> + acpi_handle handle = adev->handle;
>> + acpi_status status;
>> +
>> + status = acpi_evaluate_object(handle, "_ON", NULL, NULL);
>> + if (ACPI_FAILURE(status)) {
>> + acpi_handle_warn(handle, "Power on failed (0x%x)\n", status);
>> + return -ENODEV;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int eject_device(struct acpi_device *acpi_device, void
>> *not_used)
>> +{
>> + acpi_object_type unused;
>> + acpi_status status;
>> +
>> + status = acpi_get_type(acpi_device->handle, &unused);
>> + if (ACPI_FAILURE(status) || !acpi_device->flags.ejectable)
>> + return -ENODEV;
>> +
>> + get_device(&acpi_device->dev);
>> + status = acpi_hotplug_schedule(acpi_device,
>> ACPI_OST_EC_OSPM_EJECT);
>> + if (ACPI_SUCCESS(status))
>> + return 0;
>> +
>> + put_device(&acpi_device->dev);
>> + acpi_evaluate_ost(acpi_device->handle, ACPI_OST_EC_OSPM_EJECT,
>> + ACPI_OST_SC_NON_SPECIFIC_FAILURE, NULL);
>> +
>> + return status == AE_NO_MEMORY ? -ENOMEM : -EAGAIN;
>> +}
>> +
>> +static int memdev_power_off(struct acpi_device *adev)
>> +{
>> + return acpi_dev_for_each_child(adev, eject_device, NULL);
>> +}
>> +
>> +static ssize_t state_store(struct device *dev, struct
>> device_attribute *attr,
>> + const char *buf, size_t count)
>> +{
>> + struct acpi_device *adev = ACPI_COMPANION(dev);
>> + const int type = online_type_from_str(buf);
>> + int ret = -EINVAL;
>> +
>> + switch (type) {
>> + case STATE_ONLINE:
>> + ret = memdev_power_on(adev);
>> + break;
>> + case STATE_OFFLINE:
>> + ret = memdev_power_off(adev);
>> + break;
>> + default:
>> + break;
>> + }
>> +
>> + if (ret)
>> + return ret;
>> +
>> + return count;
>> +}
>> +static DEVICE_ATTR_WO(state);
>> +
>> +static int container_add(struct device *dev, void *data)
>> +{
>> + if (!has_hbmdev(dev))
>> + return 0;
>> +
>> + device_create_file(dev, &dev_attr_state);
>> + device_create_file(dev, &dev_attr_pxms);
>> +
>> + return 0;
>> +}
>> +
>> +static int container_remove(struct device *dev, void *data)
>> +{
>> + if (!has_hbmdev(dev))
>> + return 0;
>> +
>> + device_remove_file(dev, &dev_attr_state);
>> + device_remove_file(dev, &dev_attr_pxms);
>> +
>> + return 0;
>> +}
>> +
>> +static int __init mdev_init(void)
>> +{
>> + mdev = kzalloc(sizeof(struct memory_dev), GFP_KERNEL);
>> + if (!mdev)
>> + return -ENOMEM;
>> +
>> + mdev->memdev_kobj = kobject_create_and_add("hbm_memory",
>> kernel_kobj);
>> + if (!mdev->memdev_kobj) {
>> + kfree(mdev);
>> + return -ENOMEM;
>> + }
>> +
>> + bus_for_each_dev(&container_subsys, NULL, NULL, container_add);
>> + return 0;
>> +}
>
> 这个还是没有解决是否加载的问题;
> 1) 通过遍历,你已经拿到了所有的container device,
> 后面移除的时候就不需要再次遍历了;这里可以搞个链表;
> 2)另外判断是否有hbm,搞个全局变量,如果有了,才继续执行后面的初始化,否则就直接返回了;
>
按照这个思路这里可以改成这样:
module_init
1. 首先判断所有容器中是否有hbm device, 构建链表
2. 如果链表不为空
那么执行模块的初始化,创建hbm_memory目录
如果链表为空
直接返回,释放链表
module_exit
释放链表中的资源,释放链表
>> +module_init(mdev_init);
>> +
>> +static void __exit mdev_exit(void)
>> +{
>> + bus_for_each_dev(&container_subsys, NULL, NULL, container_remove);
>> + kobject_put(mdev->memdev_kobj);
>> + kfree(mdev);
>> +}
>> +module_exit(mdev_exit);
>> +
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_AUTHOR("Zhang Zekun <zhangzekun11(a)huawei.com>");
>> diff --git a/drivers/soc/hisilicon/hisi_internal.h
>> b/drivers/soc/hisilicon/hisi_internal.h
>> new file mode 100644
>> index 000000000000..5345174f6b84
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/hisi_internal.h
>> @@ -0,0 +1,31 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Copyright (C) Huawei Technologies Co., Ltd. 2023. All rights
>> reserved.
>> + */
>> +
>> +#ifndef _HISI_INTERNAL_H
>> +#define _HISI_INTERNAL_H
>> +
>> +enum {
>> + STATE_ONLINE,
>> + STATE_OFFLINE,
>> +};
>> +
>> +static const char *const online_type_to_str[] = {
>> + [STATE_ONLINE] = "online",
>> + [STATE_OFFLINE] = "offline",
>> +};
>> +
>> +static inline int online_type_from_str(const char *str)
>> +{
>> + int i;
>> +
>> + for (i = 0; i < ARRAY_SIZE(online_type_to_str); i++) {
>> + if (sysfs_streq(str, online_type_to_str[i]))
>> + return i;
>> + }
>> +
>> + return -EINVAL;
>> +}
>> +
>> +#endif
1
0

03 Mar '23
v6:
- get container device form container_subsys
v5:
- check the container have PNP0C80 before create
attribute files
v4:
- prettify the code
v3:
- prettify the code
- add a hisi_internal.h to hold common code
v2:
- remove the !adev judge in patch 9, as it will always be
true.
patch 1-3: Add support for iterate through the
child devices in the acpi device.
patch 4-9: Add support for hbm memory device and
hbm cache support
Rafael J. Wysocki (3):
ACPI: bus: Introduce acpi_dev_for_each_child()
ACPI: bus: Avoid non-ACPI device objects in walks over children
ACPI: bus: Export acpi_dev_for_each_child() to modules
Zhang Zekun (5):
ACPI: OSL: Export the symbol of acpi_hotplug_schedule
soc: hisilicon: hisi_hbmdev: Add power domain control methods
ACPI: memhotplug: export the state of each hotplug device
soc: hisilicon: hisi_hbmdev: Provide extra memory topology information
ACPI: hbmcache: Add support for online and offline the hbm cache
drivers/acpi/acpi_memhotplug.c | 6 +
drivers/acpi/bus.c | 27 +++
drivers/acpi/internal.h | 1 -
drivers/acpi/osl.c | 1 +
drivers/base/container.c | 1 +
drivers/soc/Kconfig | 1 +
drivers/soc/Makefile | 1 +
drivers/soc/hisilicon/Kconfig | 33 ++++
drivers/soc/hisilicon/Makefile | 4 +
drivers/soc/hisilicon/hisi_hbmcache.c | 147 +++++++++++++++
drivers/soc/hisilicon/hisi_hbmdev.c | 257 ++++++++++++++++++++++++++
drivers/soc/hisilicon/hisi_internal.h | 31 ++++
include/acpi/acpi_bus.h | 2 +
include/linux/acpi.h | 1 +
include/linux/memory_hotplug.h | 2 +
15 files changed, 514 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/hisilicon/Kconfig
create mode 100644 drivers/soc/hisilicon/Makefile
create mode 100644 drivers/soc/hisilicon/hisi_hbmcache.c
create mode 100644 drivers/soc/hisilicon/hisi_hbmdev.c
create mode 100644 drivers/soc/hisilicon/hisi_internal.h
--
2.17.1
1
8

02 Mar '23
v5:
- check the container have PNP0C80 before create
attribute files
v4:
- prettify the code
v3:
- prettify the code
- add a hisi_internal.h to hold common code
v2:
- remove the !adev judge in patch 9, as it will always be
true.
patch 1-3: Add support for iterate through the
child devices in the acpi device.
patch 4-9: Add support for hbm memory device and
hbm cache support
Rafael J. Wysocki (3):
ACPI: bus: Introduce acpi_dev_for_each_child()
ACPI: bus: Avoid non-ACPI device objects in walks over children
ACPI: bus: Export acpi_dev_for_each_child() to modules
Zhang Zekun (6):
ACPI: container: export the container list in the system
ACPI: OSL: Export the symbol of acpi_hotplug_schedule
soc: hisilicon: hisi_hbmdev: Add power domain control methods
ACPI: memhotplug: export the state of each hotplug device
soc: hisilicon: hisi_hbmdev: Provide extra memory topology information
ACPI: hbmcache: Add support for online and offline the hbm cache
drivers/acpi/acpi_memhotplug.c | 6 +
drivers/acpi/bus.c | 27 +++
drivers/acpi/container.c | 51 ++++++
drivers/acpi/internal.h | 1 -
drivers/acpi/osl.c | 1 +
drivers/soc/Kconfig | 1 +
drivers/soc/Makefile | 1 +
drivers/soc/hisilicon/Kconfig | 33 ++++
drivers/soc/hisilicon/Makefile | 4 +
drivers/soc/hisilicon/hisi_hbmcache.c | 147 ++++++++++++++++
drivers/soc/hisilicon/hisi_hbmdev.c | 245 ++++++++++++++++++++++++++
drivers/soc/hisilicon/hisi_internal.h | 33 ++++
include/acpi/acpi_bus.h | 2 +
include/linux/acpi.h | 1 +
include/linux/container.h | 8 +
include/linux/memory_hotplug.h | 2 +
16 files changed, 562 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/hisilicon/Kconfig
create mode 100644 drivers/soc/hisilicon/Makefile
create mode 100644 drivers/soc/hisilicon/hisi_hbmcache.c
create mode 100644 drivers/soc/hisilicon/hisi_hbmdev.c
create mode 100644 drivers/soc/hisilicon/hisi_internal.h
--
2.17.1
1
9
v4:
- prettify the code
v3:
- prettify the code
- add a hisi_internal.h to hold common code
v2:
- remove the !adev judge in patch 9, as it will always be
true.
patch 1-3: Add support for iterate through the
child devices in the acpi device.
patch 4-9: Add support for hbm memory device and
hbm cache support
Rafael J. Wysocki (3):
ACPI: bus: Introduce acpi_dev_for_each_child()
ACPI: bus: Avoid non-ACPI device objects in walks over children
ACPI: bus: Export acpi_dev_for_each_child() to modules
Zhang Zekun (6):
ACPI: container: export the container list in the system
ACPI: OSL: Export the symbol of acpi_hotplug_schedule
soc: hisilicon: hisi_hbmdev: Add power domain control methods
ACPI: memhotplug: export the state of each hotplug device
soc: hisilicon: hisi_hbmdev: Provide extra memory topology information
ACPI: hbmcache: Add support for online and offline the hbm cache
drivers/acpi/acpi_memhotplug.c | 6 +
drivers/acpi/bus.c | 27 ++++
drivers/acpi/container.c | 51 ++++++
drivers/acpi/internal.h | 1 -
drivers/acpi/osl.c | 1 +
drivers/soc/Kconfig | 1 +
drivers/soc/Makefile | 1 +
drivers/soc/hisilicon/Kconfig | 33 ++++
drivers/soc/hisilicon/Makefile | 4 +
drivers/soc/hisilicon/hisi_hbmcache.c | 147 +++++++++++++++++
drivers/soc/hisilicon/hisi_hbmdev.c | 218 ++++++++++++++++++++++++++
drivers/soc/hisilicon/hisi_internal.h | 31 ++++
include/acpi/acpi_bus.h | 2 +
include/linux/acpi.h | 1 +
include/linux/container.h | 8 +
include/linux/memory_hotplug.h | 2 +
16 files changed, 533 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/hisilicon/Kconfig
create mode 100644 drivers/soc/hisilicon/Makefile
create mode 100644 drivers/soc/hisilicon/hisi_hbmcache.c
create mode 100644 drivers/soc/hisilicon/hisi_hbmdev.c
create mode 100644 drivers/soc/hisilicon/hisi_internal.h
--
2.17.1
1
9

Re: [PATCH v3 OLK-5.10 6/9] soc: hisilicon: hisi_hbmdev: Add power domain control methods
by zhangzekun (A) 02 Mar '23
by zhangzekun (A) 02 Mar '23
02 Mar '23
在 2023/3/2 15:12, Kefeng Wang 写道:
>
>
> On 2023/3/2 14:51, Zhang Zekun wrote:
>> Offering: HULK
>> hulk inclusion
>> category: feature
>> bugzilla: https://gitee.com/openeuler/kernel/issues/I67QNJ
>> CVE: NA
>>
>> ------------------------------------------------------------------
>>
>> Platform devices which supports power control are often required to be
>> power off/on together with the devices in the same power domain.
>> However,
>> there isn't a generic driver that support the power control logic of
>> these devices.
>>
>> ACPI container seems to be a good place to hold these control logic. Add
>> platform devices in the same power domain in a ACPI container, we can
>> easily get the locality information about these devices and can moniter
>> the power of these devices in the same power domain together.
>>
>> This patch provide three userspace control interface to control the
>> power
>> of devices together in the container:
>> - state: Echo online to state to power up the devices in the
>> container and
>> then online these devices which will be triggered by BIOS. Echo
>> offline
>> to the state to offline and eject the child devices in the container
>> which are ejectable.
>> - pxms: show the pxms of devices which are present in the container.
>>
>> In our scenario, we need to control the power of HBM memory devices
>> which
>> can be power consuming and will only be used in some specialized
>> scenarios,
>> such as HPC. HBM memory devices in a socket are in the same power
>> domain,
>> and should be power off/on together. We have come up with an idea
>> that put
>> these power control logic in a specialized driver, but ACPI container
>> seems
>> to be a more generic place to hold these control logic.
>>
>> Signed-off-by: Zhang Zekun <zhangzekun11(a)huawei.com>
>> ---
>> v3:
>> - move the common code to hisi_internal.h
>>
>> drivers/soc/Kconfig | 1 +
>> drivers/soc/Makefile | 1 +
>> drivers/soc/hisilicon/Kconfig | 19 +++
>> drivers/soc/hisilicon/Makefile | 3 +
>> drivers/soc/hisilicon/hisi_hbmdev.c | 166 ++++++++++++++++++++++++++
>> drivers/soc/hisilicon/hisi_internal.h | 31 +++++
>> 6 files changed, 221 insertions(+)
>> create mode 100644 drivers/soc/hisilicon/Kconfig
>> create mode 100644 drivers/soc/hisilicon/Makefile
>> create mode 100644 drivers/soc/hisilicon/hisi_hbmdev.c
>> create mode 100644 drivers/soc/hisilicon/hisi_internal.h
>>
>> diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig
>> index 425ab6f7e375..f7c59b063321 100644
>> --- a/drivers/soc/Kconfig
>> +++ b/drivers/soc/Kconfig
>> @@ -23,5 +23,6 @@ source "drivers/soc/versatile/Kconfig"
>> source "drivers/soc/xilinx/Kconfig"
>> source "drivers/soc/zte/Kconfig"
>> source "drivers/soc/kendryte/Kconfig"
>> +source "drivers/soc/hisilicon/Kconfig"
>> endmenu
>> diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile
>> index 36452bed86ef..68f186e00e44 100644
>> --- a/drivers/soc/Makefile
>> +++ b/drivers/soc/Makefile
>> @@ -29,3 +29,4 @@ obj-$(CONFIG_PLAT_VERSATILE) += versatile/
>> obj-y += xilinx/
>> obj-$(CONFIG_ARCH_ZX) += zte/
>> obj-$(CONFIG_SOC_KENDRYTE) += kendryte/
>> +obj-y += hisilicon/
>> diff --git a/drivers/soc/hisilicon/Kconfig
>> b/drivers/soc/hisilicon/Kconfig
>> new file mode 100644
>> index 000000000000..497787af004e
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/Kconfig
>> @@ -0,0 +1,19 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# Hisilicon SoC drivers
>> +#
>> +menu "Hisilicon SoC driver support"
>> +
>> +config HISI_HBMDEV
>> + tristate "add extra support for hbm memory device"
>> + depends on ACPI_HOTPLUG_MEMORY
>> + select ACPI_CONTAINER
>> + help
>> + This driver add extra supports for memory devices. The driver
>> + provides methods for userpace to control the power of memory
>> + devices in a container.
>> +
>> + To compile this driver as a module, choose M here:
>> + the module will be called hisi_hbmdev.
>> +
>> +endmenu
>> diff --git a/drivers/soc/hisilicon/Makefile
>> b/drivers/soc/hisilicon/Makefile
>> new file mode 100644
>> index 000000000000..22e87acb1ab3
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/Makefile
>> @@ -0,0 +1,3 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +obj-$(CONFIG_HISI_HBMDEV) += hisi_hbmdev.o
>> diff --git a/drivers/soc/hisilicon/hisi_hbmdev.c
>> b/drivers/soc/hisilicon/hisi_hbmdev.c
>> new file mode 100644
>> index 000000000000..82943cd35fa2
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/hisi_hbmdev.c
>> @@ -0,0 +1,166 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) Huawei Technologies Co., Ltd. 2023. All rights
>> reserved.
>> + */
>> +
>> +#include <linux/kobject.h>
>> +#include <linux/module.h>
>> +#include <linux/nodemask.h>
>> +#include <linux/acpi.h>
>> +#include <linux/container.h>
>> +
>> +#include "hisi_internal.h"
>> +
>> +struct memory_dev {
>> + struct kobject *memdev_kobj;
>> +};
>> +
>> +static struct memory_dev *mdev;
>> +
>> +static int get_pxm(struct acpi_device *acpi_device, void *arg)
>> +{
>> + int nid;
>> + unsigned long long sta;
>> + acpi_handle handle;
>> + nodemask_t *mask;
>> + acpi_status status;
>> +
>> + mask = arg;
>> + handle = acpi_device->handle;
>
> 按照倒金字塔,
> acpi_handle handle = acpi_device->handle;
> nodemask_t *mask = arg;
> unsigned long long sta;
> acpi_status status;
> int nid;
>> +
>> + status = acpi_evaluate_integer(handle, "_STA", NULL, &sta);
>> + if (ACPI_SUCCESS(status) && (sta & ACPI_STA_DEVICE_ENABLED)) {
>> + nid = acpi_get_node(handle);
>> + if (nid >= 0)
>> + node_set(nid, *mask);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static ssize_t pxms_show(struct device *dev,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + nodemask_t mask;
>> + struct acpi_device *adev;
>> +
>> + adev = to_acpi_device(dev);
> 同上
>
>> + nodes_clear(mask);
>> + acpi_dev_for_each_child(adev, get_pxm, &mask);
>> +
>> + return sysfs_emit(buf, "%*pbl\n",
>> + nodemask_pr_args(&mask));
>> +}
>> +static DEVICE_ATTR_RO(pxms);
>> +
>> +static int memdev_power_on(struct acpi_device *adev)
>> +{
>> + acpi_status status;
>> + acpi_handle handle;
>> +
>> + handle = adev->handle;
> ...
>> + status = acpi_evaluate_object(handle, "_ON", NULL, NULL);
>> + if (ACPI_FAILURE(status)) {
>> + acpi_handle_warn(handle, "Power on failed (0x%x)\n", status);
>> + return -ENODEV;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int eject_device(struct acpi_device *acpi_device, void
>> *not_used)
>> +{
>> + acpi_object_type unused;
>> + acpi_status status;
>> +
>> + status = acpi_get_type(acpi_device->handle, &unused);
>> + if (ACPI_FAILURE(status) || !acpi_device->flags.ejectable)
>> + return -ENODEV;
>> +
>> + get_device(&acpi_device->dev);
>> + status = acpi_hotplug_schedule(acpi_device,
>> ACPI_OST_EC_OSPM_EJECT);
>> + if (ACPI_SUCCESS(status))
>> + return 0;
>> +
>> + put_device(&acpi_device->dev);
>> + acpi_evaluate_ost(acpi_device->handle, ACPI_OST_EC_OSPM_EJECT,
>> + ACPI_OST_SC_NON_SPECIFIC_FAILURE, NULL);
>> +
>> + return status == AE_NO_MEMORY ? -ENOMEM : -EAGAIN;
>> +}
>> +
>> +static int memdev_power_off(struct acpi_device *adev)
>> +{
>> + return acpi_dev_for_each_child(adev, eject_device, NULL);
>> +}
>> +
>> +static ssize_t state_store(struct device *d, struct device_attribute
>> *attr,
>> + const char *buf, size_t count)
>> +{
>> + int ret;
>> + struct acpi_device *adev;
>> + const int online_type = online_type_from_str(buf);
>> +
>> + if (online_type < 0)
>> + return -EINVAL;
>> +
>> + adev = to_acpi_device(d);
>
>
> const int online_type = online_type_from_str(buf);
> struct acpi_device *adev = to_acpi_device(d);
> int ret;
> ...
>
>
>> + switch (online_type) {
>> + case STATE_ONLINE:
>> + ret = memdev_power_on(adev);
>> + if (!ret)
>> + return count;
>> + break;
>> + case STATE_OFFLINE:
>> + ret = memdev_power_off(adev);
>> + if (!ret)
>> + return count;
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
> 这个有点奇怪,hbm cache和mem的热插拔的函数不能搞成类似的逻辑吗
这里改成这样吧
{
struct acpi_device *adev = to_acpi_device(d);
const int type = online_type_from_str(buf);
int ret = -EINVAL;
switch (type) {
case STATE_ONLINE:
ret = memdev_power_on(adev);
break;
case STATE_OFFLINE:
ret = memdev_power_off(adev);
break;
default:
break;
}
if (ret)
return ret;
return count;
}
>> +
>> + return ret;
>> +}
>> +static DEVICE_ATTR_WO(state);
>> +
>> +static int __init mdev_init(void)
>> +{
>> + struct cdev_node *cnode;
>> +
>> + mdev = kzalloc(sizeof(struct memory_dev), GFP_KERNEL);
>> + if (!mdev)
>> + return -ENOMEM;
>> +
>> + mdev->memdev_kobj = kobject_create_and_add("hbm_memory",
>> kernel_kobj);
>> + if (!mdev->memdev_kobj) {
>> + kfree(mdev);
>> + return -ENOMEM;
>> + }
>> +
>> + list_for_each_entry(cnode, &cdev_list->clist, clist) {
>> + device_create_file(cnode->dev, &dev_attr_state);
>> + device_create_file(cnode->dev, &dev_attr_pxms);
>> + }
>> +
>> + return 0;
>> +}
>> +module_init(mdev_init);
>> +
>> +static void __exit mdev_exit(void)
>> +{
>> + struct cdev_node *cnode;
>> +
>> + list_for_each_entry(cnode, &cdev_list->clist, clist) {
>> + device_remove_file(cnode->dev, &dev_attr_state);
>> + device_remove_file(cnode->dev, &dev_attr_pxms);
>> + }
>> +
>> + kobject_put(mdev->memdev_kobj);
>> + kfree(mdev);
>> +}
>> +module_exit(mdev_exit);
>> +
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_AUTHOR("Zhang Zekun <zhangzekun11(a)huawei.com>");
>> diff --git a/drivers/soc/hisilicon/hisi_internal.h
>> b/drivers/soc/hisilicon/hisi_internal.h
>> new file mode 100644
>> index 000000000000..f14596f58a05
>> --- /dev/null
>> +++ b/drivers/soc/hisilicon/hisi_internal.h
>> @@ -0,0 +1,31 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Copyright (C) Huawei Technologies Co., Ltd. 2023. All rights
>> reserved.
>> + */
>> +
>> +#ifndef _HISI_INTERNAL_H
>> +#define _HISI_INTERNAL_H
>> +
>> +enum {
>> + STATE_ONLINE,
>> + STATE_OFFLINE,
>> +};
>> +
>> +static const char *const online_type_to_str[] = {
>> + [STATE_ONLINE] = "online",
>> + [STATE_OFFLINE] = "offline",
>> +};
>> +
>> +static int online_type_from_str(const char *str)
> 不需要inline吗
>> +{
>> + int i;
>> +
>> + for (i = 0; i < ARRAY_SIZE(online_type_to_str); i++) {
>> + if (sysfs_streq(str, online_type_to_str[i]))
>> + return i;
>> + }
>> +
>> + return -EINVAL;
>> +}
>> +
>> +#endif
1
0
v3:
- prettify the code
- add a hisi_internal.h to hold common code
v2:
- remove the !adev judge in patch 9, as it will always be
true.
patch 1-3: Add support for iterate through the
child devices in the acpi device.
patch 4-9: Add support for hbm memory device and
hbm cache support
Rafael J. Wysocki (3):
ACPI: bus: Introduce acpi_dev_for_each_child()
ACPI: bus: Avoid non-ACPI device objects in walks over children
ACPI: bus: Export acpi_dev_for_each_child() to modules
Zhang Zekun (6):
ACPI: container: export the container list in the system
ACPI: OSL: Export the symbol of acpi_hotplug_schedule
soc: hisilicon: hisi_hbmdev: Add power domain control methods
ACPI: memhotplug: export the state of each hotplug device
soc: hisilicon: hisi_hbmdev: Provide extra memory topology information
ACPI: hbmcache: Add support for online and offline the hbm cache
drivers/acpi/acpi_memhotplug.c | 6 +
drivers/acpi/bus.c | 27 +++
drivers/acpi/container.c | 51 ++++++
drivers/acpi/internal.h | 1 -
drivers/acpi/osl.c | 1 +
drivers/soc/Kconfig | 1 +
drivers/soc/Makefile | 1 +
drivers/soc/hisilicon/Kconfig | 33 ++++
drivers/soc/hisilicon/Makefile | 4 +
drivers/soc/hisilicon/hisi_hbmcache.c | 147 +++++++++++++++++
drivers/soc/hisilicon/hisi_hbmdev.c | 229 ++++++++++++++++++++++++++
drivers/soc/hisilicon/hisi_internal.h | 31 ++++
include/acpi/acpi_bus.h | 2 +
include/linux/acpi.h | 1 +
include/linux/container.h | 8 +
include/linux/memory_hotplug.h | 2 +
16 files changed, 544 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/hisilicon/Kconfig
create mode 100644 drivers/soc/hisilicon/Makefile
create mode 100644 drivers/soc/hisilicon/hisi_hbmcache.c
create mode 100644 drivers/soc/hisilicon/hisi_hbmdev.c
create mode 100644 drivers/soc/hisilicon/hisi_internal.h
--
2.17.1
1
9

02 Mar '23
hulk inclusion
category: bugfix
bugzilla: 188150, https://gitee.com/openeuler/kernel/issues/I643OL
----------------------------------------
This reverts commit 7f10ea522db56188ae46c5bbee7052a2b2797515.
This commit has a soft lock problem:
watchdog: BUG: soft lockup - CPU#22 stuck for 67s! [iscsid:16369]
Call Trace:
scsi_remove_target+0x548/0x7b0
? sdev_store_delete+0x90/0x90
? __mutex_lock_slowpath+0x10/0x10
? device_remove_class_symlinks+0x1b0/0x1b0
__iscsi_unbind_session+0x16b/0x250 [scsi_transport_iscsi]
iscsi_remove_session+0x1d3/0x2f0 [scsi_transport_iscsi]
iscsi_session_remove+0x5c/0x80 [libiscsi]
iscsi_sw_tcp_session_destroy+0xd3/0x160 [iscsi_tcp]
iscsi_if_rx+0x2369/0x5060 [scsi_transport_iscsi]
The reason is that if other threads hold the reference count of the
kobject while waiting for the device to be released, it will keep
waiting in a loop.
Fixes: 7f10ea522db5 ("scsi: fix iscsi rescan fails to create block")
Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com>
---
drivers/scsi/scsi_sysfs.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 4468b92bf83b..6433476d3e67 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1507,13 +1507,6 @@ void scsi_remove_device(struct scsi_device *sdev)
}
EXPORT_SYMBOL(scsi_remove_device);
-static int scsi_device_try_get(struct scsi_device *sdev)
-{
- if (!kobject_get_unless_zero(&sdev->sdev_gendev.kobj))
- return -ENXIO;
- return 0;
-}
-
static void __scsi_remove_target(struct scsi_target *starget)
{
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
@@ -1532,7 +1525,9 @@ static void __scsi_remove_target(struct scsi_target *starget)
if (sdev->channel != starget->channel ||
sdev->id != starget->id)
continue;
- if (scsi_device_try_get(sdev))
+ if (sdev->sdev_state == SDEV_DEL ||
+ sdev->sdev_state == SDEV_CANCEL ||
+ !get_device(&sdev->sdev_gendev))
continue;
spin_unlock_irqrestore(shost->host_lock, flags);
scsi_remove_device(sdev);
--
2.31.1
1
0
From: Juan Zhou <zhoujuan51(a)h-partners.com>
1.Support hns HW stats
2.Add dfx cnt stats
Chengchang Tang (2):
RDMA/hns: Support hns HW stats
RDMA/hns: Add dfx cnt stats
drivers/infiniband/hw/hns/hns_roce_ah.c | 8 +-
drivers/infiniband/hw/hns/hns_roce_cmd.c | 17 ++-
drivers/infiniband/hw/hns/hns_roce_cq.c | 17 ++-
drivers/infiniband/hw/hns/hns_roce_device.h | 50 +++++++
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 59 ++++++++
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 1 +
drivers/infiniband/hw/hns/hns_roce_main.c | 146 +++++++++++++++++++-
drivers/infiniband/hw/hns/hns_roce_mr.c | 22 ++-
drivers/infiniband/hw/hns/hns_roce_pd.c | 10 +-
drivers/infiniband/hw/hns/hns_roce_qp.c | 16 ++-
drivers/infiniband/hw/hns/hns_roce_srq.c | 7 +-
11 files changed, 320 insertions(+), 33 deletions(-)
--
2.30.0
1
2

[PATCH openEuler-1.0-LTS 1/2] scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddress
by Yongqiang Liu 28 Feb '23
by Yongqiang Liu 28 Feb '23
28 Feb '23
From: Mike Christie <michael.christie(a)oracle.com>
mainline inclusion
from mainline-v6.2-rc6~31
commit 6f1d64b13097e85abda0f91b5638000afc5f9a06
category: bugfix
bugzilla: 188443, https://gitee.com/openeuler/kernel/issues/I6I8YD
CVE: NA
----------------------------------------
Bug report and analysis from Ding Hui.
During iSCSI session logout, if another task accesses the shost ipaddress
attr, we can get a KASAN UAF report like this:
[ 276.942144] BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x78/0xe0
[ 276.942535] Write of size 4 at addr ffff8881053b45b8 by task cat/4088
[ 276.943511] CPU: 2 PID: 4088 Comm: cat Tainted: G E 6.1.0-rc8+ #3
[ 276.943997] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[ 276.944470] Call Trace:
[ 276.944943] <TASK>
[ 276.945397] dump_stack_lvl+0x34/0x48
[ 276.945887] print_address_description.constprop.0+0x86/0x1e7
[ 276.946421] print_report+0x36/0x4f
[ 276.947358] kasan_report+0xad/0x130
[ 276.948234] kasan_check_range+0x35/0x1c0
[ 276.948674] _raw_spin_lock_bh+0x78/0xe0
[ 276.949989] iscsi_sw_tcp_host_get_param+0xad/0x2e0 [iscsi_tcp]
[ 276.951765] show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0xe9/0x130 [scsi_transport_iscsi]
[ 276.952185] dev_attr_show+0x3f/0x80
[ 276.953005] sysfs_kf_seq_show+0x1fb/0x3e0
[ 276.953401] seq_read_iter+0x402/0x1020
[ 276.954260] vfs_read+0x532/0x7b0
[ 276.955113] ksys_read+0xed/0x1c0
[ 276.955952] do_syscall_64+0x38/0x90
[ 276.956347] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 276.956769] RIP: 0033:0x7f5d3a679222
[ 276.957161] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 32 c0 0b 00 e8 a5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 276.958009] RSP: 002b:00007ffc864d16a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 276.958431] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5d3a679222
[ 276.958857] RDX: 0000000000020000 RSI: 00007f5d3a4fe000 RDI: 0000000000000003
[ 276.959281] RBP: 00007f5d3a4fe000 R08: 00000000ffffffff R09: 0000000000000000
[ 276.959682] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020000
[ 276.960126] R13: 0000000000000003 R14: 0000000000000000 R15: 0000557a26dada58
[ 276.960536] </TASK>
[ 276.961357] Allocated by task 2209:
[ 276.961756] kasan_save_stack+0x1e/0x40
[ 276.962170] kasan_set_track+0x21/0x30
[ 276.962557] __kasan_kmalloc+0x7e/0x90
[ 276.962923] __kmalloc+0x5b/0x140
[ 276.963308] iscsi_alloc_session+0x28/0x840 [scsi_transport_iscsi]
[ 276.963712] iscsi_session_setup+0xda/0xba0 [libiscsi]
[ 276.964078] iscsi_sw_tcp_session_create+0x1fd/0x330 [iscsi_tcp]
[ 276.964431] iscsi_if_create_session.isra.0+0x50/0x260 [scsi_transport_iscsi]
[ 276.964793] iscsi_if_recv_msg+0xc5a/0x2660 [scsi_transport_iscsi]
[ 276.965153] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi]
[ 276.965546] netlink_unicast+0x4d5/0x7b0
[ 276.965905] netlink_sendmsg+0x78d/0xc30
[ 276.966236] sock_sendmsg+0xe5/0x120
[ 276.966576] ____sys_sendmsg+0x5fe/0x860
[ 276.966923] ___sys_sendmsg+0xe0/0x170
[ 276.967300] __sys_sendmsg+0xc8/0x170
[ 276.967666] do_syscall_64+0x38/0x90
[ 276.968028] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 276.968773] Freed by task 2209:
[ 276.969111] kasan_save_stack+0x1e/0x40
[ 276.969449] kasan_set_track+0x21/0x30
[ 276.969789] kasan_save_free_info+0x2a/0x50
[ 276.970146] __kasan_slab_free+0x106/0x190
[ 276.970470] __kmem_cache_free+0x133/0x270
[ 276.970816] device_release+0x98/0x210
[ 276.971145] kobject_cleanup+0x101/0x360
[ 276.971462] iscsi_session_teardown+0x3fb/0x530 [libiscsi]
[ 276.971775] iscsi_sw_tcp_session_destroy+0xd8/0x130 [iscsi_tcp]
[ 276.972143] iscsi_if_recv_msg+0x1bf1/0x2660 [scsi_transport_iscsi]
[ 276.972485] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi]
[ 276.972808] netlink_unicast+0x4d5/0x7b0
[ 276.973201] netlink_sendmsg+0x78d/0xc30
[ 276.973544] sock_sendmsg+0xe5/0x120
[ 276.973864] ____sys_sendmsg+0x5fe/0x860
[ 276.974248] ___sys_sendmsg+0xe0/0x170
[ 276.974583] __sys_sendmsg+0xc8/0x170
[ 276.974891] do_syscall_64+0x38/0x90
[ 276.975216] entry_SYSCALL_64_after_hwframe+0x63/0xcd
We can easily reproduce by two tasks:
1. while :; do iscsiadm -m node --login; iscsiadm -m node --logout; done
2. while :; do cat \
/sys/devices/platform/host*/iscsi_host/host*/ipaddress; done
iscsid | cat
--------------------------------+---------------------------------------
|- iscsi_sw_tcp_session_destroy |
|- iscsi_session_teardown |
|- device_release |
|- iscsi_session_release ||- dev_attr_show
|- kfree | |- show_host_param_
| ISCSI_HOST_PARAM_IPADDRESS
| |- iscsi_sw_tcp_host_get_param
| |- r/w tcp_sw_host->session (UAF)
|- iscsi_host_remove |
|- iscsi_host_free |
Fix the above bug by splitting the session removal into 2 parts:
1. removal from iSCSI class which includes sysfs and removal from host
tracking.
2. freeing of session.
During iscsi_tcp host and session removal we can remove the session from
sysfs then remove the host from sysfs. At this point we know userspace is
not accessing the kernel via sysfs so we can free the session and host.
Link: https://lore.kernel.org/r/20230117193937.21244-2-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie(a)oracle.com>
Reviewed-by: Lee Duncan <lduncan(a)suse.com>
Acked-by: Ding Hui <dinghui(a)sangfor.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Wenchao Hao <haowenchao2(a)huawei.com>
Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/scsi/iscsi_tcp.c | 11 +++++++++--
drivers/scsi/libiscsi.c | 39 +++++++++++++++++++++++++++++++--------
include/scsi/libiscsi.h | 2 ++
3 files changed, 42 insertions(+), 10 deletions(-)
diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 241b1a310519..a5259fd39d6d 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -910,10 +910,17 @@ static void iscsi_sw_tcp_session_destroy(struct iscsi_cls_session *cls_session)
if (WARN_ON_ONCE(session->leadconn))
return;
+ iscsi_session_remove(cls_session);
+ /*
+ * Our get_host_param needs to access the session, so remove the
+ * host from sysfs before freeing the session to make sure userspace
+ * is no longer accessing the callout.
+ */
+ iscsi_host_remove(shost);
+
iscsi_tcp_r2tpool_free(cls_session->dd_data);
- iscsi_session_teardown(cls_session);
- iscsi_host_remove(shost);
+ iscsi_session_free(cls_session);
iscsi_host_free(shost);
}
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 9f625a4d53c0..72463874d7b4 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -3018,20 +3018,34 @@ iscsi_session_setup(struct iscsi_transport *iscsit, struct Scsi_Host *shost,
}
EXPORT_SYMBOL_GPL(iscsi_session_setup);
+/*
+ * issi_session_remove - Remove session from iSCSI class.
+ */
+void iscsi_session_remove(struct iscsi_cls_session *cls_session)
+{
+ struct iscsi_session *session = cls_session->dd_data;
+ struct Scsi_Host *shost = session->host;
+
+ iscsi_remove_session(cls_session);
+ /*
+ * host removal only has to wait for its children to be removed from
+ * sysfs, and iscsi_tcp needs to do iscsi_host_remove before freeing
+ * the session, so drop the session count here.
+ */
+ iscsi_host_dec_session_cnt(shost);
+}
+EXPORT_SYMBOL_GPL(iscsi_session_remove);
+
/**
- * iscsi_session_teardown - destroy session, host, and cls_session
+ * iscsi_session_free - Free iscsi session and it's resources
* @cls_session: iscsi session
*/
-void iscsi_session_teardown(struct iscsi_cls_session *cls_session)
+void iscsi_session_free(struct iscsi_cls_session *cls_session)
{
struct iscsi_session *session = cls_session->dd_data;
struct module *owner = cls_session->transport->owner;
- struct Scsi_Host *shost = session->host;
iscsi_pool_free(&session->cmdpool);
-
- iscsi_remove_session(cls_session);
-
kfree(session->password);
kfree(session->password_in);
kfree(session->username);
@@ -3047,10 +3061,19 @@ void iscsi_session_teardown(struct iscsi_cls_session *cls_session)
kfree(session->discovery_parent_type);
iscsi_free_session(cls_session);
-
- iscsi_host_dec_session_cnt(shost);
module_put(owner);
}
+EXPORT_SYMBOL_GPL(iscsi_session_free);
+
+/**
+ * iscsi_session_teardown - destroy session and cls_session
+ * @cls_session: iscsi session
+ */
+void iscsi_session_teardown(struct iscsi_cls_session *cls_session)
+{
+ iscsi_session_remove(cls_session);
+ iscsi_session_free(cls_session);
+}
EXPORT_SYMBOL_GPL(iscsi_session_teardown);
/**
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index 254e72b46d10..2a8d1de70290 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -425,6 +425,8 @@ extern int iscsi_host_get_max_scsi_cmds(struct Scsi_Host *shost,
extern struct iscsi_cls_session *
iscsi_session_setup(struct iscsi_transport *, struct Scsi_Host *shost,
uint16_t, int, int, uint32_t, unsigned int);
+void iscsi_session_remove(struct iscsi_cls_session *cls_session);
+void iscsi_session_free(struct iscsi_cls_session *cls_session);
extern void iscsi_session_teardown(struct iscsi_cls_session *);
extern void iscsi_session_recovery_timedout(struct iscsi_cls_session *);
extern int iscsi_set_param(struct iscsi_cls_conn *cls_conn,
--
2.25.1
1
1

[PATCH openEuler-1.0-LTS] pciehp: fix the problem that the slot is powered on again after being powered off
by jiazhenyuan@uniontech.com 28 Feb '23
by jiazhenyuan@uniontech.com 28 Feb '23
28 Feb '23
From: jiazhenyuan <jiazhenyuan(a)uniontech.com>
mainline inclusion
from mainline-4.19-lts
commit 32a8cef274feacd00b748a4f13b84d60aa6d82ff
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6ICDL
CVE: NA
----------------------------------------
The DISABLE_SLOT event is lost when the slot is powered off.
Signed-off-by: jiazhenyuan <jiazhenyuan(a)uniontech.com>
---
drivers/pci/hotplug/pciehp_ctrl.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 2d549c97ac42..dd67dc540279 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -463,6 +463,7 @@ int pciehp_sysfs_disable_slot(struct slot *p_slot)
mutex_unlock(&p_slot->lock);
pci_config_pm_runtime_get(pdev);
down_read(&ctrl->reset_lock);
+ atomic_or(DISABLE_SLOT, &ctrl->pending_events);
pciehp_handle_disable_request(p_slot);
up_read(&ctrl->reset_lock);
pci_config_pm_runtime_put(pdev);
--
2.27.0
1
0

[PATCH openEuler-1.0-LTS] net: mpls: fix stale pointer if allocation fails during device rename
by Yongqiang Liu 28 Feb '23
by Yongqiang Liu 28 Feb '23
28 Feb '23
From: Jakub Kicinski <kuba(a)kernel.org>
stable inclusion
from stable-v4.19.273
commit aa07c86e43ed8780d610ecfb2ce13da326729201
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6HZHU
CVE: CVE-2023-26545
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit fda6c89fe3d9aca073495a664e1d5aea28cd4377 upstream.
lianhui reports that when MPLS fails to register the sysctl table
under new location (during device rename) the old pointers won't
get overwritten and may be freed again (double free).
Handle this gracefully. The best option would be unregistering
the MPLS from the device completely on failure, but unfortunately
mpls_ifdown() can fail. So failing fully is also unreliable.
Another option is to register the new table first then only
remove old one if the new one succeeds. That requires more
code, changes order of notifications and two tables may be
visible at the same time.
sysctl point is not used in the rest of the code - set to NULL
on failures and skip unregister if already NULL.
Reported-by: lianhui tang <bluetlh(a)gmail.com>
Fixes: 0fae3bf018d9 ("mpls: handle device renames for per-device sysctls")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Yue Haibing <yuehaibing(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
net/mpls/af_mpls.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 7623d9aec636..c7fd387baa61 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -1375,6 +1375,7 @@ static int mpls_dev_sysctl_register(struct net_device *dev,
free:
kfree(table);
out:
+ mdev->sysctl = NULL;
return -ENOBUFS;
}
@@ -1384,6 +1385,9 @@ static void mpls_dev_sysctl_unregister(struct net_device *dev,
struct net *net = dev_net(dev);
struct ctl_table *table;
+ if (!mdev->sysctl)
+ return;
+
table = mdev->sysctl->ctl_table_arg;
unregister_net_sysctl_table(mdev->sysctl);
kfree(table);
--
2.25.1
1
0

[PATCH openEuler-5.10-LTS-SP1] net: mpls: fix stale pointer if allocation fails during device rename
by Jialin Zhang 28 Feb '23
by Jialin Zhang 28 Feb '23
28 Feb '23
From: Jakub Kicinski <kuba(a)kernel.org>
stable inclusion
from stable-v5.10.169
commit 7ff0fdba82298d1f456c685e24930da89703c0fb
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6HZHU
CVE: CVE-2023-26545
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
commit fda6c89fe3d9aca073495a664e1d5aea28cd4377 upstream.
lianhui reports that when MPLS fails to register the sysctl table
under new location (during device rename) the old pointers won't
get overwritten and may be freed again (double free).
Handle this gracefully. The best option would be unregistering
the MPLS from the device completely on failure, but unfortunately
mpls_ifdown() can fail. So failing fully is also unreliable.
Another option is to register the new table first then only
remove old one if the new one succeeds. That requires more
code, changes order of notifications and two tables may be
visible at the same time.
sysctl point is not used in the rest of the code - set to NULL
on failures and skip unregister if already NULL.
Reported-by: lianhui tang <bluetlh(a)gmail.com>
Fixes: 0fae3bf018d9 ("mpls: handle device renames for per-device sysctls")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com>
Reviewed-by: Liu Jian <liujian56(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
net/mpls/af_mpls.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 72398149e4d4..1dcbdab9319b 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -1427,6 +1427,7 @@ static int mpls_dev_sysctl_register(struct net_device *dev,
free:
kfree(table);
out:
+ mdev->sysctl = NULL;
return -ENOBUFS;
}
@@ -1436,6 +1437,9 @@ static void mpls_dev_sysctl_unregister(struct net_device *dev,
struct net *net = dev_net(dev);
struct ctl_table *table;
+ if (!mdev->sysctl)
+ return;
+
table = mdev->sysctl->ctl_table_arg;
unregister_net_sysctl_table(mdev->sysctl);
kfree(table);
--
2.25.1
1
0

[PATCH openEuler-5.10-LTS 01/50] Revert "[Huawei] io_uring:drop identity before creating a private one"
by Jialin Zhang 28 Feb '23
by Jialin Zhang 28 Feb '23
28 Feb '23
From: Li Lingfeng <lilingfeng3(a)huawei.com>
Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
-------------------------------
This reverts commit ab45921343c63dc9d461740e23003098697e0333.
We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
This patch fix a uaf problem of io_identity, and it can be reverted
since io_identity is removed in patch 788d0824269bef.
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/io_uring.c | 42 ------------------------------------------
1 file changed, 42 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4ace89ae4832..2397c2a1d919 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1364,47 +1364,6 @@ static bool io_identity_cow(struct io_kiocb *req)
return true;
}
-static void io_drop_identity(struct io_kiocb *req)
-{
- struct io_identity *id = req->work.identity;
-
- if (req->work.flags & IO_WQ_WORK_MM) {
- mmdrop(id->mm);
- req->work.flags &= ~IO_WQ_WORK_MM;
- }
-#ifdef CONFIG_BLK_CGROUP
- if (req->work.flags & IO_WQ_WORK_BLKCG) {
- css_put(id->blkcg_css);
- req->work.flags &= ~IO_WQ_WORK_BLKCG;
- }
-#endif
- if (req->work.flags & IO_WQ_WORK_CREDS) {
- put_cred(id->creds);
- req->work.flags &= ~IO_WQ_WORK_CREDS;
- }
- if (req->work.flags & IO_WQ_WORK_FILES) {
- put_files_struct(req->work.identity->files);
- put_nsproxy(req->work.identity->nsproxy);
- req->work.flags &= ~IO_WQ_WORK_FILES;
- }
- if (req->work.flags & IO_WQ_WORK_CANCEL)
- req->work.flags &= ~IO_WQ_WORK_CANCEL;
- if (req->work.flags & IO_WQ_WORK_FS) {
- struct fs_struct *fs = id->fs;
-
- spin_lock(&id->fs->lock);
- if (--fs->users)
- fs = NULL;
- spin_unlock(&id->fs->lock);
-
- if (fs)
- free_fs_struct(fs);
- req->work.flags &= ~IO_WQ_WORK_FS;
- }
- if (req->work.flags & IO_WQ_WORK_FSIZE)
- req->work.flags &= ~IO_WQ_WORK_FSIZE;
-}
-
static bool io_grab_identity(struct io_kiocb *req)
{
const struct io_op_def *def = &io_op_defs[req->opcode];
@@ -1510,7 +1469,6 @@ static void io_prep_async_work(struct io_kiocb *req)
if (io_grab_identity(req))
return;
- io_drop_identity(req);
if (!io_identity_cow(req))
return;
--
2.25.1
1
49

[PATCH openEuler-5.10-LTS-SP1 01/52] Revert "[Huawei] io_uring:drop identity before creating a private one"
by Jialin Zhang 28 Feb '23
by Jialin Zhang 28 Feb '23
28 Feb '23
From: Li Lingfeng <lilingfeng3(a)huawei.com>
Offering: HULK
hulk inclusion
category: feature
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6BTWC
-------------------------------
This reverts commit 6d9aaec1edb197ca7e05873c1930c418b7ecaa1e.
We need to apply patch 788d0824269bef (io_uring: import 5.15-stable
io_uring) to move io_uring to separate directory and solve
the problem of CVE-2023-0240.
This patch fix a uaf problem of io_identity, and it can be reverted
since io_identity is removed in patch 788d0824269bef.
Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com>
Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
fs/io_uring.c | 42 ------------------------------------------
1 file changed, 42 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 21adb7ff7b2e..adb8fcef738a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1364,47 +1364,6 @@ static bool io_identity_cow(struct io_kiocb *req)
return true;
}
-static void io_drop_identity(struct io_kiocb *req)
-{
- struct io_identity *id = req->work.identity;
-
- if (req->work.flags & IO_WQ_WORK_MM) {
- mmdrop(id->mm);
- req->work.flags &= ~IO_WQ_WORK_MM;
- }
-#ifdef CONFIG_BLK_CGROUP
- if (req->work.flags & IO_WQ_WORK_BLKCG) {
- css_put(id->blkcg_css);
- req->work.flags &= ~IO_WQ_WORK_BLKCG;
- }
-#endif
- if (req->work.flags & IO_WQ_WORK_CREDS) {
- put_cred(id->creds);
- req->work.flags &= ~IO_WQ_WORK_CREDS;
- }
- if (req->work.flags & IO_WQ_WORK_FILES) {
- put_files_struct(req->work.identity->files);
- put_nsproxy(req->work.identity->nsproxy);
- req->work.flags &= ~IO_WQ_WORK_FILES;
- }
- if (req->work.flags & IO_WQ_WORK_CANCEL)
- req->work.flags &= ~IO_WQ_WORK_CANCEL;
- if (req->work.flags & IO_WQ_WORK_FS) {
- struct fs_struct *fs = id->fs;
-
- spin_lock(&id->fs->lock);
- if (--fs->users)
- fs = NULL;
- spin_unlock(&id->fs->lock);
-
- if (fs)
- free_fs_struct(fs);
- req->work.flags &= ~IO_WQ_WORK_FS;
- }
- if (req->work.flags & IO_WQ_WORK_FSIZE)
- req->work.flags &= ~IO_WQ_WORK_FSIZE;
-}
-
static bool io_grab_identity(struct io_kiocb *req)
{
const struct io_op_def *def = &io_op_defs[req->opcode];
@@ -1510,7 +1469,6 @@ static void io_prep_async_work(struct io_kiocb *req)
if (io_grab_identity(req))
return;
- io_drop_identity(req);
if (!io_identity_cow(req))
return;
--
2.25.1
1
51

[PATCH openEuler-1.0-LTS 1/3] selinux: further adjust init order for file_alloc_security hook
by Yongqiang Liu 27 Feb '23
by Yongqiang Liu 27 Feb '23
27 Feb '23
From: "GONG, Ruiqi" <gongruiqi1(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6DRJ1
CVE: NA
----------------------------------------
After backporting commit cfff75d8973a ("selinux: reorder hooks to make
runtime disable less broken") to the 4.19 kernel of openEuler-1.0-LTS,
another kernel panic was triggered by running the POC of the
aforementioned commit:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
PGD 800000001840b067 P4D 800000001840b067 PUD 1840c067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 7 PID: 273 Comm: exe Tainted: G OE 4.19.90+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:selinux_file_open+0x49/0xf0
Code: 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 20 31 c0 4c 89 e7 e8 a6 ec ff ff 49 8b 44 24 38 48 c7 c7 e0 a5 13 97 8b 40 1c <89> 45 08 e8 6f 80 ff ff ba 02 00 00 00 89 45 0c 8b 43 44 8b 73 40
RSP: 0018:ffffbb7300867ba0 EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffff9dc301961400 RCX: 00000000000081ed
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff9713a5e0
RBP: 0000000000000000 R08: 0000000000000001 R09: ffff9dc301fedcb0
R10: 0000000000000007 R11: 7fffffffffffffff R12: ffff9dc30204fd70
R13: 0000000000000000 R14: ffff9dc301961410 R15: ffffbb7300867c70
FS: 0000000000d258c0(0000) GS:ffff9dc33e9c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 00000000022bc000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? generic_permission+0x10a/0x190
security_file_open+0x26/0x90
do_dentry_open+0xd9/0x380
do_last+0x197/0x8d0
path_openat+0x89/0x280
do_filp_open+0x91/0x100
do_open_execat+0x79/0x180
__do_execve_file.isra.0+0x6dd/0x8b0
__x64_sys_execve+0x35/0x40
do_syscall_64+0x63/0x250
? async_page_fault+0x8/0x30
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x49a5db
Code: 41 89 01 eb da 66 2e 0f 1f 84 00 00 00 00 00 f7 d8 64 41 89 01 eb d6 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 3b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe7b1cebd8 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
RAX: ffffffffffffffda RBX: 0000000000d27ee0 RCX: 000000000049a5db
RDX: 0000000000d27f08 RSI: 0000000000d27ee0 RDI: 0000000000d27f48
RBP: 0000000000d27f48 R08: fefefefefefefeff R09: fefefeff666d686f
R10: 0000000000d25b90 R11: 0000000000000246 R12: 0000000000d27f08
R13: 0000000000655894 R14: 0000000000d27f08 R15: 0000000000d26ed0
Modules linked in: e1000(OE)
CR2: 0000000000000008
---[ end trace e4eb884974c22e2d ]---
RIP: 0010:selinux_file_open+0x49/0xf0
Code: 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 20 31 c0 4c 89 e7 e8 a6 ec ff ff 49 8b 44 24 38 48 c7 c7 e0 a5 13 97 8b 40 1c <89> 45 08 e8 6f 80 ff ff ba 02 00 00 00 89 45 0c 8b 43 44 8b 73 40
RSP: 0018:ffffbb7300867ba0 EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffff9dc301961400 RCX: 00000000000081ed
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff9713a5e0
RBP: 0000000000000000 R08: 0000000000000001 R09: ffff9dc301fedcb0
R10: 0000000000000007 R11: 7fffffffffffffff R12: ffff9dc30204fd70
R13: 0000000000000000 R14: ffff9dc301961410 R15: ffffbb7300867c70
FS: 0000000000d258c0(0000) GS:ffff9dc33e9c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 00000000022bc000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x14400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---
The problem was caused by selinux_file_open() accessing a file's fsec
being NULL, which indicated that the file_alloc_security hook should be
deleted later (at least after the file_open hook) when disabling SELinux
at runtime. Here I put it into the "allocating" part.
Signed-off-by: GONG, Ruiqi <gongruiqi1(a)huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
security/selinux/hooks.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index cd26c1199353..85ac12d6b2f4 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -7040,7 +7040,6 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
LSM_HOOK_INIT(inode_copy_up_xattr, selinux_inode_copy_up_xattr),
LSM_HOOK_INIT(file_permission, selinux_file_permission),
- LSM_HOOK_INIT(file_alloc_security, selinux_file_alloc_security),
LSM_HOOK_INIT(file_free_security, selinux_file_free_security),
LSM_HOOK_INIT(file_ioctl, selinux_file_ioctl),
LSM_HOOK_INIT(mmap_file, selinux_mmap_file),
@@ -7207,6 +7206,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
LSM_HOOK_INIT(shm_alloc_security, selinux_shm_alloc_security),
LSM_HOOK_INIT(sb_alloc_security, selinux_sb_alloc_security),
LSM_HOOK_INIT(inode_alloc_security, selinux_inode_alloc_security),
+ LSM_HOOK_INIT(file_alloc_security, selinux_file_alloc_security),
LSM_HOOK_INIT(cred_alloc_blank, selinux_cred_alloc_blank),
LSM_HOOK_INIT(sem_alloc_security, selinux_sem_alloc_security),
LSM_HOOK_INIT(secid_to_secctx, selinux_secid_to_secctx),
--
2.25.1
1
2

[PATCH openEuler-1.0-LTS 1/1] nbd: fix assignment error for first_minor in nbd_dev_add
by Yongqiang Liu 25 Feb '23
by Yongqiang Liu 25 Feb '23
25 Feb '23
From: Zhong Jinghua <zhongjinghua(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 188413, https://gitee.com/openeuler/kernel/issues/I6GWYG
CVE: NA
----------------------------------------
A panic error is like below:
nbd_genl_connect
nbd_dev_add
first_minor = index << part_shift; // index =-1
...
__device_add_disk
blk_alloc_devt
*devt = MKDEV(disk->major, disk->first_minor + part->partno);
// part->partno = 0, first_minor = 11...110000 major is covered
There, index < 0 will reassign an index, but here disk->first_minor is
assigned -1 << part_shift.
This causes to the creation of the device with the same major and minor
device numbers each time the incoming index<0, and this will lead to
creation of kobject failed:
Warning: kobject_add_internal failed for 4095:1048544 with -EEXIST, don't
try to register things with the same name in the same directory.
Fix it by moving the first_minor assignment down to after getting the new
index.
Fixes: 01f7594e62e9 ("nbd: Fix use-after-free in blk_mq_free_rqs")
Signed-off-by: Zhong Jinghua <zhongjinghua(a)huawei.com>
Reviewed-by: Yu Kuai <yukuai3(a)huawei.com>
Reviewed-by: Hou Tao <houtao1(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
drivers/block/nbd.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 8dbbd676d275..551b2f7da8ef 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1706,7 +1706,6 @@ static int nbd_dev_add(int index)
struct gendisk *disk;
struct request_queue *q;
int err = -ENOMEM;
- int first_minor = index << part_shift;
nbd = kzalloc(sizeof(struct nbd_device), GFP_KERNEL);
if (!nbd)
@@ -1770,7 +1769,7 @@ static int nbd_dev_add(int index)
refcount_set(&nbd->refs, 1);
INIT_LIST_HEAD(&nbd->list);
disk->major = NBD_MAJOR;
- disk->first_minor = first_minor;
+ disk->first_minor = index << part_shift;
disk->fops = &nbd_fops;
disk->private_data = nbd;
sprintf(disk->disk_name, "nbd%d", index);
--
2.25.1
2
1
1
0
1
0
*** BLURB HERE ***
Andy Shevchenko (1):
ipmi: use %*ph to print small buffer
Dan Carpenter (1):
evm: Fix a small race in init_desc()
Eric Biggers (1):
crypto: rsa-pkcs1pad - fix buffer overread in
pkcs1pad_verify_complete()
Greg Kroah-Hartman (1):
iommu: Properly export iommu_group_get_for_dev()
Herbert Xu (2):
crypto: algif_skcipher - EBUSY on aio should be an error
crypto: algif_skcipher - Use chunksize instead of blocksize
Lubomir Rintel (1):
component: do not dereference opaque pointer in debugfs
Nishka Dasgupta (2):
of: unittest: Add of_node_put() before return
of: resolver: Add of_node_put() before return and break
Ondrej Mosnacek (1):
selinux: reorder hooks to make runtime disable less broken
Roberto Sassu (1):
evm: Check also if *tfm is an error pointer in init_desc()
Will Deacon (2):
drivers/iommu: Export core IOMMU API symbols to permit modular drivers
drivers/iommu: Allow IOMMU bus ops to be unregistered
crypto/algif_skcipher.c | 4 +-
crypto/rsa-pkcs1pad.c | 2 +
drivers/base/component.c | 8 +--
drivers/char/ipmi/ipmi_msghandler.c | 27 ++-------
drivers/iommu/iommu-sysfs.c | 5 ++
drivers/iommu/iommu.c | 12 ++++
drivers/of/resolver.c | 12 +++-
drivers/of/unittest.c | 4 +-
security/integrity/evm/evm_crypto.c | 45 +++++++-------
security/selinux/hooks.c | 93 ++++++++++++++++++++---------
10 files changed, 129 insertions(+), 83 deletions(-)
--
2.25.1
1
13
1
0
当前例会议题:
议题一:进展update(10min) -- 张伽琳 & 郑增凯
议题二:openeuler 实时内核(PREEMPT_RT)规划讨论 --丁翔
欢迎大家继续申报~
-----原始约会-----
发件人: openEuler conference <public(a)openeuler.org>
发送时间: 2023年2月21日 10:06
收件人: dev@openeuler.org,kernel-discuss@openeuler.org,kernel@openeuler.org
主题: openEuler Kernel SIG双周例会
时间: 2023年2月24日星期五 14:00-15:30(UTC+08:00) 北京,重庆,香港特别行政区,乌鲁木齐。
地点:
您好!
Kernel SIG 邀请您参加 2023-02-24 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG双周例会
会议内容:
欢迎您参加Kernel SIG双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/82111503728?pwd=TWRGa2REc0V2ZmlGZ2ZreUF1OXA3dz09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-02-24 14:00,
The subject of the conference is openEuler Kernel SIG双周例会,
Summary:
欢迎您参加Kernel SIG双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/82111503728?pwd=TWRGa2REc0V2ZmlGZ2ZreUF1OXA3dz09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0

[PATCH openEuler-1.0-LTS] dhugetlb: isolate hwpoison hugepage when release
by Yongqiang Liu 23 Feb '23
by Yongqiang Liu 23 Feb '23
23 Feb '23
From: Liu Shixin <liushixin2(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I6GSKP
CVE: NA
--------------------------------
For a hwpoison hugetlb page, the page will be freed firstly. If succeed, it
will be dissolved and released to buddy system, then isolate the hwpoison page.
For a hwpoison hugepage belong to dynamic hugetlb, we isolate the hugepage
without dissolve it. Add a check in free_huge_page_to_dhugetlb_pool() to
isolate the hwpoison hugepage directly. And keep HUGETLB_PAGE_DTOR after free
to ensure the PageHuge() check return true in dissolve_free_huge_page().
Fixes: 0f0535e57da("dhugetlb: skip dissolve hugepage belonging to dynamic hugetlb")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
mm/hugetlb.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ed89df6fc5de..9f974270c84d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1348,8 +1348,9 @@ static void free_huge_page_to_dhugetlb_pool(struct page *page,
}
spin_lock(&hpool->lock);
+ if (PageHWPoison(page))
+ goto out;
ClearPagePool(page);
- set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
if (!hstate_is_gigantic(h)) {
list_add(&page->lru, &hpool->dhugetlb_2M_freelists);
hpool->free_reserved_2M++;
@@ -1375,6 +1376,7 @@ static void free_huge_page_to_dhugetlb_pool(struct page *page,
trace_dhugetlb_alloc_free(hpool, page, hpool->free_reserved_1G,
DHUGETLB_FREE_1G);
}
+out:
spin_unlock(&hpool->lock);
dhugetlb_pool_put(hpool);
}
--
2.25.1
1
0

[PATCH openEuler-5.10-LTS-SP1 01/29] media: vivid: fix compose size exceed boundary
by Jialin Zhang 22 Feb '23
by Jialin Zhang 22 Feb '23
22 Feb '23
From: Liu Shixin <liushixin2(a)huawei.com>
stable inclusion
from stable-v5.10.163
commit f9d19f3a044ca651b0be52a4bf951ffe74259b9f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6CIF8
CVE: CVE-2023-0615
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 94a7ad9283464b75b12516c5512541d467cefcf8 ]
syzkaller found a bug:
BUG: unable to handle page fault for address: ffffc9000a3b1000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10015f067 PMD 1121ca067 PTE 0
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 23489 Comm: vivid-000-vid-c Not tainted 6.1.0-rc1+ #512
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:memcpy_erms+0x6/0x10
[...]
Call Trace:
<TASK>
? tpg_fill_plane_buffer+0x856/0x15b0
vivid_fillbuff+0x8ac/0x1110
vivid_thread_vid_cap_tick+0x361/0xc90
vivid_thread_vid_cap+0x21a/0x3a0
kthread+0x143/0x180
ret_from_fork+0x1f/0x30
</TASK>
This is because we forget to check boundary after adjust compose->height
int V4L2_SEL_TGT_CROP case. Add v4l2_rect_map_inside() to fix this problem
for this case.
Fixes: ef834f7836ec ("[media] vivid: add the video capture and output parts")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
Signed-off-by: Longlong Xia <xialonglong1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/media/test-drivers/vivid/vivid-vid-cap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/test-drivers/vivid/vivid-vid-cap.c b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
index eadf28ab1e39..eeb0aeb62f79 100644
--- a/drivers/media/test-drivers/vivid/vivid-vid-cap.c
+++ b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
@@ -953,6 +953,7 @@ int vivid_vid_cap_s_selection(struct file *file, void *fh, struct v4l2_selection
if (dev->has_compose_cap) {
v4l2_rect_set_min_size(compose, &min_rect);
v4l2_rect_set_max_size(compose, &max_rect);
+ v4l2_rect_map_inside(compose, &fmt);
}
dev->fmt_cap_rect = fmt;
tpg_s_buf_height(&dev->tpg, fmt.height);
--
2.25.1
1
28

[PATCH openEuler-5.10-LTS 01/18] media: vivid: fix compose size exceed boundary
by Jialin Zhang 22 Feb '23
by Jialin Zhang 22 Feb '23
22 Feb '23
From: Liu Shixin <liushixin2(a)huawei.com>
stable inclusion
from stable-v5.10.163
commit f9d19f3a044ca651b0be52a4bf951ffe74259b9f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6CIF8
CVE: CVE-2023-0615
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 94a7ad9283464b75b12516c5512541d467cefcf8 ]
syzkaller found a bug:
BUG: unable to handle page fault for address: ffffc9000a3b1000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10015f067 PMD 1121ca067 PTE 0
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 23489 Comm: vivid-000-vid-c Not tainted 6.1.0-rc1+ #512
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:memcpy_erms+0x6/0x10
[...]
Call Trace:
<TASK>
? tpg_fill_plane_buffer+0x856/0x15b0
vivid_fillbuff+0x8ac/0x1110
vivid_thread_vid_cap_tick+0x361/0xc90
vivid_thread_vid_cap+0x21a/0x3a0
kthread+0x143/0x180
ret_from_fork+0x1f/0x30
</TASK>
This is because we forget to check boundary after adjust compose->height
int V4L2_SEL_TGT_CROP case. Add v4l2_rect_map_inside() to fix this problem
for this case.
Fixes: ef834f7836ec ("[media] vivid: add the video capture and output parts")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
Signed-off-by: Longlong Xia <xialonglong1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/media/test-drivers/vivid/vivid-vid-cap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/test-drivers/vivid/vivid-vid-cap.c b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
index eadf28ab1e39..eeb0aeb62f79 100644
--- a/drivers/media/test-drivers/vivid/vivid-vid-cap.c
+++ b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
@@ -953,6 +953,7 @@ int vivid_vid_cap_s_selection(struct file *file, void *fh, struct v4l2_selection
if (dev->has_compose_cap) {
v4l2_rect_set_min_size(compose, &min_rect);
v4l2_rect_set_max_size(compose, &max_rect);
+ v4l2_rect_map_inside(compose, &fmt);
}
dev->fmt_cap_rect = fmt;
tpg_s_buf_height(&dev->tpg, fmt.height);
--
2.25.1
1
17

[PATCH openEuler-5.10-LTS 01/18] media: vivid: fix compose size exceed boundary
by Jialin Zhang 22 Feb '23
by Jialin Zhang 22 Feb '23
22 Feb '23
From: Liu Shixin <liushixin2(a)huawei.com>
stable inclusion
from stable-v5.10.163
commit f9d19f3a044ca651b0be52a4bf951ffe74259b9f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6CIF8
CVE: CVE-2023-0615
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 94a7ad9283464b75b12516c5512541d467cefcf8 ]
syzkaller found a bug:
BUG: unable to handle page fault for address: ffffc9000a3b1000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10015f067 PMD 1121ca067 PTE 0
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 23489 Comm: vivid-000-vid-c Not tainted 6.1.0-rc1+ #512
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:memcpy_erms+0x6/0x10
[...]
Call Trace:
<TASK>
? tpg_fill_plane_buffer+0x856/0x15b0
vivid_fillbuff+0x8ac/0x1110
vivid_thread_vid_cap_tick+0x361/0xc90
vivid_thread_vid_cap+0x21a/0x3a0
kthread+0x143/0x180
ret_from_fork+0x1f/0x30
</TASK>
This is because we forget to check boundary after adjust compose->height
int V4L2_SEL_TGT_CROP case. Add v4l2_rect_map_inside() to fix this problem
for this case.
Fixes: ef834f7836ec ("[media] vivid: add the video capture and output parts")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
Signed-off-by: Longlong Xia <xialonglong1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/media/test-drivers/vivid/vivid-vid-cap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/test-drivers/vivid/vivid-vid-cap.c b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
index eadf28ab1e39..eeb0aeb62f79 100644
--- a/drivers/media/test-drivers/vivid/vivid-vid-cap.c
+++ b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
@@ -953,6 +953,7 @@ int vivid_vid_cap_s_selection(struct file *file, void *fh, struct v4l2_selection
if (dev->has_compose_cap) {
v4l2_rect_set_min_size(compose, &min_rect);
v4l2_rect_set_max_size(compose, &max_rect);
+ v4l2_rect_map_inside(compose, &fmt);
}
dev->fmt_cap_rect = fmt;
tpg_s_buf_height(&dev->tpg, fmt.height);
--
2.25.1
1
17

[PATCH openEuler-5.10-LTS-SP1 01/29] media: vivid: fix compose size exceed boundary
by Jialin Zhang 22 Feb '23
by Jialin Zhang 22 Feb '23
22 Feb '23
From: Liu Shixin <liushixin2(a)huawei.com>
stable inclusion
from stable-v5.10.163
commit f9d19f3a044ca651b0be52a4bf951ffe74259b9f
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6CIF8
CVE: CVE-2023-0615
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
--------------------------------
[ Upstream commit 94a7ad9283464b75b12516c5512541d467cefcf8 ]
syzkaller found a bug:
BUG: unable to handle page fault for address: ffffc9000a3b1000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10015f067 PMD 1121ca067 PTE 0
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 23489 Comm: vivid-000-vid-c Not tainted 6.1.0-rc1+ #512
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:memcpy_erms+0x6/0x10
[...]
Call Trace:
<TASK>
? tpg_fill_plane_buffer+0x856/0x15b0
vivid_fillbuff+0x8ac/0x1110
vivid_thread_vid_cap_tick+0x361/0xc90
vivid_thread_vid_cap+0x21a/0x3a0
kthread+0x143/0x180
ret_from_fork+0x1f/0x30
</TASK>
This is because we forget to check boundary after adjust compose->height
int V4L2_SEL_TGT_CROP case. Add v4l2_rect_map_inside() to fix this problem
for this case.
Fixes: ef834f7836ec ("[media] vivid: add the video capture and output parts")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
Signed-off-by: Longlong Xia <xialonglong1(a)huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Signed-off-by: Jialin Zhang <zhangjialin11(a)huawei.com>
---
drivers/media/test-drivers/vivid/vivid-vid-cap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/test-drivers/vivid/vivid-vid-cap.c b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
index eadf28ab1e39..eeb0aeb62f79 100644
--- a/drivers/media/test-drivers/vivid/vivid-vid-cap.c
+++ b/drivers/media/test-drivers/vivid/vivid-vid-cap.c
@@ -953,6 +953,7 @@ int vivid_vid_cap_s_selection(struct file *file, void *fh, struct v4l2_selection
if (dev->has_compose_cap) {
v4l2_rect_set_min_size(compose, &min_rect);
v4l2_rect_set_max_size(compose, &max_rect);
+ v4l2_rect_map_inside(compose, &fmt);
}
dev->fmt_cap_rect = fmt;
tpg_s_buf_height(&dev->tpg, fmt.height);
--
2.25.1
1
28

[PATCH openEuler-1.0-LTS] mm/sharepool: Fix null-pointer-deference in sp_free_area
by Yongqiang Liu 22 Feb '23
by Yongqiang Liu 22 Feb '23
22 Feb '23
From: Wang Wensheng <wangwensheng4(a)huawei.com>
hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I6G76L
CVE: NA
----------------------------------------------
When a process is deleted from a group, the process does not apply for
memory from the shared group. Otherwise, the UAF problem occurs. We checked
this, but it didn't do a good job of preventing sp_alloc and del_task
concurrency. The process applies for memory after passing the check, which
violates our requirements and causes problems. The solution is to place the
checked code in the critical area to ensure that no memory can be allocated
after the check is passed.
[ T7596] Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000098
[ T7596] Mem abort info:
[ T7596] ESR = 0x96000004
[ T7596] EC = 0x25: DABT (current EL), IL = 32 bits
[ T7596] SET = 0, FnV = 0
[ T7596] EA = 0, S1PTW = 0
[ T7596] Data abort info:
[ T7596] ISV = 0, ISS = 0x00000004
[ T7596] CM = 0, WnR = 0
[ T7596] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001040a3000
[ T7596] [0000000000000098] pgd=0000000000000000, p4d=0000000000000000
[ T7596] Internal error: Oops: 96000004 [#1] SMP
[ T7596] Modules linked in: sharepool_dev(OE) [last unloaded: demo]
[ T7596] CPU: 1 PID: 7596 Comm: test_sp_group_d Tainted: G OE 5.10.0+ #8
[ T7596] Hardware name: linux,dummy-virt (DT)
[ T7596] pstate: 20000005 (nzCv daif -PAN -UAO -TCO BTYPE=--)
[ T7596] pc : sp_free_area+0x34/0x120
[ T7596] lr : sp_free_area+0x30/0x120
[ T7596] sp : ffff80001c6a3b20
[ T7596] x29: ffff80001c6a3b20 x28: 0000000000000009
[ T7596] x27: 0000000000000000 x26: ffff800011c49d20
[ T7596] x25: ffff0000c227f6c0 x24: 0000000000000008
[ T7596] x23: ffff0000c0cf0ce8 x22: 0000000000000001
[ T7596] x21: ffff0000c4082b30 x20: 0000000000000000
[ T7596] x19: ffff0000c4082b00 x18: 0000000000000000
[ T7596] x17: 0000000000000000 x16: 0000000000000000
[ T7596] x15: 0000000000000000 x14: 0000000000000000
[ T7596] x13: 0000000000000000 x12: ffff0005fffe12c0
[ T7596] x11: 0000000000000008 x10: ffff0005fffe12c0
[ T7596] x9 : ffff8000103eb690 x8 : 0000000000000001
[ T7596] x7 : 0000000000210d00 x6 : 0000000000000000
[ T7596] x5 : ffff8000123edea0 x4 : 0000000000000030
[ T7596] x3 : ffffeff000000000 x2 : 0000eff000000000
[ T7596] x1 : 0000e80000000000 x0 : 0000000000000000
[ T7596] Call trace:
[ T7596] sp_free_area+0x34/0x120
[ T7596] __sp_area_drop_locked+0x3c/0x60
[ T7596] sp_area_drop+0x80/0xbc
[ T7596] remove_vma+0x54/0x70
[ T7596] exit_mmap+0x114/0x1d0
[ T7596] mmput+0x90/0x1ec
[ T7596] exit_mm+0x1d0/0x2f0
[ T7596] do_exit+0x180/0x400
[ T7596] do_group_exit+0x40/0x114
[ T7596] get_signal+0x1e8/0x720
[ T7596] do_signal+0x11c/0x1e4
[ T7596] do_notify_resume+0x15c/0x250
[ T7596] work_pending+0xc/0x6d8
[ T7596] Code: f9400001 f9402c00 97fff0e5 aa0003f4 (f9404c00)
[ T7596] ---[ end trace 3c8368d77e758ebd ]---
Signed-off-by: Wang Wensheng <wangwensheng4(a)huawei.com>
Reviewed-by: Weilong Chen <chenweilong(a)huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13(a)huawei.com>
---
mm/share_pool.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/mm/share_pool.c b/mm/share_pool.c
index 5d8344fe805a..a136c70b7588 100644
--- a/mm/share_pool.c
+++ b/mm/share_pool.c
@@ -1712,14 +1712,6 @@ int mg_sp_group_del_task(int pid, int spg_id)
ret = -EINVAL;
goto out;
}
-
- if (!list_empty(&spg->spa_list)) {
- up_write(&sp_group_sem);
- pr_err_ratelimited("spa is not empty");
- ret = -EINVAL;
- goto out;
- }
-
ret = get_task(pid, &tsk);
if (ret) {
up_write(&sp_group_sem);
@@ -1743,6 +1735,15 @@ int mg_sp_group_del_task(int pid, int spg_id)
}
down_write(&spg->rw_lock);
+
+ if (!list_empty(&spg->spa_list)) {
+ up_write(&spg->rw_lock);
+ up_write(&sp_group_sem);
+ pr_err_ratelimited("spa is not empty");
+ ret = -EINVAL;
+ goto out_put_mm;
+ }
+
if (list_is_singular(&spg->procs))
is_alive = spg->is_alive = false;
spg->proc_num--;
--
2.25.1
1
0
您好!
Kernel SIG 邀请您参加 2023-02-24 14:00 召开的Zoom会议(自动录制)
会议主题:openEuler Kernel SIG双周例会
会议内容:
欢迎您参加Kernel SIG双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
会议链接:https://us06web.zoom.us/j/82111503728?pwd=TWRGa2REc0V2ZmlGZ2ZreUF1OXA3dz09
会议纪要:https://etherpad.openeuler.org/p/Kernel-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler Kernel SIG invites you to attend the Zoom conference(auto recording) will be held at 2023-02-24 14:00,
The subject of the conference is openEuler Kernel SIG双周例会,
Summary:
欢迎您参加Kernel SIG双周例会,当前议题:
1. 进展update
2. 议题征集中
欢迎大家积极申报议题(新增议题可以直接回复邮件,或录入会议看板)
You can join the meeting at https://us06web.zoom.us/j/82111503728?pwd=TWRGa2REc0V2ZmlGZ2ZreUF1OXA3dz09.
Add topics at https://etherpad.openeuler.org/p/Kernel-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0
您好!
sig-Intel-Arch SIG 邀请您参加 2023-02-21 10:00 召开的Zoom会议
会议主题:Intel Arch SIG 例会
会议内容:
1. Intel几个复杂特性的支持计划
2. QEMU和kernel选型对未来lntel平台支持
和release安排讨论
3. openEuler在SPR上的用户诉求和反馈
会议链接:https://us06web.zoom.us/j/87058792984?pwd=UFlzSzA4OHNyMEIvRkpCdWFwdEdRQT09
会议纪要:https://etherpad.openeuler.org/p/sig-Intel-Arch-meetings
温馨提醒:建议接入会议后修改参会人的姓名,也可以使用您在gitee.com的ID
更多资讯尽在:https://openeuler.org/zh/
Hello!
openEuler sig-Intel-Arch SIG invites you to attend the Zoom conference will be held at 2023-02-21 10:00,
The subject of the conference is Intel Arch SIG 例会,
Summary:
1. Intel几个复杂特性的支持计划
2. QEMU和kernel选型对未来lntel平台支持
和release安排讨论
3. openEuler在SPR上的用户诉求和反馈
You can join the meeting at https://us06web.zoom.us/j/87058792984?pwd=UFlzSzA4OHNyMEIvRkpCdWFwdEdRQT09.
Add topics at https://etherpad.openeuler.org/p/sig-Intel-Arch-meetings.
Note: You are advised to change the participant name after joining the conference or use your ID at gitee.com.
More information: https://openeuler.org/en/
1
0
kernel-leave(a)openeuler.org
Best Regards
嵌入式软件开发部 杨国锋
珠海一微半导体股份有限公司 / Amicro Semiconductor Co., Ltd.
横琴总部中心:珠海市横琴新区ICC横琴国际商务中心2座27楼
香洲研发中心:香洲区红山路26号阳光大厦13楼
深圳办公中心:深圳市南山区国家工程实验室大楼A座1102室
电话:0756-2666456
邮编:519000
Phone:18814181955
E-Mail:guofeng.yang(a)amicro.com.cn
This email may contain confidential and/or privileged information from Amicro,and is intended solely for the attention and use of the person(s) named above. If you are not the intended recipient (or have received this email in error), please notify the sender immediately and destroy this email. Any unauthorized copying, disclosure or distribution of the material in this email is strictly forbidden. The content provided in this email can not be guaranteed and assured to be accurate, appropriate for all, and complete by Amicro, and Amicro can not be held responsible forany error or negligence derived therefrom.
(此电子邮件包含来自一微半导体的信息,而且是机密的或者专用的信息。这些信息是供所有以上列出的个人或者团体使用的。如果您不是此邮件的预期收件人,请勿阅读、复制、转发或存储此邮件。如果已误收此邮件,请通知发件人。本公司不担保本电子邮件中信息的准确性、适当性或完整性,并且对此产生的任何错误或疏忽不承担任何责任.)
1
0